My code can be found here: https://bitbucket.org/ashley_dunn/compilers-project-1.
I'm starting out by doing some research. Specifically reading these:
- http://matt.might.net/articles/standalone-lexers-with-lex/
- http://flex.sourceforge.net/manual/index.html#Top
- http://matt.might.net/articles/lexing-and-syntax-highlighting-in-javascript/
- http://docs.python.org/3/reference/lexical_analysis.html
Output:
There are 8 valid tokens to parse:
-
(NEWLINE)
-- for a logical newline. - possible newlines include:
\n \r \r\n -
(INDENT)
-- for a logical increase in indentation. - should be spaces, not tabs
-
(DEDENT)
-- for a logical decrease in indentation. -
(ID name)
-- for an identifier. - possible identifiers match this regex:
[A-Za-z_][A-Za-z_0-9]*
-
(LIT value)
-- for a literal value. - possible literals are too numerous to copy pasta here, so check them out in this link.
-
(KEYWORD symbol)
-- for an keyword. - possible keywords include:
False class finally is return None continue for lambda try True def from nonlocal while and del global not with as elif if or yield assert else import pass break except in raise
-
(PUNCT text)
-- for operators and delimiters. - possible delimiters include:
( ) [ ] { } , : . ; @ = += -= *= /= //= %= &= |= ^= >>= <<= **=
- possible operators include:
+ - * ** / // % << >> & | ^ ~ < > <= >= == !=
-
(ENDMARKER)
-- for the end of the input. - If you encounter a lexical error, print
(ERROR "explanation")
and quit.
- a little more logic for indent/dedent
- support for literals
- support for keywords
- more operators and delimeters
- and support for EOF
No comments:
Post a Comment