The Pseudocode Translator Project: Compilers Project 1: Getting Started

I'm working on a lexer for Python.

My code can be found here: https://bitbucket.org/ashley_dunn/compilers-project-1.

I'm starting out by doing some research. Specifically reading these:

Based on the little reading I have done so far, I'm leaning towards Racket for the lexer (since I'm using it in my Programming Languages class, and it seems like an appropriate tool for the job).

Output:

There are 8 valid tokens to parse:

(NEWLINE) -- for a logical newline.

possible newlines include:
\n \r \r\n

(INDENT) -- for a logical increase in indentation.

should be spaces, not tabs

(DEDENT) -- for a logical decrease in indentation.
(ID name) -- for an identifier.

possible identifiers match this regex: [A-Za-z_][A-Za-z_0-9]*

(LIT value) -- for a literal value.

possible literals are too numerous to copy pasta here, so check them out in this link.

(KEYWORD symbol) -- for an keyword.

possible keywords include:

False      class      finally    is         return
None       continue   for        lambda     try
True       def        from       nonlocal   while
and        del        global     not        with
as         elif       if         or         yield
assert     else       import     pass
break      except     in         raise

(PUNCT text) -- for operators and delimiters.

possible delimiters include:

(       )       [       ]       {       }
,       :       .       ;       @       =
+=      -=      *=      /=      //=     %=
&=      |=      ^=      >>=     <<=     **=

possible operators include:

+       -       *       **      /       //      %
<<      >>      &       |       ^       ~
<       >       <=      >=      ==      !=

(ENDMARKER) -- for the end of the input.
If you encounter a lexical error, print (ERROR "explanation") and quit.

The sample file provided in the project description (written in lex, which I may end up using instead of Racket) already takes care of newline, indent/dedent (mostly), id, and some operators/delimiters. So I need to add:

a little more logic for indent/dedent
support for literals
support for keywords
more operators and delimeters
and support for EOF

The Pseudocode Translator Project

Thursday, September 19, 2013

Compilers Project 1: Getting Started

No comments:

Post a Comment