I talked to my professor about the issues I was running into, specifically when lexing strings and bytes literals. Apparently the way to handle this is lexical states, which I would have known if I was actually taking the compilers class. I found this page and this, which look like really good references! So the plan is to create 4 start states for the different kinds of strings -- one single quote ('), one double quote ("), three single quotes ('''), and three double quotes ("""). And then I can handle them appropriately.
I also learned that all of the logic for indents/dedents is in place, and I really don't actually have to do much of anything for that. The only change I need to make is editing what he's already printing in the function ("{" for an indent, "}" for a dedent) to what the output should be ("(INDENT)" for an indent, "(DEDENT)" for a dedent).
Last but not least, I was using "EOF" for the end-of-file marker, but it was matching the string "EOF". I changed that to "<<EOF>>" and now that's working.
My plan is to finish this, and start the next project by Tuesday. Look for one more update on this project (to hopefully say that I figured everything out!), and then work on the Python parser!
No comments:
Post a Comment