I ended up 4 start states for string literals and 4 for bytes literals. They were pretty straightforward and were massively easier than what I was originally doing. My code is still a wee bit buggy. For example, here's some output for some regexes I tried in my string_start_states.l file:
([^\\]|"\\".)*
"""What up, bee?!"""
dsf
(LIT "What up, bee?!"""
dsf
")
([^\\"]|"\\".)*
"""What up, "bee"?!"""
(LIT "What up, ")"(LIT "bee")"(LIT "?!")
([^\\"]|"\\".)*|"\""|"\"\""
"""What up, "bee"?!"""The first gets into an infinite loop because the "0 or more of anything except a backslash" matches the 3 quotes that are supposed to end the state. The second treats the quotes as single quotes and the third is just so wrong...
(LIT "What up, ")(LIT """)(LIT "bee")(LIT """)(LIT "?!")
But I ran my lexer on the provided test code and the output was as expected except for in two places -- both of which are \ as a continuation character. >_<
So I ought to fix that and escape sequences in strings. But my program is FINALLY essentially done, and that feels great!
No comments:
Post a Comment