Tuesday, November 5, 2013

Compilers Project 2: Parsing with Derivatives

I definitely don't have the best handle on parsing with derivatives... but! Here's what I know based on this and this, which are based on this and this:
  • the derivative of a regex with respect to a character produces a second regex
  • the second regular expression "matches what the original expression would match, assuming it had just matched the character"
Here's what I think I know based on that information:
  • the derivative of a regex with respect to a character eliminates all strings that do not start with the character, and eliminates the character from strings that do start with the character.
Example: the derivative of the following language

{ "I", "am", "a", "happy", "camper" }

with respect to a yields

{ "m", ε }

Where ε is the empty string (since the 'a' is eliminated, but not because there wasn't a match. The null pattern, ∅, is used for no match (i.e. "I"))
Of course, that's a trivial example. In a "real-world" context, a match is against a whole word that is matched character by character. So a more realistic example would be to match against the word "happy":
"I am a happy camper"
Dh(c) = "appy" 
because everything is eliminated with respect to 'h' -- except "happy", which derives to "appy". 
Next, the derivative is taken in respect to 'a':
"appy"
Da(c) = "ppy"
And so on until the result is the empty string. An empty string means that the match was successful, while the null pattern indicates that the match failed.



Next steps
Derp, which stands for Derivative Parsers, is a tool written by my professor in Racket to make parser easier. I'm planning on using derp to write my parser for Python.

The documentation can be found here: http://matt.might.net/teaching/compilers/spring-2013/derp.html

No comments:

Post a Comment