Thursday, September 12, 2013

Syntax Analysis Example 3

I don't know if I'm going to support Object Oriented Programming (OOP) in my "language" (at least not initially), but this example introduces some new features that I haven't encountered in the others, so I thought I'd include it. (I'll still comment on the class definition syntax, just in case I do decide to include OOP in the future).

class DisjointSet

    private int[] parent
    private int[] rank

    DisjointSet (n)
        parent = new int[n]
        rank = new int[n]
        for (i = 1; i  n; i++)
            parent[i] = i
            rank[i] = 0

    int find (x)
        while (x  parent[x])
            x = parent[x]
        return x

    union (x, y)
        rx = find(x)
        ry = find(y)
        if rx == ry
            return
        else if rank[rx] > rank[ry]
            parent[ry] = rx
        else if rank[rx] < rank[ry]
            parent[rx] = ry
        else
            parent[rx] = ry
            rank[ry] = rank[ry] + 1

First of all, the definition syntax is pretty close already. All the first line needs is a colon after the class name. Python doesn't do member variable declarations, so I guess I would just ignore those two lines (although, it does have a strange way of defining visibility, so maybe I would want to keep track of them being private?).

Python uses __init__ ( ) for constructor syntax, instead of a method with the same name as the class. I think this would be pretty easy to translate (just replace the name with __init__ and keep the same arguments). However, some languages can have multiple constructors, and users might want to emulate this in their pseudocode, whereas Python can only have one init function. This problem is solved in Python by using default values in the init function that can be overridden, but it might be tricky to try and meld multiple constructors together this way...

One last OOP concept: since Python doesn't do member variable declarations, every class variable must be prefixed with "self" (e.g. self.parent). I would probably handle this by storing any variables that are initialized in the constructor somewhere, and then appending "self." to the front of every instance of these variables. And speaking of the self keyword, this is also required to be the first keyword in any class method declaration. If I can properly recognize classes and functions in general, I think it would be relatively easy to add "self, " to the beginning of every methods' argument list.

Alright, new for loop syntax! This one has a few things that the "for each" syntax doesn't have: parenthesis and semicolons. In general I was thinking about allowing the user to use semicolons, but just ignoring them. However, that wouldn't really work in this case. I think I would just allow this syntax to be valid (since it's such a prevalent construct in some many languages). In a way this is easy. It's essentially equal to "for i in range (1, n+1)" (I do need to pay attention to whether the operator is less than or less than equal to, in order to determine whether I need to add one, since Python's range goes up to n-1). The tricky part is the increment value... Python's range naturally increments by 1, but you can add an optional argument to specify how much it should increment or decrement by. Since the last term in the standard for loop is an expression, I would have to parse this individually to determine the user's intent (e.g. the increment operator indicates that the loop should increment by 1 every time, but i *=2 indicates that the loop should increment by i*2 every time).

There are some instances in here in which my professor used symbols that aren't found on a standard keyboard (≠ and ≤). I think I can safely ignore these, since most people won't be able to type these, and I think he only included them because this code was in some slides. 

This example also includes a while loop. This is pretty much the same as in Python, except for the missing colon, so I would just add that during translation.

Two last things: first, is the introduction of "else if". This should be allowed, but it will be replaced by "elif" during translation. And finally, his "if" and "else if" lines don't have parenthesis. I'll have to think about whether I ever want to allow this, but for now I think this will be a compiler/interpreter error (expected '(' line __).

CONCLUSION:

  • OOP?
  • Standard for loop syntax (initial; final; increment) should be allowed.
  • Standard while loop syntax should be allowed.
  • "else if", "elsif", "elseif", and "elif" should be allowed and treated as an "elif".


No comments:

Post a Comment