Dec 21, 2012

A simple turtle graphics Domain Specific Language (DSL) parser

I've been working through some of the exercises in 'The Pragmatic Programmer' while using it to mentor a colleague.  In particular, I spent a bit of time working on the Domain Specific Language section and considering creating little DSLs to provide flexible control. I've painfully implemented DSLs in SystemVerilog/ UVM sequences in the past and thought about how I typically do this: build a parser, create tokens, then dispatch various execution functions to implement the commands.  Often, the pain of building all these pieces in a language like C or SV is enough of a barrier that I wouldn't even start.  

The example in the book is a simple Logo/ Turtle control language.  Pen up, Pen down, draw, turn, etc. It tries to simplify things by using single letter command codes and only one optional argument.

P 2


W 2

N 1

E 1 

S 1


I know Python has a Logo/ Turtle engine built in so I decided to try to write a command parser/ dispatcher that would work with that, and let me write commands in a text file and have them execute Python Turtle commands.

Here's an example script (you can see the results of this at the end of this post):

color "green"
goto 0 -50 # comments are allowed
circle 50
color "red"
write "Done!"
sleep 2

I thought initially about having to parse out all the commands and arguments, then writing a large switch/ case statement (or the Python equivalent with a dictionary).  After thinking about it a little longer I realised I didn't have to do that at all.  I could use the Introspection in Python to look up available methods and if they exist, call them. In fact, as not finding the method will just cause an exception, I can just try executing any command and if it fails, catch it and move on.

So after parsing the input text stream (throw away comments, break up tokens using whitespace) then I just try to execute the command in the global namespace.  I've pulled all the turtle functions into that namespace, so any turtle function is a valid command in my little parser.  The globals()[command[0]] in the code below looks up the function in the global namespace and then calls it, using the other parts of the command (command[1:]), after they've each been processed through the Python eval function to convert them from strings to whatever format they happen to be (numbers or strings mainly).  The final trick in this is using the * operator to take take a list and pass it as one argument after another to the function:

globals()[command[0]](*(map(eval, command[1:])))

And that's all that's needed to implement a full Logo language parser and execution engine. The command syntax is fault tolerant and reports errors, with line numbers.  New commands can be added easily, by defining new functions.  They'll be automatically supported as they are added to the namespace.

Being able to pull something like this together quickly means that writing little Domain Specific Languages is possible and quite a low bar.  Doing something similar in C is often more daunting and even worse in a language like SystemVerilog, with such a poor string and file handling library.  There's a real value in being able to program at such a high level, that can greatly enhance what's possible or likely to be attempted in a verification environment. You could do this in SystemVerilog, but how often would you even think to attempt it, without rejecting it as too much work?


# pull all the turtle commands into the global namespace, so they are valid commands
from turtle import *

# Use these 3 lines to make 'sleep' and 'exit' useable commands
from time import sleep
from sys import exit, argv

# Given a handle to a series of strings of commands, do them
def parse_and_draw(commands):
    for (line_number, line) in enumerate(commands):
        line = line.split('#')[0]  # throw away comments

        if line:  # if there is anything left after getting rid of comments

            command = line.strip().split(' ')  # parse using spaces to delimit tokens  
                                               # a big limitation of this, we can't have strings with spaces
                                               # e.g.,  "hello world" won't work as it'll get split up
                                               # into ['"hello', 'world"'] neither of which bits are valid when eval'ed

            if command[0]:  # if we have any command left 
                            # (e.g., an indented comment would dissappear)

                    # The meat of the dispatcher is the next line
                    # using a try/ except means we can always try to run any command
                    # and assume that it is valid and catch if it isn't.
                    # globals() returns a dictionary of every function defined in the global namespace
                    # including all the turtle commands because of the from turtle import *
                    # command[0] is used as a key to look up the function name
                    # we then pass all of the other tokens (command[1:]) through eval (using map)
                    # and pass them as arguments to the function we looked up

                    globals()[command[0]](*(map(eval, command[1:])))

                    # a key error occurs if we don't find command[0] in the global namespace
                except KeyError:
                    print 'Unknown command', command[0], 'on line ', line_number

                    # Some other error occurred (e.g., the called function raised an Exception)
                    # report it here and continue on (if we didn't catch it, the program would end)
                except Exception as e:
                    print 'Invalid command', command[0], 'on line ', line_number,  e

if __name__ == '__main__':
    # open the first file on the command line, get commands from that and run them
    commands = open(argv[1]).xreadlines()

    # note that the parse and draw routine works on a list of commands, it doesn't know about files
    # or anything else.  This seperation is is useful, as we can get commands from anywhere


12:21:12 10:37 AM

There are comments.