Lukas Renggli

Composite Grammars with PetitParser

Defining the Grammar

As an example let’s create a composite parser using the same expression grammar we built in the last blog post:

PPCompositeParser subclass: #ExpressionGrammar
   instanceVariableNames: ''
   classVariableNames: ''
   poolDictionaries: ''
   category: 'PetitTutorial'

Again we start with the grammar for an integer number. Define the method number in ExpressionGrammar as follows:

ExpressionGrammar>>number
   ^ #digit asParser plus token trim ==> [ :token | token value asNumber ]

Every production in ExpressionGrammar is specified as a method that returns its parser. Productions refer to each other by reading the respective instance variable of the same name. This is important to be able to create recursive grammars. The instance variables themselves are typically not written to as PetitParser takes care to initialize them for you automatically.

Next we define the productions term, prod, and prim. Contrary to our previous implementation we do not define the production actions yet; and we factor out the parts for addition (add), multiplication (mul), and parenthesis (parens) into separate productions. This will give us better reusability later on. We let Pharo automatically add the necessary instance variables as we refer to them for the first time.

ExpressionGrammar>>term
   ^ add / prod

ExpressionGrammar>>add
   ^ prod , $+ asParser trim , term

ExpressionGrammar>>prod
   ^ mul / prim

ExpressionGrammar>>mul
   ^ prim , $* asParser trim , prod

ExpressionGrammar>>prim
   ^ parens / number

ExpressionGrammar>>parens
   ^ $( asParser trim , term , $) asParser trim

Last but not least we define the starting point of the expression grammar. This is done by overriding start in the ExpressionGrammar class:

ExpressionGrammar>>start
   ^ term end

Instantiating the ExpressionGrammar gives us an expression parser that returns a default abstract-syntax tree:

parser := ExpressionGrammar new.
parser parse: '1 + 2 * 3'.       " --> #(1 $+ #(2 $* 3)) "
parser parse: '(1 + 2) * 3'.     " --> #(#($( #(1 $+ 2) $)) $* 3) "

Defining the Evaluator

Now that we have defined a grammar we can reuse this definition to implement an evaluator. To do this we create a subclass of ExpressionGrammar called ExpressionEvaluator

ExpressionGrammar subclass: #ExpressionEvaluator
   instanceVariableNames: ''
   classVariableNames: ''
   poolDictionaries: ''
   category: 'PetitTutorial'

and we redefine the implementation of add, mul and parens with our evaluation semantics:

ExpressionEvaluator>>add
   ^ super add ==> [ :nodes | nodes first + nodes last ]

ExpressionEvaluator>>mul
   ^ super mul ==> [ :nodes | nodes first * nodes last ]

ExpressionEvaluator>>parens
   ^ super parens ==> [ :nodes | nodes second ]

The evaluator is now ready to be tested:

parser := ExpressionEvaluator new.
parser parse: '1 + 2 * 3'.       " --> 7 "
parser parse: '(1 + 2) * 3'.     " --> 9 "

Similarly — as an exercise — a pretty printer can be defined by subclassing ExpressionGrammar and by redefining a few of its productions:

parser := ExpressionPrinter new.
parser parse: '1+2 *3'.          " --> '1 + 2 * 3' "
parser parse: '(1+ 2 )* 3'.      " --> '(1 + 2) * 3' "

Posted by Lukas Renggli at 27 November 2010, 3:14 pm with tags tutorial, petitparser, smalltalk, pharo link

Lukas Renggli

Composite Grammars with PetitParser

Archive

Tags

Defining the Grammar

Defining the Evaluator