The New Pier Parser

The latest version of Pier introduces a new Wiki parser: The previous parser was built using SmaCC and it dates back to the very early versions of SmallWiki. The new parser is hand-written. This might look like a big step back, however there were several compelling reasons not to use an EBNF based parser anymore:

  • Extensibility: The hand written parser is fully pluggable and can be extended without having to specify a new grammer. Of course this can happen as part of crosscutting extensions.
  • Complexity: The EBNF of the Wiki syntax got very complex over time. With every change, new rules had to be defined in the different contexts of the paragraph, table, or preformatted text. Adding formattings was a nightmare. The internals of link definitions were already parsed manually for quite some time.
  • Speed: The hand written parser running through all the pages of my web site is about 4 times faster than the old one.
  • Tests: The hand written parser is easier to test, as there is no generated code. Of course all the existing parser tests still pass.

The new parser supports simple formattings that are represented as first class nodes in the resulting Wiki AST. Of course all the new and old tags can be arbitrary nested: italic text with a superscript part, monospaced text with a bold part, etc. In my next post I will demonstrate how to extend the parser with a new node type.

Posted by Lukas Renggli at 12 April 2007, 1:25 am with tags pier, parser link

Comments

Marvellous, fantastic, wonderful...

Posted by Damien Cassou at 12 April 2007, 1:25 pm link