Epic Epoch Update

posted in The Bag of Holding

Published July 26, 2008

True to form, I just can't resist the urge to fit in just a little bit more polishing and feature work on Epoch before the first release. Right now I'm putting together the syntax error handling.

As anyone with experience in language implementation knows, handling syntax errors intelligently requires a string of minor miracles. Usually, this phase leaves the programmer indebted to several evil entities, all of which want a piece of his soul. There may also be virgin sacrifices involved, depending on what level of user-friendliness the errors have.

In Epoch, there are currently two types of errors: syntactical errors, and identifier resolution errors. (Obviously this will increase to include type-system violations and other such goodies as features are added to the language.) Using boost::spirit's assertive parsers, capturing syntactical errors is a breeze; very precise information on the location and cause of the error can be delivered to the user.

Identifier resolution errors aren't so convenient. For example, consider the case where we call a non-existent function Foo. Because of the way the Epoch VM is designed, we don't ever actually look for the code of Foo until the call is executed at runtime. This laziness has a small cost: by the time we are executing the program, the original text source has been discarded, and all we have is the binary form of the code. This means we can't show the user where the error actually occurred.

Solving this is actually fairly simple. Parsing of an Epoch program will be divided into three phases:

Syntactic De-Sugaring
Epoch supports what I call dynamic sugar, where syntax elements can be added to the language from within an Epoch program itself. This allows for very powerful DSL creation capabilities, as well as handy shorthand for various operations. Think of it like operator overloading on crack.

The first phase of Epoch parsing is to convert this sugary version of the program into the "pure" syntax, which is very simple and looks a bit like the bastard child of Lisp and C. This phase is implemented by an ad-hoc parser rather than boost::spirit in order to handle user-defined syntax properly. The output is a pure program with identical behaviour to the sugary input program.

Syntax Checking and Lexical Scope Population
The de-sugaring parser is "dumb" - it makes syntactic replacements without verifying their validity. The pure syntax is then verified by the second phase of parsing, the syntax checker. This phase is implemented in boost::spirit and serves two purposes.

First and most obviously, it detects any syntax issues and reports them. Secondly, it creates a lexical scope for each code block as needed, and registers all defined variables and functions into their appropriate scopes. (Yes, Epoch supports nested functions.) When this phase ends, there are two possible outcomes. Either the program contains a syntax error and parsing fails, or the program is clean, and the VM now holds a complete list of every function and variable in the program itself.

Conversion to binary operation form
The final stage of parsing involves actually analyzing each statement and expression in the program, and constructing the binary equivalents in the VM itself. This phase is also implemented via boost::spirit, and uses a very similar grammar to the second phase, with different semantic actions. This phase can only fail in one way: if identifier resolution fails. Each function call and variable reference is checked to see if the appropriate variable/function actually exists, using the scope data constructed during phase 2.

Once this final phase finishes, the VM holds a complete binary representation of the program, ready for execution and guaranteed to be free of errors.

Obviously as more features get implemented, the work of the phase 2 parser will become more and more sophisticated. Eventually, phase 2 will be far and away the most complex and time-consuming of the parse phases, because of things like closures and higher-order functions. Partial function application and other tricks will also have to be handled by the phase 2 parser.

I'm still working (slowly) on getting the code polished up and ready for release, but at the moment I'm sidetracked on this whole error-handling thing, because it's just too damn fun. Seeing the language come together is immensely rewarding.

Sadly, there's an upcoming deadline of rather high urgency at work, so I can't afford much time at all for Epoch tinkerings. Because of that time-sink, I expect to have the release ready in a few weeks at the latest. Don't miss it - Epoch is going places.

Previous Entry The Parser Lives!

Next Entry Epoch Release 1

0 likes 0 comments

Comments

Nobody has left a comment. You can be the first!

You must log in to join the conversation.

Don't have a GameDev.net account? Sign up!

ApochPiQ

Author

🎉 Celebrating 25 Years of GameDev.net! 🎉

Epic Epoch Update

Comments

ApochPiQ

Latest Entries

A Few Farewells

Code Reuse In Actual Practice

Source-Level Debugging For Epoch Programs

Using Poison to Reverse Engineer Code

Using Poison to Reverse Engineer Code

Debugging Information Success

Debugging Information Success

Debugging Epoch Programs

Debugging Epoch Programs

Epoch 64-bit compiler progress

🎉 Celebrating 25 Years of GameDev.net! 🎉

Epic Epoch Update

Comments

ApochPiQ

Latest Entries

A Few Farewells

Code Reuse In Actual Practice

Source-Level Debugging For Epoch Programs

Using Poison to Reverse Engineer Code

Using Poison to Reverse Engineer Code

Debugging Information Success

Debugging Information Success

Debugging Epoch Programs

Debugging Epoch Programs

Epoch 64-bit compiler progress

Reticulating splines