In my spare time, I design and implement programming languages. I'm currently trying to build a language which more-or-less combines the performance of C, the learnability of Python, and the expressive power of LISP. This is a fairly ambituous goal--and one of I've already failed to accomplish several times--but I think it's worth some effort. One of my friends has codenamed this effort "Mason".

Milestone 1: Weaving the Metaobjects

Over the last month or two of weekends, I worked out a type system which is (1) dynamic-language friendly and (2) supports parameterized types. It's a mixture of CLOS and Dylan, with some imported concepts from C++'s template system. Since I wanted the type-system to support runtime introspection, I provided a full metaobject protocol (which is just a fancy name for a bunch of classes named Class, Type, Slot, etc., which allow you to introspect your objects).

Metaobject protocols are a bit hard to bootstrap, because there's a Class object describing the class Class, which contains a bunch of Slot objects, each one of which has slots, and so on, ad nauseum. Supporting parameterized types makes everything even more exciting, because suddenly you have classes like Template, InstantiatedClass and TypeVariable floating around. Before you know it, your runtime library requires about 4K of the gnarliest data structures you've ever seen.

I decided to generate these data structures by writer a "weaver" in Python. The weaver parses class declarations, instatiates templates, and generates over 1,500 lines of static variable declarations in C.

Test-Driven Development

Unfotunately, I'm going to need more than just a few objects to get this language off the ground. I'll also need a small-as-possible bootstrap interpreter so the upper levels of the system can be self-hosting.

Since designing a small, efficient bootstrap interpreter is a non-obvious task, I'm taking the lazy way out: test driven development, as suggested in XpForOptimizingCompilers.

Basically, I start out with an "interpreter" that only runs one program--"Hello, world!". This is easy; the interpreter can just ignore the source code and print the output directly. Then I add a second program which prints something different, and modify the interpreter to start looking at the source. As I keep adding new programs--and new complexity--I refactor mercilessly at the slightest sign of duplication. The interpreter grows organically, and never includes anything beyond the most essential functionality.

I've added a few extra rules:

  • All C code must use Mason data structures, not C data structures. Since upper layers of the compiler will eventually be written in Mason, this will prevent me from developing a nasty glue layer.
  • I can use as much domain expertise as I want, but I can only use it in two places: When I'm writing test cases, and when I'm refactoring working code into a new shape. If I find myself agonizing over design decisions, I need to stop using domain knowledge and start experimenting.
  • Strange loops (loops which involve self-reference across metalevels) are a sign good language design, and the essence of what makes LISP good.

This is my Nth attempt at this project in the past several years. We'll see how far I can make it this time.