Mason Update: The Weaver Has Woven
In my spare time, I design and implement programming languages. I'm currently trying to build a language which more-or-less combines the performance of C, the learnability of Python, and the expressive power of LISP. This is a fairly ambituous goal--and one of I've already failed to accomplish several times--but I think it's worth some effort. One of my friends has codenamed this effort "Mason".
Milestone 1: Weaving the Metaobjects
Over the last month or two of weekends, I worked out a type system
which is (1) dynamic-language friendly and (2) supports parameterized
types. It's a mixture of CLOS and Dylan, with some imported concepts
from C++'s template system. Since I wanted the type-system to support
runtime introspection, I provided a full metaobject protocol (which is
just a fancy name for a bunch of classes named Class,
Type, Slot, etc., which allow you to introspect your
objects).
Metaobject protocols are a bit hard to bootstrap, because there's a
Class object describing the class Class, which contains a
bunch of Slot objects, each one of which has slots, and so on,
ad nauseum. Supporting parameterized types makes everything
even more exciting, because suddenly you have classes like
Template, InstantiatedClass and TypeVariable
floating around. Before you know it, your runtime library requires
about 4K of the gnarliest data structures you've ever seen.
I decided to generate these data structures by writer a "weaver" in Python. The weaver parses class declarations, instatiates templates, and generates over 1,500 lines of static variable declarations in C.
Test-Driven Development
Unfotunately, I'm going to need more than just a few objects to get this language off the ground. I'll also need a small-as-possible bootstrap interpreter so the upper levels of the system can be self-hosting.
Since designing a small, efficient bootstrap interpreter is a non-obvious task, I'm taking the lazy way out: test driven development, as suggested in XpForOptimizingCompilers.
Basically, I start out with an "interpreter" that only runs one program--"Hello, world!". This is easy; the interpreter can just ignore the source code and print the output directly. Then I add a second program which prints something different, and modify the interpreter to start looking at the source. As I keep adding new programs--and new complexity--I refactor mercilessly at the slightest sign of duplication. The interpreter grows organically, and never includes anything beyond the most essential functionality.
I've added a few extra rules:
- All C code must use Mason data structures, not C data structures. Since upper layers of the compiler will eventually be written in Mason, this will prevent me from developing a nasty glue layer.
- I can use as much domain expertise as I want, but I can only use it in two places: When I'm writing test cases, and when I'm refactoring working code into a new shape. If I find myself agonizing over design decisions, I need to stop using domain knowledge and start experimenting.
- Strange loops (loops which involve self-reference across metalevels) are a sign good language design, and the essence of what makes LISP good.
This is my Nth attempt at this project in the past several years. We'll see how far I can make it this time.
Want to contact me about this article? Or if you're looking for something else to read, here's a list of popular posts.