Mason Update: The Weaver Has Woven
In my spare time, I design and implement programming languages. I'm currently trying to build a language which more-or-less combines the performance of C, the learnability of Python, and the expressive power of LISP. This is a fairly ambituous goal--and one of I've already failed to accomplish several times--but I think it's worth some effort. One of my friends has codenamed this effort "Mason".
Milestone 1: Weaving the Metaobjects
Over the last month or two of weekends, I worked out a type system
which is (1) dynamic-language friendly and (2) supports parameterized
types. It's a mixture of CLOS and Dylan, with some imported concepts
from C++'s template system. Since I wanted the type-system to support
runtime introspection, I provided a full metaobject protocol (which is
just a fancy name for a bunch of classes named Class
,
Type
, Slot
, etc., which allow you to introspect your
objects).
Metaobject protocols are a bit hard to bootstrap, because there's a
Class
object describing the class Class
, which contains a
bunch of Slot
objects, each one of which has slots, and so on,
ad nauseum. Supporting parameterized types makes everything
even more exciting, because suddenly you have classes like
Template
, InstantiatedClass
and TypeVariable
floating around. Before you know it, your runtime library requires
about 4K of the gnarliest data structures you've ever seen.
I decided to generate these data structures by writer a "weaver" in Python. The weaver parses class declarations, instatiates templates, and generates over 1,500 lines of static variable declarations in C.
Test-Driven Development
Unfotunately, I'm going to need more than just a few objects to get this language off the ground. I'll also need a small-as-possible bootstrap interpreter so the upper levels of the system can be self-hosting.
Since designing a small, efficient bootstrap interpreter is a non-obvious task, I'm taking the lazy way out: test driven development, as suggested in XpForOptimizingCompilers.
Basically, I start out with an "interpreter" that only runs one program--"Hello, world!". This is easy; the interpreter can just ignore the source code and print the output directly. Then I add a second program which prints something different, and modify the interpreter to start looking at the source. As I keep adding new programs--and new complexity--I refactor mercilessly at the slightest sign of duplication. The interpreter grows organically, and never includes anything beyond the most essential functionality.
I've added a few extra rules:
- All C code must use Mason data structures, not C data structures. Since upper layers of the compiler will eventually be written in Mason, this will prevent me from developing a nasty glue layer.
- I can use as much domain expertise as I want, but I can only use it in two places: When I'm writing test cases, and when I'm refactoring working code into a new shape. If I find myself agonizing over design decisions, I need to stop using domain knowledge and start experimenting.
- Strange loops (loops which involve self-reference across metalevels) are a sign good language design, and the essence of what makes LISP good.
This is my Nth attempt at this project in the past several years. We'll see how far I can make it this time.
Want to contact me about this article? Or if you're looking for something else to read, here's a list of popular posts.