Why Hygienic Macros Rock

Posted by Eric Fri, 13 Sep 2002 00:00:00 GMT

I've recently been reading a lot of excellent essays on programming language design by Paul Graham. Paul and I agree about a number of things: (1) LISP is beautiful and powerful family of languages, even by modern standards, (2) all existing dialects of LISP are lacking a certain something, and (3) programmatic macros are a Good Idea.

What Are Programmatic Macros?

Programmatic macros, to put it simply, allow the programmer to add new control structures to a programming language. To do this, the programmer writes some code which runs in the compiler. This code transforms the new control structure into something the compiler already understands. A trivial example, in Common LISP:

(defmacro unless (condition &body body)
  ;; Transform (unless cond body...) into
  ;; (if (not cond) (progn body...)).
  `(if (not ,condition)

This defines a control structure unless in terms of the existing control structures if and progn.

Programmatic macros are similar to C preprocessor macros, but (1) they allow you to work directly with parse tree, instead of the raw textual source code, and (2) they allow you to run arbitrary code to produce the output. They're typically used to create domain-specific languages on top of an existing LISP dialect, which--like any domain-specific languages--make solving hard problems easier.

What are Hygienic Macros?

Hygienic macros are (essentially) macros which Do The Right Thing with local variable names. They're controversial because Doing The Right Thing makes it easier to write simple macros, and quite a bit harder to write extremely complex macros. Here's a very simple hygienic macro, in Scheme:

(define-syntax unless
  (syntax-rules ()
    ((unless condition body ...)
     (if (not condition) (begin body ...)))))

Unlike the above example, this definition of unless doesn't get confused by (say) a local redefinition of not. Similar safegaurds would apply to any local variables defined within the macro's expansion--they wouldn't get confused with local variables in body. In other words, the compiler knows a lot more about the macro expansion, and is doing some fairly complicated transformations behind the scenes.

Why I'm Convinced Hygienic Macros Are Better

After playing around with DrScheme--a truly amazing Scheme environment--I'm definitely convinced that hygienic macros are worth the added difficulties they inflict upon the authors of complex macros.

The other day, I implemented a begin/var macro in DrScheme because our users were getting sick of Scheme's cumbersome syntax for declaring variables. begin/var can be used as follows:

(define (silly-function)
  ;; Returns the list (15 20).
    (var x 10)
    (set! x (+ x 5))
    (var y 20)
    (list x y)))

var works like Perl's my declaration--it declares a new local variable which can be seen until the end of the current scope. The implementation of begin/varrequires 17 lines of slightly crufty macro code.

This macro was marginally harder to write in DrScheme than it would be in Common LISP. But because DrScheme's hygienic macro system has deep knowledge of how my macro works, my macro is extremely well-integrated into the environment: syntax highlighting works correctly, automatic variable renaming works correctly, cross-referencing works correctly, and so do all the other slick DrScheme IDE features. In other words, hygiene doesn't just protect you against simple bugs such as variable capture--it allows your code to formally analyzed by tools that simply could not exist for Common LISP.

Counterarguments (and Implications for Arc)

Non-hygienic macros might still be a good choice for languages which (1) are aimed at advanced macro hackers and (2) won't ever require particularly advanced IDE support. Paul Graham's Arc language definitely fits criteria (1), but I think his dreams of really excellent profiling tools may run counter to criteria (2).

Tags , , ,


  1. Scott said 2106 days later:

    To what extent can hygienic macros be implemented in non- and vice versa? I’m more familiar with Emacs Lisp-style macros than CL or Scheme macros, but it seems like the hygiene involves automatically writing code with gensyms to avoid variable capture and other compile-time analysis. If one is implemented in terms of the other, it would probably make the most sense to have two different forms, e.g. hmacro for declaring hygienic macros, and macro for when you deliberately want to do things like variable capture.

  2. Eric said 2107 days later:

    Scott: That’s a good question, and I don’t actually know the answer.

    Most Scheme implementations take a distantly-related approach. They generally have two different hygenic macro systems:

    • A “low-level” macro system, which works roughly like LISP, but which provides hygiene by default, but also provides the ability to violate hygiene selectively. This tends to be a bit fiddly to use.
    • A “high-level” macro system, which is based on hygienic rewrite rules. This is very easy to use, and will work for nearly all common macros.

    Of course, using the low-level macro system, you can easily implement Lisp-style non-hygienic macros.

    The PLT people have recently added another very slick feature to the standard Scheme toolkit: The ability the selectively violate hygiene from inside rewrite rules, in a carefully controlled fashion. This is a really nice feature, and we’ve been using it at work with great success.

  3. Daniel said 2109 days later:

    I have actually read a lot about hygienic vs non-hygienic macros. I just cannot recall what blogs and sites I read them.

    You might try smuglispweeny and lambda-the-ultimate blogs, though. And wikipedia.

    The thing is, gensyms handles some of the stuff hygienic macros handle, but not all of the stuff. There are some complex cases which would require heroic efforts in LISP, and very prone to errors at that.

    On the other hand, hygienic macros can access the runtime context, in a more verbose way.

    So, really, they each can be used for the same things, with varied levels of complexity depending on the task. But let me advocate hygienic macros a bit here.

    There once was a language called Forth. Forth was interesting in that not only it was written in itself, but all the internals were available to any programs written in it, and, furthermore, it could be changed at will. You could change the parser, for instance, or you could change how it performed compilation.

    Needless to say, it was powerful in the way such languages are. The thing is… Forth lacked abstraction. Yes, you could change the compiler. But, of course, you’d have to know how the specific compiler you were changing worked—and I don’t mean simply dialect differences. The most famous Forth generated indirectly threaded code, quite a few generated directly threaded code, some generated subrouting threaded code, others generated bytecode, and others compiled directly to machine code.

    So it was very powerful stuff, which had very little portability. Well, let’s be honest. It had no portability at all.

    And here is the thing with non-hygienic macros… the programmers using someone’s macro also have to know how that macro works. Either that, or you add stuff to your macro until it is, effectively, hygienic.

    So, by using hygienic macros, you guarantee a level of portability to them similar to library functions.

  4. Scott said 2132 days later:

    Daniel: Ok, thanks.

    I’m acquainted with Forth. I read both Starting Forth and Thinking Forth a few days after posting the above comment, actually; I love the concatenative approach, though I find Forth a bit too low-level for my taste. (My favorite language is currently OCaml, so you can guess how I feel about working without any type system whatsover.) I’ve also looked a little at Factor, but not in depth.

    That said, the source to a Forth interpreter in i386 assembly and Forth was one of the single most mind-blowing programming things I’ve ever read. When I saw that the interpreter had a bit for whether the code was being compiled or evaluated, and immediate functions could toggle it at runtime, everything suddenly made sense. So much flexibility emerging from so few constructs!


Comments are disabled