...and how to fake Lisp macros with Template Haskell

(I wrote this article in response to a comment by sigfpe. You may find it pretty dry reading, unless you want to build domain-specific languages in Haskell. Proceed at your own risk.)

Haskell's built-in `Monad` type has some serious limitations. We can fix those limitations using a number of advanced Haskell techniques, including Template Haskell, Haskell's closest equivalent to Lisp macros.

We can illustrate the limitations of `Monad` with an example from math. In set theory, we can define a set by specifying how to compute each element:

{ xy : x ∈ {1,2,4}, y ∈ {1,2,4} }

We can read this as, "the set of all xy, where x is one of {1,2,4}, and y is one of {1,2,4}." To calculate the answer, we first multiply together all the possible combinations:

1×1=1, 1×2=2, 1×4=4, 2×1=2, 2×2=4, 2×4=8, 4×1=4, 4×2=8, 4×4=16

We then collect up the answers, and---because we're working with sets--we throw away the duplicates:

{1,2,4,8,16}

``````listExample = do
x <- [1,2,4]
y <- [1,2,4]
return (x*y)
``````

But when we run this, Haskell gives us lots of duplicate values:

``````> listExample
[1,2,4,2,4,8,4,8,16]
``````

Our problem: We're using lists (which can contain duplicate values) to represent sets (which can't). Can we fix this by switching to Haskell's `Data.Set`?

``````import qualified Data.Set as S

-- This doesn't work.
setExample = do
x <- S.fromList [1,2,4]
y <- S.fromList [1,2,4]
return (x*y)
``````

Unfortunately, this code fails spectacularly. A Haskell monad is required to work for any types `a` and `b`:

``````class Monad m where
return :: a -> m a
fail :: String -> m a
(>>=) :: m a -> (a -> m b) -> m b
``````

But `Data.Set` only works for some types. Specifically, it requires that values of type `a` can be ordered:

``````data (Ord a) => Set a = ...
``````

As it turns out, we can make `Data.Set` into a monad. But be warned: The solution involves some pretty ugly Haskell abuse.

### Splitting `Monad` in half

If we want to put a restrictions on `a` and `b`, we'll need to move them into the signature of `Monad`. My first attempt looked like this:

``````{-# LANGUAGE MultiParamTypeClasses,
UndecidableInstances #-}

import Prelude hiding (return, fail, (>>=))
import qualified Prelude

-- This won't work:
class NewMonad m a b where
return :: a -> m a
fail :: String -> m a
(>>=) :: m a -> (a -> m b) -> m b
``````

Unfortunately, the type `b` doesn't appear anywhere in `return` or `fail`. This makes the type checker sad. But there's a workaround, discovered by oleg. We can split `Monad` into two pieces:

``````class Monad1 m a where
return :: a -> m a
fail :: String -> m a

(>>=) :: m a -> (a -> m b) -> m b
``````

Using these type classes, we can finally make `Set` a monad:

``````instance (Ord a) =>
return = S.singleton
fail _ = S.empty

instance (Ord a, Ord b) =>
m >>= f = (setJoin . S.map f) m
where setJoin = S.unions . S.toList
``````

This gets us very close to our goal:

``````setExample :: S.Set Int
setExample =
S.fromList [1,2,4] >>= \x ->
S.fromList [1,2,4] >>= \y ->
return (x*y)
``````

When we run this code, the duplicates are gone:

``````> setExample
fromList [1,2,4,8,16]
``````

### Rebuilding `do`: Macros

Unfortunately, GHC won't let us use the built-in `do` syntax. Even though we've carefully hidden the regular `>>=` operator, GHC goes digging around in the libraries and finds it anyway. So we need to somehow replace the built-in `do` with a new version that uses our redefined `>>=`.

Update: As Brandon points out below, we can get the same result with `-fno-implicit-prelude`, which will force GHC to use whatever `>>=` is in scope. Thanks, Brandon!

We can build our own version of `do` using Template Haskell. Template Haskell allows us to generate code at compile-time, in fashion similar to Lisp macros. In the code below, `\$(...)` means "insert some code here," and `[|...|]` parses an expression and returns it as a data structure.

``````setExample' :: S.Set Int
setExample' = \$(restricted [|do
x <- S.fromList [1,2,4]
y <- S.fromList [1,2,4]
return (x*y) |])
``````

The function `restricted` maps a parsed expression to another parsed expression.

As it turns out, we're not allowed to define `restricted` (or any of our earlier type classes) in the same file as `setExample'`. So we need to create a file "RestrictedMonad.hs" and move all our definitions into it:

``````{-# LANGUAGE MultiParamTypeClasses,
UndecidableInstances,

restricted
) where

import Prelude hiding (return, fail, (>>=))
import qualified Prelude

import qualified Data.Set as S

-- Our earlier definitions go here...
``````

Now, we can define our "macro":

``````-- Maps a quoted expression to a quoted
-- expression.
restricted :: Q Exp -> Q Exp
restricted code = do
(DoE stmts) <- code
expand stmts
``````

Before we can define `expand`, we need one more piece. Template Haskell is based on monads, and we've redefined the built-in monad operators. To generate our replacement code, we'll need to use the built-in `return` operator:

``````ret :: a -> Q a
ret = Prelude.return
``````

Now we're ready to define `expand`. This function expands a parsed `do`-body into a series of function calls. The implementation gets a little messy at times, due to limitations in Template Haskell.

``````-- pat <- expr; stmts...  We use lamE here
-- because we can't insert 'pat' directly
-- into the [|...|] form.
--
-- Note that we don't call 'fail' on
-- pattern-match failure, but we should.
expand (BindS pat expr:stmts) =
[| \$(ret expr) >>=
\$(lamE [ret pat]
(expand stmts)) |]

-- let decls...; stmts...
expand (LetS decls:stmts) =
letE (fmap ret decls)
(expand stmts)

-- The final expression in the 'do'.
expand (NoBindS expr:[]) = ret expr

-- expr; stmts...
expand (NoBindS expr:stmts) =
[| \$(ret expr) >>=
(\_ -> \$(expand stmts)) |]

expand stmts =
error ("Malformed 'do': "++show stmts)
``````

There are three well-known laws that all monads must obey. But there are some lesser-known monad laws that are automatically enforced in standard Haskell, thanks to the type checker. But now that we allow our monad to use the operations defined by `Ord`, we need to prove these extra laws manually.

This should be a complete list:

``````-- The standard monad laws.
join . return == id
join . fmap return == id
join . join == join . fmap join

-- The other monad laws: Return and join
-- are natural transformations.
fmap f . return == return . f
fmap f . join == join . fmap (fmap f)

-- The functor laws.
fmap id == id
fmap (f . g) == fmap f . fmap g
``````

Ultimately, I'd like to see `Data.Set` become a monad in standard Haskell.