# How to make Data.Set a monad

**...and how to fake Lisp macros with Template Haskell**

(I wrote this article in response to a comment by sigfpe. You may find it pretty dry reading, unless you want to build domain-specific languages in Haskell. Proceed at your own risk.)

Haskell's built-in `Monad`

type has some serious limitations. We can fix those limitations using a number of advanced Haskell techniques, including Template Haskell, Haskell's closest equivalent to Lisp macros.

We can illustrate the limitations of `Monad`

with an example from math. In set theory, we can define a set by specifying how to compute each
element:

{xy:x∈ {1,2,4},y∈ {1,2,4} }

We can read this as, "the set of all *xy*, where *x*
is one of {1,2,4}, and *y* is one of {1,2,4}." To calculate the
answer, we first multiply together all the possible combinations:

1×1=1, 1×2=2, 1×4=4, 2×1=2, 2×2=4, 2×4=8, 4×1=4, 4×2=8, 4×4=16

We then collect up the answers, and---because we're working with sets--we throw away the duplicates:

{1,2,4,8,16}

Can we do the same thing in Haskell? Well, using Haskell's list monad, we can write:

```
listExample = do
x <- [1,2,4]
y <- [1,2,4]
return (x*y)
```

But when we run this, Haskell gives us lots of duplicate values:

```
> listExample
[1,2,4,2,4,8,4,8,16]
```

Our problem: We're using lists (which can contain duplicate
values) to represent sets (which can't). Can we fix this by switching to
Haskell's `Data.Set`

?

```
import qualified Data.Set as S
-- This doesn't work.
setExample = do
x <- S.fromList [1,2,4]
y <- S.fromList [1,2,4]
return (x*y)
```

Unfortunately, this code fails spectacularly. A Haskell monad is required
to work for *any* types `a`

and `b`

:

```
class Monad m where
return :: a -> m a
fail :: String -> m a
(>>=) :: m a -> (a -> m b) -> m b
```

But `Data.Set`

only works for some types. Specifically, it
requires that values of type `a`

can be ordered:

```
data (Ord a) => Set a = ...
```

As it turns out, we can make `Data.Set`

into a monad. But be
warned: The solution involves some pretty ugly Haskell abuse.

### Splitting `Monad`

in half

If we want to put a restrictions on `a`

and `b`

,
we'll need to move them into the signature of `Monad`

. My first
attempt looked like this:

```
{-# LANGUAGE MultiParamTypeClasses,
UndecidableInstances #-}
import Prelude hiding (return, fail, (>>=))
import qualified Prelude
-- This won't work:
class NewMonad m a b where
return :: a -> m a
fail :: String -> m a
(>>=) :: m a -> (a -> m b) -> m b
```

Unfortunately, the type `b`

doesn't appear anywhere in
`return`

or `fail`

. This makes the type checker sad.
But there's a workaround, discovered by oleg. We can split
`Monad`

into two pieces:

```
class Monad1 m a where
return :: a -> m a
fail :: String -> m a
class (Monad1 m a, Monad1 m b) =>
Monad2 m a b where
(>>=) :: m a -> (a -> m b) -> m b
```

Using these type classes, we can finally make `Set`

a monad:

```
instance (Ord a) =>
Monad1 S.Set a where
return = S.singleton
fail _ = S.empty
instance (Ord a, Ord b) =>
Monad2 Set.Set a b where
m >>= f = (setJoin . S.map f) m
where setJoin = S.unions . S.toList
```

This gets us very close to our goal:

```
setExample :: S.Set Int
setExample =
S.fromList [1,2,4] >>= \x ->
S.fromList [1,2,4] >>= \y ->
return (x*y)
```

When we run this code, the duplicates are gone:

```
> setExample
fromList [1,2,4,8,16]
```

### Rebuilding `do`

: Macros

Unfortunately, GHC won't let us use the built-in `do`

syntax.
Even though we've carefully hidden the regular `>>=`

operator,
GHC goes digging around in the libraries and finds it anyway. So we need
to somehow replace the built-in `do`

with a new version that
uses our redefined `>>=`

.

**Update:** As Brandon points out below, we can get the same result with `-fno-implicit-prelude`

, which will force GHC to use whatever `>>=`

is in scope. Thanks, Brandon!

We can build our own version of `do`

using Template
Haskell. Template Haskell allows us to generate code at compile-time,
in fashion similar to Lisp macros. In the code below, `$(...)`

means "insert some code here," and `[|...|]`

parses an
expression and returns it as a data structure.

```
setExample' :: S.Set Int
setExample' = $(restricted [|do
x <- S.fromList [1,2,4]
y <- S.fromList [1,2,4]
return (x*y) |])
```

The function
`restricted`

maps a parsed expression to another parsed
expression.

As it turns out, we're not allowed to define `restricted`

(or
any of our earlier type classes) in the same file as
`setExample'`

. So we need to create a file "RestrictedMonad.hs"
and move all our definitions into it:

```
{-# LANGUAGE MultiParamTypeClasses,
UndecidableInstances,
TemplateHaskell #-}
module RestrictedMonad (
Monad1, Monad2, return, fail, (>>=),
restricted
) where
import Prelude hiding (return, fail, (>>=))
import qualified Prelude
import qualified Data.Set as S
import Language.Haskell.TH
import Language.Haskell.TH.Syntax
-- Our earlier definitions go here...
```

Now, we can define our "macro":

```
-- Maps a quoted expression to a quoted
-- expression.
restricted :: Q Exp -> Q Exp
restricted code = do
(DoE stmts) <- code
expand stmts
```

Before we can define `expand`

, we need one more piece. Template
Haskell is based on monads, and we've redefined the built-in monad
operators. To generate our replacement code, we'll need to use the
built-in `return`

operator:

```
ret :: a -> Q a
ret = Prelude.return
```

Now we're ready to define `expand`

. This function expands a parsed `do`

-body into a series of
function calls. The implementation gets a little messy at times, due to limitations in
Template Haskell.

```
-- pat <- expr; stmts... We use lamE here
-- because we can't insert 'pat' directly
-- into the [|...|] form.
--
-- Note that we don't call 'fail' on
-- pattern-match failure, but we should.
expand (BindS pat expr:stmts) =
[| $(ret expr) >>=
$(lamE [ret pat]
(expand stmts)) |]
-- let decls...; stmts...
expand (LetS decls:stmts) =
letE (fmap ret decls)
(expand stmts)
-- The final expression in the 'do'.
expand (NoBindS expr:[]) = ret expr
-- expr; stmts...
expand (NoBindS expr:stmts) =
[| $(ret expr) >>=
(\_ -> $(expand stmts)) |]
expand stmts =
error ("Malformed 'do': "++show stmts)
```

For more information on Template Haskell, see Template Meta-programming for Haskell and DSL Implementation in MetaOCaml, Template Haskell, and C++ (PDF).

### Some lesser-known monad laws

There are three well-known laws that all monads must obey. But
there are some lesser-known monad laws that are automatically
enforced in standard Haskell, thanks to the type checker. But now that we
allow our monad to use the operations defined by `Ord`

, we
need to prove these extra laws manually.

This *should* be a complete
list:

```
-- The standard monad laws.
join . return == id
join . fmap return == id
join . join == join . fmap join
-- The other monad laws: Return and join
-- are natural transformations.
fmap f . return == return . f
fmap f . join == join . fmap (fmap f)
-- The functor laws.
fmap id == id
fmap (f . g) == fmap f . fmap g
```

Ultimately, I'd like to see `Data.Set`

become a monad in standard Haskell.

Want to contact me about this article? Or if you're looking for something else to read, here's a list of popular posts.

If you use -fno-implicit-prelude, GHC will use whatever (>>=) is in scope instead of pulling in (Prelude.>>=) behind your back; see section 7.3.5 of the User’s Guide.

Nice! That looks like it probably needed a lot of persistence to get working.

There’s also this which I haven’t tried yet.

Brandon: Thanks! That’s really good to know.

sigfpe: The Template Haskell bits were actually pretty interesting. I’ve been meaning to look at Haskell code-generation for a while, and I’m pleased to discover that things mostly work.

There are definitely some missing pieces in the API, though—not all of Haskell’s grammar is actually available, and splicing is very limited. So Template Haskell is less powerful than Lisp macros (and it lacks the runtime code-generation support of MetaOCaml). But it’s still powerful enough to allow for some interesting applications.

I recently used the GADT method to implement a restricted monad on my blog.

But, I prefer the solution with template Haskell. Good idea !

Yeah, very cool stuff!

It might be that Associated Types could provide a cleaner solution (see here), but this one is

good enoughfor me.alpheccar: Note that you don’t need Template Haskell if you use

`-fno-implicit-prelude`

, as Brandon points out.Still, the Template Haskell solution is definitely interesting in its own right: There are some nice features there, which could be useful for other things.

pepe: Oh, that’s

cool. That might be the right way to get something like this into standard Haskell.The TH solution is very interesting indeed, since one would probably want to have code using your restricted monad as well as vanilla monads in the same module.

In that case, rebinding the do notation via -fnoimplicit-prelude is not an option. Otherwise, you need to keep code that uses a restricted monad in a separate module, which would get annoying very quickly.

Nice one! I ran into the problem of trying to define a monad on a subcategory of the category of types rather than on the whole thing a while back, but couldn’t get anywhere with it – nice to see some solutions!

Interesting! Scala’s support for subtyping, higher-kinded types (as of 2.5.0, or in a nightly build), and implicit arguments makes it easy to define type classes that abstract over type class contexts (such as Ord a => in the case of Set as an instance of Monad).

A simplified version (without the implicits to fully model type classes) looks like this: (full source)

Essentially, this says Monad is a type constructor that takes a type T, a type constructor MyType, and a type constructor Bound. More specifically:

PS: If you’d like more details, let me know (adriaan |at| cs/kuleuven/be) — I’ve been working on adding support for type constructor polymorphism (as we tend to call higher-kinded types etc.) to the Scala compiler over the past couple of months.

The disadvantage of this approach is, that in a function you need a (Monad2 m a b) for each transition from a type ‘a’ to another type ‘b’. Wolfgang Jeltsch gave a solution on Haskell-Cafe which only needs a constraints like (Monad1 m a, Monad1 m b). It uses type families and existential quantification for packing and unpacking the Ord dictionary.

http://www.haskell.org/pipermail/haskell-cafe/2008-March/041084.html

Here an interactive link to Monad instance for Data.Set, again on Haskell-Cafe.