# How to make Data.Set a monad

**...and how to fake Lisp macros with Template Haskell**

(I wrote this article in response to a comment by sigfpe. You may find it pretty dry reading, unless you want to build domain-specific languages in Haskell. Proceed at your own risk.)

Haskell's built-in `Monad`

type has some serious limitations. We can fix those limitations using a number of advanced Haskell techniques, including Template Haskell, Haskell's closest equivalent to Lisp macros.

We can illustrate the limitations of `Monad`

with an example from math. In set theory, we can define a set by specifying how to compute each
element:

{xy:x∈ {1,2,4},y∈ {1,2,4} }

We can read this as, "the set of all *xy*, where *x*
is one of {1,2,4}, and *y* is one of {1,2,4}." To calculate the
answer, we first multiply together all the possible combinations:

1×1=1, 1×2=2, 1×4=4, 2×1=2, 2×2=4, 2×4=8, 4×1=4, 4×2=8, 4×4=16

We then collect up the answers, and---because we're working with sets--we throw away the duplicates:

{1,2,4,8,16}

Can we do the same thing in Haskell? Well, using Haskell's list monad, we can write:

```
listExample = do
x <- [1,2,4]
y <- [1,2,4]
return (x*y)
```

But when we run this, Haskell gives us lots of duplicate values:

```
> listExample
[1,2,4,2,4,8,4,8,16]
```

Our problem: We're using lists (which can contain duplicate
values) to represent sets (which can't). Can we fix this by switching to
Haskell's `Data.Set`

?

```
import qualified Data.Set as S
-- This doesn't work.
setExample = do
x <- S.fromList [1,2,4]
y <- S.fromList [1,2,4]
return (x*y)
```

Unfortunately, this code fails spectacularly. A Haskell monad is required
to work for *any* types `a`

and `b`

:

```
class Monad m where
return :: a -> m a
fail :: String -> m a
(>>=) :: m a -> (a -> m b) -> m b
```

But `Data.Set`

only works for some types. Specifically, it
requires that values of type `a`

can be ordered:

```
data (Ord a) => Set a = ...
```

As it turns out, we can make `Data.Set`

into a monad. But be
warned: The solution involves some pretty ugly Haskell abuse.

### Splitting `Monad`

in half

If we want to put a restrictions on `a`

and `b`

,
we'll need to move them into the signature of `Monad`

. My first
attempt looked like this:

```
{-# LANGUAGE MultiParamTypeClasses,
UndecidableInstances #-}
import Prelude hiding (return, fail, (>>=))
import qualified Prelude
-- This won't work:
class NewMonad m a b where
return :: a -> m a
fail :: String -> m a
(>>=) :: m a -> (a -> m b) -> m b
```

Unfortunately, the type `b`

doesn't appear anywhere in
`return`

or `fail`

. This makes the type checker sad.
But there's a workaround, discovered by oleg. We can split
`Monad`

into two pieces:

```
class Monad1 m a where
return :: a -> m a
fail :: String -> m a
class (Monad1 m a, Monad1 m b) =>
Monad2 m a b where
(>>=) :: m a -> (a -> m b) -> m b
```

Using these type classes, we can finally make `Set`

a monad:

```
instance (Ord a) =>
Monad1 S.Set a where
return = S.singleton
fail _ = S.empty
instance (Ord a, Ord b) =>
Monad2 Set.Set a b where
m >>= f = (setJoin . S.map f) m
where setJoin = S.unions . S.toList
```

This gets us very close to our goal:

```
setExample :: S.Set Int
setExample =
S.fromList [1,2,4] >>= \x ->
S.fromList [1,2,4] >>= \y ->
return (x*y)
```

When we run this code, the duplicates are gone:

```
> setExample
fromList [1,2,4,8,16]
```

### Rebuilding `do`

: Macros

Unfortunately, GHC won't let us use the built-in `do`

syntax.
Even though we've carefully hidden the regular `>>=`

operator,
GHC goes digging around in the libraries and finds it anyway. So we need
to somehow replace the built-in `do`

with a new version that
uses our redefined `>>=`

.

**Update:** As Brandon points out below, we can get the same result with `-fno-implicit-prelude`

, which will force GHC to use whatever `>>=`

is in scope. Thanks, Brandon!

We can build our own version of `do`

using Template
Haskell. Template Haskell allows us to generate code at compile-time,
in fashion similar to Lisp macros. In the code below, `$(...)`

means "insert some code here," and `[|...|]`

parses an
expression and returns it as a data structure.

```
setExample' :: S.Set Int
setExample' = $(restricted [|do
x <- S.fromList [1,2,4]
y <- S.fromList [1,2,4]
return (x*y) |])
```

The function
`restricted`

maps a parsed expression to another parsed
expression.

As it turns out, we're not allowed to define `restricted`

(or
any of our earlier type classes) in the same file as
`setExample'`

. So we need to create a file "RestrictedMonad.hs"
and move all our definitions into it:

```
{-# LANGUAGE MultiParamTypeClasses,
UndecidableInstances,
TemplateHaskell #-}
module RestrictedMonad (
Monad1, Monad2, return, fail, (>>=),
restricted
) where
import Prelude hiding (return, fail, (>>=))
import qualified Prelude
import qualified Data.Set as S
import Language.Haskell.TH
import Language.Haskell.TH.Syntax
-- Our earlier definitions go here...
```

Now, we can define our "macro":

```
-- Maps a quoted expression to a quoted
-- expression.
restricted :: Q Exp -> Q Exp
restricted code = do
(DoE stmts) <- code
expand stmts
```

Before we can define `expand`

, we need one more piece. Template
Haskell is based on monads, and we've redefined the built-in monad
operators. To generate our replacement code, we'll need to use the
built-in `return`

operator:

```
ret :: a -> Q a
ret = Prelude.return
```

Now we're ready to define `expand`

. This function expands a parsed `do`

-body into a series of
function calls. The implementation gets a little messy at times, due to limitations in
Template Haskell.

```
-- pat <- expr; stmts... We use lamE here
-- because we can't insert 'pat' directly
-- into the [|...|] form.
--
-- Note that we don't call 'fail' on
-- pattern-match failure, but we should.
expand (BindS pat expr:stmts) =
[| $(ret expr) >>=
$(lamE [ret pat]
(expand stmts)) |]
-- let decls...; stmts...
expand (LetS decls:stmts) =
letE (fmap ret decls)
(expand stmts)
-- The final expression in the 'do'.
expand (NoBindS expr:[]) = ret expr
-- expr; stmts...
expand (NoBindS expr:stmts) =
[| $(ret expr) >>=
(\_ -> $(expand stmts)) |]
expand stmts =
error ("Malformed 'do': "++show stmts)
```

For more information on Template Haskell, see Template Meta-programming for Haskell and DSL Implementation in MetaOCaml, Template Haskell, and C++ (PDF).

### Some lesser-known monad laws

There are three well-known laws that all monads must obey. But
there are some lesser-known monad laws that are automatically
enforced in standard Haskell, thanks to the type checker. But now that we
allow our monad to use the operations defined by `Ord`

, we
need to prove these extra laws manually.

This *should* be a complete
list:

```
-- The standard monad laws.
join . return == id
join . fmap return == id
join . join == join . fmap join
-- The other monad laws: Return and join
-- are natural transformations.
fmap f . return == return . f
fmap f . join == join . fmap (fmap f)
-- The functor laws.
fmap id == id
fmap (f . g) == fmap f . fmap g
```

Ultimately, I'd like to see `Data.Set`

become a monad in standard Haskell.

Want to contact me about this article? Or if you're looking for something else to read, here's a list of popular posts.

`-fno-implicit-prelude`

, as Brandon points out. Still, the Template Haskell solution is definitely interesting in its own right: There are some nice features there, which could be useful for other things. pepe: Oh, that'scool. That might be the right way to get something like this into standard Haskell.