Refactoring probability distributions, part 1: PerhapsT

Posted by Eric Kidd Wed, 21 Feb 2007 08:06:00 GMT

(Warning: This article is a bit more technical than most of my stuff. It assumes prior knowledge of monads and monad transformers.)

Martin Erwig and Steve Kollmansberger wrote PFP, a really sweet Haskell library for computing with probabilities. To borrow their example:

die :: Dist Int
die = uniform [1..6]

If we roll a die, we get the expected distribution of results:

> die
1  16.7%
2  16.7%
3  16.7%
4  16.7%
5  16.7%
6  16.7%

If you haven’t seen PFP before, I strongly encourage you to check it out. You can use it to solve all sorts of probability puzzles.

Anyway, I discovered an interesting way to implement PFP using monad transformers. Here’s what it looks like:

type Dist = PerhapsT ([])

uniform = weighted . map (\x -> (x, 1))

In other words, Dist can be written by adding some semantics to the standard list monad.

Perhaps: A less specific version of Maybe

First, let’s define a simple probability type:

newtype Prob = P Float
  deriving (Eq, Ord, Num)

instance Show Prob where
  show (P p) = show intPart ++ "." ++ show fracPart ++ "%"
    where digits = round (1000 * p)
          intPart = digits `div` 10
          fracPart = digits `mod` 10

Thanks to the deriving (Num) declaration, we can treat Prob like any other numeric type.

We can now define Perhaps, which represents a value with an associated probability:

data Perhaps a = Perhaps a Prob
  deriving (Show)

Now, this is just a generalization of Haskell’s built-in Maybe type, which treats a value as either present (probability 1) or absent (probability 0). All we’ve added is a range of possibilities in between: Perhaps x 0.5 represents a 50% chance of having a value.

Note that there’s one small trick here: When the probability of a value is 0, we may not actually know it! But because Haskell is a lazy language, we can write:

Perhaps undefined 0

We’ll need a convenient way to test for this case, to make sure we don’t try to use any undefined values:

neverHappens (Perhaps _ 0) = True
neverHappens _             = False

So Perhaps is just like Maybe. As it turns out, they’re both monads, and they both have an associated monad transformer.

First, a monad

Like Haskell’s standard Maybe type, Perhaps is a monad.

instance Functor Perhaps where
  fmap f (Perhaps x p) = Perhaps (f x) p

instance Monad Perhaps where
  return x = Perhaps x 1
  ph >>= f  | neverHappens ph  = never
            | otherwise        = Perhaps x (p1 * p2)
    where (Perhaps (Perhaps x p1) p2) = fmap f ph

We can also define a slightly extended interface:

class Monad m => MonadPerhaps m where
  perhaps :: a -> Prob -> m a
  never :: m a

instance MonadPerhaps Perhaps where
  never = Perhaps undefined 0
  perhaps = Perhaps

Next, a monad transformer

What if we’re already using a monad, and we want to layer Perhaps-style semantics over the top? Well, we can always use a monad transformer. The following code is presented without further comment:

newtype PerhapsT m a = PerhapsT { runPerhapsT :: m (Perhaps a) }

instance MonadTrans PerhapsT where
  lift x = PerhapsT (liftM return x)

instance Monad m => Functor (PerhapsT m) where
  fmap = liftM

instance Monad m => Monad (PerhapsT m) where
  return = lift . return
  m >>= f = PerhapsT bound
    where bound = do
            ph <- runPerhapsT m
            case ph of
              (Perhaps x1 p1)  | p1 == 0    -> return never
                               | otherwise  -> do
                (Perhaps x2 p2) <- runPerhapsT (f x1)
                return (Perhaps x2 (p1 * p2))

Update: I removed the instance MonadPerhaps (PerhapsT m), because not all derived monads (including the one described here) will actually want to provide access to never and perhaps.

Recreating PFP

From here, everything is simple. We can implement PFP by applying our monad transformer to the standard list monad:

type Dist = PerhapsT ([])

uniform = weighted . map (\x -> (x, 1))

weighted :: [(a, Float)] -> Dist a
weighted [] =
  error "Empty probability distributuion"
weighted xws = PerhapsT (map weight xws)
  where weight (x,w) = Perhaps x (P (w / sum))
        sum = foldl' (\w1 (_,w2) -> w1+w2) 0 xws

What does this refactoring buy us? Well, it turns out that we can reuse PerhapsT elsewhere…

Part 2: Random sampling

Tags , , , ,

Comments

  1. Dan P said about 6 hours later:

    Refactorings of monads are cool and this is a really nice example.

    But even cooler is when you refactor into already existing monads. I think Perhaps is simply Writer Prob and PerhapsT Prob is WriterT Prob. You need to define Prob to be a Float but with lazy multiplication so that when evaluating 0*x, 0 is returned without evaluating x. You also need to make Prob an instance of Monoid so that mappend is ordinary multiplication.

    With this, I think your definition Monad (PerhapsT m) comes for free from the definition of WriterT. (At least I’ve written some code to do this and it seems to work.)

    Curiously I was coming at something related from a different direction. I’ve been toying with writing yet another monad tutorial where monads are considered to be a form of tainting. You can use this to ‘quarantine’ code you consider tainted from clean code (in the same way the IO monad sequesters impure code). I was going to approach the Writer monad with something that looks remarkably like your Perhaps monad, ie. as a number attached to some data that gives a level of trust. But I didn’t make the neat connection with probability theory that you’ve just made.

  2. Eric said about 7 hours later:

    I had noticed the underlying monoid structure of Perhaps, but I hadn’t connected it to WriterT. Thanks!

    I discovered this particular monad transformer while trying to generalize PFP in various directions. There was a nasty smell in my code, and this is what fell out of the refactoring.

    I suspect there’s a lot more WriterT monads lurking out there—monoids are such a basic structure that they can’t help but turn up again and again.

  3. Dan P said about 12 hours later:

    generalize PFP in various directions.

    Quantum computation perhaps?

    I suspect there’s a lot more WriterT monads lurking out there

    Yes, so many that I think that ‘Writer’ is a terrible name as that really only captures one specific use.

  4. Eric said about 16 hours later:

    Quantum computation perhaps?

    Very tempting. But it’s pretty far down my list, and besides, I’m still a little bit confused about how join would work, and what to do about mixed states.

    I think that this is really your area of expertise. :-)

  5. Porges said 66 days later:

    I don’t quite understand all the code, but this is cool :)

    ...I’m still a little bit confused about how join would work…

    Wouldn’t it be along the lines of (where P = Perhaps to make it fit better):

    join P (P x p1) p2 = P x (p1 * p2)
  6. Samer said 73 days later:

    This is interesting stuff – I must say monad transformers still seem a bit like magic to me! Following on from what Dan said about using WriterT, it seems you don’t even need to declare a Monoid instance for probabilities – you can just use the Product type from Data.Monoid. You can also parameterise by the type of probabilities and get, eg, rational probabilities for free. Remarkably, you get all of this from a type synonym:

    type ProbMT p = WriterT (Product p) ([])

    (Like I said – magic!) With this you can write, eg,

    lift [1,2,3] >>= \x->tell (Product (1/3))>>return x :: ProbMT Rational Int

    to get an exact distribution by thirds over {1,2,3}. (Note that tell is used to reweight a value.)

  7. Eric said 74 days later:

    Yeah, monad transformers are sweet, though they confuse me bit at times.

    Dan and I have discussed a version of WriterT which (a) has a more sensible name, and (b) only supports tell. The rest of the WriterT API is fairly pernicious in this case.

    Another cool fact is that you can replace the product with (almost?) any metric space, and experiment with things like possibility theory.

  8. Samer said 74 days later:

    I’ve been playing around this a bit and I keep coming up against the same problem – this is possibly not the right place to ask as it’s more general than just probability monads, but since we’re already discussing this…

    Anyway, the problem is that I want to declare a type constructor as an instance of some class, but to implement the class methods, I need constraints on the type argument. Eg, if I want to implement a discrete probability monad using a Data.Map to manage the distribution, I might have

    import qualified Data.Map as Map

    newtype ProbMap a = PM (Map.Map a Double)

    When I come to declare an instance of Monad, I need to use some Map functions:

    instance Monad ProbMap where return x = PM (Map.singleton x 1)

    This doesn’t compile because Map requires the type of x to be an instance of Ord. But the type of x doesn’t appear anywhere in the instance declaration. How can I say that ProbMap is a Monad only for types a such that (Ord a)?

  9. Eric said 74 days later:

    The good news: You can make Data.Map a monad.

    The bad news: It’s really ugly.

    I would be delighted if this were fixed in a future version of Haskell.

  10. Robert said 408 days later:

    With regards to what Dan P said, there is a quantum IO monad:

    http://www.cs.nott.ac.uk/~txa/talks/qnet06.pdf

Comments are disabled