Refactoring probability distributions, part 1: PerhapsT
(Warning: This article is a bit more technical than most of my stuff. It assumes prior knowledge of monads and monad transformers.)
Martin Erwig and Steve Kollmansberger wrote PFP, a really sweet Haskell library for computing with probabilities. To borrow their example:
die :: Dist Int
die = uniform [1..6]
If we roll a die, we get the expected distribution of results:
> die
1 16.7%
2 16.7%
3 16.7%
4 16.7%
5 16.7%
6 16.7%
If you haven't seen PFP before, I strongly encourage you to check it out. You can use it to solve all sorts of probability puzzles.
Anyway, I discovered an interesting way to implement PFP using monad transformers. Here's what it looks like:
type Dist = PerhapsT ([])
uniform = weighted . map (\x -> (x, 1))
In other words, Dist can be written by adding some semantics to the standard list monad.
Perhaps: A less specific version of Maybe
First, let's define a simple probability type:
newtype Prob = P Float
deriving (Eq, Ord, Num)
instance Show Prob where
show (P p) = show intPart ++ "." ++ show fracPart ++ "%"
where digits = round (1000 * p)
intPart = digits `div` 10
fracPart = digits `mod` 10
Thanks to the deriving (Num) declaration, we can treat Prob like any other numeric type.
We can now define Perhaps, which represents a value with an associated probability:
data Perhaps a = Perhaps a Prob
deriving (Show)
Now, this is just a generalization of Haskell's built-in Maybe type, which treats a value as either present (probability 1) or absent (probability 0). All we've added is a range of possibilities in between: Perhaps x 0.5 represents a 50% chance of having a value.
Note that there's one small trick here: When the probability of a value is 0, we may not actually know it! But because Haskell is a lazy language, we can write:
Perhaps undefined 0
We'll need a convenient way to test for this case, to make sure we don't try to use any undefined values:
neverHappens (Perhaps _ 0) = True
neverHappens _ = False
So Perhaps is just like Maybe. As it turns out, they're both monads, and they both have an associated monad transformer.
First, a monad
Like Haskell's standard Maybe type, Perhaps is a monad.
instance Functor Perhaps where
fmap f (Perhaps x p) = Perhaps (f x) p
instance Monad Perhaps where
return x = Perhaps x 1
ph >>= f | neverHappens ph = never
| otherwise = Perhaps x (p1 * p2)
where (Perhaps (Perhaps x p1) p2) = fmap f ph
We can also define a slightly extended interface:
class Monad m => MonadPerhaps m where
perhaps :: a -> Prob -> m a
never :: m a
instance MonadPerhaps Perhaps where
never = Perhaps undefined 0
perhaps = Perhaps
Next, a monad transformer
What if we're already using a monad, and we want to layer Perhaps-style semantics over the top? Well, we can always use a monad transformer. The following code is presented without further comment:
newtype PerhapsT m a = PerhapsT { runPerhapsT :: m (Perhaps a) }
instance MonadTrans PerhapsT where
lift x = PerhapsT (liftM return x)
instance Monad m => Functor (PerhapsT m) where
fmap = liftM
instance Monad m => Monad (PerhapsT m) where
return = lift . return
m >>= f = PerhapsT bound
where bound = do
ph <- runPerhapsT m
case ph of
(Perhaps x1 p1) | p1 == 0 -> return never
| otherwise -> do
(Perhaps x2 p2) <- runPerhapsT (f x1)
return (Perhaps x2 (p1 * p2))
Update: I removed the instance MonadPerhaps (PerhapsT m), because not all derived monads (including the one described here) will actually want to provide access to never and perhaps.
Recreating PFP
From here, everything is simple. We can implement PFP by applying our monad transformer to the standard list monad:
type Dist = PerhapsT ([])
uniform = weighted . map (\x -> (x, 1))
weighted :: [(a, Float)] -> Dist a
weighted [] =
error "Empty probability distributuion"
weighted xws = PerhapsT (map weight xws)
where weight (x,w) = Perhaps x (P (w / sum))
sum = foldl' (\w1 (_,w2) -> w1+w2) 0 xws
What does this refactoring buy us? Well, it turns out that we can reuse PerhapsT elsewhere...
Part 2: Random sampling
Want to contact me about this article? Or if you're looking for something else to read, here's a list of popular posts.
Refactorings of monads are cool and this is a really nice example.
But even cooler is when you refactor into already existing monads. I think Perhaps is simply Writer Prob and PerhapsT Prob is WriterT Prob. You need to define Prob to be a Float but with lazy multiplication so that when evaluating 0*x, 0 is returned without evaluating x. You also need to make Prob an instance of Monoid so that mappend is ordinary multiplication.
With this, I think your definition Monad (PerhapsT m) comes for free from the definition of WriterT. (At least I’ve written some code to do this and it seems to work.)
Curiously I was coming at something related from a different direction. I’ve been toying with writing yet another monad tutorial where monads are considered to be a form of tainting. You can use this to ‘quarantine’ code you consider tainted from clean code (in the same way the IO monad sequesters impure code). I was going to approach the Writer monad with something that looks remarkably like your Perhaps monad, ie. as a number attached to some data that gives a level of trust. But I didn’t make the neat connection with probability theory that you’ve just made.
I had noticed the underlying monoid structure of Perhaps, but I hadn’t connected it to WriterT. Thanks!
I discovered this particular monad transformer while trying to generalize PFP in various directions. There was a nasty smell in my code, and this is what fell out of the refactoring.
I suspect there’s a lot more WriterT monads lurking out there—monoids are such a basic structure that they can’t help but turn up again and again.
> generalize PFP in various directions.
Quantum computation perhaps?
> I suspect there’s a lot more WriterT monads lurking out there
Yes, so many that I think that ‘Writer’ is a terrible name as that really only captures one specific use.
Quantum computation perhaps?
Very tempting. But it’s pretty far down my list, and besides, I’m still a little bit confused about how
joinwould work, and what to do about mixed states.I think that this is really your area of expertise. :-)
I don’t quite understand all the code, but this is cool :)
Wouldn’t it be along the lines of (where P = Perhaps to make it fit better):
join P (P x p1) p2 = P x (p1 * p2)This is interesting stuff – I must say monad transformers still seem a bit like magic to me! Following on from what Dan said about using WriterT, it seems you don’t even need to declare a Monoid instance for probabilities – you can just use the Product type from Data.Monoid. You can also parameterise by the type of probabilities and get, eg, rational probabilities for free. Remarkably, you get all of this from a
type synonym:
> type ProbMT p = WriterT (Product p) ([])
(Like I said – magic!) With this you can write, eg,
> lift [1,2,3] >>= \x→tell (Product (1/3))>>return x :: ProbMT Rational Int
to get an exact distribution by thirds over {1,2,3}. (Note that tell is used to reweight a value.)
Yeah, monad transformers are sweet, though they confuse me bit at times.
Dan and I have discussed a version of
WriterTwhich (a) has a more sensible name, and (b) only supportstell. The rest of theWriterTAPI is fairly pernicious in this case.Another cool fact is that you can replace the product with (almost?) any metric space, and experiment with things like possibility theory.
I’ve been playing around this a bit and I keep coming up against the same problem – this is possibly not the right place to ask as it’s more general than just probability monads, but since we’re already discussing this…
Anyway, the problem is that I want to declare a type constructor as an instance of some class, but to implement the class methods, I need constraints on the type argument. Eg, if I want to implement a discrete probability monad using a Data.Map to manage the distribution, I might have
> import qualified Data.Map as Map
> newtype ProbMap a = PM (Map.Map a Double)
When I come to declare an instance of Monad, I need to use some Map functions:
> instance Monad ProbMap where
> return x = PM (Map.singleton x 1)
This doesn’t compile because Map requires the type of x to be an instance of Ord. But the type of x doesn’t appear anywhere in the instance declaration. How can I say that ProbMap is a Monad
only for types a such that (Ord a)?
The good news: You can make Data.Map a monad.
The bad news: It’s really ugly.
I would be delighted if this were fixed in a future version of Haskell.
With regards to what Dan P said, there is a quantum IO monad:
http://www.cs.nott.ac.uk/~txa/talks/qnet06.pdf