Since I mentioned the idea of monoids as a formal models of computations, John
Armstrong made the natural leap ahead, to the connection between monoids and monads – which are
a common feature in programming language semantics, and a prominent language feature in Haskell, one of my favorite programming languages.
Monads are a category theoretic construction. If you take a monoid, and use some of the constructions we’ve seen in the last few posts, we can move build a meta-monoid; that is,
a monoid that’s built from monoid-to-monoid mappings – essentially, the category of
small categories. (Small categories are categories whose collection of objects are a
set, not a proper class.)
We’re going to look at constructs built using objects in that category. But first (as usual), we need to come up with a bit of notation. Suppose we have a category, C. In the category of categories, there’s an identity morphism (which is also a functor) from C to C. We’ll
call that 1C. And given any functor from T:C→C (that is, from C to itself),
we’ll say that exponents of that functor are formed by self-compositions of T: T2=TºT; T3=TºTºT, etc. Finally, given a functor T,
there’s a natural transformation from T to T, which we’ll call 1T.
So, now, as I said, a monad is a construct in this category of category – that is, a particular
category with some additional structure built around it. Given a category, C, a monad on C
consists of three parts:
- T:C→C, a functor from C to itself.
- A natural transformation, η:1C→T
- A natural transformation μ:T2→T
C, T, η, and μ must satisfy some coherence conditions, which
basically mean that they must make the following two diagrams commute. First, we
show a requirement that in terms of natural transformations, μ is commutative in
how it maps T2 to T:
And then, a commutativity requirement on μ and η with respect to T (basically
making μ and η into a meta-identity for this meta-monoid):
μ and η basically play the role of turning C into a meta-meta-monoid. A monoid is
basically a category; then we play with it, and construct the category of categories – the first
meta-monoid. Now we’re taking a self-functor of the meta-monoid, and and using natural
transformations to build a new meta-meta-monoid around it.
One of the key things to notice here is that we’re building a monoid whose objects are,
basically, functions from monoids to monoids. We’ve gone meta out the wazoo – but it’s given us
something really interesting.
We start with the category. From the category, we get the functor – a structure preserving map
from the category to itself. The monad focuses on the functor – the transition from C to C: using
natural transformations, it defines an equivalence – not an equality, but an equivalence – between
multiple applications of the functor and a single application.
In terms of programming languages, we can think of C as a state. An application
of the functor T is an action – that is, a mapping from state to state. What the monad
does is provide a structure for composing actions. We don’t need to write the state – it’s
implicit in the definition of the functor/action. The monad says that if we have an action “X” and an action “Y”, we can compose them into an action “X followed by Y”. What the natural transformation says is that “X followed by Y” is an action – we can compose sequences of
actions, and the result is always an action – which we can compose further, producing other
So at the bottom, we have functions that are state-to-state transformers. But we don’t
really need to think much about the complexity of a state-to-state transition. What
we can do instead is provide a collection of primitive actions – which are themselves
written as state-to-state transitions – and then use those primitives to build
imperative code – which remains completely functional under the covers, and yet has
all of the properties that we would want from an imperative programming system –
ordering, updatable state, etc.
Below is a really simple piece of Haskell code using the IO monad. What the monad does is play
with IO states. The category is the set of IO states. Each action is a transformation from state to
state. The state is invisible — it’s created at the beginning of the “do”, and each
subsequent statement is implicitly converted to a state transition function.
hello :: IO () hello = do print "What is your name?" x <- getLine print (concat ["Hello", x])
So in the code above, “
print "What is your name"” is an action from an IO state to an IO state. It’s composed with
x <- getLine – which is, implicitly,
another transition from an IO state to an IO state, which includes an implicit variable
definition; and that’s composed with the finat “
lets us program completely in terms of the actions, without worrying about how to pass the states.