Meta out the wazoo: Monads and Monoids

Since I mentioned the idea of monoids as a formal models of computations, John
Armstrong made the natural leap ahead
, to the connection between monoids and monads - which are
a common feature in programming language semantics, and a prominent language feature in href="http://scienceblogs.com/goodmath/goodmath/programming/haskell/">Haskell, one of my favorite programming languages.

Monads are a category theoretic construction. If you take a monoid, and use some of the constructions we've seen in the last few posts, we can move build a meta-monoid; that is,
a monoid that's built from monoid-to-monoid mappings - essentially, the category of
small categories. (Small categories are categories whose collection of objects are a
set, not a proper class.)

We're going to look at constructs built using objects in that category. But first (as usual), we need to come up with a bit of notation. Suppose we have a category, C. In the category of categories, there's an identity morphism (which is also a functor) from C to C. We'll
call that 1C. And given any functor from T:C→C (that is, from C to itself),
we'll say that exponents of that functor are formed by self-compositions of T: T2=TºT; T3=TºTºT, etc. Finally, given a functor T,
there's a natural transformation from T to T, which we'll call 1T.

So, now, as I said, a monad is a construct in this category of category - that is, a particular
category with some additional structure built around it. Given a category, C, a monad on C
consists of three parts:

  • T:C→C, a functor from C to itself.
  • A natural transformation, η:1C→T
  • A natural transformation μ:T2→T

C, T, η, and μ must satisfy some coherence conditions, which
basically mean that they must make the following two diagrams commute. First, we
show a requirement that in terms of natural transformations, μ is commutative in
how it maps T2 to T:

i-3708df09409a7a561b25a0597ce652f3-monad-prop1.jpg

And then, a commutativity requirement on μ and η with respect to T (basically
making μ and η into a meta-identity for this meta-monoid):

i-7446dbf8a04d1331f612c90223413813-monad-prop2.jpg

μ and η basically play the role of turning C into a meta-meta-monoid. A monoid is
basically a category; then we play with it, and construct the category of categories - the first
meta-monoid. Now we're taking a self-functor of the meta-monoid, and and using natural
transformations to build a new meta-meta-monoid around it.

One of the key things to notice here is that we're building a monoid whose objects are,
basically, functions from monoids to monoids. We've gone meta out the wazoo - but it's given us
something really interesting.

We start with the category. From the category, we get the functor - a structure preserving map
from the category to itself. The monad focuses on the functor - the transition from C to C: using
natural transformations, it defines an equivalence - not an equality, but an equivalence - between
multiple applications of the functor and a single application.

In terms of programming languages, we can think of C as a state. An application
of the functor T is an action - that is, a mapping from state to state. What the monad
does is provide a structure for composing actions. We don't need to write the state - it's
implicit in the definition of the functor/action. The monad says that if we have an action "X" and an action "Y", we can compose them into an action "X followed by Y". What the natural transformation says is that "X followed by Y" is an action - we can compose sequences of
actions, and the result is always an action - which we can compose further, producing other
compound actions.

So at the bottom, we have functions that are state-to-state transformers. But we don't
really need to think much about the complexity of a state-to-state transition. What
we can do instead is provide a collection of primitive actions - which are themselves
written as state-to-state transitions - and then use those primitives to build
imperative code - which remains completely functional under the covers, and yet has
all of the properties that we would want from an imperative programming system -
ordering, updatable state, etc.

Below is a really simple piece of Haskell code using the IO monad. What the monad does is play
with IO states. The category is the set of IO states. Each action is a transformation from state to
state. The state is invisible -- it's created at the beginning of the "do", and each
subsequent statement is implicitly converted to a state transition function.

hello :: IO ()
hello =
  do
    print "What is your name?"
    x <- getLine
    print (concat ["Hello", x])

So in the code above, "print "What is your name""is an action from an IO state to an IO state. It's composed with x <- getLine - which is, implicitly,
another transition from an IO state to an IO state, which includes an implicit variable
definition; and that's composed with the finat "print". The actions are sequenced - they occur in the correct order, and each passes its result state to next action. The monad
lets us program completely in terms of the actions, without worrying about how to pass the states.

More like this

As promised, I'm finally going to get to the theory behind monads. As a quick review, the basic idea of the monad in Haskell is a hidden transition function - a monad is, basically, a state transition function. The theory of monads comes from category theory. I'm going to assume you know a little…
In the last post on groups and related stuff, I talked about the algebraic construction of monoids. A monoid is, basically, the algebraic construction of a category - it's based on the same ideas, and has the same properties; just the presentation of it is different. But you can also see a monoid…
One of the questions that a ton of people sent me when I said I was going to write about category theory was "Oh, good, can you please explain what the heck a *monad* is?" The short version is: a monad is a category with a functor to itself. The way that this works in a programming language is that…
Time to come back to category theory from out side-trip. Category theory provides a good framework for defining linear logic - and for building a Curry-Howard style type system for describing computations with *state* that evolves over time. Linear logic provides a way of defining a valid *model*…

Mark, good stuff on category theory as usual.

Talking about John Armstrong, I think it would be helpful for other readers if you could add his blog to your blogroll (I think you just overlooked that.)

Now here's something I didn't realize until one of the students in my category theory seminar last semester helped tease apart: a Haskell monad is actually not a monad. It's a closely related structure called at "Kleisli triple". There's a way of going back and forth between monads and triples, but the data you give in a Haskell program is actually that of the triple corresponding to the monad we're thinking of.

> a Haskell monad is actually not a monad

Here is the documentation for Monad in the Haskell libraries:

http://cvs.haskell.org/Hugs/pages/libraries/base/Control-Monad.html

Due to a silly mistake in the library, Monad doesn't derive from Functor, but everyone who needs it defines a Functor instance for their monads. So we have an endofunctor, and two natural transformations, return and join. return and join are supposed to satisfy the equations satisfied by eta and mu in a monad. So I'm a bit confused about why you say Haskell monads aren't really monads.

John:

a Haskell monad is actually not a monad. It's a closely related structure called at "Kleisli triple".

As I understand it, the notions are isomorphic until you get to 2-categories.
However, there is some historical justification, in that Haskell Monads as originally posed used fmap/return/join as the basis functions rather than fmap/return/bind.

By Pseudonym (not verified) on 11 Mar 2008 #permalink

The category you discuss here - the category of IO states with arrows being state transitions - is all nice and well, but it's not the category "normally" considered in Haskell code descriptions; and Haskell Monads are monads in the rather different category Hask of Haskell datatypes with functions as morphisms as well.

Thus, IO is a functor that takes a datatype to a datatype encoding IO actions. And the monad structure on IO basically tells us that once we've started allowing changes to the World State, we can re-declare our intent to allow changes to the World State without really changing anything.

Is there a good way to view the state categories we get in IO (and in the State monad) as ... collapsed subcategories of Hask, with all objects in IOState collapsed to the datatype classed objects in IO Hask? It seems like there should be potential for a "categorification" process here where a state-ish datatype gets blown up to a large subcategory of state transitions encoded using some sort of virtual type thingies...