Why oh why Y?

By goodmath on August 31, 2006.

So in the last few posts, I've been building up the bits and pieces that turn lambda calculus into a useful system. We've got numbers, booleans, and choice operators. The only thing we're lacking is some kind of repetition or iteration.

In lambda calculus, all iteration is done by recursion. In fact, recursion is a pretty natural way of expressing iteration. It takes a bit of getting used to, but anyone who's programmed in a language like Scheme, ML, or Haskell for a while gets very used to idea, and feels frustrated coming back to a language like Java, where you need to write loops.

It can be a bit difficult if you're not used to thinking recursively. I wrote [an explanation of recursion][recursion] which you can go read if you're not used to what recursion is or how it works.

be found here). But since functions in lambda calculus don't have names, that means that we resort to something tricky. It's called the Y combinator, aka the lambda fixed point operator.

Let's start by looking at a simple recursive function outside of the lambda calculus. The factorial function, n!, is the standard example:

factorial(n) = 1 if n = 0, or factorial(n) = n*factorial(n-1) if n > 0

If we want to start trying to write that in lambda calculus, we'd need a couple of tools... We need a test for equality to zero, and we need a way of multiplying numbers; and we need a way of subtracting one.

For testing equality to zero, we'll use a function named *IsZero*, which takes three parameters: a number, and two values. If the number is 0, it returns the first value; if it's not 0, then it returns the second value.

For multiplication - multiplication is an iterative algorithm, so we can't write multiplication until we work out recursion. But we'll just handwave that for now, and have a function *Mult x y*.

And finally, for subtracting one, we'll use *Pred x* for the predecessor of x - that is, x - 1.

So - a first stab at factorial, written with the recursive call left with a blank in it, would be:

*λ n . IsZero n 1 (Mult n (**something** (Pred n)))*

Now, the question is, what kind of "something" can we plug in to there? What we're really like to do is plug in a copy of the function itself:

*Fact ≡ λ n . IsZero n 1 (Mult n (Fact (Pred n)))*

How can we do that? Well, the usual way of plugging something in to a lambda calculus function is through a parameter:

*Fact ≡ (λ f n . IsZero n 1 (Mult n (f (Pred n)))) Fact*

Of course, we can't plug in a copy of the function as its own parameter that way: the name *Fact* doesn't exist in the expression in which we're trying to use it. You can't use an undefined name - and in lambda calculus, the *only* way to bind a name is by passing it as a parameter to a λ expression. So what can we do?

The answer is to use something called a *combinator*. A combinator is a special kind of *higher order function* which can be defined without reference to anything but function applications. (A higher order function is a function which takes functions as parameters and returns functions as results). The Y combinator is the special, almost magical function that makes recursion possible. Here's what it looks like:

Y ≡ λ y . (λ x . y (x x)) (λ x . y (x x))

If you look at it, the reason for calling it Y is because it is *shaped* like a Y. To show you that more clearly, sometimes we write lambda calculus using trees. Here's the tree for the Y combinator:

Why is the Y combinator an answer to our problem in defining the factorial function? The Y combinator is something called a *fixed point* combinator. What makes it special is the fact that for any function *f*, *Y f* evaluates to *f Y f*; which evaluates to *f (f Y f)*; which evaluates to *f (f (f Y f))*. See why it's called Y?

Let's try walking through "*Y f*":

1. Expand Y: "*(λ y . (λ x . y (x x)) (λ x . y (x x))) f*"
2. β: "*(λ x . f (x x)) (λ x . f (x x))*
3. β again: "*f (λ x . f (x x)) (λ x . f (x x))*"
4. Since "*Y f = (λ x . f (x x)) (λ x . f (x x))*", what we just got in step three is "f Y f".

See, there's the magic of "Y". No matter what you do, you can't make it consume itself. Evaluating "*Y f*" will produce another copy of *f*, and leave the "Y f" part untouched.

So how do we use this crazy thing?

Remember our last attempt at defining the factorial function? Let's look at it again:

*Fact ≡ (λ f n . IsZero n 1 (Mult n (f (Pred n)))) Fact*

Let's rewrite it just a tiny bit to make it easier to talk about:

*Metafact ≡ (λ f n . IsZero n 1 (Mult n (f (Pred n))))*

*Fact ≡ Metafact Fact*

Now - the trick is, "Fact" is not an identifier defined inside of "Fact". How do we let "Fact" reference "Fact"? Well, we did a lambda abstraction to let us pass the "Fact" function as a parameter; so what we needed to do is to find a way to write "Fact" that lets us pass it to itself as a parameter.

What does "Y f" do? It expands into a call to "f" with "Y f" as its first parameter. And "Y f" will expand into "f Y f" - that is, a call to "f" with "Y f" as its parameter. In other words, Y f turns "f" into a recursive function with *itself* as its first parameter.

So the factorial function is:

*Fact ≡ Y Metafact*

*(Y metafact)* is the parameter value of "f" in the metafact lambda; when we do β on the function, if n is zero, then it just returns 1. If it's not zero, then we get the call to *f (Pred n)*. *f* betas to *Y metafact*. Which does that funky magic copying thing, giving us *metafact (Y metafact) (Pred n)*.

Voila, recursion. s

I learned about the Y combinator back in my undergrad days, which would place it around 1989 - and I still find it rather mystifying. I do understand it now, but I can't imagine how on earth anyone ever figured it out!

If you're interested in this, then I highly recommend getting a copy of the book [The Little Schemer][schemer]. It's a wonderful little book - set up like a childrens' book, where the front of each page is a question; and the back of each page is the answer. It's written in a delightfully playful style, it's very fun and engaging, and it will not only teach you to program in Scheme.

As an important side-note there are actually a couple of different versions of the Y combinator. There are different ways of evaluating lambda calculus: given an expression like *(λ x y . x \* y) 3 ((λ z. z \* z) 4)*"
we can do it in two different orders: we can first do the beta on "*(λ x y . x \* y)*",which would give us: "*3 \* ((λ z . z \* z) 4)*".

Or, we could beta "*((λ z . z \* z) 4)*" first: "*(λ x y . x \* y) 3 (4 \* 4)*". Nn this case, the two orders end up with the same result; but that's not always the case. Sometimes the order of evaluation matters - and the way that the Y combinator works is one of those times. One order will result in expanding the Y infinitely; the other will result in a clean recursive function.

The first order is what we call *lazy evaluation*: don't evaluate the parameters to a function until they're needed. (This is also pretty much the same thing as what we sometime call *by name* parameter passing.) The second is called *eager evaluation* : always evaluate parameters *before* the functions that they're passed to. (In real programming languages, Lisp, Scheme, and ML are lambda-calculus based languages that use eager evaluation; Haskell and Miranda are lambda calculus based languages that use lazy evaluation.) The Y combinator I described above is the Y for *lazy* evaluation. If we used eager evaluation, then Y combinator above wouldn't work - in fact, it would copy Ys forever. There is another version of Y which works for eager evaluation - in fact, it's the one described in "The Little Schemer", and they explain it so much better than I do that I'll just recommend again that you head for whatever bookstore you prefer, and buy yourself a copy.

[recursion]: http://goodmath.blogspot.com/2006/03/clarifying-recursion.html
[schemer]: http://www.amazon.com/gp/redirect.html?link_code=ur2&tag=goodmathbadma-…

More like this

Esther Lederberg, Discover of Lambda Prophage, RIP

Biologist Esther Lederberg died recently--unfortunately, it wasn't very well covered (the NY Times was only a month late...).

Programs are Proofs: Models and Types in Lambda Calculus

Lambda calculus started off with the simple, untyped lambda calculus that we've been talking about so far. But one of the great open questions about lambda calculus was: was it sound? Did it have a valid model?

A Lambda Calculus rerun

I'm on vacation this week, so I'm recycling some posts that I thought were particularly interesting to give you something to read. ------------

From Lambda Calculus to Cartesian Closed Categories

This is one of the last posts in my series on category theory; and it's a two parter. What I'm going to do in these two posts is show the correspondence between lambda calculus and the [cartesian closed categories][ccc].

I'm not sure if this makes Y less mysterious, but the intuitive leap is self-application.

You have the equation

Fact = Metafact Fact

If we make the assumption that Fact is of the form QQ (some function Q applied to itself) then we have:

QQ = Metafact QQ

We still don't know what Q is, but we know that Q operating on Q produces some expression involving Q. Let's just generalize to assume that Q operating on any function f produces the same expression, but with Q replaced by f:

Qf = Metafact (ff)

Now, the right side no longer involves Q! So we can write:

Q = Î» f . Metafact (f f)

Remembering that Fact = QQ, we can write:

Fact = (Î» f . Metafact (f f)) (Î» f . Metafact (f f)) = Y Metafact

Thank you for such beautiful essays.

Could you please advice me on how to draw such graphs/diagrams on a blog?

Thank you,
N

I see that John Baez has endorsed your blog threads on Lambda calculus and Category theory as "... here are some fun, easy places to start ..." on August 24, 2006 at 'The n-Category Cafe'.

http://golem.ph.utexas.edu/category/

Let's just generalize to assume that Q operating on any function f produces the same expression, but with Q replaced by f:

For those that haven't seen the reasoning before I suspect the above part is probably the only confusing step. The point is that the only way that Q can appear in the beta-reduction of QQ is by having e reference x, assuming Q = lambda x.e. Why? Well, if this wasn't the case then Q would have to appear literally in e (its own body). But this would lead to an infinite regress as Mark already demonstrated in the article.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Moving on

August 2, 2010

Finally, at long last, I can tell you what I've been up to with finding a new home for this blog. I've created a new, community-based science blogging site, called Scientopia. With the help of many wonderful people, we're ready. We launched this morning. So to continue following GM/BM - along with…

Goodbye, Scienceblogs

July 7, 2010

So my decision is made. I'm closing up around here. I'm in the process of working out exactly where I'm going to go. With any luck, Seed will leave this blog here long enough for me to post an update with the new location. But I'm through with Seed and ScienceBlogs.

Seed, Conflicts of Interest, and Sleaze

July 6, 2010

As my friend Pal wrote about, Seed Media Group, the corporate overlords of the ScienceBlogs network that this blog belongs to, have apparently decided that blog space in these parts is now up for sale to advertisers. We've been advertiser supported since I joined up with SB. I've never minded…

Searching for Topics

June 28, 2010

As regular readers have no doubt noticed by now, posting on the blog has been slow lately. I've been trying to come back up to speed, but so far, that's been mainly in the form of bad math posts. I'd like to get back to the good stuff. Unfortunately, the chaos theory stuff that I was…

Saturday Recipe: Ginger Scallion Sauce

June 26, 2010

Today's recipe is something I made this week for the first time, and trying it was like a revelation. It's simple to make, it's got an absolutely spectacularly wonderful flavor - light and fresh - and it's incredibly versatile. It's damned near perfect. It's scallion ginger sauce, and once you try…