While I was waiting for stuff to install on my new machine, I was doing some browsing around the web, and came across an interesting article at a blog called “The Only Winning Move”, titled [Scheme Death Knell?](http://theonlywinningmove.blogspot.com/2006/10/scheme-death-knell.html). It’s not a bad article at all, but I disagree with its conclusions, and I thought it would be interesting to
discuss why.
The article was brought on by *another* blog post, this one at [Lambda the Ultimate](http://lambda-the-ultimate.org/) suggesting that someone translate Douglas Hofstadter’s columns introducing Scheme to replace the Scheme with Haskell.
Josh over at TOWM was slightly irritated by this idea, and concludes that the only reason
why people are suggesting that is because scheme is getting too old to be cool and trendy,
and so they want to switch to something newer.
I disagree.
Haskell is a *very* different language from Scheme, and I think that there are some genuinely good
reasons for teaching Haskell instead of Scheme. I *also* think there are some genuinely good reasons for using Scheme instead of Haskell. It’s not really clear to me which is better, but it’s worth contrasting the two to help understand the tradeoff.
Syntax
——–
Syntax is in many ways a minor issue. But for people just learning to program, it *can* be a big deal. And it’s *not* the case that simpler is always better. I’ve actually taught an introduction to computer class using Scheme, and the syntax was a *big* problem.
I think that syntactically, Haskell is a better language for beginners. Haskell syntax is very redundant, which is *good* for people. Compare these two little snippets:
(define (fact n)
(if (= n 0)
1
(* n (fact (- n 1)))))
fact n = if n == 0
then 1
else n * fact(n-1)
Which is easier to read? And the Haskell can get even easier to read and write using
pattern matching:
fact 0 = 1
fact n = n * fact (n-1)
Try this comparison:
(define (fold init reducer list)
(if (eq? list ())
init
(reducer (car list) (fold init reducer (cdr list)))))
versus:
fold init reducer [] = init
fold init reducer l:ls = reducer l (fold init reducer ls)
The difference gets *much* more obvious with bigger pieces of code. Between the deliberate
redundancies in Haskell’s syntax, and the way that pattern matching lets you decompose programs, the Haskell is significantly clearer.
There’s another syntax thing that’s a *lot* clearer with Haskell. Both Scheme and Haskell
are strongly oriented towards programming in a style where functions are values. When you work that way, one of the most fundamental things you want to do is create new functions. And one of the
most common kinds of functions you end up writing is currys: that is, versions of already existing
functions with *some* parameters bound. So, for example, suppose you want to search a list for
things with a key of 3. In Scheme, you’d pass a function (lambda (x) (eq? x 3));
in Haskell, you don’t need to wrap it into a lambda; you can just write the function with missing parameters. For example, the function to compare to 3 is (3==). Which of those two is easier to read?
Typing
———-
This is going to be more controversial. I’m a big fan of strong typing; many other people aren’t. But I’m going to try to avoid my personal opinion about what language I’d like to program in, and focus on the aspects that affect teaching.
Beginners are *very* prone to make certain kinds of mistakes as they learn. The idea that the string “123″, the integer 123, and the floating point number 123.0 are three *different* things is actually remarkably tricky for beginners. I can’t even begin to tell you how many times I’ve sat in a lab with a student getting a Scheme error that they couldn’t understand
because they were mixing up data types. Strong typing is *really* good for helping catch a lot of
beginners errors, and giving them sensible error messages. It’s particularly good in languages like Haskell that have really powerful type inference systems, so that most beginners programs don’t need any actual type declarations.
On the other hand, getting from point of being a beginner to being really skilled in a language, Haskell types *can* start to get in the way. That wonderful type inference comes at a cost: there’s a fairly complicated system of type “classes” and “instances” to describe how types fit into the polymorphic type inference setup.
Basic Semantics
—————-
At a very basic level, the semantics of Scheme and Haskell are extremely similar. They’re both basically lambda calculus evaluators, and they both model data in very similar ways. There are two
main differences:
1. *Purity*: Haskell is pure lambda calculus: no assignments, no side effects; Scheme is
more of a hybrid, with pure semantics when you want them, and assignments when you do.
2. *Evaluation Order*: Scheme uses eager evaluation; Haskell uses lazy evaluation. What that means is that in Scheme, a function call “(f x y)” will evaluate “x” and “y” before invoking “f”; but in Haskell “f x y”
Both of these are tradeoffs, and it’s hard to say which is preferable.
The purity of Haskell is fantastic at times: there are definitely times, especially in simple beginner programs, where pure programs are much clearer than the corresponding non-pure/imperative program. Try, if you can, to remember your reaction the first time you saw “x=x+1″. And again, from the beginners perspective, the pure approach *prevents* beginners from making some common mistakes. For example, if you can’t use global variables, then you won’t. Once you’re a skilled programmer, recognizing the places where a global is appropriate is something you’re expected to be able to do. But if you teach beginners, in every class, you’ll find at least 20% who don’t pass parameters the way they should – they put some or all of the values that should be parameters into globals.
From the other side, some things are *much* more difficult working in a pure style. Some algorithms are far more natural written in an imperative style; and much more importantly, many things like I/O are *intrinsically* based on state change. The pure approach generally uses monads to work around this; but there are problems with the monadic approach. Monads can be *extremely* difficult to understand, and even experts can have trouble when they need to mix different kinds of monadic actions. (For example, using a state monad to allow you to write some efficient state-based algorithm, mixed with I/O.)
Laziness has a similar problem. It’s very convenient to be able to write code without worrying
about what order things happen in. In fact, it’s so convenient that *most* programming languages include *at least* one version of lazy operators for use in conditionals (except they call them “short-circuit” operators).
And again, from the other side, laziness can be incredibly confusing. Not knowing what order things are going to happen in can result in things happening in an unexpected order. Mix in monads in the wrong way, and you can get utterly incomprehensible results.
Advanced Semantics
———————
On a more advanced level, the semantics of Haskell are clearly superior. But I’m not saying that for the reasons that you usually hear in discussions about functional languages. What you normally hear is a story about correctness proofs, and how you can do correctness proofs easily in pure functional languages, but it’s much harder in non-pure languages. I think that’s a pile of nonsense. I mean, sure, on some level, it’s true that it’s easier to write a proof about a pure functional program than it is to write one about a program that’s chock-full of assignments. But the simple fact of the matter is, outside of trivial classroom examples, I’ve never seen anyone actually write a correctness proof. I’ve rarely seen anyone *specify* the semantics of what they intend their program to do well enough to be able to write a correctness proof.
The reason that I like the semantics of Haskell is very simple: they’re simple.
If you’ve got a class full of first year computer scientists, you *can* teach them to read and understand the full formal semantics of Haskell. You can make it completely non-mysterious. Everything can be explained by the standard lambda calculus α, β, and η rules, with no hidden complexities. It’s all remarkably straightforward.
In non-pure languages, that’s just not the case. Even Scheme, which has by far the simplest and cleanest denotational semantics of any non-pure language I’ve ever seen, has enough incomprehensible gunk in its formal semantics that I wouldn’t attempt to teach it before *at least* third year undergrad, and probably later.
A lot of people will say that that’s no big deal: why needs to know the formal semantics anyway? And for *most* programmers, they’re probably right. But for someone who wants to *really understand* what’s going on, the fact that the semantics *are* approachable and comprehensible *is* a very good thing.
Conclusions
—————
In the end, do I like the idea of re-writing the Hofstadter tutorials in Haskell, instead of Scheme?
I think it’s a great idea. Haskell is a great language, and I think that on balance,
it would be a great experience for interested beginners to have something like Hofstadters
essays to guide them through learning to programming using a language like Haskell. Not as a *replacement* for the Scheme ones, but as an alternative.