*(This is a heavily edited repost of the first article in my original
Haskell tutorial.)*

*(I’ve attempted o write this as a literate haskell program. What that
means is that if you just cut-and-paste the text of this post from your
browser into a file whose name ends with “.lhs”, you should be able to run it
through a Haskell compiler: only lines that start with “>” are treated as
code. The nice thing about this is that this blog post is itself a
compilable, loadable Haskell source file – so I’ve compiled and tested
all of the code in here in exactly this context.)*

Haskell is a functional programming language. At the simplest

level, what that means is that you write programs by writing functions – and the

functions are truly mathematical functions: they take a set of inputs, and generate

a set of outputs, and the result of the function call is dependent solely

on the values of the inputs. So, the best way to start out

looking at Haskell is to look at how to write basic functions.

So let’s dive right in a take a look at a very simple Haskell definition

of the factorial function:

> simplest_fact n = if n == 0 > then 1 > else n * simplest_fact (n - 1)

This is the classic implementation of the factorial. Some things to

notice:

- In Haskell, a function definition uses no keywords. Just “
`name`

“

params = impl - Function application is written by putting

things side by side. So to apply the factorial function to`x`

, we

just write`simplest_fact x`

. Parens are only used for managing precedence.

The parens in “`fsimplest_act (n - 1)`

” aren’t there because the function

parameters need to be wrapped in parens, but because otherwise, the default

precedence rules would parse it as “`(simplest_fact n) - 1`

“. - Haskell uses infix

syntax for most mathematical operations. That doesn’t mean that they’re

special: they’re just an alternative way of writing a binary function

invocation. You can define your own mathematical operators using normal

funcion definitions. - There are no explicit grouping constructs in Haskell:

no “{/}”, no “begin/end”. Haskell’s formal syntax uses braces, but the

language defines how to translate indentation changes into braces; in

practice, just indenting the code takes care of grouping.

That implementation of factorial, while correct, is not how most Haskell

programmers would normally implement it. Haskell includes a feature called

*pattern matching*, and most programmers would use pattern matching

rather than the `if/then`

statement for a case like that. Pattern matching

separates things and often makes them easier to read. The basic idea of

pattern matching in function definitions is that you can write what *look
like* multiple versions of the function, each of which uses a different set of

patterns for its parameters (we’ll talk more about what patterns look like in

detail in a later post); the different cases are separated not just by

something like a conditional statement, but they’re completely separated at

the definition level. The case that matches the values at the point of call is

the one that is actually invoked. So here’s a more stylistically correct

version of the factorial:

> pattern_fact 0 = 1 > pattern_fact n = n * pattern_fact (n - 1) >

In the semantics of Haskell, those two are completely equivalent. In fact,

the deep semantics of Haskell are based on pattern matching – so it’s more

correct to say that the if/then/else version is translated into the pattern

matching version than vice-versa! At a low level, both will basically be

expanded to the Haskell pattern-matching primitive called the

“`case`

” statement, which selects from among a list of patterns:

> cfact n = case n of > 0 -> 1 > _ -> n * cfact (n - 1)

Another way of writing the same thing is to use Haskell’s list type. Lists

are a very fundamental type in Haskell. They’re similar to lists in Lisp,

except that all of the elements of a list must have the same type. A list is

written between square brackets, with the values in the list written inside,

separated by commas, like:

[1, 2, 3, 4, 5]

As in lisp, the list is actually formed from pairs, where the first

element of the pair is a value in the list, and the second value is the rest

of the list. Pairs are *constructed* in Haskell using “`:`

“,

so the list could also be written in the following ways:

1 : [2, 3, 4, 5] 1 : 2 : [3, 4, 5] 1 : 2 : 3 : 4 : 5 : []

If you want a list of integers in a specific range, there’s a shorthand

for it using “`..`

“. To generate the list of values from

`x`

to `y`

, you can write:

[ x .. y ]

Getting back to our factorial function, the factorial of a number “n” is

the product of all of the integers from 1 to n. So another way of saying that

is that the factorial is the result of taking the list of all integers from 1

to n, and multiplying them together:

listfact n = listProduct [1 .. n]

But that doesn’t work, because we haven’t defined `listProduct`

yet. Fortunately, Haskell provides a ton of useful list functions. One of the

useful types of list operations is *fold*, which comes in two versions:

`foldl`

and `foldr`

. What folds do is iterate over a

list using some functions to combine elements. It takes a function to merge

two values, and an initial value for the merge. Then it takes the first

element in the list, and merges it with the initial value of the result. Then

it takes the second value, and merges it with the result of the first. And so

on.

The difference between `foldl`

and `foldr`

is the associative order in which they do things: `foldl`

starts

from the left, and foldr starts from the right.

To illustrate, suppose we had a list `[1, 2, 3, 4, 5, 6]`

. If

we did `foldl (*) 1 [1, 2, 3, 4, 5, 6]`

, it would evaluate as

“(((((1 * 1) * 2) * 3) * 4) * 5) * 6″. That is, it would start with 1, and

then 2, and then 3, and so on. `foldr`

would group

right-associative; it would do “1*(2*(3*(4*(5*(6*1)))))”. Since `*`

is associative, the result value doesn’t matter. But it can make a significant

difference in performance; the way that the Haskell ends up evaluating

function calls, you can easily wind up with programs where `foldr`

can be much faster that `foldl`

.

So, using fold, you can write the factorial using lists:

> listfact n = listProduct [ 1 .. n ] > where listProduct lst = foldl (*) 1 lst

One thing that some Haskell programmers like to do is to write

what they call *points-free* functions: that is, functions

that do not need to name any of their parameters. You can do an amazing

amount of stuff in points-free Haskell. Personally, I’m not a fan of

points-free programming: I find it much harder to read. But some people

love it.

To go points-free, you define a function in terms

of series of compositions of other functions. The elements of

those compositions take care of pushing the values around to

get them to the right place. For factorial, it’s pretty easy. There’s

a function called “`enumFromTo`

“, which takes two

parameters, m and n, and generates a list of the values from m to n. Through

a haskell feature called currying, if you take a two-parameter

function and only give it one parameter, you get back a one-parameter

function. So:

> enumFromOne = enumFromTo 1

is exactly the same as:

enumFromOne n = enumFromTo 1 n

There’s a list function, “`product`

” which takes the product of

all of the elements of a list – it’s basically a shorthand for “`zipwith`

“. Finally, if you’ve got two functions f and g, you can

(*) 1

write the composition of the two functions as `g . f`

. So,

to write a points-free factorial function, you could just write:

> pointsfree_fact = product . (enumFromTo 1)

I need to explain one more list function before moving on. There’s a

function called “`zipWith`

>” for performing an operation pairwise

on two lists. For example, given the lists “`[1,2,3,4,5]`

” and

“`[2,4,6,8,10]`

“, “`zipWith (+) [1,2,3,4,5]`

” would result in “

[2,4,6,8,10]`[3,6,9,12,15]`

“.

Now we’re going to jump into something that’s going to seem really

strange. One of the fundamental properties of how Haskell runs a program is

that Haskell is a *lazy* language. What that means is that no

expression is actually evaluated until its value is *needed*. So you

can do things like create *infinite* lists – since no part of the list is

computed until it’s needed, you can create something that defines an infinite

list – but since you’ll never access more than a finite number of elements of

it, it’s no problem. Using that, we can do some really clever things, like

define the complete series of fibonnaci numbers:

> fiblist = 0 : 1 : (zipWith (+) fiblist (tail fiblist))

This looks very strange if you’re not used to it. But if we tease it apart a

bit, it’s really pretty simple:

- The first element of the fibonacci series is “0”.
- The second element of the fibonacci series is “1”.
- For the rest of the series, take the full fibonacci list, and

line up the two copies of it, offset by one (the full list, and the

list without the first element), and add the values pairwise.

That third step is the tricky one. It relies on the laziness of Haskell:

Haskell won’t compute the *nth* element of the list until you

*explicitly* reference it; until then, it just keeps around the

*unevaluated* code for computing the part of the list you haven’t

looked at. So when you try to look at the *n*th value, it will compute

the list *only up to* the *n*th value. So the actual computed

part of the list is always finite – but you can act as if it wasn’t. You can

treat that list *as if* it really were infinite – and retrieve any

value from it that you want. Once it’s been referenced, then the list up to

where you looked is concrete – the computations *won’t* be repeated.

But the last tail of the list will always be an unevaluated expression that

generates the *next* pair of the list – and *that* pair will

always be the next element of the list, and an evaluated expression for the

pair after it.

Just to make sure that the way that “`zipWith`

” is working in

“`fiblist`

” is clear, let’s look at a prefix of the parameters to

`zipWith`

, and the result. (Remember that those three are all actually the

same list! The diagonals from bottom left moving up and right are the same

list elements.)

fiblist = [0 1 1 2 3 5 8 ...] tail fiblist = [1 1 2 3 5 8 ... ] zipWith (+) = [1 2 3 5 8 13 ... ]

Given that list, we can find the *n*th element of the list very

easily; the *n*th element of a list *l* can be retrieved with

“`l !! n`

“, so, the fibonacci function to get the *n*th

fibonacci number would be:

> list_fib n = fiblist !! n

And using a very similar trick, we can do factorials the same way:

> factlist = 1 : (zipWith (*) [1..] factlist) > factl n = factlist !! n

The nice thing about doing factorial this way is that the values of all of

the factorials *less than* n are also computed and remembered, so the

next time you take a factorial, you don’t need to repeat those

multiplications.