Basics: Algorithm

By goodmath on February 27, 2007.

A kind reader pointed out that I frequently mention algorithms, but that I haven't defined them in the basics posts. To me, they're so fundamental to the stuff I do for a living that I completely forgot that they're an interesting idea.

It's really pretty simple: an algorithm is description of a mechanical procedure for performing a mathematical task. But that's actually a fairly sophisticated idea - the general concept of things that describe a procedure, and which can be discussed, described, and reasoned about as entities in their own right is something quite different from nearly anything that came before it.

The term algorithm has an interesting and humorous origin. Most people believe that it's named after a Persian mathematician named "Al-Khwarizmi". That's actually not quite true. While the term is derived from Al-Khwarizmi's name, it's actually an accident. In fact, what happened was that Al-Khwarizmi wrote a book about how to perform calculations using the Hindi numbering system (which is what we know as arabic numerals). Several hundred years later, the text was translated into Latin, and in the process of transliterating the name "Al-Khwarizmi" into Latin, the title of the book became "Algorithmi on Indian Numbers". "Algorithmi" looks like a latin plural noun - and so many European mathematicians interpreted the title of the book not as "Al-Khwarizmi talking about mathematical procedures using Indian numbers", but "Mathematical Procedures using Indian Numbers". And the term stuck.

So - an algorithm is a description mathematical procedure - and it's possible to reason about an algorithm as an entity itself. Let's look at a simple example, just to get the sense of it.

Suppose I have a list of N numbers in random order. I want to get the list into
increasing order - smallest first, largest last, ever number smaller than the number that comes after it.

One of the simplest algorithms for this is called bubble sort:

Given: a list L₁...L_N of N numbers in random order.
Produce: a list of those N numbers in increasing order
Do:
   repeat the following until the list no longer changes:
      look at each pair of adjacent numbers L_i and L_i+1:
      if L_i > L_i+1 then swap L_i and L_i+1

This will produce a sorted list. Let's run through a quick example: the list [3,5,2,1,4].

First pass through the list:
1. Compare 3 and 5. 3<5 so move on to the next pair.
2. Compare 5 and 2. 5>2, so swap. List becomes [3,2,5,1,4]
3. Compare 5 and 1. 5>1, so swap. List becomes [3,2,1,5,4]
4. Compare 5 and 4. 5>4, so swap. List becomes [3,2,1,4,5]
The list changed, so we take another pass. Second pass:
1. Compare 3 and 2. 3>2, so swap. List becomes [2,3,1,4,5]
2. Compare 3 and 1.3>1, so swap. List becomes [2,1,3,4,5]
3. Compare 3 and 4. 3<4, so move on.
4. Compare 4 and 5. 4<5, so move on.
The list changed, so we do another pass. Third pass:
1. Compare 2 and 1. 2>1, so swap. List is now: [1,2,3,4,5].
2. Next four comparisons are (2,3), (3,4), (4,5), none of which cause swaps.
The list changed, so another pass. Fourth pass, everything is in order - each pair compares without causing changes. So the result is [1,2,3,4,5].

What can we say about this algorithm? Well, for one thing, it's slow. Given a list of N elements, we might need to do up to N passes over the list, each of which will do N-1 comparisons, each of which might cause a swap. So in the worst case, this will require N²-N comparisons and swaps to get the list into order. Our example wasn't even a worse case, and it ended up needed to do 16 comparisons (4 passes, 4 comparisons per pass), and 6 swaps. In the best case - the list is already in order - this algorithm will take one pass - N-1 comparisons, 0 swaps.

There are even some tricks using Kolmogorov-Chaitin information theory where we can show that on average, Bubble-sort will tend more towards the worse end of its possible performance than the best - but that's beyond the scope of a basics post, except to point out that we can apply interesting mathematical analyses to the algorithm itself to figure out what it's average performance should be, even without running it.

More like this

If I Were You: Perceptual Illusion of Body Swapping

Everybody's talking about the Body Swap Illusion. So why should I bother. Here's what you need to know: The original paper is here, at PLoS. Here are some blog posts on the topic:

Credit Default Swaps: Gambling as Insurance

So, the financial questions keep coming. I'm avoiding a lot of them, because (A) they bore me, and (B) I'm really not the right person to ask. I try to stay

Help a ScienceBlogger

Remember those classic psychology experiments where people walk around for a couple days with glasses that either flip the image up/down or right/left and then their brain seems to adjust to the swap? I'm curious about the effect and would love to get a hold of some of the glasses they used.

Book Review: Moore, The Swap

Now that you have done a post on both algorithms and Turing Machines, you should do a post on Church's Thesis.

What's your favourite algorithm?

I enjoy your method of teaching Mark, I've been reading your blog for 3 months now (or so), always a nice read, especially the plonky ID drama! :)

This just adds to a growing list of re-readable material! :)

If you do the Church-Turing thesis next you might also tack on this "Elementary Proof that P =/= NP".

Since the question was asked, my favourite algorithm is the Hopcroft state minimisation algorithm. It's one of the simplest algorithms which has the following properties:

1. What it does is easy to understand.

2. Writing a simple algorithm to solve the problem is easy, and it's easy to understand.

3. Taking one of the simplest algorithms, and making a very simple (a one line change in a typical programming language) modification to it, improves the running time from O(n^2) to O(n log n).

4. This modification is precisely the opposite of what you'd expect.

Good work Mark, as usual. And this short post of J. Shallit over at Recursivity makes a nice complement, I think.

it's average performance

While I'm sure yesterdays human calculators were interested in how much time (how many operation) they used, I don't think they were much concerned with the number of papers. I have a feeling that computer science may have contributed with the complementary algorithm performance measure of needed memory space.

As someone who has a little programing experience (java), and understands the programing perspective of an algorithm, and the silliness that is a bubble sort, but has very basic math skills. Can you explain the difference between an algorithm used in programing, and an algorithm used in mathematics. Please excuse me if this seems odd, but my math skills suck. =( I'm working on it...

-- jolt

The synchronicity of the web - speaking of performance measures, Jeffrey Shallit has a post where he relates his article to Spiked magazine on the "greatest innovation in my field":

In theoretical computer science, the greatest innovation is the realization that algorithms are mathematical objects, and can be rigorously analyzed in terms of their consumption of scarce resources, including space, time, and randomness.

( http://recursed.blogspot.com/2006/12/greatest-innovation-of-theoretical… )

Hmm. Randomness - is that a scarce resource due to the use of pseudorandom algorithms and/or finite number size? And if so, wouldn't that fall back to the constraints in time and space?

(Btw, every computer could easily have a circuit who gives good random numbers from a noise or quantum process - and it seems such circuits are WIP now.)

Is there any chance you might write about some other sorting algorithms?

I find them to be fascinating. Anyone who is interested in such things will probably like this site:
http://www.cs.ubc.ca/~harrison/Java/sorting-demo.html

Hmm. Randomness - is that a scarce resource due to the use of pseudorandom algorithms and/or finite number size? And if so, wouldn't that fall back to the constraints in time and space?

I'm not quite what Shallit meant by that statement, but there are two distinct ways in which algorithms use randomness:

1.) Pseudoramdomness, which as you mention has some overlap with time and space. It however distinct due to it's use, for instance, as a statistical judgement of whether the output you get is "rigged" or not in randomized algorithms.

2.) Incompressibility: Turing machines are modeled with self-delimiting versions of their individual inputs bordered by "blanks", which goes back to the definition of "randomness" in Kolmogorov complexity, which is used to minimize redundancy.

Jesus Christ, I'm on my way to typo-city for a stay at the word-gap hotel. Sorry for the messiness of the above post. :(

It might be helpful to discuss a simple algorithm that people are familiar with, if you're trying to teach what an algorithm is in general. For instance, the algorithm that people use to multiply numbers of several digits (which is made possible by the use of arabic numerals), where you line up the numbers one above the other and multiply the digits, carrying the tens to the next column, etc.

I enjoyed learning sorts in computer science classes and comparing them to my own sorting algorithms. Many of them (but not heapsort!) had obvious analogues in my own behavior.

Randomness as a resource: Even on the idealized machines, the question of how many, say, uniformly random bits, you need for some algorithm is often an important question. Do I just need 4 random bits say, or do I need some number based on the size of the input. That is why it is looked at as a resource in the same way space or time are.

(Btw, every computer could easily have a circuit who gives good random numbers from a noise or quantum process - and it seems such circuits are WIP now.)

Intel had such a device, and included it on certain consumer-level Intel motherboards produced between 1999 and 2001. You may somewhere have a seven-year-old computer with a hardware random number generator in it.

It turned out that there wasn't really a need in most of what drives the consumer electronics market for a device - as of now, good-enough randomness is cheap even without such a hardware device.

Thanks Tyler, markk and Daniel for setting me straight:

It however distinct due to it's use, for instance, as a statistical judgement

Ah yes, Shallit may have meant it as a quality resource.

Even on the idealized machines, the question of how many, say, uniformly random bits, you need for some algorithm is often an important question.

I didn't think of use as an internal metric, but that makes sense too.

Intel had such a device, and included it on certain consumer-level Intel motherboards produced between 1999 and 2001.

Um, it was likely an old memory. I guess I should have checked the current situation first.

as of now, good-enough randomness is cheap even without such a hardware device

Glad to hear it. Considering the above remarks it seems I misread Shallit completely here.

It's easy to give examples of algorithms, as you have done. But it seems that "algorithm" is very difficult to *define* in a precise way.

Why? Let's give the example of sorting a list of integers. You can write many computer programs that perform this task, but we don't want to say that every different computer program implements a different algorithm. There are many programs that implement a bubble sort algorithm which differ only in trivial implementation details such as the location of temporary storage. We also can't define it in terms of the actual function calculated: we want to be able to say that bubble sort and merge sort are different algorithms, even though they calculate the exactly the same function.

We would like our understanding of the concept of "algorithm" to define some kind of equivalence relation on computer programs, such that any two programs that implement the same algorithm are defined to be equivalent. But the correct definition of such an equivalence relation seems quite mysterious to me...

Alex, what you discuss seems to me like the problems of defining for example the concept of species. Depending on what you study (living populations or fossils) you need different conceptions.

Here you need different scope depending on if you study the idealized algorithm, the intended implementation or the real algorithm.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

Moving on

August 2, 2010

Finally, at long last, I can tell you what I've been up to with finding a new home for this blog. I've created a new, community-based science blogging site, called Scientopia. With the help of many wonderful people, we're ready. We launched this morning. So to continue following GM/BM - along with…

Goodbye, Scienceblogs

July 7, 2010

So my decision is made. I'm closing up around here. I'm in the process of working out exactly where I'm going to go. With any luck, Seed will leave this blog here long enough for me to post an update with the new location. But I'm through with Seed and ScienceBlogs.

Seed, Conflicts of Interest, and Sleaze

July 6, 2010

As my friend Pal wrote about, Seed Media Group, the corporate overlords of the ScienceBlogs network that this blog belongs to, have apparently decided that blog space in these parts is now up for sale to advertisers. We've been advertiser supported since I joined up with SB. I've never minded…

Searching for Topics

June 28, 2010

As regular readers have no doubt noticed by now, posting on the blog has been slow lately. I've been trying to come back up to speed, but so far, that's been mainly in the form of bad math posts. I'd like to get back to the good stuff. Unfortunately, the chaos theory stuff that I was…

Saturday Recipe: Ginger Scallion Sauce

June 26, 2010

Today's recipe is something I made this week for the first time, and trying it was like a revelation. It's simple to make, it's got an absolutely spectacularly wonderful flavor - light and fresh - and it's incredibly versatile. It's damned near perfect. It's scallion ginger sauce, and once you try…