programming

A bunch of people have sent me links to an article about MapReduce. I've hesitated to write about it, because the currently hyped MapReduce stuff was developed, and extensively used by Google, my employer. But the article is really annoying, and deserves a response. So I'm going to be absolutely clear. I am not commenting on this in my capacity as a Google employee. (In fact, I've never actually used MapReduce at work!) This is strictly on my own time, and it's purely my own opinion. If it's the dumbest thing you've ever read, that's my fault, not Google's. If it's the most brilliant thing…
Today is the 70th birthday of Donald Knuth. If you don't know who Knuth is, then you're not a programmer. If you're a programmer and you don't know who Knuth is, well... I have no idea what rock you've been hiding under, but you should probably be fired. Knuth is one of the most famous and accomplished people in the computer science community. He's done all sorts of great research, published a set of definitive textbooks on computer algorithms, and of particular interest to me, implemented a brilliant, hideous, beautiful, godawful piece of software called TeX. When I went to grad school,…
I've gotten both some comments and some e-mail from people in response to my mini-rant about Erlang's macros. I started replying in comments, but it got long enough that I thought it made sense to promote it up to a top-level post. It's not really an Erlang issue, but a general language issue, and my opinions concerning it end up getting into some programming language design and philosophy issues. If, like me, you think that things like that are fun, then read on! I'll start at the absolute beginning. What's a macro? A macro is a meta-structure in a language which defines a translation of…
One of the problems with many of the interesting graph algorithms is that they're complex. As graphs get large, computations over them can become extremely slow. For example, graph coloring is NP-complete - so the time to run a true optimal graph coloring algorithm on an arbitrary graph grows exponentially in the size of the graph! So, quite naturally, we look for ways to make it faster. We've already talked about using heuristics to get fast approximations. But what if we really need the true optimum? The other approach to making it faster, when you really want the true optimum, is…
Colored Petri Nets The big step in Petri nets - the one that really takes them from a theoretical toy to a serious tool used by protocol developers - is the extension to colored Petri nets (CPNs). Calling them "colored" is a bit of a misnomer; the original idea was to assign colors to tokens, and allow color-based expressions on the elements of the net. But study of analysis models quickly showed that you could go much further than that without losing the basic analyzability properties that are so valuable. In the full development of CPNs, you can associate essentially arbitrary collections…
Since my post on datatypes for my π-calculus language, I've gotten a bunch of questions from people who (I guess) picked up on the series after the original post where I said that the idea of the series was to see if I could create a programming language based on it. The questions are all variations on "Why design another programming language? Do you really think anyone will ever use it?" I'll answer the second question first. Of course I realize that the chances that anyone but me, and maybe a couple of readers, will ever use it are close to zero. But that's not the point. Which brings us…
For the basics, I wrote a bunch of stuff about sorting. It seems worth taking a moment to talk about something related: binary search. Binary search is one of the most important and fundamental algorithms, and it shows up in sorts of places. It also has the amazing property that despite being simple and ubiquitous, it's virtually always written wrong. There's a bit of subtlety in implementing it correctly, and virtually everyone manages to put off-by-one indexing errors into their implementations. (Including me; last time I implemented a binary search, the first version included one of the…
This came up in a question in the post where I started to talk about π-calculus, but I thought it was an interesting enough topic to promote it up to a top-level post. If you listen to anyone talking about computers or software, there are three worlds you'll constantly hear: parallel, concurrent, and distributed. At first glance, it sounds like they mean the same thing, but in fact, they're three different things, and the differences are important. The connection between them is that they're all terms that describe systems made up of computers and software that are doing more than one…
I feel like a bit of a change of pace, and trying a bit of an experiment. Re-reading Backus's old FP work reminds me of what I was doing the last time I read it, which was back in grad school. At the time, I was working on some stuff involving parallel computation, and I discovered Robin Milner's π-calculus as a tool for describing parallel computation. You can think of π-calculus as being a sort of parallel (pun intended) for the λ-calculus: in sequential, single-threaded computation, λ-calculus can be a great tool for describing what things mean. But λ-calculus has absolutely no way of…
In my earlier post about John Backus, I promised to write something about his later work on functional programming languages. While I was in a doctors office getting treated for an awful cough, I re-read his 1977 Turing Award Lecture. Even 30 years later, it remains a fascinating read, and far from being dated, it's positively astonishingly to see both how far-sighted Backus was, and how little progress we've actually made. Backus started from a pretty solid perspective. Almost all of the common programming languages that we see - ranging from Backus's own 1950s version of Fortran, to the…
Many people would probably say that things like computability and the halting program aren't basics. But I disagree: many of our basic intuitions about numbers and the things that we can do with them are actually deeply connected with the limits of computation. This connection of intuition with computation is an extremely important one, and so I think people should have at least a passing familiarity with it. In addition to that, one of the recent trends in crappy arguments from creationists is to try to invoke ideas about computation in misleading ways - but if you're familiar with what…
As long as I'm doing all of these basics posts, I thought it would be worth explaining just what a Turing machine is. I frequently talk about things being Turing equivalent, and about effective computing systems, and similar things, which all assume you have some clue of what a Turing machine is. And as a bonus, I'm also going to give you a nifty little piece of Haskell source code that's a very basic Turing machine interpreter. (It's for a future entry in the Haskell posts, and it's not entirely finished, but it does work!) The Turing machine is a very simple kind of theoretical computing…
In the comments thread of the post on Turing Equivalent vs Turing Complete, there've been a flurry of questions and comments about some stuff involving computational complexity. So I thought it would be interesting to write a couple of posts about computational complexity. I'm not going to get into a lot of detail, because I would need to dig out some books that I really don't want to read again :-). But I can do the basics without them. It'll take a few posts to get through this. The logical place to begin is: What is computational complexity? When we look at a computation, one of the…
In my discussion with Sal Cordova in this post, one point came up which I thought was interesting, and worth taking the time to flesh out as a separate post. It's about the distinction between a Turing equivalent computing system, and a Turing complete computation. It's true that in informal use, we often tend to muddy the line between these two related but distinct concepts. But in fact, they are distinct, and the difference between them can be extremely important. In some sense, it's the difference between "capable of" and "requires"; another way of looking at it is "sufficient" versus "…
As you may have noticed, lately, I've been fascinated by Haskell. I haven't done anything much in it until quite recently; it's been sitting in my to-do queue for a long time. This weekend, I was hacking away on a Haskell implementation of an interesting (but currently unimplemented) language from the Esolang wiki. For the most part, it went astonishingly smoothly, until I got to the point of putting things together, when I ran into a problem combining two monads, which is one of the typically difficult problems in real Haskell programming. What surprised me a bit when I hit this is how hard…
One of my fellow ScienceBloggers, [Karmen at Chaotic Utopia](http://scienceblogs.com/chaoticutopia/2006/11/puzzling_at_a_simpleminde…) pointed out a spectacularly stupid statement in [Casey Luskin's critique of Carl Zimmer][lutkin] (*another* fellow SBer) at the Discovery Institutes "Center for Science and Culture". Now normally, I might not pile on to tear-down of Casey (not because he doesn't deserve it, but because often my SciBlings do such a good job that I have nothing to add); but this time, he's crossed much too far into *my* territory, and I can't let that pass without at least a…
I came across an article yesterday about programming languages, which hit on one of my major peeves, so I can't resist responding. The article is at greythumb.org, and it's called [Programmer's rant: what should and should not be added to C/C++](http://www.greythumb.org/blog/index.php?/archives/152-Programmers-rant-…). It's a variation on the extremely common belief that C and C++ are the best languages to use when you need code to run fast. They're not. They're good at things that need to get very close to the hardware - not in the efficiency sense, but in the sense of needing to be able…
While I was waiting for stuff to install on my new machine, I was doing some browsing around the web, and came across an interesting article at a blog called "The Only Winning Move", titled [Scheme Death Knell?](http://theonlywinningmove.blogspot.com/2006/10/scheme-death-knell.html). It's not a bad article at all, but I disagree with its conclusions, and I thought it would be interesting to discuss why. The article was brought on by *another* blog post, this one at [Lambda the Ultimate](http://lambda-the-ultimate.org/) suggesting that someone translate Douglas Hofstadter's columns introducing…
Over in my post accepting my victory as the biggest geek on ScienceBlogs, an interesting discussion about beginners learning to program got started in the comments. It was triggered by someone mentioning David Brin's article in Salon about how terrible it is that computers no longer come with basic. The discussion was interesting; I think it's interesting enough to justify a top-level post. A few days ago in Salon, David Brin published an article (no link, because Salon is a pay-site), lamenting the fact that computers no longer come with BASIC interpreters, and how that was going to wreck…
Back at my old digs last week, I put up a post about programming languages and types. It produced an interesting discussion, which ended up shifting topics a bit, and leading to a very interesting question from one of the posters, and since the answer actually really does involve math, I'm going to pop it up to the front page here. In the discussion, I argued that programmers should know and use many different programming languages; and that that's not just a statement about todays programming languages, but something that I think will always be true: that there will always be good reasons…