When I was an undergrad, I almost took a degree in linguistics because I was so fascinated by languages, especially by the rate and patterns of change that languages undergo. So of course, I was excited to read two fascinating papers that were published in this week's issue of Nature. These papers find that individual words evolve in a predictable manner and this rate of evolution depends upon their frequency of use. Further, this predictability can be defined mathematically.
To test this hypothesis, one group of researchers from Harvard University set out to determine the rate at which irregular verbs change into "regularized verbs" and compared that to their frequency of use. Another group from the University of Reading, UK, compared the frequency of word use to that of replacement by another word for four different languages. Despite their different approaches, both groups report similar findings; the more frequently a word is used, the less likely it will undergo change -- a phenomenon that mirrors the evolutionary rate for genes.
The study conducted by Erez Lieberman and Jean-Baptiste Michel and their team of evolutionary biologists, all from Harvard University, focused on conjugation changes to 177 irregular Old English verbs. They found that 145 of these irregular verbs were still irregular in Middle English, while only 98 remain irregular today.
An irregular verb is an action word that does not follow regular conjugation rules so that it has a peculiar past tense, for example, words such as sing/sang/sung and go/went/gone, whereas regular verbs are simply modified by adding -ed to the infinitive to create the simple past and past participle forms; for example, talk/talked/talked. Interestingly, the team mentions that new verbs entering the English language universally follow regular conjugation rules; for instance, google/googled/googled.
Curiously, even though less than 3% of all English verbs are irregular, the ten most commonly used verbs are irregular (see table at right for a representative sampling of irregular verbs in the study). Thus, there is a heavy bias towards irregular verbs maintaining their "irregularities" when they are used often, whereas infrequently used irregular verbs are regularized by adding -ed to the infinitive. It is unclear why the -ed rule emerged as dominant over the other seven (at least) classes of irregular verbs, but it probably is because other rules, and their unusual conjugations, are otherwise easy to forget.
To study evolution of irregular verbs, Lieberman and Michel's team first identified 177 irregular English verbs, then they used the English language CELEX corpus, a lexical and textual database containing 17.9 million words, primarily from American and British English text sources. CELEX determines the frequency of use for any given word within the database. Using these data, the team calculated the frequency at which the 177 irregular verbs became "regularized" over time, and compared that to their frequency of use (figure 1a, below);
By leaving out frequency data for the most commonly used irregular verbs (be, have, come, do, find, get, give, go, know, say, see, take, think), which are most resistant to change, Lieberman and Michel's group found that the least frequently used irregular verbs changed, or regularized, fastest. In fact, there was a mathematical relationship that described this rate of change such that those verbs that were used 100 times less frequently evolved 10 times faster (figure 1b, above). Thus, the team used this mathematical relationship to predict the half-life for regularization for other irregular verbs (see table 1, above). According to their hypothesis, only 83 of the original 177 irregular verbs will retain their unusual conjugations by 2500.
Using these data, the team predicts the half-lives of 'be' and 'have' are 38,800 years, making these two verbs the most resistant to regularization. The team also predicts that the next irregular verb that will regularize is wed/wed/wed, because it is the least frequently used modern irregular verb. In fact, this verb is already is being replaced by wed/wedded/wedded in the four major English-language dictionaries.
So based on this study, it is perhaps not surprising to learn that infrequently used irregular verbs evolved fastest. For instance, 'stode', the antiquated past tense for 'study,' was regularized to 'studied' and 'holp,' the past tense of 'help,' became the modern 'helped.'
In fact, another study published in the same issue of Nature revealed the same basic rate of linguistic evolution. That study, by Mark Pagel, an evolutionary bio-informaticist at the University of Reading, UK, and his group also suggested that less commonly-used words evolve faster. Pagel's group found this by comparing frequency of word use to the rate of replacement with a different word for four languages (Greek, Spanish, English and Russian). Further, based on their findings, Pagel's group estimates that frequency of word use explains half the overall rate of language evolution.
Interestingly, these linguistic data are analogous to the way that genes evolve: those genes that are expressed most frequently, such as the so-called "housekeeping genes", are least likely to change, whereas genes that encode rarer and more specialized functions have less selective pressure and thus change faster.
"Quantifying the evolutionary dynamics of language" by Erez Lieberman, Jean-Baptiste Michel, Joe Jackson, Tina Tang & Martin A. Nowak. Nature 449:713-716 (11 October 2007 | doi:10.1038/nature06137) [PDF].
"Frequency of word-use predicts rates of lexical evolution throughout Indo-European history" by Mark Pagel, Quentin D. Atkinson & Andrew Meade. Nature 449:717-720 (11 October 2007 | doi:10.1038/nature06176) [PDF].
Is there no end to your brain? You are such a renaissance woman. Do you repair car engines and paint portraits as well? Actually, I found this fascinating...but want the days to slow down so I can digest all the interesting stuff you send out to the blogosphere.
Just skimmed through the data, but burn, burning, burn_t_ is hardly a regular verb... Or am I misreading something?
I think burn is one of those lesser used rules referred to early on in the article, making them regular, but not in the most commonly known way. Some examples I can think of are spell > spelt and learn > learnt. Obviously these examples evolved. :)
This is so awesome! Unfortunately, your .pdf links take you right back to the post and not to the papers themselves...
Oh, I see, I skimmed to fast.
It's pretty interesting stuff, though.
I wonder if anyone has done similar work with the old-english plurals (ox -> oxen, for example)?
If anyone's interested, a few days ago I wrote up a more detailed post on the Pagel et al paper for my blog, henry.
I wonder why irregular verbs start out irregular?
Do all verbs start out irregular and then regularise over time?
#7: one way verbs become irregular is by combining the present tense of one word with the past tense of a different word that has fallen out of use in the present. IIRC, this is how we ended up with go/went; went used to be the past tense of "wendan" which means "to go".
Thanks, that I understand but in that case it was mixing different languages (Saxon and Angle possibly). Was the original past tense of go (in its original language) goed?
What I'm really asking, I guess, is; are all irregular verbs caused by such mixing of languages or do some verbs start out irregular in their original language and if so why.
If irregularity is caused by adoption then we are reverting to type when we reularise.
Chris' Wills #9: Irregular verbs are born from phonological variations in regular verbs, and by compelling analogy with existing irregulars, as well as from suppletion (the process described by Lila #8).
The bewildering vowel changes in English irregular verbs are the remnants of formerly-regular verb classes, on which the now-dominant -ed class has encroached over the centuries. Many of these vowel changes used to be accompanied by suffixes, and they were (originally) predictable by the sounds surrounding them. An example of such a fossil class is the "swim" group, which shows i/a/u vowels in the present, past, and past participle forms. Almost all of the verbs in that class end in -m, -n, or -ng, which hints at what the enabling context used to be.
A wonderful introduction to this subject is Steven Pinker's Words and Rules.
ACW #10 Thank you. I'll have a look at Steve Pinker's book when I can.