Where the variation comes from.

Evolution proceeds by the action of many different evolutionary forces on heritable variation. Natural selection leads to the increase in frequency of variation that allows individuals to produce more offspring who, themselves, produce offspring. Genetic drift changes the frequency of variation through random sampling of individuals from one generation to the next. Population subdivision divides the variation into isolated groups where other forces (selection, drift, etc) act upon it. But where does all this variation come from?

Given the title of the post, the subtitle of the post, and your general understanding of biology, it should be pretty obvious that the variation comes from mutation. The purpose of this essay is to explain the different types of mutations that can contribute to heritable variation in populations. We will also explore how evolutionary forces act upon the different mutations.

Genomes are divided into chromosomes. Each chromosome contains a unique set of DNA sequences. Some organisms contain two copies of each chromosome (one from their mother and one from their father), and others only have one copy of each chromosome (inherited from a single parent). Each chromosome is made up of sequences that perform specific functions (some encode proteins, some determine when the protein coding sequences are expressed, and others encode other function elements) and non-functional sequences (junk DNA). We will not concern ourselves much with the specific categories the DNA sequences can fall into.

For the purpose of this treatment of mutations, we will divide the types of mutations into four classes:

  • Substitutions: changing the information in the genome.

  • Rearrangements: rearranging the information in the genome.

  • Insertions: increasing the amount of information in the genome.

  • Deletions: decreasing the amount of information in the genome.

Substitutions, also known as point mutations, result in the change of a nucleotide into another nucleotide. DNA sequences are made up of a series of four different nucleotides: adenine, thymine, guanine, and cytosine (symbolized by the letters A, T, G, and C, respectively). A genome sequence consists of an arrangement of these nucleotides in a specified order (think about it as a book written in a language consisting of four letters). If one of the nucleotides is changed, a substitution or point mutation has occurred.

The effects of a substitution depend on within which type of sequence the mutation occurs. If it occurs in a DNA sequence that lacks any function (so-called junk DNA), it will be neutral. Point mutations can also occur in sequences that encode proteins or sequences that regulate the expression of protein coding sequences. A substitution within a protein coding sequence may alter the protein that the sequence encodes. If so, the protein may be rendered nonfunctional, in which case the individual harboring that mutation will be less fit than other individuals. There is also the possibility that the mutation renders the individual more fit because the new protein sequence is better than the old one. Similar scenarios could be imagined for point mutations in other functional DNA sequences.

Genomic rearrangements include events such as fusion and fissions of chromosomes, inversions, and translocations (see figure below). The scale of these events can range from a region as small as a gene (a small part of a chromosome) to large portions of chromosomes to entire chromosomes. Fusion events, such as the one that occurred in the human genome after the divergence with chimpanzees, join together two complete chromosomes, whereas translocations occur when part of a chromosome is moved to another part of the same chromosome or to a different chromosome.


Inversions rearrange the genetic content within a single chromosome. The can also play an important role in speciation or contain alleles that confer fitness benefits. One reason is that inversions suppress recombination between different arrangements (chromosomes carrying different inversions do not easily exchange alleles). But it's important to understand that rearrangements occur at a much lower frequency than point mutations. Comparing fitness effects of rearrangements and substitutions is a bit trickier, and we won't deal with that here.

The next two classes of mutation result in the addition or loss of genetic material in the genome. Insertions increase the net content of a genome. There are multiple sources of the genetic material that enters the genome. It may arise de novo (synthesis of novel sequence), it may come from outside the genome, or it may be a duplicated copy of something from within the genome of interest. Insertions of novel sequence tend to be small (on the order of a few nucleotides), whereas insertions of material from outside or within the genome can be as large as a single gene or even multiple genes. We will focus on these types of insertions.

The ability of material from outside of a genome to insert into the genome in question (and our understanding of such events) depends on what types of sequence we are studying. In eukaryotes (plants, animals, fungi, and various microbes), viral sequences often move from individual to individual, inserting themselves into genomes. This horizontal transmission of genetic information is even more common in bacteria due to biological properties of these organisms. The uptake of genetic material from extragenomic sources have the potential to cause disease (ie, viruses) or may lead to novel genes which allow an organism to perform a new function (ie, bacteria picking up genes for antibiotic resistance from other individuals or from the environment).

Genomic information can also be duplicated within a genome. This can occur via various mechanisms, sometimes even aided by viral like sequences moving within a single genome. Entire blocks of genetic material can be duplicated via mechanisms that we're still working to understand. Another common mechanism occurs when DNA sequences are transcribed to RNA, then reverse transcribed back into DNA and inserted back into the genome. Duplications allow genes or other DNA sequences to explore mutational space that would be inaccessible if they only existed in a single copy. That's because many point mutations are deleterious, but if there is a copy of a sequence that maintains the original function, a duplicate copy can accumulate mutations that interfere with the original function. Many of these mutations will lead to a non-functional duplicate copy, but some may lead to a sequence with a new function that would not be possible with a single copy because the single copy must maintain the original function.

In addition to accumulate new content, a genome can also lose content. These events can occur on various scales, ranging from a few nucleotides to large chunks of chromosomes. Larger duplications tend to be more deleterious than smaller ones, but the quality of the content of the deletions affect of the fitness costs of the deletions as well. For example, the deletion of a small region containing an essential gene will be far more deleterious than a large deletion of non-functional sequence. Not all deletions will be deleterious, and it's possible that they may even confer a fitness benefit if they delete a sequence that is deleterious. We can gain a fair bit of understanding of the evolutionary dynamics by exploring the frequencies of deleted sequences in natural populations. Three recent studies (reviewed here, here, and here) have performed such an analysis and found many common deletions within human populations.

The different types of mutations vary in the frequency at which they occur (point mutations are more common than rearrangements, insertion, and deletions), but there is also variation within the classes. For example, different sizes and types of rearrangements, insertions, and deletions occur at different frequencies. Additionally, certain substitutions are more frequent than others (see here for more details). And the fitness costs of these mutations depend on multiple factors, including the size of the events, in which types of sequence they occur or which sequences they contain, and what other mutations are associated with the mutations.

More like this

I have a little bit of an infatuation with copy number polymorphism (CNP), which describes the fact that individuals within a population can differ from each other in gene content. Some genes, such as olfactory receptors (ORs), have many different related variants in any animal genome. New copies…
I have been describing some recently published worked on polymorphic deletions (see here and here for the previous two posts) on the old site. I will conclude that series here at ScienceBlogs with a discussion of linkage disequilibrium and deletions. In the previous two posts I outlined two…
I've been chatting up Wilkins about the role of natural selection in speciation (and when I say "speciation" I mean "reproductive isolation"). Wilkins listed a few cases where speciation would occur independently of natural selection. Amongst the mechanisms in Wilkins's list was speciation via…
Oh, boy. Jonathan Wells explains why some of us reject the outrageous interpretations made from the ENCODE work claiming 80%+ functionality of the genome. It was really an effort to get past this sentence. Some historical context might help. Bwahahahahaha! First sentence, he makes a joke. Wells is…

I left out a lot, Reed :)

Emoticons aside, are you talking about the ability of recombination to induce rearrangements, substitutions, and indels OR recombination shuffling genetic diversity amongst homologs? In other words, recombination as the ultimate cause of the above mutations, or the results of simple meiotic recombination?

The results of simple meiotic recombination.

There is a lot of stuff floating out there where people (scientists included) argue that "recombination" not "mutation" causes nearly all of biological variation. It pops up in creationist literature sometimes.

This is just a misunderstanding because recombination is a type of mutation, in that it changes DNA sequences, and because recombination has no effect if other forms of mutation hadn't already taken place.

"recombination is a type of mutation" -- agree completely.

"recombination has no effect if other forms of mutation hadn't already taken place" partly agree, partly disagree, maybe ambiguous.

Recombination can move a gene into the set of genes regulated differently than the gene was before it moved. hence it can be very different developmentally, can be much more or much less linked to another gene, can be promoted or repressed or otherwise affected by environmental things that wouldn't have affected it before.

I've presented over a dozen papers on mathematical population biology and mathematical proteomics and the like, and am currently over 80 pages of writing on a paper hosted at a wiki, on the Shannon channel cpacity of evolution by natural selection -- triggered by an ID advocate whom, I thought, asked an interesting question, however imperfectly articulated. I'd like to quote parts of your simplified explanation of Mutation in the paper, with proper citation, okay?

Dialog is great. This science blog is great. Keep up the fine work!

I'm not a geneticist, and no pro on evolution, but doesn't Toe require mutations to be random? Recombination, from what I've read is not a random event. If randomness is taken out of variation, then natural selection cannot play a creative role, instead, creativity comes from within the orgnanism itslef. KC

Let's just abandom the term random (because lots of people throw that term around without any formal understanding of probability) and refer to mutation as stochastic. All the forms of mutation are stochastic events. We can assign a probability that the event will occur, but we cannot say for certain that an event will occur. They are random draws from some distribution.

Recombination is just as "random" as all other forms of mutation.