(Disclaimer: this is not my field but the paper looked interesting so here goes …)
Promoters, enhancers and other DNA regulatory elements that turn on or off gene transcription are important. We’ve known this for quite a while. Many would argue that metazoans all have the same major gene families. Getting closer to us, most vertebrates have the same types of cells and have very similar genes and gene counts. That is not surprising as most genes encode the different tools that go into making the each major cell type found in all vertebrates. To rephrase this idea in a different manner (so that you won’t forget), all vertebrates have neurons, muscle cells, fibroblasts and thus it is not surprising that vertebrates have almost the same collection of genes. What distinguishes one vertebrate from another is how these cells are specified and placed together. Activation of slightly different gene programs lead to modifications in cell migration, tissue patterning and thus body shape. Thus in vertebrates it is thought (by many) that evolution works to a large extent on gene regulation … in other words, selection acts on
1) the DNA that turns on and off genes, and
2) the proteins that enact the turning on/off of these signals (transcription factors).
Now that doesn’t mean that coding proteins have no role to play in vertebrate evolution. In fact from recent debates (outside my field) it would seem that changes to protein coding areas of the genome probably contribute significantly to evolution. But non-coding DNA regulatory elements are likely to play bigger roles.
From the ENCODE paper (see this entry) we learned that about half the conserved bits of DNA in mammals corresponds to transcripts that go on to specify proteins. Much of the rest of this conserved DNA is controlling gene expression. Now in a new PLoS genetics paper, we get better look at how these Conserved Non-coding Elements (CNCs) vary across mammals and across vertebrates.
This analysis (from human, chimp, dog, mouse, rat, chicken, fugu and zebrafish genomes) demonstrates that there is a quite a bit of these CNCs that are conserved but also that there are quite a number of these bits that are undergoing rapid evolution.
Here is figure from the paper that indicates how these CNCs are changing in the mouse genome relative to the other mammals. In panel A we see how the mouse CNCs stack up to their mammalian counterparts. A p-value near 0 indicates an increased rate of change, and a p-value near 1 indicates a decreased rate.
From the text:
At the significance level of 0.001, 1027 (1.2%) and 503 (0.6%) mammalian CNCs show speed-ups and slow-downs, respectively. Among amniotic CNCs, 228 (1.4%) and 106 (0.6%) show speed-ups and slow-downs, respectively on the mouse lineage.
In panel B we get a closer look at the fast and slowly evolving CNCs in mouse:
Fast- and slow-evolving CNCs are indicated in red and blue, respectively. The violet dashed horizontal line shows the genome-wide average substitution rate on the mouse lineage for unconstrained regions near the fast-evolving CNCs.
So what does this tell us?
We estimate that 68% (54,643/81,957) of the mammalian CNCs evolve at a single rate. The remaining nonneutral CNCs show rate changes on at least one lineage.
So 2/3rds of the CNCs in animals are changing at a constant rate in every lineage. Of the 1/3rd that is left, half are slowing down in some lineages (but are changing at the same rate in the other lineages) and half are speeding up, again in only a subset of the lineages. As for humans and chimps,
… there are 638 and 530 CNCs that show rate speed-ups on the human and chimpanzee lineages, respectively, far more than the four and eight CNCs, respectively, showing slow-downs.
Dogs also show more speed-ups then slow-downs. Since these speed-ups are faster then the expected rate of neutral change, these changes are likely due to an acceleration in evolution (i.e. positive selective change).
So what is changing? From the paper:
We next looked at whether CNCs showing significant rate speed-ups are more likely to be in the proximity of particular kinds of genes , using the PANTHER GO database . A significant difficulty in this sort of analysis is that even for those CNCs that act as cis-regulators, it is unknown which of the nearby genes is being regulated. However, as a rather imperfect proxy for this we simply used, for each CNC, the nearest gene (in either orientation). For each branch of the mammalian tree, we divided the CNCs into those with increased rate on that branch (by AIC) and used CNCs evolving under the null model as “neutral” controls. We looked at whether particular biological process categories were enriched among the nearest genes of the selected CNCs compared to the neutral CNCs.
For mammalian CNCs, there is significant enrichment of the process categories “amino acid activation” and “other coenzyme and prosthetic group metabolism” on the dog and the lineage leading to the common ancestor of mouse and rat (rodent lineage), respectively, at p < 0.05 after Bonferroni adjustment. We also tested whether any categories show repeated evidence for enrichment on different branches of the tree. For mammalian CNCs, the "sensory perception" category appears in the top ten enriched biological processes for three out of the seven lineages. However, in summary, we view these GO associations as rather tentative, since none of them is highly significant or highly repeatable across branches of the tree. Complete results from this analysis are presented in Tables S6 and S7.
I looked at these tables (especially this one) and … yeah it looks like a collection of random items, although two of the categories near rapidly changing human CNCs were “sensory perception” and “neuronal activities” …