Multicellular Eukaryotic Genomes are Teh Suck

The human genome is one big, bloated motherfucker. It's almost all non-protein-coding DNA. The same is true for many other eukaryotic genomes. Sure, some of it has a function. But a whole lot of it (and maybe most of it) is just junk.

There are some who point to a relationship between genome size and organismal completexity and argue that those large genomes are necessary to explain the compelexity they observe. There are other that disagree -- T.R. Gregory at Genomicron being one of the more vocal objectors in the geno-blogosphere. First off, how do you measure complexity? Second of all, what species will you sample for your measures of genome size? And how will you measure genome size? By number of basepairs? By number of genes?

Anyway, a couple of weeks ago, Gregory posted this figure from a Science News article entitled Genome 2.0:


The amount of non-coding DNA is plotted for various species with sequenced genomes. The bars in light blue are bacterial species, the dark gray is brewers' yeast, the green bar is a plant, the purple bars are two invertebrates, and the orange bar is humans. The article suggests that humans have so much non-coding DNA because it serves an important function -- it's what makes us so gosh darn complex. There's also the underlying "chain of being" message in the figure, with the "primative" species on the left end of the X-axis and the "advanced" species on the right end.

But if you thought that figure was bad, Larry Moran provided an even better worse one:


Here, the great chain of being is even more prominently displayed. There's also so much paraphyly and polyphyly it'll make a cladist's head explode. Why group unicellular eukaryotes together, excluding all other eukarotes? Why group invertebrates together, excluding vertebrates? Why group chordates, excluding vertebrates? Why exclude humans from the vertebrates? And, despite what some people may claim, fungi and plants are distant relatives.

Based on the position of the dog's ass in the second bar graph, Gregory has dubbed these types of plots Dog's Ass Plots (DAPs):

Indeed, the human datum would accurately be placed roughly below the dog's ass in this figure if it included a proper sampling of diversity.

Here are some other recent posts from Genomicron on the topic of the misrepresentation of genome size and non-coding DNA:

And, don't forget, creationists love to misuse genome size data when arguing against evolution.

More like this

I agree this complexity axis is nonsense, but I wonder if %noncoding could be tied to something else... perhaps reproductive rate or longevity? Genome size (by base pairs)? Number of distinct species within 1,000,000 years of evolution?

Whatever it correlates to, that will probably just correlate back to each species' tolerance for transposons and retroviruses.

Hi, I agree whit this overestimation about this junk DNA. Some of this non-coding DNA could have regulatory function or structural function or also could acting like a buffering against mutation. Maybe the term "junk" is not the most appropriate word to describe it. In other words, if this non-coding region were really useless, they would have discarding by evolution many million years ago.

On the other hand, the genome is just a draft. The non-coding regions are assigned from the conclusion about the ORFome. Probably are many small genes in that huge region of non-coding regions that was initially rejected in the ORFome study. Remember that no all genes are coding for a protein, and many small RNAs have functional properties.

Best Regards,