Junk DNA is still junk

i-e88a953e59c2ce6c5e2ac4568c7f0c36-rb.png

The ENCODE project made a big splash a couple of years ago — it is a huge project to not only ask what the sequence of a strand of human DNA was, but to analyzed and annotate and try to figure out what it was doing. One of the very surprising results was that in the sections of DNA analyzed, almost all of the DNA was transcribed into RNA, which sent the creationists and the popular press into unwarranted flutters of excitement that maybe all that junk DNA wasn't junk at all, if enzymes were busy copying it into RNA. This was an erroneous assumption; as John Timmer pointed out, the genome is a noisy place, and coupled with the observations that the transcripts were not evolutionarily conserved, it suggested that these were non-functional transcripts.

Personally, I fall into the "it's all junk" end of the spectrum. If almost all of these sequences are not conserved by evolution, and we haven't found a function for any of them yet, it's hard to see how the "none of it's junk" view can be maintained. There's also an absence of support for the intervening view, again because of a lack of evidence for actual utility. The genomes of closely related species have revealed very few genes added from non-coding DNA, and all of the structural RNA we've found has very specific sequence requirements. The all-junk view, in contrast, is consistent with current data.

Larry Moran was dubious, too — the transcripts could easily by artifactual.

The most widely publicized result is that most of the human genome is transcribed. It might be more correct to say that the ENCODE Project detected RNA's that are either complimentary to much of the human genome or lead to the inference that much of it is transcribed.

This is not news. We've known about this kind of data for 15 years and it's one of the reasons why many scientists over-estimated the number of humans genes in the decade leading up to the publication of the human genome sequence. The importance of the ENCODE project is that a significant fraction of the human genome has been analyzed in detail (1%) and that the group made some serious attempts to find out whether the transcripts really represent functional RNAs.

My initial impression is that they have failed to demonstrate that the rare transcripts of junk DNA are anything other than artifacts or accidents. It's still an open question as far as I'm concerned.

I felt the same way. ENCODE was spitting up an anomalous result, one that didn't fit with any of the other data about junk DNA. I suspected a technical artifact, or an inability of the methods used to properly categorize low frequency accidental transcription in the genome.

Creationists thought it was wonderful. They detest the idea of junk DNA — that the gods would scatter wasteful garbage throughout our precious genome by intent was unthinkable, so any hint that it might actually do something useful is enthusiastically siezed upon as evidence of purposeful design.

Well, score one for the more cautious scientists, and give the creationists another big fat zero (I think the score is somewhere in the neighborhood of a big number requiring scientific notation to be expressed for the scientists, against a nice, clean, simple zero for the creationists). A new paper has come out that analyzes transcripts from the human genome using a new technique, and, uh-oh, it looks like most of the early reports of ubiquitous transcription were wrong.

Here's the author's summary:

The human genome was sequenced a decade ago, but its exact gene composition remains a subject of debate. The number of protein-coding genes is much lower than initially expected, and the number of distinct transcripts is much larger than the number of protein-coding genes. Moreover, the proportion of the genome that is transcribed in any given cell type remains an open question: results from "tiling" microarray analyses suggest that transcription is pervasive and that most of the genome is transcribed, whereas new deep sequencing-based methods suggest that most transcripts originate from known genes. We have addressed this discrepancy by comparing samples from the same tissues using both technologies. Our analyses indicate that RNA sequencing appears more reliable for transcripts with low expression levels, that most transcripts correspond to known genes or are near known genes, and that many transcripts may represent new exons or aberrant products of the transcription process. We also identify several thousand small transcripts that map outside known genes; their sequences are often conserved and are often encoded in regions of open chromatin. We propose that most of these transcripts may be by-products of the activity of enhancers, which associate with promoters as part of their role as long-range gene regulatory sites. Overall, however, we find that most of the genome is not appreciably transcribed.

So, basically, they directly compared the technique used in the ENCODE analysis (the "tiling" microarray analysis) to more modern deep sequencing methods, and found that the old results were mostly artifacts of the protocol. They also directly examined the pool of transcripts produced in specific tissues, and asked what proportion of them came from known genes, and what part came from what has been called the "dark matter" of the genome, or what has usually been called junk DNA. The cell's machinery to transcribe genes turns out to be reasonably precise!

To assess the proportion of unique sequence-mapping reads accounted for by dark matter transcripts in RNA-Seq data, we compared the mapped sequencing data to the combined set of known gene annotations from the three major genome databases (UCSC, NCBI, and ENSEMBL, together referred to here as "annotated" or "known" genes). When considering uniquely mapped reads in all human and mouse samples, the vast majority of reads (88%) originate from exonic regions of known genes. These figures are consistent with previously reported fractions of exonic reads of between 75% and 96% for unique reads, including those of the original studies from which some of the RNA-Seq data in this study were derived. When including introns, as much as 92%-93% of all reads can be accounted for by annotated gene regions. A further 4%-5% of reads map to unannotated genomic regions that can be aligned to spliced ESTs and mRNAs from high-throughput cDNA sequencing efforts, and only 2.2%-2.5% of reads cannot be explained by any of the aforementioned categories.

Furthermore, when they looked at where the mysterious transcripts are coming from, they are most frequently from regions of DNA near known genes, not just out of deep intergenic regions. This also suggests that they're an artifact, like an extended transcription of a gene, or from other possibly regulatory bits, like pasRNA (promoter-associated small RNAs — there's a growing cloud of xxxRNA acronyms growing out there, but while they may be extremely useful, like siRNA, they're still tiny as a fraction of the total genome. Don't look for demolition of the concept of junk DNA here).

There clearly are still mysteries in there — they do identify a few novel transcripts that come up out of the intergenic regions — but they are small and rare, and the fact of their existence does not imply a functional role, since they could simply be byproducts of other processes. The only way to demonstrate that they actually do something will require experiments in genetic perturbation.

The bottom line, though, is the genome is mostly dead, transcriptionally. The junk is still junk.


van Bakel H, Nislow C, Blencowe BJ, Hughes TR (2010) Most "Dark Matter" Transcripts Are Associated With Known Genes. PLoS Biology 8(5):1-21.

More like this

PLoS Biology, Medicine, Neglected Tropical Diseases and ONE publish on Tuesday. What's new today? As always, you should rate the articles, post notes and comments and send trackbacks when you blog about the papers. You can now also easily place articles on various social services (CiteULike,…
I finally read the huge Nature paper that everyone has been talking about, the ENCODE project, or the encyclopedia of DNA Elements. ENCODE is a large scale concerted effort whose goal is to understand how the genome is used, maintained and conserved. In other words, what parts of the genome get…
You know that organisms develop, grow, and function in part because genes code for proteins that form the building blocks of life or that function as working bioactive molecules (like enzymes). You also know that most DNA is junk, only a couple percent actually coding for anything useful. Most…
Two big studies on genetics came out in the past couple weeks, and I want to talk about both. One of them -- the ENCODE study -- was well covered by the media. The other seems to have slipped through. Paper #1: In the ENCODE study, the authors compiled data using a variety of experimental…

All this science on a religion bashing blog.

Does anyone know where I can find a good religion bashing blog?

By Rev. BigDumbChimp (not verified) on 19 May 2010 #permalink

You know Rev, I was thinking the exact same thing.

Interesting stuff.

I await with bated breath, the big I TOLD YOU SO, BITCHES!!! post on Sandwalk. With much less anticipation, I await Casey Luskin's excuses about how this result wrong/doesn't prove anything/Oh Look! Abiogenesis doesn't work!/.....

By Eamon Knight (not verified) on 19 May 2010 #permalink

The finding that "Most "Dark Matter" Transcripts Are Associated With Known Genes" is not inconsistent with the previous claims. Remember, the claim was not about the level of transcription (known genes were accepted as the source of most transcripts). The previous claim was that there was 'some' level of transcription over most of the genome. I guess there will be a bit of nitpicking between the various groups over the term 'appreciable'.

Well yes, but they can ignore anything at all.

Even if 98% of the DNA were not junk, they'd still need to explain junk like vestigials and why genome duplications provided a lot of the "junk DNA" that was turned into other functions. Gee, does the "designer" have to wait for a whole genome duplication to start working, or, indeed, does the Designer produce such duplications in order to start designing?

Neither fits their Designer, who clearly is God, but then what the in earth or in the heavens does?

Glen D
http://tinyurl.com/mxaa3p

By Glen Davidson (not verified) on 19 May 2010 #permalink

The Black-Eyed Peas' shortest song ever:

?"Whatcha gonna do with all that junk
All that junk inside your nucleus?"?
?"I'ma do nothing. Most of it is just garbage."?

By Brownian, OM (not verified) on 19 May 2010 #permalink

Now now, there's more ways for a piece of DNA to carry out a function than just holding protein information.

Some of the junk DNA (heterochromatin) is clearly involved in how homologous chromosomes are able to pair for meiotic segregation. In that case, it's not the specific sequences themselves, but the size and composition of the heterochromatic regions that impacts pairing efficacy. Likewise, centromeres are not identified by a specific sequence (well, except in some yeasts with very specialized single-microtubule centromeres) but is an epigenetic state based on both the junk DNA and the modifications of their histones. And the coevolution of the tail of centromere-specific histones (CenpA/Cid) with the predominant centromeric satellite sequences in a species is an intriguing observation.

Consider the analogy of how much of your house do you actually "use". I mean, there's the space you're personally taking up, and the spot where your computer is, but then there's all that empty space everywhere that just isn't doing anything...

Junk DNA is like the big bezel around the edge of an iPad. It doesn't actually do anything itself, but the device would be much harder to use without it.

The Black-Eyed Peas' shortest song ever:
♬"Whatcha gonna do with all that junk
All that junk inside your nucleus?"♩
♫"I'ma do nothing. Most of it is just garbage."♪

whew

that put a hurtin' on me

By Rev. BigDumbChimp (not verified) on 19 May 2010 #permalink

I and a few others have been doing online battle with a raving fundie about junk DNA. He's one of the ones who had read the previous literature that got the god squad so excited.

When I pressed him once recently to explain the logic he used to come to the conclusion that that data supported creationism, his response was one simple sentence: "God don't make junk."

Thanks for posting this, PZ.

Time now to go make a creationist's head explode.

Glen Davidson:

Well yes, but they can ignore anything at all.
Even if 98% of the DNA were not junk, they'd still need to explain junk like vestigials and why genome duplications provided a lot of the "junk DNA" that was turned into other functions. Gee, does the "designer" have to wait for a whole genome duplication to start working, or, indeed, does the Designer produce such duplications in order to start designing?
Neither fits their Designer, who clearly is God, but then what the in earth or in the heavens does?

Of course they can still explain the junk; THE FALL silly. Of course they have no way to explain the about face from 'the genome is perfect' to 'the genome bears all the flaws of the fall'. Science can change but religions shouldn't do that too often. Makes the rubes begin to question the whole infallibility bit.

By Dornier Pfeil (not verified) on 19 May 2010 #permalink

Maybe that's why I collect so much junk: my genes are full of collected junk and I'm trying to represent!

give the creationists another big fat zero

You are much too kind. The negative numbers were invented so that we would have an appropriate scale against which to measure the accomplishments of the creationists. And it's not bounded below!

Even if the junk DNA does turn out to have some function, what's the point of poison ivy?

By cervantes (not verified) on 19 May 2010 #permalink

Well I'm not a biologist and I don't really know much about junk DNA, but it seems pretty absurd that the would be a bright red line between junk DNA and functional DNA. There's got to be DNA that used to be functional but it is so long ago it has been orphaned. There's got to be junk DNA which hits upon a random combination whereby it becomes functional in some small degree. If this is true than we only await the ability to analyze DNA to the levels that we can pick up on that nuance.

Of course, this is not what creationists are looking for. But the point is that we should not to say all junk DNA is junk, the point should be to say:
-The genetic process is messy and DNA looks entirely like what we'd expect from naturalistic explanation.
-There is some DNA which is useless and plenty more which is suboptimal which is not what we'd expect from a theistic explanation.

By Prometheus (not verified) on 19 May 2010 #permalink

whole new context for the junk in my trunk...I need to go somewhere that explains this s-l-o-w-l-y. Is there a "DNA transcription for dummies" site anywhere?

Of course they can still explain the junk; THE FALL silly. Of course they have no way to explain the about face from 'the genome is perfect' to 'the genome bears all the flaws of the fall'. Science can change but religions shouldn't do that too often. Makes the rubes begin to question the whole infallibility bit.

While I am sure many fundies are going to be making an about face because they have largely rejected the teachings about the Fall for other reasons, others have actually been saying the genome bears the marks of the Fall for a long time.

You'd think the delusional would have better imaginations. I don't know why Creationists even bother when there are much simpler and less mentally demanding ways for them to ignore reality:

- God made that so-called "junk" DNA to serve a purpose that is too complicated for mere mortals to comprehend. (If science does happen to figure out a purpose then we can praise God for his infinite genius. If science does not discover a purpose then we can praise God for his incomprehensible genius.)
- God created the junk DNA. It serves no purpose. So what? God can do what he wants, it's His universe.
- God put that junk DNA in there to test our faith.
- Satan put it there to sow doubt.

I'll go no further. It just always baffles me when they look to science to confirm that their favorite series of fantasy novels is actually real. For a group that supposedly holds science in universal contempt, they seem suspiciously hungry for scientific validation.

There's got to be DNA that used to be functional but it is so long ago it has been orphaned.

There is. Quite a lot of it actually. They're called pseudogenes. They have been described as "fossil" genes.

Then there are the relics of retroviral infections and jumping genes. These had a function once, too (that function being replicating itself like mad throughout the host genome), just one that didn't do anything useful for the host.

There's got to be junk DNA which hits upon a random combination whereby it becomes functional in some small degree

This is much rarer, since it requires a fairly specific series of mutations to render a previously transcriptionally silent stretch of DNA transcriptionally active and subject to appropriate regulation. But even this has been documented a few times. The antifreeze gene in some antarctic (or was it the artic ones? - they're different and unrelated) ice fish is one example I recall. This was probably helped by the fact that the sequence necessary for an effective antifreeze peptide is not strongly constrained. There was also a case of a parasitic wasp species that incorporated DNA (the protein product actually) originally from a retrovirus infection into its venom.

It should be noted that the mere existence of junk DNA is not actually evidence for evolution, per se. Evolution allows for both the presence and absence of junk DNA, and in fact proposes mechanisms to explain either case (purifying selection for rapid cell division very quickly eliminates junk DNA, which is why bacteria tend to have very little of it). It's the distribution of the junk DNA across lineages, and the specific sequences of the junk DNA, their similarities and differences between lineages, that is the evidence for evolution and common ancestry.

But the mere existence of junk DNA is powerful evidence AGAINST creation/design by an intelligent, benevolent designer (it isn't evidence against a stupid, malicious, or indifferent designer, though). The creationists are quite justified in hating/fearing it.

Remembering though that you have to look absolutely throughout all of our lifespan to be sure. For eg there are the reports that find bursts of endogenous retroviral expression very early after the first division of a fertilised egg and again around the time of implantation and placental formation. Since retroviral sequences take up much of the 'junk' dna the pronouncement that it is all junk without looking allwhen is premature.

I am also reminded of a talk we had once by a lab visitor who had found a chicken gene that turned on (in terms of rna at least) in a restricted area of the blastoderm (when the embryo is basically just a plate of cells) for about 2-3hours. Then turns off again. They were unable to detect expression at any other stage of development. Such a gene might be called a pseudo gene in humans and assigned to the junk as well.

I am reminded as well of the limb enhancer of mouse Myf5 which lies some 50kb upstream from the gene itself. There have been a number of cases of genes knocked out which have affected expression of neighbouring genes, not because the knocked out gene was functionally upstream in some way but because knocking it out altered the spacing of essential regulatory regions (enhancers). So some 'junk' is packing. Not highly functional, but functionally necessary none the less.

There are those engaged in trying to delete large regions of 'junk' in the mouse. Lets wait for their results in more detail until we pronounce shall we?

By Peter Ashby (not verified) on 19 May 2010 #permalink

God put that junk DNA in there to test our faith.

I'll add one more, slightly more charitable possibility.

God put that junk DNA in there as enrichment for his pet humans, to distract our impressionable minds from naught thoughts.

While reading up on the Lenski affair, I remembered a philosophical observation(in particular the resulting insult) about the intelligent design movement: most of it's arguments are arguments for ignorance; we don't know, so it was some other Intelligence that we can't understand. In other words, we(ID'ers) want to stay too dumb to understand.
It's not Intelligent Design, it is Counterintelligent Design.

Junk DNA is like the big bezel around the edge of an iPad. It doesn't actually do anything itself, but the device would be much harder to use without it.

this is a poor assumption, not based on any real evidence whatsoever.

In fact, there are many genes which transcribe well known and used proteins that have no "bezels" around them.

For eg there are the reports that find bursts of endogenous retroviral expression very early after the first division of a fertilised egg and again around the time of implantation and placental formation

sorry, but that still makes it junk DNA wrt to the individual hosting it.

In fact, it has ALWAYS been a part of the definition of junk DNA that it has included random retroviral insertions.

I am also reminded of a talk we had once by a lab visitor who had found a chicken gene that turned on (in terms of rna at least) in a restricted area of the blastoderm (when the embryo is basically just a plate of cells) for about 2-3hours.

again, STILL JUNK

There are those engaged in trying to delete large regions of 'junk' in the mouse. Lets wait for their results in more detail until we pronounce shall we?

Those experiments have been run for decades, with pretty clear results so far. If you were more familiar with the literature on the subject, you probably would be less inclined to make such statements.

Such a gene might be called a pseudo gene in humans and assigned to the junk as well.

Would it? My understanding of the definition of "pseudogene" was that it was recognizably a gene by sequence (by homology with other, known functional genes either in the same species or in related species), but contained mutations such that it could NOT be transcribed, EVER. Such as a premature stop codon, a frameshift into gibberish, or a screwed up promoter.

If the sequence had a valid reading frame such that we know that it could be transcribed, I don't think we'd call it a pseudogene, even if we had no idea at all about when or where it actually was transcribed.

John Avise in "Inside the Human Genome: Evidence for Non-intelligent Design" pointed out that it's the ugly and messy that distinguishes non-ID from ID (and human DNA certainly looks evolved).

ID proponents are convinced that the human genome contains only useful information, so when I ask why human DNA contains 3.1 billion base pairs per nucleus and the marbled lungfish 130 billion and ask why the lungfish requires 30 times as much information as humans, they usually say that it's because the lungfish has 30 times as much DNA (d'oh).

By waynerobinson4 (not verified) on 19 May 2010 #permalink

When I pressed him once recently to explain the logic he used to come to the conclusion that that data supported creationism, his response was one simple sentence: "God don't make junk."

At which point you said you didn't quite follow, and asked him to express his logic as a syllogism, right? Something like

God don't make junk.
The genome ain't junk.
Therefore God made the genome.

Oh, wait, that's a fallacy of affirmation of the consequent.

Time now to go make a creationist's head explode.

You'll never manage that when neither of you knows what logic is.

By truth machine, OM (not verified) on 19 May 2010 #permalink

But the mere existence of junk DNA is powerful evidence AGAINST creation/design by an intelligent, benevolent designer (it isn't evidence against a stupid, malicious, or indifferent designer, though). The creationists are quite justified in hating/fearing it.

So why don't they hate the Bible, with it's stupid, malicious God?

By truth machine, OM (not verified) on 19 May 2010 #permalink

"it's" -> "its" (stupid)

By truth machine, OM (not verified) on 19 May 2010 #permalink

Don't more long lived animals have more junk DNA?

I read that introns, while completely not conserved through natural selection for their coding specifically, *increase* the selection against small mutations nearby in the genome. So perhaps it functions then as a sort of highlight on indispensable regions which aren't all that useful to mess with, because any mutations in that region would have a much greater probability of being negative to host survival rather than neutral or beneficial.

Introns would allow mutated offspring unlikely to reach reproductive age to be miscarried at the 8 cell stage or so, a benefit to a species which puts lots of effort and energy into it's young before reaching reproductive age. So, bacteria would have few to none, while humans would have quite a lot.

I'm not sure if this is supported in research more in depth than that I've read.

Of course, introns are a small percentage of 'junk' DNA.

I remember once being told by a creationist I was arguing in bad faith because he claimed that there was no such thing as junk DNA. In support he showed that a small percentage of Junk DNA had a function. So apparently I was meant to conclude that all Junk DNA had function because a small part of it did...

Then again, this was someone who maintained "there are no transitional fossils". Having an honest argument with a creationist is like having sex with an imaginary girl.

sorry, but that still makes it [retroviral sequences] junk DNA wrt to the individual hosting it.
In fact, it has ALWAYS been a part of the definition of junk DNA that it has included random retroviral insertions.

Except that Trophoblast giant cells in the placenta look remarkably like cells infected with a class of retroviruses leading to the idea that we placental mammals owe our difference from marsupials at least in part to a virus or viruses. The very early burst of retroviral expression are also very restricted in terms of time indicating at least that they are under some sort of control.

I also spent time analysing the mouse myogenin gene's regulatory sequences. A minimal 330bp bp promoter is sufficient to recapitulate embryonic expression, but fairly weakly. If you go up to 1kb the promoter drives very strong and reliable expression. Pretty well all of the extra sequence is a LINE element. Junk? or an enhancer?

To my mind the jury is still out for good reasons. I am not saying all retroviral sequences are biologically functional to the good of the host organism, just that some may well be and there are good reasons to go looking, and people are doing so.

By Peter Ashby (not verified) on 19 May 2010 #permalink

OK, since this is a mainly science thread....

Allow me to express my ignorance...forgive me, I come from the world of physics, and haven't taken a course in biology since high school...30+ years ago...

Genes I'm ok with...
DNA, I'm ok with.
I think I have a fundamental handle on the basics of DNA replication A-T, G-C stuff...
This junk DNA... am I correct in assuming that this is sections of the DNA strand that doesn't have any particular genomic purpose which has been discovered yet?

And now to go out on a limb..
So, it could be possible that there may be some useful, vestigal purpose for that DNA, just not any use associated with genes, correct? Or no? (I'm thinking something like "with this number of molecules between the genes..it folds better" kind of use...no I'm not advancing a hypothesis, just trying to think of a style of vestigal purpose)

Thanks to all who care to educate me :)

By maddogdelta (not verified) on 19 May 2010 #permalink

Having an honest argument with a creationist is like having sex with an imaginary girl.

I call the latter "masturbation", and it's way more fun than the former!

By Weed Monkey (not verified) on 19 May 2010 #permalink

I'm completely out of my depth, so forgive me if this sounds silly. But like Platypus asked, I wonder if it's plausible that the non-transcribing junk DNA serves some more basic function, such as "physical spacer". Specifically, might non-transcribed sections of DNA be conserved or selected for if their physical size separates or in some other way physically mediates between or among transcribing sections in such a way that this benefits the transcribing sections? Or has something to do with crossover?

If that's dumb, I don't mind hearing that.:-)

By Josh, Official… (not verified) on 19 May 2010 #permalink

My half-baked idea for a "purpose" for junk DNA (which I must stress is not based upon any solid evidence) is that it's effectively cannon fodder for mutagens.

Consider that most (not all, but most) of the environmental causes of mutations are relatively fixed quantities and do only localized damage: for each high energy photon impact, or reactive oxygen species, etc. you get one spot of damage. Adding a whole lot of junk then gives a big decoy where such damage is inconsequential, while diluting the impact upon the critical components.

Given that there seems to be an optimum mutation rate for evolution - too few and it happens too slowly, too many and you lose discrimination between adaptive and deleterious mutations - one might then expect organisms, over time, to evolved towards having a junk DNA content which optimizes the functional DNA mutation rate in their particular environment.

Pure speculation, like I said - but I like the idea.

By tristan.croll (not verified) on 19 May 2010 #permalink

Now now, there's more ways for a piece of DNA to carry out a function than just holding protein information.

Some of the junk DNA (heterochromatin) is clearly involved in how homologous chromosomes are able to pair for meiotic segregation. In that case, it's not the specific sequences themselves, but the size and composition of the heterochromatic regions that impacts pairing efficacy. Likewise, centromeres are not identified by a specific sequence (well, except in some yeasts with very specialized single-microtubule centromeres) but is an epigenetic state based on both the junk DNA and the modifications of their histones. And the coevolution of the tail of centromere-specific histones (CenpA/Cid) with the predominant centromeric satellite sequences in a species is an intriguing observation.

A salamander's centromeres are bigger than our entire genome, aren't they? If we lacked junk DNA altogether, we could afford much smaller chromosomes still... and wouldn't need big centromeres.

Heterochromatin is involved in pairing of homologous chromosomes? Aren't the entire chromosomes involved anyway?

The antifreeze gene in some antarctic (or was it the artic ones? - they're different and unrelated) ice fish is one example I recall.

Yes, the Antarctic ones.

Remembering though that you have to look absolutely throughout all of our lifespan to be sure. For eg there are the reports that find bursts of endogenous retroviral expression very early after the first division of a fertilised egg and again around the time of implantation and placental formation. Since retroviral sequences take up much of the 'junk' dna the pronouncement that it is all junk without looking allwhen is premature.

Most of those retroviral sequences are pseudogenes, however. They can't be expressed. The viable retroviruses are a small fraction!

And those are silenced most of the time. Things like DNA methylation and histone modification are greatly remodeled in the early embryo; while this happens, some retroviruses apparently get transcribed.

Would it? My understanding of the definition of "pseudogene" was that it was recognizably a gene by sequence (by homology with other, known functional genes either in the same species or in related species), but contained mutations such that it could NOT be transcribed, EVER. Such as a premature stop codon, a frameshift into gibberish, or a screwed up promoter.

Such mutations don't prevent transcription, they prevent translation of the RNA into a functional protein; and indeed, AFAIK, many pseudogenes are transcribed and translated all the way to the first premature stop codon, yielding a useless truncated protein that goes into the proteasome. Unholy waste it is.

BTW, homology is an inference, not an observation. I know there are still molecular biologists who say "homology" when they mean "similarity"; they're wrong.

If the sequence had a valid reading frame such that we know that it could be transcribed, I don't think we'd call it a pseudogene, even if we had no idea at all about when or where it actually was transcribed.

Correct.

Don't more long lived animals have more junk DNA?

No, why? Birds live much longer than mammals of the same size, and yet have considerably less junk DNA.

Introns would allow mutated offspring unlikely to reach reproductive age to be miscarried at the 8 cell stage or so, a benefit to a species which puts lots of effort and energy into it's young before reaching reproductive age. So, bacteria would have few to none, while humans would have quite a lot.

And the single-celled Amoeba proteus would have few to none... while in reality it has a genome size of 100 billion bp, IIRC.

Note also that your argument says nothing about the size of introns, only about their number.

Except that Trophoblast giant cells in the placenta look remarkably like cells infected with a class of retroviruses leading to the idea that we placental mammals owe our difference from marsupials at least in part to a virus or viruses.

True; there is indeed a promoter of retroviral origin involved.

This doesn't explain why over half of our genome consists of rotting retrovirus corpses in all stages of decay, does it. We have twenty-mumble thousand genes of our own, and thirty-four thousand still recognizable retrovirus genes...

I am not saying all retroviral sequences are biologically functional to the good of the host organism, just that some may well be and there are good reasons to go looking, and people are doing so.

A few have been coopted for such purposes. The rest is trash.

How do I know? Because only 4 % of the human genome shows evidence of being under selection.

This junk DNA... am I correct in assuming that this is sections of the DNA strand that doesn't have any particular genomic purpose which has been discovered yet?

That's a small part of it. The rest is sections of the DNA strand that are known to lack any function.

And now to go out on a limb..
So, it could be possible that there may be some useful, vestigal purpose for that DNA, just not any use associated with genes, correct? Or no? (I'm thinking something like "with this number of molecules between the genes..it folds better" kind of use...no I'm not advancing a hypothesis, just trying to think of a style of vestigal purpose)

This hypothesis fails both parts of the onion test.

By David Marjanović (not verified) on 19 May 2010 #permalink

My half-baked idea for a "purpose" for junk DNA (which I must stress is not based upon any solid evidence) is that it's effectively cannon fodder for mutagens.

Not only does this again fail both parts of the onion test*, the logic behind it doesn't hold. There isn't a fixed number of DNA polymerase mistakes per genome, there's a (more or less) fixed number per number of nucleotides. Add more nucleotides, and you'll get more mutations per genome. External mutagens, too, aren't diverted from their course by the presence of more DNA elsewhere.

The only function I can see is the fact that cell size depends on genome size in eukaryotes. Smaller cells mean a higher surface/volume ratio, and that means faster metabolism; selection for faster metabolism results in smaller amounts of junk DNA, and has done so in dinosaurs, bats, and pterosaurs (bone cells leave measurable holes in fossil bones).

* As Ryan T. Gregory mentions, our genome is by no means the smallest, not even among vertebrates. The fugu (Takifugu rubripes) packs at least as many genes as ours (probably a bit more) into just 390 million base pairs, 1/8 of what we carry around.

By David Marjanović (not verified) on 19 May 2010 #permalink

I read the paper.

I don't think it means what you think it means.

Put up or shut up.

By David Marjanović (not verified) on 19 May 2010 #permalink

... what has been called the "dark matter" of the genome, or what has usually been called junk DNA.

What is the justification for using an astrophysical metaphor guaranteed to cause confusion and opportunities for creo mischief in an area so far removed from astrophysics?

Junk DNA is relatively passive and functionally just along for the ride, as might be said of interstellar "dark matter", but the analogy fails on most other levels. It's not as bad as "God particle", but a better-thought-out term - like "junk DNA" - seems in order.

By Pierce R. Butler (not verified) on 19 May 2010 #permalink

I don't think it means what you think it means.

Yawn, another troll all mouth, no evidence. Unless this is super idjit banned troll CW (makes sign of crossed tentacles).

By Nerd of Redhead, OM (not verified) on 19 May 2010 #permalink

The problem that I have with junk DNA is that it must be energetically expensive to keep it hanging round if it does no good, and if bacteria have a method of weeding it out why don't we bother?

I feel quite strongly that over the course of the next century uses will be found for much of it. But I could be wrong.

By Janet Holmes (not verified) on 19 May 2010 #permalink

Rev. BigDumbChimp:

All this science on a religion bashing blog.

Does anyone know where I can find a good religion bashing blog?

:D No complaints from me.

Although I have a lot of trouble wrapping my head around the concept of junk DNA. It seems like a gigantic waste of time, energy, and resources to replicate large portions of the genome that do absolutely nothing.

If there's always an initiation region that marks the beginning of a coding region, it doesn't seem terribly unlikely that somewhere along the way DNA Polymerase would have acquired a mutation that allowed it to skip past the non-coding (and non-promoting and non-transcription factor) sections of the genome.

But I'm no geneticist.

By protoposthuman (not verified) on 19 May 2010 #permalink

The Black-Eyed Peas' shortest song ever:

♬"Whatcha gonna do with all that junk
All that junk inside your nucleus?"♩
♫"I'ma do nothing. Most of it is just garbage."♪

I'll see that and, ahem, raise.

"1: Cut a hole in a box.
2: Put your junk in the box..."

By Standard Curve (not verified) on 19 May 2010 #permalink

Not only does this again fail both parts of the onion test*, the logic behind it doesn't hold.

Well, I did say it was half-baked - and your link shows me it's not even original. Oh well.

However:

There isn't a fixed number of DNA polymerase mistakes per genome, there's a (more or less) fixed number per number of nucleotides. Add more nucleotides, and you'll get more mutations per genome.

Yes, from plain old copying errors, sure. But that's not the only source of mutations by any stretch.

External mutagens, too, aren't diverted from their course by the presence of more DNA elsewhere.

... is simply false, for most mutagens.

UV damage, for example, is pretty much 1:1. A UVB photon, upon hitting DNA, tends to cause backbone breaks, which repair enzymes sometimes fix wrongly. UVA is too weak to break bonds, but instead excites bases into reactive states, the most common result of which is pyrimidine dimers. These are, again, occasionally repaired wrongly.

The key point, though, is that the repair errors for both of these mechanisms follow predictable patterns, so that one can infer which mutation was caused by which form of radiation. This has been used, for example, to show that in skin cancer biopsies, mutations in p53 seem to originate more often from UVA than UVB damage.

It's a similar case with chemical mutagens - intercalating agents, for example. Each individual molecule can only do som much damage before it's disposed of.

Take the limiting cases of a genome that's 100% functional, and one that's 99+% junk. Given an identical environment, and considering only damage from external sources, the rate of mutations in functional sequences is going to be far, far, higher in the former.

Yes, biology tends to be very, very messy. And yes, junk DNA mostly seems to arise from various errors. Actually, scratch that last - all DNA sequences seem to arise originally from errors. But that doesn't mean that, simply by being there, it can't be serving a valid purpose.

By tristan.croll (not verified) on 19 May 2010 #permalink

@41
paraphrasing: "since it is energy-expensive to have it there, then it must have a function, although we don't know what it is" - is that what you meant?

If it is, then that is approaching from an engineering viewpoint.

Yet, in reality, many if not most natural systems exist not energetically-optimally. Just two examples: it is more energy-consuming for chimpanzees to walk on all four limbs than on two (which they can), yet they most often locomote using 4. And the trite example of the appendix in humans: those cells are living and so use some of our energy, and although they may provide some minor function, we could and can survive without them.

The point is: organisms survive very well even if some of the processes in their bodies use more energy than is necessary for that survival. Start with the reality of processes in organisms and trace back the energy flow and you see all sorts of waste - yet the organism still functions. Start instead with a plan to produce the most efficient conversion of energy, and you would not see extant natural organisms.

The problem that I have with junk DNA is that it must be energetically expensive to keep it hanging round if it does no good, and if bacteria have a method of weeding it out why don't we bother?

It's not that expensive energetically. A lot of cells in the body are terminally differentiated and nondividing. It isn't like DNA turns over like proteins or RNA.

It is also expensive to get rid of it. The thought is that organisms reach an equilibrium or steady state between useless DNA accumulating and cells eliminating it.

None of this looks like what an intelligent designer would do but it all looks like kludges piled on kludges by the blind watchmaker.

Testing noncoding DNA for function is experimentally within reach. It's been done. Someone deleted megabases of DNA from the mouse genome and then made cloned mice. No phenotype, nothing happened except the DNA is gone.

Most junk is still junk.

With the exception of regulatory regions, tRNAs, siRNAs ; a handful of other non-protein encoding RNAs and histone binding sites, most of the extensively sprawling DNA sequence we have in us is useless junk. If I remember correctly I think around 8% of the DNA was from retroviral insertions. Maybe we should call this spam DNA? :p

It may help evolution, i.e. duplications that exist in silenced junk DNA accruing mutations and later re-activating under new regulatory elements, that's what we understand to be the evolution of many of the Hox genes. But for the time being, it's almost as much functionless junk than the Tea Party, but then again, no genome could work with 100% trash.

sciencedaily.com

Remarkably, the new research, recently published in Current Biology, shows that these early estimates were spot on - in total, we all carry 100-200 new mutations in our DNA. This is equivalent to one mutation in each 15 to 30 million nucleotides. Fortunately, most of these are harmless and have no apparent effect on our health or appearance.

We are all mutants.

Another way to look at noncoding DNA. Each person born carries 100-200 de novo mutations.

Most of us by definition are at least average humans. This indicates that from sciencedaily, "most of these [mutations] are harmless and have no apparent effect on our health or appearance." Because most of the genome is noncoding DNA and not all that important and that is where most mutations occur.

#44 makes an excellent point. The "Junk" DNA sections might have a kind of buffering effect against random genomic mutations. Bacterial cells are (in general) divide much faster then eukaryotic cells,in an "immortal" line. By comparison human cells hang around for a lot longer and are expected to operate the entire time without the same kinds of replication and selection to keep them in the game. So, a kind of bet hedging against mutation seems a reasonable explanation.

By jsteemson (not verified) on 19 May 2010 #permalink

There's no mutagen lightning rod. I disbelieve the hypothesis for safeguard against mutation.

There's no mutagen lightning rod.

Why?

By tristan.croll (not verified) on 19 May 2010 #permalink

Actually, no I slightly lie here. There are some mutation hotspots, usually in CG rich DNA regions (that are associated with chromatin condensation and transcription silencing). CG rich areas can sometimes result in the DNA polymerase enzyme slipping. Spontaneous deamination can also change methylated Cs to Ts, but enzymes usually repair it.

Fun fact: Because Mormons breed so fast they were used as human guinea pigs to track down colon cancer mutations. Mormon families (with large numbers of children) who had a case of colon cancer were genetically screened. It was found that an inherited methylation of a genetic regulatory region caused the hereditary colon cancer.

#43: This might be splitting hairs, but those kinds of UV induced mutations aren't passed on in many multicellular organisms because they don't occur in the germ line*. Germ-line mutations are most often the result of replication error.

*BUT...single celled organisms often have large genomes. The winner (as far as chromosome numbers go) I think belongs to a radiolarian**

**BUT...single celled organisms are often protected from UV by the water that they live in. Or not?

New line of thought: In some lineages, the quantity of DNA fluctuates pretty wildly. For example, Pines have much more DNA than their sister taxon (spruces), most likely due to the proliferation of a retrotransposon. I don't see a need to invoke a selectional advantage to the retrotransposon, as long as it isn't disadvantageous either.

OK...more like stream of consciousness than splitting hairs.

By Antiochus Epiphanes (not verified) on 19 May 2010 #permalink

The problem that I have with junk DNA is that it must be energetically expensive to keep it hanging round if it does no good, and if bacteria have a method of weeding it out why don't we bother?

The "method" bacteria have of getting rid of junk DNA is purifying selection. Basically it means all the bacteria that do have junk DNA are outcompeted by bacteria that do not - ie they die.

DNA replication is the rate and energy limiting step of bacterial reproduction. The less DNA a bacteria has, the faster it can reproduce. There is thus a big competitive advantage for bacteria to have small genomes, and so bacterial genomes are honed by natural selection to be very streamlined, containing the bare minimum number of genes that that particular bacteria needs to survive in its environment, and no more. And definitely any noncoding DNA that appears due to a mutation is strongly selected against and weeded out.

The largest bacterial genomes we see are around 9000-10000 bp or so.

But this means that there is very strong selection pressure in bacteria AGAINST the accumulation of complexity. Very few new genes, no matter how advantageous, are so advantageous that they can offset the disadvantage in slower DNA replication (and hence slower reproduction) that just having the extra base pairs entails.

When bacteria are faced with environmental change, they typically activate or increase their ability to do lateral gene transfer. Those bacterial lucky enough to stumble upon genes helpful or needed to survive the new environmental change quickly take over the population. But once the environment changes and the new genes are no longer essential, they are quickly lost again.

For 3.5 billion years bacteria have been going through this merry-go round of gaining genes and losing genes as the environment changes, always remaining bacteria, always remaining below that upper limit of base pairs. As a group they possess a mind-boggling diversity of biochemical tricks, but no single bacterial population ever gets to accumulate any large portion of that diversity of complexity for itself.

But eukaryotes are different. Thanks to their mitochondria they have a surfeit of energy. DNA replication is no longer rate limiting for their cell division. The extra energy allows them to adopt lifestyles where they do not have to compete directly with bacteria for speed of cell division (chief among them growing bigger in size and eating the smaller bacteria).

This means that purifying selection for rate of DNA replication DOES NOT APPLY for eukaryotes. The cost in energy and time to replicate extra DNA becomes quite minor, if it even exists at all.

This means eukaryotes are free to accumulate large genomes and accumulate complexity - to evolve tricks like multicellularity, embryogenesis, and intelligence.

One of the biggest contributors to both the evolution of complexity and the accumulation of junk DNA in eukaryotes is multiple rounds of whole genome duplication. In a bacteria such a mutation is a catastrophe - the bacterium will now be able to reproduce only half as fast. Even if the duplication somehow provides a huge advantage of new functionality, there is virtually no way that this advantage could be big enough to offset a halving of its reproductive rate. The bacterium is therefore doomed.

But in an eukaryote such a mutation is almost wholly neutral - it just gets an extra copy of each of its genes. Over time the extra copies accumulate mutations which are also neutral because at least one copy remains unmutated and functional. The majority of random change is destructive, so most of the mutated extra copies become non-functional, some accumulating so much degenerate change over time that their original sequence is wholly wiped out and we can barely trace their origin. But a small fraction of the duplicated genes get mutations that change their function slightly. Now, instead of having a single genetic function, the eukaryote gets two. And since the cell is no longer penalized for having extra DNA, even a very tiny advantage from a new function gets preserved.

In short, the accumulation of junk DNA is an inevitable consequence of the undirected mechanism that also produce increasing complexity and allow for the accumulation of novel traits. It remains because the penalty for keeping it is small, and easily offset by the advantage accrued by the acquisition of those new traits.

This might be splitting hairs, but those kinds of UV induced mutations aren't passed on in many multicellular organisms because they don't occur in the germ line*. Germ-line mutations are most often the result of replication error.

Very true. However, in vertebrates at least they often do lead to cancer, which tends to be not very conducive to the passing on of genes...*

Anyway, the point I'm trying to make is that it's not an all-or-nothing question. Non-conserved, non-coding and non-regulatory still doesn't mean it's serving no purpose whatsoever. Doesn't mean it is serving a purpose either, but biology has a funny habit of co-opting accidents to its advantage.

Oh, and in reply to MolBio @51: I'm not suggesting junk DNA somehow attracts mutagenic stimuli in preference to functional DNA (as your lightning rod analogy would suggest) - just that, simply by being there, it decreases the probability of any particular event harming a functional region.

*As an aside, this brings up an interesting question that I'm guessing has been answered somewhere, but I've never seen the answer: we know that the inherited mutation rate in humans is somewhere around 100-200 mutations per generation. However, if you were to take an adult human and sequence the genome of a bone marrow stem cell sample vs., say, a sample of near-terminally-differentiated keratinocytes, I wonder how many mutations we'd see? My guess is that it'd be far, far higher than 100-200.

By tristan.croll (not verified) on 19 May 2010 #permalink

Amphiox--Well explained.

Some means of dealing with a lot of extra DNA was required of protoeukaryotes, because what was to become the nuclear genome was bombarded with DNA from endosymbionts. IIRC, meiotic recombination has been proposed as a means of fixing interruption errors in these rapidly expanding genomes. The evolution of linear rather than circular genomes also provided eukaryotes with a means of rapid replication at multiple replication forks, while avoiding the problem of supercoiling inherent in circular chromosomes.

By Antiochus Epiphanes (not verified) on 19 May 2010 #permalink

amphiox@53:

A most excellent and clear summary, thanks! I may yet have to re-think my position.

By tristan.croll (not verified) on 19 May 2010 #permalink

So, a kind of bet hedging against mutation seems a reasonable explanation.

It may seem so at first glance, but it actually isn't reasonable.

"Hedging against mutation" or anything else for that matter, requires foresight. Natural selection, of course, is not capable of foresight.

The other thing is that the bigger your genome, the more base-pairs you have exposed to whatever environmental mutations you care to name, and the higher your mutation rate. This means that having lots of extra DNA around doesn't really change the risk of mutation per base pair, which means it doesn't really change the risk of mutation in your coding genes at all, and so cannot protect your coding DNA from mutational events.

Bacterial cells are (in general) divide much faster then eukaryotic cells

Bacterial cells do so because they must, in order to compete with other bacterial cells. Eukaryotic cells in multicellular organisms do not have to (all surrounding cells are genetically identical), and therefore don't. Indeed, they have mechanisms that prevent any one of their number from dividing too quickly, and when those mechanisms fail, it's called cancer.

By comparison human cells hang around for a lot longer and are expected to operate the entire time without the same kinds of replication and selection to keep them in the game.

Most human cells are continously replaced. The average human cell turns over no more and no less quickly, on average, than any other eukaryotic cell with a similar metabolic rate. There are only a few cell types that terminally differentiate and endure while remaining metabolically active and functional for long periods of time.

@54: I have always been interested in the role of somaclonal mutation in generating diversity in long-lived multicellular organisms that rarely reproduce sexually. Mutation alone couldn't really catch up with recombination as a diversity generator...yet these things are often less genetically uniform than their breeding system would suggest.

By Antiochus Epiphanes (not verified) on 19 May 2010 #permalink

The other thing is that the bigger your genome, the more base-pairs you have exposed to whatever environmental mutations you care to name, and the higher your mutation rate. This means that having lots of extra DNA around doesn't really change the risk of mutation per base pair, which means it doesn't really change the risk of mutation in your coding genes at all, and so cannot protect your coding DNA from mutational events.

I don't think I can agree with this. UV exposure, for example, is mostly a function of surface area - and the surface~l^2, volume~l^3 relationship still holds at the cellular scale, no?

Similarly, for chemical damage: in any multicellular organism, at least, the rate of production of potentially DNA-damaging reactive species is going to be a relative constant - or at least independent of DNA content.

So for these sources of mutation, it is most certainly not clear that the rate of mutation per base pair is a constant independent of genome size.

By tristan.croll (not verified) on 19 May 2010 #permalink

All this science on a religion bashing blog.

Does anyone know where I can find a good religion bashing blog?

That was a great first comment for this post, Rev. I am so sick of every controversial social-issue post on every other science blog being harangued by idiots that require their favorite bloggers to withhold their opinions on current events.

By chrstphrgthr (not verified) on 19 May 2010 #permalink

Who cares about the UV mutation rate? That's only a handful of cells and by the time you get cancer you've already passed along your genes.

Who cares about the UV mutation rate? That's only a handful of cells and by the time you get cancer you've already passed along your genes.

... says the organism with a large amount of junk DNA buffering against UV damage. ;-)

That's the thing, isn't it? We have no good control organism - a multicellular eukaryote with a very low junk DNA content - to compare against.

By tristan.croll (not verified) on 19 May 2010 #permalink

#40,

I feel quite strongly that over the course of the next century uses will be found for much of it.

Input for a random number generator? Pie filling? (DNA and bacon pie!) Or there could be some way of reading it out as scripts for cartoon sitcoms or episodes of Lost.

By John Scanlon FCD (not verified) on 19 May 2010 #permalink

Peter Ashby @19

Remembering though that you have to look absolutely throughout all of our lifespan to be sure.

Surely not just the indivual lifetime, but the lifetime of the species.

Ichthyic @22 Wikipedia:

Many noncoding DNA sequences have very important biological functions. Comparative genomics reveals that some regions of noncoding DNA are highly conserved, sometimes on time-scales representing hundreds of millions of years, implying that these noncoding regions are under strong evolutionary pressure and positive selection.

Several people above have echoed that: non-coding doesn't necessarily mean non-functional. I've argued before that PZ is unnecessarily and even dangerously dogmatic on junk DNA.

By clausentum (not verified) on 19 May 2010 #permalink

The UV thing fails the onion test (linked above) anyway. Why does an onion need so much more UV protection than a human? You're jumping on something incredibly minor and saying that it's the reason when a much simpler explanation would suffice.

You're jumping on something incredibly minor and saying that it's the reason when a much simpler explanation would suffice.

No - I'm saying that buffering against external mutagenic influences is a possible mechanism by which junk DNA may play some adaptive role.

By tristan.croll (not verified) on 19 May 2010 #permalink

The problem that I have with junk DNA is that it must be energetically expensive to keep it hanging round if it does no good, and if bacteria have a method of weeding it out why don't we bother?

I feel quite strongly that over the course of the next century uses will be found for much of it.

Bacteria don't have a method for specifically getting rid of it. They're just under strong selection for fast replication and therefore small genome size. There are no enzymes that can recognize useless DNA as such and cut it out; the only way to get rid of it is random deletions plus natural selection against deleting genes that are still needed; for most eukaryotes, the latter factor simply outweighs selection for a smaller genome.

If junk DNA has a use, why does only 4 % of the human genome show evidence of being under selection?

(And this already includes regulatory sequences and stuff. Protein-coding genes are only 1.2 % of the human genome.)

And what about the onion test?

Although I have a lot of trouble wrapping my head around the concept of junk DNA. It seems like a gigantic waste of time, energy, and resources to replicate large portions of the genome that do absolutely nothing.

It is a gigantic waste of time, energy, and resources to replicate large portions of the genome that do absolutely nothing.

You see, natural selection doesn't kill off all except the ideal. It preserves all that are good enough. We can afford to carry all that junk around and replicate it, so we do, as described in comment 53.

If there's always an initiation region that marks the beginning of a coding region, it doesn't seem terribly unlikely that somewhere along the way DNA Polymerase would have acquired a mutation that allowed it to skip past the non-coding (and non-promoting and non-transcription factor) sections of the genome.

DNA polymerase doesn't recognize any specific sequences that have anything to do with transcription or translation. It requires replication initiation sites, which have nothing to do with transcription or translation, and then simply replicates and replicates till it randomly falls off.

It's a similar case with chemical mutagens - intercalating agents, for example. Each individual molecule can only do som much damage before it's disposed of.

Take the limiting cases of a genome that's 100% functional, and one that's 99+% junk. Given an identical environment, and considering only damage from external sources, the rate of mutations in functional sequences is going to be far, far, higher in the former.

No, because a smaller genome is a smaller target. Literally smaller in spatial dimensions. It's harder to hit. Fewer mutagens will hit it.

If I remember correctly I think around 8% of the DNA was from retroviral insertions. Maybe we should call this spam DNA? :p

That's for a very strict definition of "retroviral insertion". If you count all the defunct retrotransposons and stuff, the number climbs to 54 %, IIRC.

This indicates that from sciencedaily, "most of these [mutations] are harmless and have no apparent effect on our health or appearance." Because most of the genome is noncoding DNA and not all that important and that is where most mutations occur.

No. If we had no junk DNA at all, most mutations would still be harmless, because they wouldn't change the amino acids genes code for, and because regulatory elements rarely have precise sequence requirements.

And if there is less to mutate, fewer mutations will happen in total. :-|

The "Junk" DNA sections might have a kind of buffering effect against random genomic mutations. Bacterial cells are (in general) divide much faster then eukaryotic cells,in an "immortal" line. By comparison human cells hang around for a lot longer and are expected to operate the entire time without the same kinds of replication and selection to keep them in the game. So, a kind of bet hedging against mutation seems a reasonable explanation.

The other way around, as described in comment 53: because bacteria are under selection for fast reproduction, they're under selection for small genome size, while we are not.

Most human cells are continously replaced. The average human cell turns over no more and no less quickly, on average, than any other eukaryotic cell with a similar metabolic rate. There are only a few cell types that terminally differentiate and endure while remaining metabolically active and functional for long periods of time.

OK, but few if any eukaryotic cells take less than 24 h to reproduce... under ideal conditions, Escherichia coli reproduces every 20 minutes...

I don't think I can agree with this. UV exposure, for example, is mostly a function of surface area - and the surface~l^2, volume~l^3 relationship still holds at the cellular scale, no?

So a longer genome has more surface than a shorter one in absolute terms.

What does it matter that it's smaller in relative terms? UV doesn't care, and neither do chemical mutagens.

Who cares about the UV mutation rate? That's only a handful of cells and by the time you get cancer you've already passed along your genes.

Indeed, it seems that the selection for dark skin in humans in equatorial regions is not against skin cancer but against UV destroying folic acid in the blood. Folic acid is necessary for embryogenesis. Skin cancer develops way too slowly to impact reproduction.

That's the thing, isn't it? We have no good control organism - a multicellular eukaryote with a very low junk DNA content - to compare against.

I have already mentioned Takifugu rubripes: 1/8 of our genome size, yet at least as many genes as we have.

Sure, it lives in water, which protects against UV to some extent. But water also gives easier access to chemical mutagens, and the fugu is so small that little if any of its cells are protected from UV by virtue of being far from the body surface – we aren't going to get UV-induced mutations in our bone marrow.

No - I'm saying that buffering against external mutagenic influences is a possible mechanism by which junk DNA may play some adaptive role.

This still fails the onion test.

Remember, onions make a living from standing around in the sun all day. If anything needs UV protection, they do.

By David Marjanović (not verified) on 19 May 2010 #permalink

Posted by: Zeno | May 19, 2010 3:52 PM

give the creationists another big fat zero

You are much too kind. The negative numbers were invented so that we would have an appropriate scale against which to measure the accomplishments of the creationists. And it's not bounded below!

Surely this is what imaginary numbers are for.

By GravityIsJustATheory (not verified) on 20 May 2010 #permalink

Ichthyic @22 Wikipedia:

Many noncoding DNA sequences have very important biological functions. Comparative genomics reveals that some regions of noncoding DNA are highly conserved, sometimes on time-scales representing hundreds of millions of years, implying that these noncoding regions are under strong evolutionary pressure and positive selection.

Several people above have echoed that: non-coding doesn't necessarily mean non-functional. I've argued before that PZ is unnecessarily and even dangerously dogmatic on junk DNA.

"Many" does not mean "most".

I repeat:

1.2 % of the human genome consists of protein-coding genes. All the rest is non-coding.

4 % of the genome show evidence of being under natural selection. (This includes the protein-coding genes, but also genes that code for RNA which isn't translated into proteins – tRNA, rRNA, and a long list of regulatory RNAs. It also includes promoters, enhancers, silencers, binding sites for histones, and so on and so forth.) All the rest is useless. If it weren't useless, there'd be selection pressure on its sequence, but evidently there isn't.

The Wikipedia quote you give talks about 2.8 % of the human genome. You seem to believe it talks about 98.8, but it doesn't.

By David Marjanović (not verified) on 20 May 2010 #permalink

David Marjanović @ 69
ok: thanks for the clarification, and I've looked up the onion test you mentioned, and, yes, that also puts things in perspective.

By clausentum (not verified) on 20 May 2010 #permalink

Junk DNA is still junk.

That's not true and even if it was true it's because God made it that way. Also, there's no transitional fossils, there's the Cambrian thing, Darwin changed his mind on his death bed, and Hilter was an atheist evolutionist who quit the Catholic Church because he wanted to kill Jews for Darwin. Also, PZ hates God.

By a.human.ape (not verified) on 20 May 2010 #permalink

Bernard@70...

Compelling abstract. I'll have to check out the paper from my office later. Thanks.

By Antiochus Epiphanes (not verified) on 20 May 2010 #permalink

Correlation between genome size and induced mutation rates.

Eh, yes, but that's from 1975. When something that old has been so completely forgotten, there's usually a good reason for it.

By David Marjanović (not verified) on 20 May 2010 #permalink

I feel quite strongly that over the course of the next century uses will be found for much of it.

We already have. As a means for figuring out the evolutionary history and relationships between organisms. How do we know that chimpanzees are humans' closest living relatives? Junk DNA. How do we know that Gorillas first branched away from the the lineage that led to humans/chimps, followed by humans, followed by bonobos (from chimps)? Junk DNA. How do we know that eukaryotic cells arose from an endosymbiosis between a bacteria and an archean? Junk DNA (well, one among several lines of evidence). How do we know that a critical event in the evolution of vertebrates from stem chordates was a double whole genome duplication? Junk DNA. How do we prove paternity in court? Junk DNA. Or fingerprint the semen from a rape? Junk DNA.

I don't think I can agree with this. UV exposure, for example, is mostly a function of surface area - and the surface~l^2, volume~l^3 relationship still holds at the cellular scale, no?

The information contained in DNA is arranged linearly - ie one dimension. Since the width of a DNA molecule doesn't change no matter how long it is, the relative risk of each individual base pair to UV radiation is not affected by length/surface area considerations.

Unless you coil the DNA into three dimensional shapes, and wrap it in a protective casing of proteins. Which, granted, is the case.

However, there is a still a problem here. If junk DNA did in fact have a function as a buffer/protection against mutagens, you should expect a spatial arrangement, with the coding genes protected in the central areas, and the junk all surrounding it, at the surface. This is not the case. In fact, the actively transcribed genes are generally exposed at the surface.

Secondly, if Junk DNA really did function as a protection against mutation, it's sequence should have been optimized over time to the sequences that was best for protecting against mutations. We already know that different sequences have different susceptibility to mutations. But it is not.

Finally, there is the issue again of foresight. We know when the majority of Junk DNA in eukaryotes was first accumulated. A large amount of it was very early, right at the beginning with the very first eukaryotic cells. (There's a single celled amoeba with four times as much total DNA as humans and maybe a third of the coding genes, for example). If Junk DNA was intended to function as a buffer against mutation for long lived multicellular organisms, then something must have foreseen the need for such a buffer 2 billion years before it was required, and put it into the first eukaryotic cells. (Maybe creationists shouldn't hate Junk DNA so much after all - it's evidence of foresight! Not particularly intelligent, or capable, or benevolent foresight, but still....)

Otherwise, Junk DNA would have to be an effective mutation buffer in those first eukaryotic cells, which reproduced maybe in 24hours or so. Natural selection works on individuals. For mutation buffering to be an adaptive function, the mutation rate has to be high enough that a single eukaryotic cell, with a life cycle of about 1 day, is subject to a mutation rate high enough that the risk of catastrophic life or reproduction limiting mutations is sufficiently high over a single 24h life time to be selectively relevant. We know what the mutation rates in eukaryotes are. It is not that high, not by many orders of magnitude.

Cancer isn't a reproduction limiting event for long lived multicellular organisms today only because over time all of us have evolved many overlapping and extremely powerful defenses against it. It is only after these mechanisms fail after many, many years, that we get cancer (or we're really unlucky and are born with defective defenses to begin with).

I'm willing to speculate that for the very first multicellular organisms, cancer was a real threat and a major limitation on how big they could grow and how long they could live. It was probably only after the anti-cancer defense mechanisms evolved that larger and longer lived multicellular critters could appear.

P Z also posted his blog entry above here:

http://pandasthumb.org/archives/2010/05/junk-dna-is-sti.html

Unforunately, the notorious troll and loon John A Davison invaded that space and now no one can comment there because P Z or another moderator has shut down the ability to do that.

So I made an entry on MY blog about it:
http://circleh.wordpress.com/2010/05/20/the-sheer-incompetence-of-john-…

Unlike Davison, I DO know how to run a blog properly!

I just have to chime in and say that even though the DNA doesn't code, it's unimaginative to just call it junk. According to evolution, this DNA is the remnant of previously functional DNA, and it could well mutate and become functional. Just because it's currently in an evolutionary intermediary state doesn't mean it's junk. Vestigial appendages, like the human appendix, though they seem to serve no purpose, could well mutate into something useful.

By notVerneant (not verified) on 20 May 2010 #permalink

A quick correction to some of my prior posts:

The largest known bacterial genomes don't contain just 9000-10000 bps. That's actually ludicrously puny. It's actually 9000-10000 genes. (The smallest are in the vicinity of 500 or so).

And there are a few known prokaryotes (actually archaeans and not bacteria, I think) that have a small bit of junk DNA. But not much.

According to evolution, this DNA is the remnant of previously functional DNA, and it could well mutate and become functional.

Those parts that become functional again, of course, won't be junk DNA anymore. If they become transcribed (see Ice Fish antifreeze) again, then they become part of the coding DNA. If they become incorporated into regulatory elements, then they become regulatory DNA.

If one could forgive a mildly teleological analogy, then think of evolution as a tinkerer - a particularly dumb, scatterbrained, and forgetful tinker, but one with tremendous persistence, working tirelessly (and brainlessly) all the time. The Junk DNA is the byproduct of his tinkering, the bits and pieces and scraps, failed prototypes and abandoned early models, that pile up in the corners of his workshop. It is all available for future tinkering, and very rarely some piece of this junk does get converted into something useful again. But much more often it is the reverse, and previously useful stuff gets broken by his incompetent tinkering, and added to the pile of junk.

And occasionally he might go and trade some junk with the tinkering next door.

And sometimes a thief breaks in, gets stuck in the debris, dies, and his corpse get added to the pile.

If the workshop is very small, then the junk gets thrown out in order to clear space to work.

If the workshop is large, then he doesn't bother to waste his energy lugging the junk out to the pavement for the garbage collector to take. It just piles up.

@81: Thanks for that, an interesting read. I'm now willing to accept that this idea is (a) waaay old, and (b) not particularly well supported. As the author said, though, it does have intuitive appeal, doesn't it?

For those interested (and who have access) I just spent a very interesting half hour skimming through this review on the subject of mutation rate variation in multicellular eukaryotes. Upshot is that (as one would expect) there are lots of overlapping mechanisms determining mutation rates.

There was reference in there to one study that I found really quite interesting (Hum. Mol. Genet. 2, 173–182 (1993)) - which showed that the spontaneous mutation rate in a reporter gene varied over 60-fold depending upon where it integrated into the genome of a human cell line. That's really pretty cool.

By tristan.croll (not verified) on 20 May 2010 #permalink

@82

I think we're having a semantic argument on what constitutes junk. If it doesn't seem to be immediately useful, to me, it's not junk. It's material. If it truly is junk, I think it would weigh down chances for survival and would be eliminated in short order. I think that there must be a reason for the gene to be that length, even if we can't discern it. Like someone here posted, even if the use of the DNA isn't to code, it could serve some other function. DNA might be the smallest sub-machine that the coding DNA can control, like a muscle, or eye. It's a mechanism to be manipulated.

By notVerneant (not verified) on 20 May 2010 #permalink

David Marjanović @67 (more because you've misunderstood me than because I'm seriously defending the hypothesis at this point):

Take the limiting cases of a genome that's 100% functional, and one that's 99+% junk. Given an identical environment, and considering only damage from external sources, the rate of mutations in functional sequences is going to be far, far, higher in the former.

No, because a smaller genome is a smaller target. Literally smaller in spatial dimensions. It's harder to hit. Fewer mutagens will hit it.

and

I don't think I can agree with this. UV exposure, for example, is mostly a function of surface area - and the surface~l^2, volume~l^3 relationship still holds at the cellular scale, no?

So a longer genome has more surface than a shorter one in absolute terms.What does it matter that it's smaller in relative terms? UV doesn't care, and neither do chemical mutagens.

OK, try this. You say our genome is around 4% functional, 96% junk. Therefore it's 25 times as big in volume as it would be if all the junk was removed. However, making for the sake of simplicity the assumption of a spherical nucleus, it has only around 8.5 times the cross-sectional area - which, since radiation tends to travel in straight lines, is what ultimately defines radiation exposure. Thus, even if your distribution of junk and functional DNA is entirely random, by expanding the volume of the genome 25-fold you get an approximately (25/8.5)=~3-fold reduction in per-base-pair radiation exposure. That's not exactly negligible.

By tristan.croll (not verified) on 20 May 2010 #permalink

@85

You're both arguing with the assumption that shorter, smaller genome has the same chance as being hit by radiation as a long one. It has a smaller chance but a hit would do more damage, I think.

By notVerneant (not verified) on 20 May 2010 #permalink

@86,

No, not quite. The point is that while in absolute terms a smaller genome will get hit by less radiation, the somewhat counter-intuitive result is that, as long as the geometry and volumetric density of DNA remains comparable, the radiation flux per base is substantially higher for the smaller genome.

By tristan.croll (not verified) on 20 May 2010 #permalink