Junk DNA, Revisited

Some bio-bloggers are atwitter over an article by Wojciech Makalowski on Scientific American's website about Junk DNA. I'm a little late to the game because, well, I've been really busy looking at sequences to determine if they are junk DNA. Is it irony? Is it coincidence? Who cares? It's an opportunity to discuss semantics, and I love semantics.

Those of you who have hung around here for a while know this topic often comes up at evolgen (remember this, this, and this . . . hell, here's what a search for Junk DNA turns up). Long story short, I can't stand the term junk DNA, but I do agree with Dan Graur that Junk DNA is a valid null hypothesis. And that's it. The majority of any eukaryotic genome is made up of non-functional sequences that are slightly deleterious, but they persists because selection against these sequences is too weak to purge them (i.e., nothing in evolution makes sense except in the light of population genetics).

The blogorific bruh-ha-ha started with Larry Moran's response, and then Alex Palazzo jumped in. The biggest flaw with Makalowski article, and one that Larry points out, is that Wojciech attempts to answer the question "What is junk DNA, and what is it worth?", but instead spends most of the time describing repetitive DNA. He never actually gets around to answering the question. Makalowski goes on to point out a bunch of examples of repetitive DNA being coopted (exapted?) by genomes to become transcriptional enhancers. At least he avoids saying that transposable elements don't become parts of protein coding genes (see here for why that's not surprising).

Even though some repetitive sequences become functional elements, most of those repeats are just filler. Junk. Not useful, but not bad enough to be worth eliminating. I'm sure there's an appropriate analogy, but I'm not clever enough to coin it. But even though the junk can perform fancy molecular tricks, like induce rearrangements, it is still junk. You can think of those pieces of junk as mutational hotspots. The junk is isolated to regions of the genome where it can do as little harm as possible (the hypothetical analogy I would have introduced were I more clever would have been extended here). That is the evidence for selection on these sequences -- to purge them from near functional regions.

I'm usually annoyed by uses of the term "junk DNA" in the popular literature because they treat the discovery of functional non-protein coding sequences as some big surprise. This time, I'm bothered by a treatment that suggests a function for all junk DNA. I guess I can't win.

I rather like the distinction between "junk" and "garbage" as originally coined.

The problem is, not knowing the function of a non-protein coding sequence does ot mean the same as knowing that there is no function.

When we have found some non-protein coding sequence to be functional, did we not find it essentially by chance? Without being able to make better estimates of the percentage of other When we have found some non-protein coding sequences to be functional?

Okay, so we know much of the Genome. We know much of the Proteome. We now know a lot of the Metabolome.

How does it all connect? How does it all correlate under a range of conditions that we don't know?

Calling the parts we don't know "Junk DNA" is like calling putative Dark Matter and Dark Energy "Junk Cosmos."

Remember the special issue of nature -- was it their 125th anniversary issue? -- on "The Frontiers of Ignorance?"

That's roughly where we are. We don't know what "Junk DNA" does and doesn't do in given organisms.

We don't know what it is that we don't know.

We don't know how much there is of what we don't know, compared to what we do know.

What if what we don't know is more than we think it is?

That's not just a semantic issue, is it?

I'll let you in on a well-kept secret. The people who write about junk DNA aren't as stupid as you think we are.

Over the past 30 years we've accumulated a bus-load of evidence that large parts of the mamalian genome are truly junk. This isn't an argument from ignorance as you imply. We leave those sorts of arguments to the creationists.

To take just one example; think of pseudogenes. We're not just guessing that pseudogenes are non-functional, we have data. There are almost as many junk pseudogenes in the human genome as there are functional genes.

