Blogging on Peer-Reviewed ResearchYou know that organisms develop, grow, and function in part because genes code for proteins that form the building blocks of life or that function as working bioactive molecules (like enzymes). You also know that most DNA is junk, only a couple percent actually coding for anything useful. Most importantly, however, you know that everything you know is wrong. Right?

The “Junk DNA” story is largely a myth, as you probably already know. DNA does not have to code for one of the few tens of thousands of proteins or enzymes known for any given animal, for example, to have a function. We know that. But we actually don’t know a lot more than that, or more exactly, there is not a widely accepted dogma for the role of “non-coding DNA.” It does really seem that scientists assumed for too long that there was no function in the DNA. But it is also true that the discovery of a function of non-coding DNA is intrinsically harder (in many cases) than discovering a function for a protein which is coded for by a particular gene. The protein is easier to deal with because it may be quite large (compare to a bit of RNA from a non-coding region of the genome) and often quite blatant (highly bioactive). In other cases, the protein may be organized in a way that is impossible to miss … like your hair for example. It could be argued that a hair is a single, huge molecule that does not require much sophistication of machinery to observe. Building-block protein … the proteins used to make tissues, cell membrane, etc. … is pretty hard to miss.

The difference between observing the function of bits of RNA from non-coding regions of the genome and observing the function of proteins is like trying to figure out by observing from a distance the meaning of mail being carried around by postal workers, in trucks, being delivered to office buildings, in a city with all kinds of infrastructure, etc. The people are enzymes, the trucks are transport proteins, the buildings and infrastructure are structural proteins, and that is all pretty obvious. But while it might be obvious that the mail being carried around by the postal workers has a function, figuring out what that function is requires a very different scale of observations. If early on in the process someone discovered that most of the “mail” is “junk” (you know, junk mail…) then you might start ignoring the mail and focus on the trucks and the roads and the different kinds of postal workers.

But increasingly, we understand that there is meaning in the mail, and a current paper in PNAS asserts that there may be meaning in the molecules known as non-coding RNA. Non-coding RNA, or ncRNA, is RNA that got there during transcription … when the DNA unzipps and makes a template for the production of a protein … but does not itself get involved in the protein coding process. This RNA is floating around in living tissue all the time. If it was simply “artifact” … meaning, it just fell off during protein coding like the sawdust and bent useless nails that accumulate on your garage floor when you are building new bookshelves, for instance, then its distribution in the tissue would have a random pattern. If, on the other hand, certain ncRNA’s showed up in certain tissues in a way not explained by the “sawdust” model, maybe it does have a function.

Here is part of the abstract from the paper by Mercer et al,:

A major proportion of the mammalian transcriptome comprises long RNAs that have little or no protein-coding capacity (ncRNAs). Only a handful of such transcripts have been examined in detail, and it is unknown whether this class of transcript is generally functional or merely artifact. … we identified 849 ncRNAs (of 1,328 examined) that are expressed in the adult mouse brain and found that the majority were associated with specific neuroanatomical regions, cell types, or sub cellular compartments. … Comparisons between the expression profiles of ncRNAs and their associated protein-coding genes revealed complex relationships that, in combination with the specific expression profiles exhibited at both regional and subcellular levels, are inconsistent with the notion that they are transcriptional noise or artifacts of chromatin remodeling. Our results show that the majority of ncRNAs are expressed in the brain and provide strong evidence that the majority of processed transcripts with no protein-coding capacity function intrinsically as RNAs.

The research was done by looking at data from an “atlas” – the ABA – of the mouse brain, which is a huge pile of information on DNA known to be expressed in each part of the mouse brain. (There are atlasses for other species, including humans, and the present study also lookd in the human atlas.)

An example of patterning in ncRNA is with antisense transcribed genes (that is a common way that genes are expressed in mammals):

Transcriptional profiling has shown that antisense transcription is prevalent in the mammalian genome (33), and several studies indicate its importance in regulating diverse biological functions. We identified 44 ncRNAs in the ABA that are antisense to the exons of protein-coding genes… These antisense ncRNAs often share varied and complex expression relationships with their sense protein-coding transcripts. For example, P-rex1, a gene involved in neuronal migration, and its antisense ncRNA partner are both expressed in the cerebral cortex (Fig. 2b). However, in the cerebellum, P-rex1 is specifically expressed in the Purkinje cell layer, whereas the associated antisense ncRNA is expressed within the granular and molecular layer (Fig. 2c).

That’s a little thick, but we can parse it. This observation as well as other observations made in this study are relevant because of the following hypothesis: ncRNA is an artifact, or a side effect, of gene expression. This predicts a random but correlated pattern of association … where a gene is expressed, the associated ncRNA bits should be found, more or less, like the sawdust from your bookshelves is found on the floor of your garage only when you make bookshelves in your garage. If the ncRNA is missing sometimes where related transcription happens, or if it is more abundant than it should be, depending on what tissue you look in (your garage vs. your bathroom… different tissues in this analogy), then this hypothesis is untenable. This would suggest that ncRNA is not functionless.

ADDED: I recommend this critique of the study at Genomicron.


Mercer, Tim R., Marcel E. Dinger, Susan M. Sunkin, Mark F. Mehler, and John S. Mattick. (2008) Specific expression of long noncoding RNAs in the mouse brain. PNAS | January 15, 2008 | vol. 105 | no. 2 | 716-721, Open Access Article

Comments

  1. #1 TR Gregory
    January 15, 2008

    Hi Greg,

    I actually don’t have any criticism of the study, I am just preparing for the inevitable hyperextrapolation to “see, it’s ALL functional” that is on the way, as always.

    Cheers,

    R

  2. #2 RPM
    January 15, 2008

    The “Junk DNA” story is largely a myth, as you probably already know.

    No, I don’t already know. The majority of the human genome is non-functional. Why not describe it as junk. Sure, a small fraction of the genome is functional (including protein coding genes, non-translated transcribed sequences, and transcriptional regulatory regions), but most of it can be best described as junk.

  3. #3 Greg Laden
    January 15, 2008

    Its funny how people are so touchy about Junk DNA. It’s not like I killed your puppy or something…

    It may be that every single nucleotide that is not part of a codon that is ultimately transcribed and translated to an amino acid chained into a protein is absolutely without function. Then, the classical term “Junk DNA” fits.

    I do regret saying “The “Junk DNA” story is largely a myth, as you probably already know. DNA does not have to code for one of the few tens of thousands of proteins or enzymes known for any given animal, for example, to have a function. We know that.” (even though it is utterly true if you include DNA that “codes for” things that are not proteins, control regions, and effects of genome size, and possibly other things such as suggested in this paper) then following that with “we actually don’t know a lot more than that, or more exactly, there is not a widely accepted dogma for the role of “non-coding DNA.” second. I should have made the second point use more words and the first point use fewer words, then reversed the order of the points. Then what i was saying would have been more obvious to people skimming the post…

    Jeesh… and I thought anthropologists were touchy…

  4. #4 Sigmund
    January 16, 2008

    RPM, while it is clear that the majority of the human genome is non functional what we are dealing with is not defined islands of functional elements (coding, functional RNA and non-coding regulatory sequences) separated by seas of non functional DNA. What we are increasingly finding is large numbers of important regulatory elements scattered throughout the genome. Sequences that were once ‘junk’ can later acquire function through mutations of one kind or the other. While the whole question of ‘junk DNA’ is a semantic point for genomicists it is not that way for other biologists and the general public. They wont know enough about microRNAs, antisense genes, enhancers, repressors, insulator elements, matrix attachment regions – all of which are still being identified within DNA sequences that were not found to be protein encoding. What the public don’t understand is that genomicists have always assumed that such regulatory elements do exist within the non coding sequences – we just haven’t characterized them sufficiently yet. This is the reason why the discovery institute has started trumpeting functional elements found within previously described ‘junk DNA’ as some sort of success of their predictive abilities. The

  5. #5 apalazzo
    January 16, 2008

    There’s a problem here and elsewhere. About 50-70% of the genome is continuously transcribed. Yet only 1-2% of it codes for proteins. So why? There may be some ncRNAs that may perform a specific task, and the authors here show that <1% of the ncRNA may be expressed in an interesting pattern. But it still looks like the rest doesn’t do much. In contrast there is plenty of evidence out there that the act of transcription is important to remodel how the DNA is packed. And this paper does not provide any evidence that the genome contains “mostly functional” ncRNAs.

  6. #6 apalazzo
    January 16, 2008

    (damn what happened to the rest of my comment?)

    the authors show that <1% of the genome is expressed as ncRNAs with an interesting pattern. There is no evidence here that “most of the genome is functional”. Another peeve – there is no functional data in this paper. How do we know that these bits of RNAs are important? Perhaps the interesting patterns are just incidental to a patterns of transcriptional factors +lots of cryptic transcriptional start sites? Or perhaps it is just stable bits of transcripts that survive the various nuclear environments differently. If someone knocks one down and sees some phenotype then people will pay attention, as it stands these are only interesting observations.

  7. #7 Larry Moran
    January 16, 2008

    There’s a problem here and elsewhere. About 50-70% of the genome is continuously transcribed.

    The “problem” is that what you say is not correct. It has not been demonstrated that 50-70% of the (mammalian) genome is “continuously” transcribed in the sense that you imply.

  8. #8 apalazzo
    January 16, 2008

    Larry,

    I had a whole paragraph under that sentence developing an idea. That paragraph vanished. I then retyped it and it vanished again … I won’t try a third time.

  9. #9 Greg Laden
    January 16, 2008

    apalazzo: That is because your comments are being censored!

    Only kidding. I’ve looked at this end and there is nothing there. You must be having a problem with your browser or something.

  10. #10 windy
    January 16, 2008

    It may be that every single nucleotide that is not part of a codon that is ultimately transcribed and translated to an amino acid chained into a protein is absolutely without function. Then, the classical term “Junk DNA” fits.

    You are free to argue that “Junk DNA” is a misnomer but this is really quite a strawman. The first sentence can’t be true since we know that noncoding promoters and such things exist. There’s nothing in the classical definition of junk DNA about every single noncoding nucleotide having to be non-functional!

  11. #11 Paul Nelson
    January 16, 2008

    The main reason I read articles with titles like this one is to see if there are any new functions discovered in the ncRNA (and ncDNA). I am sure many, many more will be discovered as time goes on, but one thing that strikes me – and never seems to be mentioned – is that our DNA – all of it, the coding parts (open reading frames), Ribosomal parts, parts that are trascribed into miRNAs, SNPs, etc – as well as all the so called “Junk” – are part of our genome – our chromsomes. This makes up the “whole body of the genome”, and this is necessary for Meiosis and Mitosis. And that is what keeps us alive, as individuals and as a species. By analogy, it’s like if you think everything except the plumbing, wiring, doors, windows, furniture and all the accoutrements like computers, TVs etc in your house are not “Junk” but everything else is. Try living in a house without bricks and mortar, or a roof. Or a floor. Or wooden framing. Might get a bit chilly in wintertime.