You know that organisms develop, grow, and function in part because genes code for proteins that form the building blocks of life or that function as working bioactive molecules (like enzymes). You also know that most DNA is junk, only a couple percent actually coding for anything useful. Most importantly, however, you know that everything you know is wrong. Right?
The “Junk DNA” story is largely a myth, as you probably already know. DNA does not have to code for one of the few tens of thousands of proteins or enzymes known for any given animal, for example, to have a function. We know that. But we actually don’t know a lot more than that, or more exactly, there is not a widely accepted dogma for the role of “non-coding DNA.” It does really seem that scientists assumed for too long that there was no function in the DNA. But it is also true that the discovery of a function of non-coding DNA is intrinsically harder (in many cases) than discovering a function for a protein which is coded for by a particular gene. The protein is easier to deal with because it may be quite large (compare to a bit of RNA from a non-coding region of the genome) and often quite blatant (highly bioactive). In other cases, the protein may be organized in a way that is impossible to miss … like your hair for example. It could be argued that a hair is a single, huge molecule that does not require much sophistication of machinery to observe. Building-block protein … the proteins used to make tissues, cell membrane, etc. … is pretty hard to miss.
The difference between observing the function of bits of RNA from non-coding regions of the genome and observing the function of proteins is like trying to figure out by observing from a distance the meaning of mail being carried around by postal workers, in trucks, being delivered to office buildings, in a city with all kinds of infrastructure, etc. The people are enzymes, the trucks are transport proteins, the buildings and infrastructure are structural proteins, and that is all pretty obvious. But while it might be obvious that the mail being carried around by the postal workers has a function, figuring out what that function is requires a very different scale of observations. If early on in the process someone discovered that most of the “mail” is “junk” (you know, junk mail…) then you might start ignoring the mail and focus on the trucks and the roads and the different kinds of postal workers.
But increasingly, we understand that there is meaning in the mail, and a current paper in PNAS asserts that there may be meaning in the molecules known as non-coding RNA. Non-coding RNA, or ncRNA, is RNA that got there during transcription … when the DNA unzipps and makes a template for the production of a protein … but does not itself get involved in the protein coding process. This RNA is floating around in living tissue all the time. If it was simply “artifact” … meaning, it just fell off during protein coding like the sawdust and bent useless nails that accumulate on your garage floor when you are building new bookshelves, for instance, then its distribution in the tissue would have a random pattern. If, on the other hand, certain ncRNA’s showed up in certain tissues in a way not explained by the “sawdust” model, maybe it does have a function.
Here is part of the abstract from the paper by Mercer et al,:
A major proportion of the mammalian transcriptome comprises long RNAs that have little or no protein-coding capacity (ncRNAs). Only a handful of such transcripts have been examined in detail, and it is unknown whether this class of transcript is generally functional or merely artifact. … we identified 849 ncRNAs (of 1,328 examined) that are expressed in the adult mouse brain and found that the majority were associated with specific neuroanatomical regions, cell types, or sub cellular compartments. … Comparisons between the expression profiles of ncRNAs and their associated protein-coding genes revealed complex relationships that, in combination with the specific expression profiles exhibited at both regional and subcellular levels, are inconsistent with the notion that they are transcriptional noise or artifacts of chromatin remodeling. Our results show that the majority of ncRNAs are expressed in the brain and provide strong evidence that the majority of processed transcripts with no protein-coding capacity function intrinsically as RNAs.
The research was done by looking at data from an “atlas” – the ABA – of the mouse brain, which is a huge pile of information on DNA known to be expressed in each part of the mouse brain. (There are atlasses for other species, including humans, and the present study also lookd in the human atlas.)
An example of patterning in ncRNA is with antisense transcribed genes (that is a common way that genes are expressed in mammals):
Transcriptional profiling has shown that antisense transcription is prevalent in the mammalian genome (33), and several studies indicate its importance in regulating diverse biological functions. We identified 44 ncRNAs in the ABA that are antisense to the exons of protein-coding genes… These antisense ncRNAs often share varied and complex expression relationships with their sense protein-coding transcripts. For example, P-rex1, a gene involved in neuronal migration, and its antisense ncRNA partner are both expressed in the cerebral cortex (Fig. 2b). However, in the cerebellum, P-rex1 is specifically expressed in the Purkinje cell layer, whereas the associated antisense ncRNA is expressed within the granular and molecular layer (Fig. 2c).
That’s a little thick, but we can parse it. This observation as well as other observations made in this study are relevant because of the following hypothesis: ncRNA is an artifact, or a side effect, of gene expression. This predicts a random but correlated pattern of association … where a gene is expressed, the associated ncRNA bits should be found, more or less, like the sawdust from your bookshelves is found on the floor of your garage only when you make bookshelves in your garage. If the ncRNA is missing sometimes where related transcription happens, or if it is more abundant than it should be, depending on what tissue you look in (your garage vs. your bathroom… different tissues in this analogy), then this hypothesis is untenable. This would suggest that ncRNA is not functionless.
ADDED: I recommend this critique of the study at Genomicron.
Mercer, Tim R., Marcel E. Dinger, Susan M. Sunkin, Mark F. Mehler, and John S. Mattick. (2008) Specific expression of long noncoding RNAs in the mouse brain. PNAS | January 15, 2008 | vol. 105 | no. 2 | 716-721, Open Access Article