If you missed it, today’s NY Times Science section has been dedicated to “The Gene” a concept invented 99 years ago by Wilhelm Johanssen.
Overall, the articles were very good, however as a scientist who wants to explain basic concepts of molecular biology to the masses, I have a few problems.
First, there is a misplacement of emphasis on how information flows from DNA to phenotype. The idea that the articles try to convey is that in the old model went along theses lines: DNA contains genes, each is copied into RNAs that are then translated into a certain type of protein … and then presto the end result is a fully formed organism. Now apparently the new model is that the DNA encodes more than genes, it has all sorts of weird stuff mostly noncoding-RNAs, and that there is mass confusion in the biomedical sciences. There is also this epigenetics (as in DNA methylation and histone modification) our simple ideas have to be thrown out the window.
To this I say, WOT?
First of all there hasn’t been a clear paradigm shift in the biomedical sciences. In fact our view has essentially remained unchanged since the 1970s. DNA encodes three different types of information.
1 – Protein-coding genes. These are the “classic” genes that get transcribed into RNA that is subsequently spliced, processed and then exported into the cytoplasm where they are translated into proteins. These genes are highly conserved and contain ALL the information needed to make proteins. Proteins act as the tools, machines and scaffolds that are found inside and outside of the cell. They are highly versatile and have extremely complicated functions. They are modified, transported, and eventually destroyed. In any biological process, such as cell migration or cholesterol biosynthesis, proteins are the main players that determine how thee activities will proceed. Most biologists out there study what proteins actually do.
2 – Genes that specify non-coding RNAs (ncRNAs). Here is where much of the hoopla has centered. These genes produce two types of RNAs, catalytic RNAs that act like molecular machines and are known as ribozymes and non catalytic small RNAs that modify the expression of classic genes.
This first class of RNAs, which include ribosomal RNAs, tRNAs and snRNAs, are the most ancient genes in all of biology. They have been known by the biological community for the past 40 years and have some of the most important activities inside the cell, such as protein synthesis and RNA splicing. Are there other catalytic RNAs? There must be, but few have been found. Protein enzymes are just better – they are smaller and much more versatile then their RNA counterparts. The RNA enzymes that are still with us are probably too central to biological function for them to be replaced. There is by far more ribosomal RNA in every cell then there is DNA or protein. We are basically huge ribosome creatures. Our cells spend most of their efforts making ribosomes and regulating ribosomal function.
So now we have the second class of genes that specify, ncRNAs, the newly discovered small regulatory RNAs. Although proteins are generally more versatile, RNA does have one advantage, it is a molecule that can pair up with complementary sequences found on other RNA or DNA molecules quite easily. These 20ish nucleotide long RNA creatures employ this advantage to help recognize target RNA and/or DNA sequences that are then acted on by the real enzymes, proteins. So when a miRNA binds to a region of an mRNA, the RISC protein complex can then direct this mRNA to P-bodies where the RNA is silenced. Fundamentally miRNAs regulate how mRNAs are translated.
In addition to these two types of RNAs, people have noticed that there is all this non-specific RNA being transcribed off of non-conserved sequence. Most scientist believe that the actual content of this RNA is probably not important although some believe that the actual act of transcription itself may actually play a role in regulating how the genome is organized.
That leads us to the last important bit of genomic code … the one that is constantly being ignore in popular science.
3 – DNA elements that modulates how genes are transcribed into RNA and how the genome is organized. There are promoters, enhancers, silencers and many other functional DNA bits. And this is the part that irritates me the most. These elements have been known for ever!!!! But journalists and certain bioinformatics specialists either ignore, downplay or are simply ignorant of their existence. Take a look at this sentence from Carl Zimmer’s article:
As part of the Encode project, scientists identified the location of variations in DNA that have been linked to common diseases like cancer. A third of those variations were far from any protein-coding gene. Understanding how noncoding RNA works may help scientists figure out how to use drugs to counteract genetic risks for diseases.
NO!!! The vast majority of these mutations that are away from protein-coding genes probably map to these DNA elements.
In some ways these elements are the most interesting bit of the genome but some of the most complicated. They are not only ill-defined, but in addition it we have no good way to predict how they will influence the transcription of nearby genes into RNA. This is the true black box of the genome. What the ENCODE results suggested was that these DNA elements are highly conserved, and make up a significant chunk of the genome (at the very least, the same % as the coding bits). The difference between a neuron, and a liver cell, is mostly due to the protein content and this in turn is dictated mostly by how these DNA elements activate or repress the transcription of nearby genes. Sure small ncRNAs and epigenetic mechanisms modulate gene expression, but a HUGE part of the picture lies in how these ill-defined DNA regulator elements affect the the transcription of protein-coding and non-coding genes.
Sure, it is unclear how transcription, epigenetic marks and DNA elements talk to eachother to generate expression patterns, but DNA elements are a fundamental part of the puzzle, one that does not register in our public understanding of biological systems.