More "Junk" DNA is Not

By gregladen on December 12, 2007.

Some of the base pairs in a given genome are strung together into templates that code for proteins or RNA molecules. These are the classic "genes." Other base pairs probably have little or no function. Among the DNA that is not in classic gene-templates, however, there is a lot of important information, including "control regions."

How much of each "type" of DNA exists in a particular genome varies. A recent study suggests that the currently used methods for scanning DNA for regulatory sequences may systematically m miss more than half of that information.

Looking specifically at the DNA surrounding the zebrafish phox2b gene, McGaughey et. al. using five different measures of "evolutionary constraint" (indicating functionality of a particular DNA sequence), found that each method misses regulatory sequences. They estimate that between 29 and 61% of actual regulatory elements are missed by these various commonly used methods. They conclude:

the noncoding functional component of vertebrate genomes may far exceed estimates predicated on evolutionary constraint.

When searching for non-coding but still functional (as regulatory) sequences, genetic researchers develop "training sets" of known functional sequences that software then uses to find matches elsewhere. The training sets are developed by messing around with an organism's DNA and observing the effects of this treatment on development. It is painstaking work, and it may be impractical to carry out this work on just any organism. Some organisms make better lab subjects than others owing to availability, the difficulties of providing the proper developmental environment, etc.

A key assumption of this method is that a given DNA sequence that is functional (a regulatory sequence) will not vary randomly across distantly related organisms because it is under selection.

The information derived from this detailed work can then be structured into statistical probes that can be applied with software to large genetic databases. In theory, a group of likely coding sequences would match homologous (similar by common descent) sequences in a wide range of organisms.

For this study, the researchers essentially tested the assumptions of this model by bringing it (the model) back to the lab bench. They found that the predicted proportion of regulatory sequences from each of the five different methods missed actual on-the-ground (in the petri dish, as it were) sequences at the rates cited above.

In this study, the researchers sliced up the DNA sequence around the phox2b gene into small bits, tagged each bit with a fluorescent protein. Then, they inserted each bit into zebrafish embryos. If the bit being tested turned out to be a regulatory sequence, it would become part of the growing tissues and the glowing protein woudl be visible. If not, no glowing. This resulted in several cool images such as this one:

(From Figure 1. In situ hybridization (ISH) of endogenous phox2b expression. ISH was performed on wild-type zebrafish embryos from 24 to 96 hpf using a dig-labeled phox2b RNA probe. (a) Dorsal view of 24-hpf embryo, illustrating phox2b expression in the hindbrain and anterior spinal column. (b) Dorsal view of 48-hpf embryo. hb, hindbrain; mo, medulla oblongata; ventral diencephalon (filled arrowhead), locus coeruleus (open arrowhead). (c) Lateral view of 48-hpf embryo. Rhombomeres of the hindbrain are numbered; mo, medulla oblongata; locus coeruleus (open arrowhead); cranial ganglia (black arrow). (d) Lateral view of the trunk of a 96-hpf embryo. Spinal cord (open dotted arrow) and ENS (open arrows).)

They uncovered a total of 17 discrete DNA segments that had the ability to make fish glow in the right cells. The team then analyzed the entire region around the phox2b gene using the five commonly used computer programs that compute sequence conservation; these established methods picked up only 29 percent to 61 percent of the phox2b regulators McCallion identified in the zebrafish experiments.

The situation is further complicated by the phylogenetic distances that exist among living organisms. It can be safely assumed that although a particular regulatory sequence is constrained as to what its exact composition (of base pairs) can be, there is also room for variation, some random, some adaptive. This variation can be expected to increase, on average, with phylogenetic distance between any to organisms. Applying this sort of statistical probing technique, if based on one set of organisms (say, one species of zebra fish) to closely related organisms (say, some other species of zebra fish) should work very well. But applying the same search method to more distantly related fish, such as fugu (puffer fish), would be different. These two types of fish separated about 350 million years ago, so there is a cumulative 700 million years of "evolutionary time" separating them. Using data derived from zebra fish to probe mammalian genomes would be even more extreme.

According to one of the paper's authors:

The problem with this approach ... is that it's often throwing the baby out with the bath water. So while we believe sequence conservation is a good method to begin finding regulatory elements, to fully understand our genome we need other approaches to find the missing regulatory elements.

Our data supports the recent NIH encyclopedia of DNA elements project, which suggests that many DNA sequences that bind to regulatory proteins are in fact not conserved. I hope this pilot shows that these types of analyses can be worthwhile, especially now that they can be done quickly and easily in zebrafish.

Sources

McGaughey. McGaughey, David M. , Ryan M. Vinton, Jimmy Huynh, Amr Al-Saif, Michael A. Beer, and Andrew S. McCallion. Metrics of sequence constraint overlook regulatory sequences in an exhaustive analysis at phox2b . Genome Res. Published December 10, 2007, 10.1101/gr.6929408, ().

Johns Hopkins Press Release

More like this

One possible criticism of their work: is it possible that a particular isolated sequence might affect expression of GFP (or whatever they used), yet NOT be important for regulating phox2b?

I vaguely recall studies claiming that if you insert completely random bits of DNA upstream of a promoter-less reporter gene, a surprisingly high fraction will result in detectable expression.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

Universities Can Agree On All Hate Speech Except Antisemitism

More by this author

Last Post

October 30, 2017

This is my last post at Scienceblogs.com. In the future I will be blogging at Greg Laden's blog, located at its original home at gregladen.com. I have a feeling that Scienceblogs will not last long without me. What do you think? :) But seriously, I'll be talking about the story of the current…

Hacking Voting Machines

October 10, 2017

In every area of life, but especially in the overlapping realms of technology, science, and health, misunderstanding how things work can be widespread, and that misunderstanding can lead to problems. In the area of voting, the main problem seems to be the expenditure of great amounts of outrage and…

On that chilling law suit against the environmental groups

October 5, 2017

... which I've posted on before ... there are new developments, summarized at Inside Climate News: Invoking the Racketeer Influenced and Corrupt Organizations Act, or RICO, a federal conspiracy law devised to ensnare mobsters, the suit accuses the organizations, as well as several green campaigners…

One response to the Las Vegas Shooting

October 5, 2017

from a major non profit, click through the the X Blog to read the press release.

Watch Jeff Merkley Wipe Floor With Trump's William Wehrum

October 5, 2017

William Wehrum is a lawyer and once, apparently, worked for the EPA. Trump is trying to appoint him to be assistant administrator for air and radiation. This is a reasonably important job that concerns many aspects of the environment. Watch: https://twitter.com/SenJeffMerkley/status/…

More like this

Last Post

Hacking Voting Machines

On that chilling law suit against the environmental groups

One response to the Las Vegas Shooting

Watch Jeff Merkley Wipe Floor With Trump's William Wehrum

Vote For Jeanne Shaheen, New Hampshire Senate

Sometimes, Size is Everything!

I am Spartacus! or: Orac applies some Insolence to a rather confused antivaccine blogger