In the effort to help us define a few basic concepts, PZ started out by giving us a nice simple definition of a gene, but as he, rightly noted:
I tell you right now that if I asked a half dozen different biologists to help me out with this, they'd rip into it and add a thousand qualifiers, and it would never get done.
Well, okay, technically speaking he didn't ask me for help. But, since I'm a biologist, as soon as I looked at the definition that he chose, from Modern Genetic Analysis (by Griffiths, Lewontin, Miller, and Gelbart), I couldn't help but find something wrong.
The definition from Griffiths, et. al. reads:
A gene is an operational region of the chromosomal DNA, part of which can be transcribed into a functional RNA at the correct time and place during development. Thus, the gene is composed of the transcribed region and adjacent regulatory regions.
But that sounds like a very nice definition and I know you agree with most of it. What is that you don't like?
I have a problem with the words "chromosomal," "DNA," and I guess even, the phrase "the correct time and place during development."
Why are they problems?
First, the easy part. I disagree with phrase "correct time and place during development" because of cancer cells. Cancer cells contain genes that do not behave nicely. The genes involved in turning a cell from the right path and onto the path of cancer are not getting transcribed at the correct time and place in development. Quite the contrary.
But they are still genes. So, it's easy to cut that part out of the definition.
Next, let's take the terms "chromosomal" and "DNA."
I object to those words because genes can be found on pieces of nucleic acids that are neither part of a chromosome, nor are they DNA. Sometimes the extra-chromosomal nucleic acids are plasmids, sometimes they are viruses. Sometimes they are made of DNA, sometimes they are made of RNA. But they still contain genes.
Alright, then, what's your definition?
A gene is a heritable string of nucleotides that can be transcribed, creating a molecule with biological activity.
[1/22/2007 Added the term "heritable."]
I think this works because it covers all the exceptions and tricky situations that I know about.
The most tricky thing that is up for debate in my definition of a gene, is whether or not we include the regulatory sequences. In practice, when we talk about cloning a gene, we consider a gene to be cloned when we've put the functional part of it into some other DNA molecule (a plasmid or virus). Cloning a "gene" doesn't require cloning the promoter, enhancer, or other kinds of regulatory sequences.
If we compare a gene to a light bulb, the regulatory sequences would be the switch that turns the light on and off, along with the dimmer mechanism that we use to control brightness. If I use a lamp as a metaphor, then, to me, the gene is light bulb, and the regulatory sequences are the pieces that determine whether or not the light gets turned on.
And a pseudogene would a light bulb that's missing the filament.
- Log in to post comments
As I pointed out in the thread chez PZ I think that the regulatory sequences have to be included. Promoter sequences may be hard to define. However, the same holds true for your definition because at least in eukaryotes transcription termination is not well defined. PolyA signals at which the primary transcript is truncated and poly-adenylated can be easily identified. Still, what do you do in cases of alternatively used polyA signals. Are you dealing with two different genes then?
In addition,your claims
rather refer to cDNA cloning. If you want to interfere with a gene's function in vivo you indeed have to study its complete structure. If you use a cDNA in vivo you have to complement the missing endogenous functions of the gene by the addition of a promoter and a polyA signal. The only in vivo methods I know that don't require the addition of regulatory sequences are posttranscriptional gene silencing methods like anti-sense RNA, RNAi or intrabodies.
Having a light bulb may be fine. However, it only makes sense when you also have a lamp, a switch and a power line.
At the end of the day everybody uses the gene definition that is most appropriate for his work.
As you say, Sparc:
Yes, indeed. In many cases, people that clone genes are not cloning cDNA. Many eucaryotes (C. elegans, protozoans, yeast, and many viruses) do not have introns and there are only a few introns in bacterial genes, so you can't assume that cloning DNA is synonymous with cloning cDNA.
When I define a "gene" I am not writing about all the pieces that you need to make a gene work, I'm writing about the way we use the word in biotechnology.
If you look a bit at PubMed, you can see all kinds of cases where people have, for example, cloned the GFP gene [GFP = green fluorescent protein] into constructs that include the regulatory signals, i.e. the promoter and polyA signals. Often, they're investigating the signals that turn on the activity of a specific promoter.
If you look at how people describe these activities, they do not describe the work as cloning part of the GFP gene into a construct, they almost always describe it as cloning the GFP gene into a construct.
So, while, I agree, in a cell, you would never find a gene sequence functioning all by itself, on it's own; just like you would never find a light bulb glowing on it's own without the lamp; when we talk to each other and when we publish in scientific journals, we tend to discuss the regulatory signals that control gene expression as entities that are separate from the coding signals - which are the gene.
Now, about your question of whether alternatively polyadenylated transcripts should be considered one gene or two, well, I think that deserves it's own post.
The question remains though, if they used the term gene properly. IMO, cloning the cDNA of a gene should be diffferentiated from the isolation of a complete gene. Leaving regions outside the transcribed sequences aside, you otherwise even run into problems including UTRs and introns within your definition.
Another issue one should keep in mind is that genes are categorized according to the polymerase used to transcribe them (PolI, PolII and PolIII genes). Since each polymerase requires distinct promoter sequences (and different termination signals) it appears useful to include regulatory sequences in the gene definition.
PS.:I don't like commenting on the same issue on different blogs to much. Therefore I will put this comment also on PZ's blog and continue there.
"A gene is a string of nucleotides that can be transcribed, creating a molecule with biological activity."
This definition says nothing whatever about heritability, and I therefore find it misleading, at best. Of course, PZ's definition only implies heritability through including the word "chromosomal."
The real "basics" definition of gene should probably include something along the lines of "the smallest unit of biochemical information capable of being transmitted from generation to generation of an organism."
Chezjake,
you raise a good point about heritability. I will modify the definition, then to say:
The idea that it would be "the smallest unit biochemical information" wouldn't be correct since that would include proteins, and things like prions. I think while prions can replicate, in a sense, it's not correct to call them genes.
Thank you, Sandra.
I'd temporarily forgotten about prions, although I'm not sure they qualify as "organisms" either.
There's another "basics" question -- how do you define an organism? It appears most dictionary definitions would seem to exclude viruses. Is this appropriate?