Introns are parts of the gene that do not contain coding information, they have to be spliced out of precursor RNA to form mature messenger RNA (mRNAs). But ask most biologists and they'll tell you that in "higher eukaryotes" all genes have introns. All? They may reply, "well not quite". The most famous examples of intronless genes are the histone genes. Also many tRNA genes are intronless. But just how many intronless genes are there in the human genome? Well I just stumbled onto this site: Genome SEGE - Intronless Genes in Eukaryotes.
Here's a couple of graphs from the SEGE (Single Exonic Genes in Eukaryotes) website:
These stats were compiled in 2004, I'll have to do some searches in PubMed/NCBI to see what is the current count.
Here's an explanation for why we see different amounts (and sizes) of introns in different taxa.
From the website you link to:
1. Single exon genes are identified using the CDS FEATURE definition in the genome data.
It seems like they're ignoring introns located in the 5' and 3' UTRs.
2. Potential pseudogenes are identified by scanning for a polyA tail within 1000 nt from the stop codon and are preceded by a polyA signal.
I think they're trying to say that they excluded recently retroposed genes from their data. I don't know why they refer to them as pseudogenes. Pseudogenes don't necessarily need to be retroposed and retroposed genes aren't always pseudogenes.
check your timestamp doug! this is a bio-blog, not a physics one, no time travelling!
Sorry about the time travelling - was going to post this tomorrow. I forgot to change the "published" mode to "scheduled" mode.
I'm sure that the real number must be lower than 10% of all genes. Also the total # of genes they survey is close to 30,000 ... I'll have to check whether many of these "intronless genes" have been recently classified as pseudogenes. The last count of the # of genes in the human genome (according to Eric Landers) is ~19,000.
Lots of math in that paper! Funny enough, Lynch mentions that there are too many introns to be explained by neutral mechanisms. The surplus of introns was thus likely caused by positive selection for introns. Why? You need introns for NMD (nonsense mediated decay) which weeds out genes that accumulate premature stop codons (incidentally there are very few genes with introns in the 3' UTR ... such genes would be effectively destroyed by the NMD machinery). In contrast to vertibrates, NMD is intron independent in yeast and flies ... it is thought that NMD was intron dependent in the earliest eukaryotic anscester as plants use introns to detect NMD. Another point that he doesn't address is that in higher eukaryotes, the act of splicing catalyzes nuclear export. Thus introns serve to enhance expression. This may have evolve as a defence against the expression of genes derived from foreign pathogens such as viruses that have compact genomes. With these two phenomena it becomes harder to see how there could be so many intronless genes in the human genome.
Why? You need introns for NMD (nonsense mediated decay) which weeds out genes that accumulate premature stop codons
Are you sure about this? I assume that it is more likely that the majority of premature stop codons are generated by wrong splicing. Thus, NMD would account for elimination of wrongly spliced transcripts rather then as a control recation against PTCs fixed in the genome.
In addition, there was a recent paper (unfortunately I can't remember the authors) claiming that alternative splicing is a means to downregulate the majority of transcripts via NMD as a reaction to stress.
A reference to my claim
it is more likely that the majority of premature stop codons are generated by wrong splicing.
Hillman RT, Green RE, Brenner SE (2004): An unappreciated role for RNA surveillance.
Genome Biol. 5(2):R8
BACKGROUND: Nonsense-mediated mRNA decay (NMD) is a eukaryotic mRNA surveillance mechanism that detects and degrades mRNAs with premature termination codons (PTC+ mRNAs). In mammals, a termination codon is recognized as premature if it lies more than about 50 nucleotides upstream of the final intron position. More than a third of reliably inferred alternative splicing events in humans have been shown to result in PTC+ mRNA isoforms. As the mechanistic details of NMD have only recently been elucidated, we hypothesized that many PTC+ isoforms may have been cloned, characterized and deposited in the public databases, even though they would be targeted for degradation in vivo. RESULTS: We analyzed the human alternative protein isoforms described in the SWISS-PROT database and found that 144 (5.8% of 2,483) isoform sequences amenable to analysis, from 107 (7.9% of 1,363) SWISS-PROT entries, derive from PTC+ mRNA. CONCLUSIONS: For several of the PTC+ isoforms we identified, existing experimental evidence can be reinterpreted and is consistent with the action of NMD to degrade the transcripts. Several genes with mRNA isoforms that we identified as PTC+--calpain-10, the CDC-like kinases (CLKs) and LARD--show how previous experimental results may be understood in light of NMD.
I may be wrong about the stress issue. Still there are hypothesis coupling NMD with regulation of gene expression rather then elimination of genetically fixed PTCs:
Lewis BP, Green RE, Brenner SE (2003): Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans.
Proc Natl Acad Sci U S A. 100(1):189-92
To better understand the role of alternative splicing, we conducted a large-scale analysis of reliable alternative isoforms of known human genes. Each isoform was classified according to its splice pattern and supporting evidence. We found that one-third of the alternative transcripts examined contain premature termination codons, and most persist even after rigorous filtering by multiple methods. These transcripts are apparent targets of nonsense-mediated mRNA decay (NMD), a surveillance mechanism that selectively degrades nonsense mRNAs. Several of these transcripts are from genes for which alternative splicing is known to regulate protein expression by generating alternate isoforms that are differentially subjected to NMD. We propose that regulated unproductive splicing and translation (RUST), through the coupling of alternative splicing and NMD, may be a pervasive, underappreciated means of regulating protein expression.
Good point sparc. Thanks for the refs. I must say that if a transcript is targeted for degradation by NMD, it is likely that it will be destroyed before researchers can isolate the cDNA. Despite what I wrote (that PTCs accumulate genetically), I would suspect that most PTC are not generated by genetic mutations but by aberrant mRNA transcription and aberrant mRNA processing. Any mistakes in transcription or splicing that cause frame shifts will generate PTCs. NMD will destroy these transcripts - so I think of it as a surveillance mechanism. Now the cell may exploit NMD to regulate genes during certain conditions such as stress (in fact parts of the NMD machinery accumulate in what is called stress granules) but as a stated before most NMD substrates will be destroyed rapidly and be hard to isolate without inhibiting NMD itself.
I want to understand in topic intronless pseudogene for RNA -cDNA please explain