In my work, I've investigated mRNA distribution in cells. Many aspects of mRNA metabolism and regulation seem dependent on splicing. And so I've been doing some digging with respect to the survey of intronless genes that I wrote about yesterday. According to their bioinformatic analysis of the human genome, there are over 3000 coding genes that do not contain introns. Here's a comment on this survey from RPM:
From the website you link to:
1. Single exon genes are identified using the CDS FEATURE definition in the genome data.
It seems like they're ignoring introns located in the 5' and 3' UTRs.
Now it is very unlikely that genes would have introns in their 3'UTRs as this would activate nonsense mediate decay (NMD) which weeds out any transcripts that have premature termination codons (PTCs). Any stop codon occuring before a splice site (exon-exon junction) is identified as a PTC by the cell. But these genes may have intons within the 5'UTRs. I decided to check this out. I screened a very small subset of the SEGE (Single Exonic Genes in Eukaryotes) website by typing "Endoplamsc Retiuculum" and "human" in their search algorithm. I got seven hits:
-phosphatidylinositol glycan, class C (PIGC)
-dolichyl-phosphate mannosyltransferasepolypeptide 3 isoform 1 (DPM3)
-dolichyl-phosphate mannosyltransferasepolypeptide 3 isoform 2 (DPM3)
-bradykinin receptor B1 (BDKRB1)
-cytochrome P450, family 8, subfamily B,polypeptide 1 (CYP8B1)
Now all these genes seem real and are expressed. As for introns, 5 of them had introns within the 5'UTR (so RPM, it looks like you're right), none had introns within the 3'UTR as I suspected. Only two of the hits were bonafide intronless transcripts (DPM3 isoform 2, and the cytochrome P450 member). Now having written this, here is what I'm asking - does anyone know of a REAL list of intronless transcripts in the human genome?
Any stop codon occuring before a splice site (exon-exon junction) is identified as a PTC by the cell.
This is not correct: PTCs only induce NMD if they are located >50nt upstream of the next splice donor site.
An upcomming paper in the December issue of Molecular Biology and Evolution (Hong X, Scofield DG, Lynch M (2006): Intron Size, Abundance, and Distribution within Untranslated Regions of Genes. Mol Biol Evol 23(12):2392-404 (Epub 2006 Sep 15)) descries 5,236 human transcripts with 2,721 introns in 5ï¿½-UTRs, 37,508 introns in cds and only 75 in 3'-UTRs. The respective numbers for 4,527 murine transcripts are: 2,490 35,376 and 54.
I guess that I'll have to cross every t and dot every i from now on. To be complete for the non-RNA people the EJC (exon-junction complex), which must be kicked off to prevent NMD, gets deposited 50-55nt upstream of the exon-exon junction.
BTW thanks for the ref. I'll have to look at it ... maybe they note the number of bonafide "intronless transcripts" in the human genome.
BTW, does anybody know any non-terminal constitutively spliced exon. I have tried to find one in ASD by a trial and error approach, but any gene that came to my mind was annotated as alternatively spliced, even exons that have been described as being constitutive in older publications (maybe this is biased due to the EST alignment approach ASD is employing)
In trying to find this list of intronless genes, I keep bumping into papers that use the SEGE database as their source of intronless genes ... arrrggghhh!!!!
(I did find out that 90% of GCPR encoding genes in the human genome are intronless.)