Distinct RNA Binding Proteins for Distinct Classes of mRNAs

Over the last few years it has become increasing clear that gene expression is partially regulated at the mRNA level.

What do I mean by that?

In eukaryotic cells, the first step of gene expression occurs in the nucleus when regions of DNA are transcribed into RNA. These "transcripts" then encounter RNA binding proteins (RBPs), some which act to process the RNA into a mature message, others that simply bind the mRNA. The whole collection of RNA and its associated proteins is often referred to as the Ribonuclear Particle (RNP). The protein content will dictate whether the RNA is spliced, exported from the nucleus to the cytoplasm, transported to various cytoplasmic sites, translated by the ribosome into protein, or degraded. You can then imagine that in any given cell, gene expression relies on the levels of various

1) transcription factors, which dictate which mRNAs are going to be synthesized
2) RBPs, which dictate how these mRNAs are going to be treated

In addition the types of RBPs bound to any one transcript may vary over its life time. Some may accompany the mRNA from the nucleus to the cytoplasm then fall off, some may bind when the transcript has reached its cytoplasmic destination, some may bind during translation, and others may only recognize the RNA at the very end of its life.

So how many of these RBPs are there? In yeast there are over 500. If you take into account the size of the yeast genome (4,300 genes) that means that over 10% of the protein-coding genes of this organism encode RBPs!

So now that we have some context, I would like to describe some results from a recent PLoS Biology paper where scientists examined the types of mRNAs bound to 40 different RNA binding proteins in budding yeast, S cerevisiae.

Here are some interesting points from the study:

- The number of mRNAs bound by each RBP varied greatly from a dozen to over a thousand (so 0.2% - 25% of all mRNA types found in the yeast genome).
- 9 RBPs accounted for about 75% of all the RNA-RBP interactions.
- Each type of mRNA was associated with 2.8 RBPs from the list of 40. (If we were to extrapolate up, that would mean that each transcript bound to about 35 of the 500 RBPs found in the yeast genome)
- 40% of the RBPs associated with unprocessed mRNAs, indicating that they bound to the transcripts before they were mature.

One of the most surprising aspects of the paper was that each RBP tended to associate with mRNAs encoding related proteins. Hints of this type of trend had been uncovered previously when it was demonstrated that the yeast protein Puf3 bound predominantly to mRNAs encoding mitochondrial proteins. How general is this phenomena? I've included a table from the new paper where the authors calculates how each RBP (columns) binds to transcripts that encode various types of proteins (rows, A) or various cellular activities (rows, B):


Yes it appears that each RBP associates with distinct classes of transcripts. So what does this mean? Well it would suggest that the levels of any one given RBP could potentially affect a whole family of related mRNAs. In other words RBPs may help to coordinately the expression of related genes at the level or mRNA metabolism.

Hogan DJ, Riordan DP, Gerber AG, Herschlag D, and Brown PO
Diverse RNA-Binding Proteins Interact with Functionally Related Sets of RNAs, Suggesting an Extensive Regulatory System
PLoS Biol (2008) 6: e255

Gerber AP, Herschlag D, Brown PO
Extensive association of functionally and cytotopically related mRNAs with Puf family RNA-binding proteins in yeast.
PLoS Biol (2004) 2:E79

More like this

That's fucking fascinating! This is yet another example of an independent gene expression regulatory mechanism--joining transcriptional factors and small non-coding RNAs--that provides opportunities for exponential increase in regulatory flexibility. Were they able to identify mRNA sequence motifs that predicted which RBPs would be bound to a particular transcript?

And kudos to these authors for publishing something in PLoS Biology that would have had an excellent shot at making it into a high-profile closed-access journal.

How are we supposed to be able to separate the wheat from the chaff in the study like this? Based on the results, it seems that there is no single protein that is not "specific RNA binding protein"...

Those 40 hand-picked proteins need, at a minimum, to be compared to 40 other random proteins to see how specific the employed purification was.

Comrade PP,

Were they able to identify mRNA sequence motifs that predicted which RBPs would be bound to a particular transcript?

They tried using two separate search algorithms, but they restricted their searches (for obvious reasons) to the UTRs. (They would have missed the SSCR, a motif I discovered that is present in the signal-sequence coding region of mRNAs encoding secreted proteins). In some cases their predictions matched what was know to be the RBPs' binding motifs, in other cases they missed it. Unfortunately they did not try to biochemically validate their predictions.


They did TAP-tag and purify two random proteins to control for unspecific mRNA binding, from the paper:

Nce102 was associated with eight distinct RNAs, whereas Bud27 was associated with two putative mRNA targets; interestingly, one of these putative targets (RPA190) was reproducibly enriched more than 300-fold, and both targets were lost when immunopurifications were performed in the absence of Mg2+ (unpublished data). Because neither Nce102 nor Bud27 was known or expected to associate with RNA, the RNAs identified as their targets may be spurious, but we cannot exclude the possibility that the RNA interactions we found for these two proteins are real and significant. Regardless, they provide a benchmark estimate of the number of RNA targets falsely identified for other RBPs.

One thing that did annoy me was that the authors did not try to estimate the absolute amount of RBPs molecules bound per RNA molecule (as opposed to qualitative analysis of what "types" of RBPs bound to what "types" of mRNAs). You could imagine that RBP X may be tightly bound to transcript A, B and C but weakly bound to transcript D and E. It would be informative if we could find out that for every 100 transcripts from gene E, 21 molecules of RBP X were bound. In this type of quantitative analysis we could better estimate what the background association between a protein and some random mRNA molecule might be.

Yeah, I saw it. That's my point - two is not enough! Particularly two that were hand-picked. A good, clean experiment on a complex system typically should have more controls than experimental points. There is always, always, always a background. And, the background can be protein-dependent. AND, concentration-dependent. Considering that the downstream RNA step is not quantitative, and no attempt was made to access tagged proteins expression levels, I would worry a lot of what the sequencing really tells me. More than that, considering crudeness of the method, I woudln't be surprised if 10 individual experiments done with the SAME protein but with different hands would produce overlapping but different results.

Alex says,

Over the last few years it has become increasing clear that gene expression is partially regulated at the mRNA level.

I was teaching this over twenty-five years ago using the common bacteria and phage examples. Starting about twenty years ago I began adding eukaryotic examples.

Does that count as "last few years"? :-)

I don't understand why your generation likes to pretend that these are recently discovered phenomena. Is it because the current generation doesn't know their history?

By Larry Moran (not verified) on 16 Nov 2008 #permalink

OK Larry let me rephrase that.

Over the last few years it has become increasing clear that the expression of the vast majority of genes is regulated, in part, at various steps of mRNA metabolism.

My excusses to the old generation ;)

the vast majority of genes are regulated in part at various stages of mRNA metabolism

But that, too, is a total truism. An age old observation that mRNA levels correlate poorly with the protein expression levels is all you need to conclude that.

An age old observation that mRNA levels correlate poorly with the protein expression levels is all you need to conclude that.

That observation could be ascribed to variable translation rates for each mRNA and different half-lives for the resulting protein products. It was never clear whether steps such as mRNA nuclear export and cytoplasmic localization played important roles in regulating gene expression.

Larry is precisely correct (not unusually for him). The degree to which nearly the entire field of eukaryotic RNA regulation simply ignores decades worth of first-rate work done in bacterial and bacteriophage systems is nothing short of fucking disgraceful.

It is terrible, shoddy scholarship. A fucking disgrace.

By George Smiley (not verified) on 18 Nov 2008 #permalink

"It is terrible, shoddy scholarship. A fucking disgrace."

Lets vent mutherfuckers! Mediocrity is queer, and it still breeds. Careless people don't give a fart about scholarship. Bring back electric shock therapy.