Exponential Decay of Quality Data

We've noticed that our cumulative knowledge of any individual process is inversely proportional to the number of researchers striving (i.e. contaminating) to gather data.

Take APC, no not that APC, but the Adenomatous Polyposis Coli ... too many people study the damn thing and ... who knows what it does.

And the Golgi? Who knows where it goes in Mitosis? (and frankly who cares)

But if our theory is correct, we're in big trouble. From the not so latest Molecular Cell:

The biomedical literature is growing at a double-exponential pace; over the last 20 years, the total size of MEDLINE (the database searched by PubMed) has grown at a 4.2% compounded annual growth rate, and the number of new entries in MEDLINE each year has grown at a compounded annual growth rate of 3.1% (see Figure 1). There are now more than 16,000,000 publications in MEDLINE; more than three million of those were published in the last 5 years alone. The number of MEDLINE entries with a 2005 publication date was 666,029--more than 1800 per day. .

Science is doomed to drown in its own excrement.

But wait, there's more. In the next couple of years we predict the emergence of a new discipline. Systems Biological PubMed Searches, where the behavior of gradstudents/postdocs "PubMed-ing" will be simulated to ... well we are not sure why we would want to do this. But hold on, THE FUTURE IS NOW! But instead of Systems Biological PubMed Searches, it's called Biomedical Language Processing. Oy vey. Read the rest of this in the Molecular Cell article.

Ref: Lawrence Hunter and K. Bretonnel Cohen Biomedical Language Processing: What's Beyond PubMed? Mol Cell (06) 21:589-594

Tags

More like this

(from my old blog) Every subject has its lingo and its share of strange terms. Add abbreviations and acronyms, and certain areas of expertise can be almost incomprehensible. Then there is Biology. Life has a diversification machine, evolution. Thus those who study life (i.e. Biologists) have lots…
Friday - time to take a look at the new articles in PLoS Computational Biology, Genetics and Pathogens - check them all out, but here are a couple of picks: Exploration of Small RNAs: There is substantial interest in noncoding RNAs (ncRNAs), which play an essential role in complex biological…
I've written much about the Nuclear Pore Complex (NPC). This large molecular gate controls the flow of molecules into and out of the nucleus. Recent work (see this post and this new paper) describes how filaments containing "FG repeats" form a matrix in the center of the pore that blocks the…
No soccer today. So instead of spending time watching others run around, go read the two papers, published in last week's Nature on Golgi maturation. Proteins that need to traverse, or be embedded, within membranes are synthesized on the surface of the endoplasmic reticulum (ER). At the ER…

"We've noticed that our cumulative knowledge of any individual process is inversely proportional to the number of researchers striving (i.e. contaminating) to gather data."

I second that. p53. It does everything and nothing.

Let's have a race between the literature in pubmed and the sequences in NCBI. On your marks, get set, go... I'll check back in one year.

With the incredible boom of publications and information that is available, we have to be able to manage all that data and not to find a way of reducing that data (where you suggesting this?). I found "Biomedical Language Processing" very usefull when you are digging into new fields, and to have a quick look at a protein function. take a look at iHOP (http://www.ihop-net.org/UniPub/iHOP/), a very good tool for looking at this. much better that pubmed for this type of questions. Also, without tools like this, it is impossible for our brains to process all this data. We must learn how to use these tools in combination with our brain's unique capacity of generating knowledge, and only then, science will move forward.

I think that the growth in databases such as MEDLINE and PubMed has only increased the need for more specific tools that can be used by researchers in their particular field. I have noticed especially over the past couple years that postdocs in our lab are seeking out other sources of literature or other material that is specifically suited to their target area of research.