Exponential Decay of Quality Data

We've noticed that our cumulative knowledge of any individual process is inversely proportional to the number of researchers striving (i.e. contaminating) to gather data.

Take APC, no not that APC, but the Adenomatous Polyposis Coli ... too many people study the damn thing and ... who knows what it does.

And the Golgi? Who knows where it goes in Mitosis? (and frankly who cares)

But if our theory is correct, we're in big trouble. From the not so latest Molecular Cell:

The biomedical literature is growing at a double-exponential pace; over the last 20 years, the total size of MEDLINE (the database searched by PubMed) has grown at a 4.2% compounded annual growth rate, and the number of new entries in MEDLINE each year has grown at a compounded annual growth rate of 3.1% (see Figure 1). There are now more than 16,000,000 publications in MEDLINE; more than three million of those were published in the last 5 years alone. The number of MEDLINE entries with a 2005 publication date was 666,029--more than 1800 per day. .

Science is doomed to drown in its own excrement.

But wait, there's more. In the next couple of years we predict the emergence of a new discipline. Systems Biological PubMed Searches, where the behavior of gradstudents/postdocs "PubMed-ing" will be simulated to ... well we are not sure why we would want to do this. But hold on, THE FUTURE IS NOW! But instead of Systems Biological PubMed Searches, it's called Biomedical Language Processing. Oy vey. Read the rest of this in the Molecular Cell article.

Ref: Lawrence Hunter and K. Bretonnel Cohen Biomedical Language Processing: What's Beyond PubMed? Mol Cell (06) 21:589-594

Tags

More like this

"We've noticed that our cumulative knowledge of any individual process is inversely proportional to the number of researchers striving (i.e. contaminating) to gather data."

I second that. p53. It does everything and nothing.

Let's have a race between the literature in pubmed and the sequences in NCBI. On your marks, get set, go... I'll check back in one year.

With the incredible boom of publications and information that is available, we have to be able to manage all that data and not to find a way of reducing that data (where you suggesting this?). I found "Biomedical Language Processing" very usefull when you are digging into new fields, and to have a quick look at a protein function. take a look at iHOP (http://www.ihop-net.org/UniPub/iHOP/), a very good tool for looking at this. much better that pubmed for this type of questions. Also, without tools like this, it is impossible for our brains to process all this data. We must learn how to use these tools in combination with our brain's unique capacity of generating knowledge, and only then, science will move forward.

I think that the growth in databases such as MEDLINE and PubMed has only increased the need for more specific tools that can be used by researchers in their particular field. I have noticed especially over the past couple years that postdocs in our lab are seeking out other sources of literature or other material that is specifically suited to their target area of research.