“What’s in a name? that which we call a rose
By any other name would smell as sweet”
– Juliet, from Romeo and Juliet by William Shakespeare
It seems we have two topics – why do we need a new name at all? and why the current names (biologist, computational biologist, bioinformatician, etc.) don’t work. What really distinguishes a digital biologist from a regular, garden variety biologist? Why isn’t a digital biologist a computational biologist?
So, I brought along two “show and tell” items today, a picture and job posting, to help me explain.
First, the job posting. I saw this today on LinkedIn and thought it fit rather well.
Let’s parse the job post
Computational Biologist Washington Univ.
The Genome Analysis and Informatics Technology (GAIT) Center at Washington University School of Medicine in St. Louis is seeking Computational Biologists. … The GAIT center seeks to hire individuals to analyze DNA sequencing data in collaboration with biologists and physicians [the emphasis is mine].
Note – the employer makes a distinction between computational biologists and biologists. He expects the computational biologist to collaborate with biologists. He does not expect the computational biologist to be one.
The ideal candidate has a degree in computer science, biomedical engineering, or biology, …
As far as knowledge goes, the candidate most likely has a degree in computer science or engineering. Biologists aren’t excluded since biology is listed, too. Last.
…is fluent in one or more of the following languages: C, C++, Java, Python, or Ruby,
In other words, the candidate needs to be able to program but doesn’t seem to need any special abilities related to biology.
Don’t they expect a computational biologist to know any biology?
Well, towards the end, we get this:
…and has taken at least one undergraduate level Genetics class.
Okay – so all you need to have to be a computational biologist is a computer science degree and one genetics class.
Maybe a graph will help
Now, for our second show and tell item, I drew a Venn diagram.
It occurred to me, from our discussion the other day, that another important difference between digital biologists and regular biologists is the source of the biological data. Biologists are data producers and digital biologists are mostly data consumers.
Artificial data, oh my!
I’m specifying that biologists, digital and otherwise, work with biological data because computational biologists, statisticians, and bioinformaticists sometimes work with artificial data. I know, if you’re a biologist, the idea of artificial data is really weird and a bit suspicious. It certainly surprised me to learn that people made artificial data. But mathematical biologists and statisticians like these sorts of things.They find it helpful to have data sets that really are random, like a random collection of DNA sequences, or a set that follows a Poisson curve, or normal distribution. The most efficient way to get these data sets is to make them.
Anyway, unlike mathematical biologists and statisticians, biologists and digital biologists are more likely to use data that come from wet lab experiments.
Hey buddy, where’d you get that data?
The next factor is “where did the data come from?” I’m well aware that many biologists outsource some of their wet lab data collection to core laboratories. But for the most part, biologists get data from wet lab experiments or from activities where they go out and collect samples.
Digital biologists, on the other hand, get most, if not all, of their data from others: either public databases or collaborators.
I think the data source is an important difference between us. I often have to explain to school groups, who want to tour Geospiza, that the company doesn’t have a lab. Our work environment looks more like the “The Office” with blue cubicles, not a student’s vision of a high tech science lab.
And this brings me to the last point – why do we need a name at all?
Having a name is like wearing a name tag. You don’t wear a name tag for yourself, you wear it to help others. Just like we have mathematical biologists, computational biologists, evolutionary biologists, and so on, I think having a name help clarify what makes us different.
Digital biologists need a name because there are students and prospective scientists who don’t know about this things we do. They don’t know that we have all this really interesting data, already out there, in public databases that we can study. They don’t know that we can study the data with existing software tools. They don’t know that it’s important for us to think of the next generation and think how we get them interested in learning how to find data, evaluate data, and use it to understand biology.