Bug hunting is a BLAST

By sporte on July 23, 2007.

Last week I found a bug in the new NCBI BLAST interface.

Of course, I reported it to the NCBI help desk so it will probably get fixed sometime soon. But it occurred to me, especially after seeing people joke about whether computer science is really a science or not, that it might surprise people to learn how much of the scientific method goes into testing software and doing digital biology.

tags: blast, software testing, scientific method, science education

What happens when the scientific method isn't used?

I wrote earlier in January about applying scientific principles from the wet bench world to world of computer work and how appalled I was when it seemed that some crystallographers had left out some of the basic principles of doing good science - i.e. positive controls. The retraction of five crystallography papers left me wondering if, perhaps, the crystallographers had either missed learning the scientific method - or if they believed somehow that it didn't apply to computer work.

(I instructed the computer to run these calculations, so of course the computer is running the calculations correctly!)

Of course the scientific method does apply to computer experiments just as much as it applies to experiments at the wet bench. Software testing is a really good example.

I don't do very much testing, but every now and then, we all get called in to help find bugs and get them fixed before a new version of our software suite gets released to the public. During these times, I've found that having a scientific background and understanding the scientific method is invaluable.

I've even been able to apply the scientific method to testing and identifying bugs in other people's software, as this post will describe. Certainly, I'd rather not find bugs, but it is reassuring to know why programs are behaving a certain way.

Bug hunting is a BLAST
Earlier this summer, I had a strange experience in our Chautauqua course. We were using blastn to test some primer sequences and try to figure out if the primers would detect the correct sequences in the database. We had some really weird results and I couldn't figure out what was happening - at least not during our course. The college professors taking our course kept getting different results than I did, even though (so we thought) all the parameters were the same. After the course, I decided that the strange results were probably due to the new interface, and that I could solve the problem by logging out of my account before doing a search.

That was a nice hypothesis, but it was wrong. The explanation wasn't that simple.

Naturally, I found this out by accident while giving a BLAST workshop for beginners last week at the Fralin Biotechnology conference.

(I make all my best discoveries at the front of a classroom).

I decided to show the teachers attending the workshop the parameters that were getting used in a blast nucleotide search. Happily, we all clicked the Algorithm parameters link at the bottom of the NCBI BLAST web form.

Only we found a surprise.

Their web forms showed this:

While mine showed this:

It was puzzling to say the least.

We also looked at our BLAST results to see which paramters were used by BLAST. Where the parameters shown in the form, the same as the parameters used by the program?

They were.

In science terms, we recognized an unexpected phenomenon and we observed it more than once.

But what was going on?

Since I use the scientific method, the next steps were to propose an explanation and to see if I could repeat the phenomenon and predict when it would occur.

At the workshop, everyone in the room was using a Windows computer except for me. So our first hypothesis was that the strange result occurred because I was using a Mac.

And sure enough, I could repeat the behavior. Macs and PCs showed different parameters.

But there was something else.

Remember, you can never compare experiments where you've changed multiple variables. Here was a good example. It wasn't just the platforms that were different. I was using Safari on the Mac and the teachers were using either Firefox or IE on the Windows computers.

The next step was to try and reproduce the experiment, but fewer variables. Part of the scientific method also involves testing alternative explanations for phenomena. I decided to investigate using Safari and Firefox, side by side, on my Mac.

That was the answer. When I used Safari to access NCBI BLAST and clicked the the radio button in front of "More dissimilar sequences," nothing happened.

When I used FireFox, two things were different. First, in Firefox, selecting the NR database automatically caused the Nucleotide collection database to be selected. Second, when I clicked the radio button in front of "More dissimilar sequences," the parameters for the Match/Mismatch scores changed.

Good science is always reproducible. These results were reproducible, too.

I still don't know which of the two behaviors is correct. But, now at least I know that there's a problem.

And I'm reminded, as usual, that you should not take anything for granted.

Go ahead. Click that Algorithm parameters link at the bottom of the BLAST form. Make sure that you know what experimental conditions (i.e. parameters) you're using when you run BLAST.

Those values are just as important as the conditions that you use for doing PCR.

POSTSCRIPT: and just like so many things in science, just when you think you know the answer, you can find out that there were a few more details that you missed. I realized tonight, that the problem only occurs with Safari 3, and not with Safari 2. It's the penalty for trying beta-version software, I guess.

More like this

Basics: Standard Deviation

When we look at a the data for a population+ often the first thing we do is look at the mean. But even if we know that the distribution

Seasons, short and simple

I love this question: Why is it warmer in the summer than in the winter (for the Northern hemisphere)? Go ahead and ask your friends. I suppose they will give one of the following likely answers:

The Real Bozo Attempts to Atone: Why the DDWFTW Car Works

Technorati Tags: ddftw, bozos, markcc-screwups

BIO101 - Lecture 7 - Physiology: Coordinated Response

Last week we looked at the organ systems involved in regulation and control of body functions: the nervous, sensory, endocrine and circadian systems. This week, we will cover the organ systems that are regulated and controlled.

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.

Science 2.0

Science Codex

More by this author

New home for Discovering Biology in a Digital World

October 30, 2017

Sometime in the next day or two, Scienceblogs will shut down. We've enjoyed the opportunity to blog here for the past 10+ years. Not to worry, @digitalbio and @finchtalk will continue blogging, but more so from their own site at Digital World Biology. The Scienceblogs posts have been…

Synbiobeta: The Future is Now

October 12, 2017

@synbiobeta concluded it’s #sbbsf17 annual meeting on synthetic biology Oct 5, 2017. The progress companies are making in harnessing biology as a platform for manufacturing and problem solving is world changing. Locations of Synbio Companies What is Synthetic Biology? Synthetic biology is a term…

Understanding the CRISPR Cas9 system

September 18, 2016

On Sept. 30th, I'm going to be co-presenting a Bio-Link webinar on Genome Engineering with CRISPR-Cas9 with Dr. Thomas Tubon from Madison College. If you're interested, Register here. Since my part will be to help our audience understand the basics of this system, I prepared a…

Zika virus, drug discovery, and student projects

March 8, 2016

It's well understood in science education that students are more engaged when they work on problems that matter. Right now, Zika virus matters. Zika is a very scary problem that matters a great deal to anyone who might want to start a family and greatly concerns my students. I…

DNA: it's in your blood

February 28, 2016

Did you know small fragments of DNA are circulating in your blood stream? These short pieces of DNA are left behind after cells self-destruct. This self-destruction, or apoptosis, is a normal process. In the case of fetal development, certain cells in our hands die, leaving behind individual…