People keep sending me this link to an article by Jonah Lehrer in the New Yorker: The Decline Effect and the Scientific Method, which has the subheadings of "The Truth Wears Off" and "Is there something wrong with the scientific method?" Some of my correspondents sound rather distraught, like they're concerned that science is breaking down and collapsing; a few, creationists mainly, are crowing over it and telling me they knew we couldn't know anything all along (but then, how did they know…no, let's not dive down that rabbit hole).
I read it. I was unimpressed with the overselling of the flaws in the science, but actually quite impressed with the article as an example of psychological manipulation.
The problem described is straightforward: many statistical results from scientific studies that showed great significance early in the analysis are less and less robust in later studies. For instance, a pharmaceutical company may release a new drug with great fanfare that showed extremely promising results in clinical trials, and then later, when numbers from its use in the general public trickle back, shows much smaller effects. Or a scientific observation of mate choice in swallows may first show a clear preference for symmetry, but as time passes and more species are examined or the same species is re-examined, the effect seems to fade.
This isn't surprising at all. It's what we expect, and there are many very good reasons for the shift.
Regression to the mean: As the number of data points increases, we expect the average values to regress to the true mean…and since often the initial work is done on the basis of promising early results, we expect more data to even out a fortuitously significant early outcome.
The file drawer effect: Results that are not significant are hard to publish, and end up stashed away in a cabinet. However, as a result becomes established, contrary results become more interesting and publishable.
Investigator bias: It's difficult to maintain scientific dispassion. We'd all love to see our hypotheses validated, so we tend to consciously or unconsciously select reseults that favor our views.
Commercial bias: Drug companies want to make money. They can make money off a placebo if there is some statistical support for it; there is certainly a bias towards exploiting statistical outliers for profit.
Population variance: Success in a well-defined subset of the population may lead to a bit of creep: if the drug helps this group with well-defined symptoms, maybe we should try it on this other group with marginal symptoms. And it doesn't…but those numbers will still be used in estimating its overall efficacy.
Simple chance: This is a hard one to get across to people, I've found. But if something is significant at the p=0.05 level, that still means that 1 in 20 experiments with a completely useless drug will still exhibit a significant effect.
Statistical fishing: I hate this one, and I see it all the time. The planned experiment revealed no significant results, so the data is pored over and any significant correlation is seized upon and published as if it was intended. See previous explanation. If the data set is complex enough, you'll always find a correlation somewhere, purely by chance.
Here's the thing about Lehrer's article: he's a smart guy, he knows this stuff. He touches on every single one of these explanations, and then some. In fact, the structure of the article is that it is a whole series of explanations of those sorts. Here's phenomenon 1, and here's explanation 1 for that result. But here's phenomenon 2, and explanation 1 doesn't work…but here's explanation 2. But now look at phenomenon 3! Explanation 2 doesn't fit! Oh, but here's explanation 3. And on and on. It's all right there, and Lehrer has explained it.
But that's where the psychological dimension comes into play. Look at the loaded language in the article: scientists are "disturbed," "depressed," and "troubled." The issues are presented as a crisis for all of science; the titles (which I hope were picked by an editor, not Lehrer) emphasize that science isn't working, when nothing in the article backs that up. The conclusion goes from a reasonable suggestion to complete bullshit.
Such anomalies demonstrate the slipperiness of empiricism. Although many scientific ideas generate conflicting results and suffer from falling effect sizes, they continue to get cited in the textbooks and drive standard medical practice. Why? Because these ideas seem true. Because they make sense. Because we can't bear to let them go. And this is why the decline effect is so troubling. Not because it reveals the human fallibility of science, in which data are tweaked and beliefs shape perceptions. (Such shortcomings aren't surprising, at least for scientists.) And not because it reveals that many of our most exciting theories are fleeting fads and will soon be rejected. (That idea has been around since Thomas Kuhn.) The decline effect is troubling because it reminds us how difficult it is to prove anything. We like to pretend that our experiments define the truth for us. But that's often not the case. Just because an idea is true doesn't mean it can be proved. And just because an idea can be proved doesn't mean it's true. When the experiments are done, we still have to choose what to believe.
I've highlighted the part that is true. Yes, science is hard. Especially when you are dealing with extremely complex phenomena with multiple variables, it can be extremely difficult to demonstrate the validity of a hypothesis (I detest the word "prove" in science, which we don't do, and we know it; Lehrer should, too). What the decline effect demonstrates, when it occurs, is that just maybe the original hypothesis was wrong. This shouldn't be disturbing, depressing, or troubling at all, except, as we see in his article, when we have scientists who have an emotional or profit-making attachment to an idea.
That's all this fuss is really saying. Sometimes hypotheses are shown to be wrong, and sometimes if the support for the hypothesis is built on weak evidence or a highly derived interpretation of a complex data set, it may take a long time for the correct answer to emerge. So? This is not a failure of science, unless you're somehow expecting instant gratification on everything, or confirmation of every cherished idea.
But those last few sentences, where Lehrer dribbles off into a delusion of subjectivity and essentially throws up his hands and surrenders himself to ignorance, is unjustifiable. Early in any scientific career, one should learn a couple of general rules: science is never about absolute certainty, and the absence of black & white binary results is not evidence against it; you don't get to choose what you want to believe, but instead only accept provisionally a result; and when you've got a positive result, the proper response is not to claim that you've proved something, but instead to focus more tightly, scrutinize more strictly, and test, test, test ever more deeply. It's unfortunate that Lehrer has tainted his story with all that unwarranted breast-beating, because as a summary of why science can be hard to do, and of the institutional flaws in doing science, it's quite good.
But science works. That's all that counts. One could whine that we still haven't "proven" cell theory, but who cares? Cell and molecular biologists have found it a sufficiently robust platform to dive ever deeper into how life works, constantly pushing the boundaries of uncertainty.









Comments
Posted by: Enkidum
|
December 30, 2010 6:41 PM
Nice post. I have to say that a certain amount of "fishing" in new data is warranted, surely. When you're doing research in a fairly unexplored area, surely you want to look at more than the simple hypothesis you started with, especially if you have a complex data set. Wouldn't you want to pore over that data, look for what seem like sensible patterns, even if you didn't expect them, draw tentative conclusions about those patterns, and offer them up as worthy of further investigation?
Because if that's not warranted, then I'd better reconsider the last few years of my career.
Posted by: j-brisby
|
December 30, 2010 6:43 PM
I love this post. Thanks, PZ.
Posted by: Glen Davidson
|
December 30, 2010 6:43 PM
Wow, so replicability is still important in science?
Who knew?
Glen Davidson
Posted by: PZ Myers
|
December 30, 2010 6:48 PM
Fishing is a start. Yes, you should always look for surprising and unexpected effects in your data, but that shouldn't be the conclusion that is published. It should become the premise for more focused investigations.
The best example I can think of doing this badly is some old work in, of all things, astrology. There was a study by the Gauqulins in which they just looked at piles and piles of birth data from all kinds of people, and pulled out any correlations they could find. They claimed to find something called the Mars effect, where some specific association was found between Mars and athletic ability.
Total nonsense, of course. The only surprising thing they could have discovered in such a scattershot study was that there were no correlations at all.
Posted by: daniel.ocampo.daza
|
December 30, 2010 6:54 PM
As a young scientist, early in my scientific career, one of the best things I've learned is that science is not about discovering "truth" in any objective sense, it's about quantifying doubt. That simple perspective solves the "problem" the article is trying to sell... The really interesting thing to discuss is why the idea that science is all about discovering "truth", or "proving", is so hard to shake.
Posted by: clemxxx
|
December 30, 2010 6:54 PM
Ideas are not Facts although They're not a Lie,
and with the gath'ring of more Data
Hypotheses can die.
- Not H.P. Lovecraft -
Posted by: Amphiox, OM
|
December 30, 2010 7:04 PM
I can't believe Lehrer even used the word "prove" in that conclusion. Experiments and the scientific method do not, never did, and never claimed to, "prove" anything. That's not how they work.
Experiments disprove hypotheses.
Lehrer should have known this. He should be ashamed of that conclusion.
Ashamed.
Posted by: https://www.google.com/accounts/o8/id?id=AItOawmbrIhrpZ1_Ka3H4UEtwjsVRA7YMO8nUYE
|
December 30, 2010 7:06 PM
I'm always said for years to pals of mine that statistics, when used alone, are absolute bubkes as evidence of anything.
It's all too easy to twist a sampling you have so that the statistic says whatever you want it to say. There's an old, but very relevant book called "How To Lie With Statistics" that describes in almost frightening detail what one can do to make a statistic say the exact opposite of what the facts say while still, when actually examining the data itself, evaluates as true.
I can intentionally grab, say, 30 red heads, 2 blondes, and a brunette, then use that biased sample to try and establish that 90.91% of the population are redheads by not actually specifying WHAT population I am referring to, so it seems global in scale. After all, look at my statistic! See how 90.91% are red heads, but only 6.1% are are blond and a shockingly low 3% are brunettes. Of course, anyone with common sense can just walk into a crowd of people and see that this statistic would be pure crap, but even a crowd of people is biased, as it's hardly every single person on the entire planet this instant.
So, with statistics, I like to see something "physical" that could back it up. Supporting evidence.
Posted by: Teh Merkin
|
December 30, 2010 7:09 PM
This is one of the posts that makes sure I keep coming back. I am not a scientist, but after a couple of years of visiting here, I feel like I understand the scientific mindset, and I really appreciate the education I am getting.
That, plus I love to LOL at religious idiots. LOVE it.
Thank you!
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
December 30, 2010 7:19 PM
Googlemess @8: "I'm always said for years to pals of mine that statistics, when used alone, are absolute bubkes as evidence of anything."
This is a meaningless statement. Statistics are a tool. You can use them to lie or you can use them to illustrate the truth. Do you eschew the use of a crescent wrench because it can also be used as a murder weapon?
Googlemess: "It's all too easy to twist a sampling you have so that the statistic says whatever you want it to say."
Yes, you can, but then you aren't doing statistics or science, but merely engaging in a sophisticated type of lying.
Googlemess: "So, with statistics, I like to see something "physical" that could back it up."
This is the first sensible thing you've said. A mechanism that is consistent with the statistical evidence makes both stronger.
The problem here is that too many scientists and (especially) engineers don't understand statistics and probability. However, here's a hint: dismissing it ain't gonna get you to understand it any better.
Posted by: Wazza
|
December 30, 2010 7:28 PM
PZ, you're missing a word in the first sentence of that last paragraph...
http://xkcd.com/54/
Posted by: https://www.google.com/accounts/o8/id?id=AItOawmbrIhrpZ1_Ka3H4UEtwjsVRA7YMO8nUYE
|
December 30, 2010 7:29 PM
#10
I stand corrected.
And no, I'm not a Googlemess, I've been lurking here for some time and I got my information from that book.
Posted by: james.michael.thompson
|
December 30, 2010 7:30 PM
It's not bad to be skeptical of science, but PZ is right - the last few sentences in this article are inane. Here's an excerpt:
We like to pretend that our experiments define the truth for us. But that’s often not the case. Just because an idea is true doesn’t mean it can be proved. And just because an idea can be proved doesn’t mean it’s true. When the experiments are done, we still have to choose what to believe.
Who are "we" that like to pretend that experiments define truth? This is silly. A more appropriate name for "the decline effect" is "improved statistical power by gathering more data." An effect that demonstrates the value of gathering more data hardly an indictment of the scientific method. In fact, critical discussions of how to improve the scientific practices are a topic of current research:
Why Most Published Research Findings are False
This paper has been cited more than 100 times, and is one of the most-read papers in all of the PLoS journals.
Lehrer's New York piece makes an old mistake in thinking about science. Some people think that because scientific opinion changes when acquire new data, science is somehow flawed. In fact, this willingness to change our opinions in the face of new data is the greatest strength of science.
Posted by: Kel, The Privileged View From Nowhere
|
December 30, 2010 7:31 PM
I believe in the computer in front of me - and all that entails...Posted by: Jeffrey A. Myers
|
December 30, 2010 7:45 PM
God botherers love to prattle on about the alleged failings of science and the scientific method while sending messages from a phone utilizing the latest advances in materials science via controlled bursts of electromagnetic radiation transmitted via satellites in orbit thanks to our understanding of gravitation while driving a car that makes use of GPS that relies on our understanding of the speed of light and the effects of time dialation at increased velocities.
God botherers hate science except when they don't.
Posted by: ginckgo
|
December 30, 2010 7:49 PM
I wonder how long it will take for Lehrer's article to pop up in the Global Warming denialosphere as another supposed example of how you can't trust them scientists. It would be twisted in just the way this one did: http://wattsupwiththat.com/2010/12/27/climate-change-and-the-corruption-of-science-where-did-it-all-go-wrong/ (dang I hate linking to them)
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
December 30, 2010 7:51 PM
Googlemess,
There is no offense intended. It is a common shorthand here at Pharyngula. Give me another name to call you and I'll stick to that.
FWIW, I FUCKING HATE the book "How to Lie with Statistics." The fucker had a chance to actually write something that illustrated the power of statistics when properly used, and instead, he just bought into the old Disraeli meme (lies, damnable lies and statistics). First, I find an information theory perspective helpful in understanding these issues.
Let's say you are gathering data and fitting a statistical model to the data using some goodness of fit criterion (e.g. maximum likelihood, least squares, chi-square...) Initially, your fit parameters jump all over the place--you don't have enough data to constrain the model.
Eventually, your parameters settle down, and now you think you understand the data. The thing is, that overwhelmingly, most of your data is coming from the center of the distribution (the mode) precisely because it is the most probable value. Suddenly you get an "outlier" that knocks your parameters back out of their accustomed range. Do you discard it? Big mistake. That "outlier" may well be from the tail of your data distrubution, and as such it is giving you A TON of information. Outliers are rare gifts of information.
Jonah Lehrer's piece is crap. The problems he identifies could all be solved by 1)ensuring that research doesn't succumb to MBAs with visions of dollar signs dancing in their heads; 2)ensuring the scientists know enough statistics to do the simple stuff and know when to turn to a professional statistician when it gets beyond simple.
Posted by: Kamaka
|
December 30, 2010 7:52 PM
www.google.com/accounts/o8/id?id=AItOawmbrIhrpZ1_Ka3H4UEtwjsVRA7YMO8nUYE @ 12
Googlemess is not some kind of insult.
Sign off with a handle or name, please.
Posted by: raven
|
December 30, 2010 7:52 PM
Another science bashing article. Big Deal.
Science is the basis of modern 21st century civilization and the basis for American economic and military leadership.
The US government knows this which is why it spends 180 billion USD on R&D, a lot of which is military related. Kill US science, and eventually you end up with a third world nation. Our competitors know it too. That is why China, India, and others are pouring as much money as they can into...science.
What has xianity done lately? Besides sponsor xian terrorism and assassinate a few MDs, not much. We won't bother pointing out that the last time xianity ruled was known as The Dark Ages.
Posted by: SteveL
|
December 30, 2010 7:54 PM
That open source database of results thing sounds like a good idea.
Posted by: raven
|
December 30, 2010 7:56 PM
This is pure Postmodernism. It also was shown to be wrong about science a decade ago.
Postmoderism doesn't work with science because it ignores an obvious fact. There is only one reality, one real world.
You can choose what to believe, but no matter what, the earth isn't flat and the sun doesn't orbit it.
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
December 30, 2010 7:59 PM
One of the better statistical references I've found on line is here--lots of good stuff in this on-line text:
http://www.itl.nist.gov/div898/handbook/
Check it out next time you have a big dataset to crunch.
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
December 30, 2010 8:05 PM
ginckgo: "(dang I hate linking to them)"
Then don't. Every hit puts money in Tony "Micro"Watts pocket, and I would not piss on that dude if he were on fire.
Posted by: KG
|
December 30, 2010 8:06 PM
Yeah, right. When smallpox dies out, we still have to choose to believe that vaccination works. When we see photographs of the Earth from space, we still have to choose to believe it's round. When computers work, we still have to choose to believe in semiconductors.
Posted by: Pierce R. Butler
|
December 30, 2010 8:11 PM
Googlemess @ # 8: I can intentionally grab, say, 30 red heads, 2 blondes, and a brunette...
You can try, but my estimate is that you'll be hospitalized (or morgue-ized) before finishing a quarter of your specimen-collection process.
raven @ # 21: Tom the idiot...
Who dat? The person you're quoting, as noted by our esteemed host, is named Jonah. (A former SciBlogger, sad to say.)
Posted by: Ibis3, féministe avec un titre française de fantaisie
|
December 30, 2010 8:14 PM
Didn't read the original, but I'd like to add a factor to your list (though not technically a reason for the "Decline Effect" it certainly exacerbates the perception of a problem):
• Premature and/or Incompetent Mainstreaming: Much of what passes for science reporting in the mainstream media gives an erroneous and utterly hyped view of what is often preliminary in the original or uncorroborated by other studies. The potential bias of the researchers is almost always overlooked in favour of controversial headlines.
Whatever "Decline Effect" there is, it is magnified geometrically for the lay person.
Posted by: raven
|
December 30, 2010 8:16 PM
Oh. Never heard of him, I'm happy to say.
I just saw Lehrer and put a Tom in front of it without paying much attention. IIRC, there is a musician with that name.
Posted by: feralboy12
|
December 30, 2010 8:18 PM
I choose to believe that I didn't just hit my head on the cupboard door.
There is no pain. There is no pain. There is no pain.
Posted by: Lynna, OM
|
December 30, 2010 8:19 PM
Thanks so much, PZ. I was hoping that you would weigh in.
I read the article when it first came out and threw the magazine across the room when I came to that last sentence, "When the experiments are done, we still have to choose what to believe."
Lehrer had not provided a platform for that conclusion at all.
I immediately suspected him of getting an under-the-table payday from the
DiscoveryDisco Institute.Someone really wanted Lehrer to provide fodder for the "you believe what you want, and I'll believe in God" crowd. The article is almost okay. It pretends to be wholly truthful by being mostly truthful and then veering off into not-truthful-at-all land at the crucial moment of conclusion.
In addition to the improbable, intellectually impoverished conclusion, Lehrer's loaded language throughout will be recognized by working scientists as eye-brow-raising hyperbole, but general readers will be gulled.
Despicable. And very disappointing when it comes to The New Yorker, which used to have a better fact-checking department, better editors, and a higher ethical standard. This article is a disservice to the magazine's readers.
The only thing this article "proves" is that Lehrer can't be trusted.
Posted by: alexandersafir
|
December 30, 2010 8:21 PM
I read this last night and thought I should be quick and write something before someone beats me to it. PZ is too quick on the draw.
Posted by: raven
|
December 30, 2010 8:26 PM
I'll add here that this is the first I've heard of the decline effect in statistics.
But it can go the other way as well. Sometimes the data gets better with time and experience.
One trial started out with unexciting data and we thought of canceling it. Then the data got better and better. Real world use has reinforced that with years of data and followup.
Seen that in other experiments as well.
Posted by: Legion
|
December 30, 2010 8:38 PM
Ray @17:
Bingo!
It's the corruption of science by the revenue-obsessed, pencil pushing, bean counting MBAs that leads to a lack of confidence in science.
The rubes may not know science, but they know when they've been lied to by slick marketing gimmicks that lie about the actual science. This opens the door for religious ideologists to demonize science.
I'd like to see the science community push back against the suits, whose main goal seems to be to turn a short-term profit no matter the long-term costs.
Posted by: Enkidum
|
December 30, 2010 8:59 PM
@ocampa #5
I'm not a huge fan of this way of phrasing things. It's true that any statistical test is about quantifying how certain you are (=how much you doubt) that a given hypothesis is false. But almost any null hypothesis is chosen with a positive alternative in mind. And there is no kind of quantification of how certain we are (=how much we doubt) about the positive hypothesis, because there are always an infinite number of alternative hypotheses that we haven't yet ruled out (which is why we don't report p-values for the hypotheses we actually care about, most of the time). We just hope that we have managed to rule out the most important alternatives. So the theories we are actually supporting, the reason we actually do our day to day work in the lab, are far from having doubt quantified in the way you describe.
Sorry, that's a bit of a rant for one of my first posts here. At any rate, I think a lot of scientists have gotten scared away from talking about "truth" (even to the point where they always put it in scare quotes), when in fact we should be more assertive about it - I believe x to be true for reasons y and z.
Posted by: Nerdette
|
December 30, 2010 9:10 PM
Thank you, PZ. This was a great read and very heartening after a frustrating argument with the son of Christian missionaries that climaxed with this little gem: "I hope that you stop using "science" as some clutch point because most science is really just mostly shoddy and questionable statistics in the end."
The entire argument was over a Ricky Gervais quote: "Saying atheism is a belief system is like saying not going skiing is a hobby. I’ve never been skiing. It’s my biggest hobby. I literally do it all the time." My Christian friend said it was a play on words, calling atheism "a strict belief system defined on the foundation that you don't believe something exists" and couldn't be swayed otherwise.
My inarticulate nature botched the argument, and once he laid down the above gem, I tossed my hands in the air and stopped trying. This may be the wrong thread for it, but what do you Pharynguloids think? Science is a tool, but is it also a belief system?
As Rachel Weisz beautifully said in 'Agora': "I believe in philosophy."
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
December 30, 2010 9:17 PM
Enkidum,
In part the reason we phrase the whole problem of significance in terms of a null hypothesis is because you can show that you can never say definitively whether a model is the "true" model--just which of two models is closer. There will always be an indeterminate constant.
On the other hand, I think a lot of scientists have become way too "post modern" about this. If you have done your stats right, and you really have 95% confidence (not a manufactured CL by "bump hunting"), then you can pretty much take that to the frigging bank.
Posted by: raven
|
December 30, 2010 9:20 PM
No. That is Postmodernism nonsense. It is dead wrong.
Science is how we understand and describe the real world. There is only one real world.
That is why Postmodernism failed.
Try not believing in the Law of Internal Combustion. Cars will still start and run anyway. Reality doesn't care what you believe, it just is.
Science built our modern technological civilization. Not bad for shoddy statistics. What has xianity done in the last 2,000 years except endless wars and a Dark Age?
Posted by: Nerdette
|
December 30, 2010 9:31 PM
Okay, I'm willing to accept that. Is there a name for a scientific belief system then? (I'm drawing from the wiktionary definition: The basis on which beliefs are based. For example a religious belief system is based on faith and dogma whereas a scientific belief system is based on observation and reason.) Personally, I don't like the word "belief" at all - I find it limiting, but I'm finding it difficult to find a replacement. I do admit to being inarticulate.
Posted by: speedweasel
|
December 30, 2010 9:43 PM
@ #17
I fucking love Ben Goldacre's book "Bad Science," because he frequently writes from the position of the 'science subverter' in order to demonstrate various aberrations of the scientific method.
Then once he has you horrified by very real examples of assholes peddling bullshit disguised as science, he goes on to highlight the value of rigour, clarity and integrity as a means of thwarting this abuse.
First he gets your attention and makes you care, then he drops the ScienceTM on you.
I'm guessing "How to Lie with Statistics," doesn't utilise this same approach to learning?
Posted by: Enkidum
|
December 30, 2010 9:48 PM
a-ray-etc
"you can never say definitively whether a model is the "true" model--just which of two models is closer"
Right, but it's even worse than that, because you never have any kind of proof that those are the only two possible models. False dichotomies are a huge problem in my area, at least - someone shows that the effects predicted by Theory A do not hold, and takes this as support for Theory B. If it were that simple, I'd already have several Nobel prizes...
Posted by: llewelly
|
December 30, 2010 9:49 PM
Jonah Lehrer's article is so disappointing I'm wondering if I should stop recommending his book, How We Decide.
Posted by: speedweasel
|
December 30, 2010 9:51 PM
Oops! That ScienceTM previewed perfectly.
....Frackin' SB and its frackin' amateur IT boffin(s). Wouldn't let them write my shopping list.
Posted by: llewelly
|
December 30, 2010 10:00 PM
Nerdette | December 30, 2010 9:31 PM:
A substantial portion of skeptics and scientists are trying to replace "believe" with "accept". At this point, I do not see any sign such a usage change might improve anyone's understanding.
The important part, is to keep your belief, or your acceptance of the evidence, in proportion to the evidence available, and more importantly, to be prepared to change your belief if better evidence points elsewhere.
Posted by: Crudely Wrott , Drinking Solo Since Death's Back On The Wagon
|
December 30, 2010 10:02 PM
Lehrer piteously pens,
Science is in no way a human fallibility -- rather it is the single enabling power of humanity.
Observe most any person going about their daily business; you will note that they look (observe), remember (take notes), compare (make preliminary judgments), try doing things differently based upon the observations and judgments (experiment) and then they examine the results. These are normal steps (carelessly generalized) in how your average human goes about doing . . . whatever they do.
These steps are also the fundamental blueprint of doing science. The main difference between your average human and your average scientist is that the scientists observe this protocol to a much higher degree and facilitate their progress by means of maddeningly tedious record keeping. Throw in the fact that a fair number of working scientists are youngish, sociable and gregarious. Moreover they are often excited about what they are doing. Science is a most deeply ingrained trait in humans and the evidence for it is standing before us.
Science at one time was limited to counting days and reckoning seasonal change. At some point knapping stone was added. These are certainly technologies that arose from our ancestors doing basic science. And today we manipulate individual atoms with wit and aplomb!
Science! It's built in, bitches!
Posted by: llewelly
|
December 30, 2010 10:20 PM
And one other thing. When will the decline effect apply to climate scientists' projections of sea level rise? I've been watching them for 22 years, and they sure haven't been declining!
Posted by: sacculina
|
December 30, 2010 10:22 PM
I think we are being too hard on Jonah Leher here; while this may not be a good article, I don't think we can count him among the enemies of science. I quite enjoyed Proust was a Neuroscientist.
Posted by: Crudely Wrott , Drinking Solo Since Death's Back On The Wagon
|
December 30, 2010 10:42 PM
It can't be said enough that science is not for finding what is true.
What it is primarily for is answering questions. Secondarily but probably more useful, it is for allowing one to ask more questions, deeper questions, better informed questions.
This process, if applied on a regular basis has a distinct benefit -- a more or less steady stream of answers. These lead, as we have learned, to more better questions.
Most anyone, it is assumed but not proven, could easily see where such a process might lead . . .
Posted by: xyx
|
December 30, 2010 11:07 PM
I strongly disagree that scientists shouldn't be concerned about the decline effect. It is a huge problem in medicine that treatments are approved and implemented without sufficient evidence for their efficacy. More generally, it is just not good for the transmission of knowledge that so many papers are published that claim much stronger support for their hypotheses than justified. It is misleading to both other scientists and the public. Maybe this isn't a problem of "science" or the "scientific method" but it is a big problem of scientific institutions. It's not a crisis in all fields, but I think it is approaching crisis-level in medical science (where I have worked).
Moreover, it's a solvable problem. Raise the required standards for making conclusions, lower standards for publication. It's hopeless to expect top-tier journals to publish negative results unless they negate some well-known positive result, but they should be published somewhere. The "publish or perish" culture is also at fault, and not necessary. People are more likely to find positive results if feeding their families depends on it. There's also a general lack of statistical knowledge among researchers, which is understandable, because medical researchers (or psychologists, etc.) are not trained as statisticians. This is also a solvable problem.
Posted by: shgstewart
|
December 30, 2010 11:22 PM
I love it when you talk discourse analytics, PZ. :)
Posted by: Amy(T)
|
December 31, 2010 12:19 AM
AP Moller and ESP research as an example? Yeah, if you have/do shity research it gets overturned, that's science. A lot of that stuff should have never been published in the first place.
Posted by: Pierce R. Butler
|
December 31, 2010 12:23 AM
xyz @ # 47: ... because medical researchers (or psychologists, etc.) are not trained as statisticians.
Funny how just one uninformed clause can torpedo an otherwise commendable comment.
Posted by: MadScientist
|
December 31, 2010 1:25 AM
When Jerry Coyne brought up that article several weeks ago I pointed out how horrible it was. Among other things of course the author pretends that the observed phenomenon applies to *all* scientific claims. However, the last time I checked gravity it was still working according to Newton's description and the planets were still orbiting as described by Kepler and Newton. The sun was still fusing hydrogen in much the same way as it has been for over 4 billion years and the sun still rises in the east and sets in the west unless you're within the polar circles. I thought it was a notoriously bad article from start to finish - the sort of thing I'd expect to see in National Enquirer. The whole article was a big lie along the lines of "scientists dispute evolution".
Posted by: GrueBleen
|
December 31, 2010 1:30 AM
"If you torture the data long enough, it will confess [to anything]."
Timely to remember Ronald Coase, two days after his 100th birthday.
Posted by: cody.cameron
|
December 31, 2010 1:34 AM
Excellent post, I just had to mention a bit, I think Feynman said it, about an embarrassing event in the history of physics, where over time the value of the electron charge (following Milikan's oil drop experiment), and it turned out that Milikan was such a great experimentalist, but had used the wrong value for the viscosity of air, and so his calculation was wrong. So other experimenters biased their data, dropping outliers to steer their values towards his. But we learned from that and we know better now (he says).
Also, I've wondered about the antidepressants which I'm told have a low margin of effectiveness over controls often times... I figure, if I were in a drug study, whether or consciously or not, I think I'd be trying to determine whether I was on the placebo or not. And if I had little or no side effects I'd probably start to question the veracity, which if I were actually on the placebo, would reduce it's effectiveness, right? So there would be a slight advantage just to having a pill with noticeable effects over one with none, right?
Does anyone know if they try to compensate for this effect? I imagine there are ethical questions to giving a placebo that induces specific (negative) side effects?
Posted by: GrueBleen
|
December 31, 2010 1:36 AM
Madscientists @ #51,
But why hasn't anybody, and especially Lehrer, mentioned Ioannides ? Surely Lehrer's article is just a low-grade ripoff of Ioannides more careful research ?
Posted by: The Sailor
|
December 31, 2010 2:09 AM
"Raise the required standards for making conclusions, lower standards for publication."
Don't lower the standards, lower the cost. It's freaking amazing how much it costs my research team to publish text, much less include grayscale or, FSM forbid, color images in a peer-reviewed journal.
And stop publishing scientists just because they have a rep. Publishing should be blind as to who wrote the paper.
There's nothing wrong about the scientific method, but scientists keep fucking it up. I hate the petty arguments that reviewers have with each other and authors that they know. [/rant]
Posted by: raven
|
December 31, 2010 2:16 AM
One of the things that bothers me about this article is that the claim is just wrong.
Science is cumulative and self consistent. Anything that has stood the test of time is likely to stand up as true forever.
The speed of light is still 3x10exp9 cm/seconds, a mole is 6.023 10exp23 particles, germs still cause disease, automobiles can be run on hydrocarbons, smallpox is still extinct, and so on.
About the only time there might be a decline effect is in recent findings on the cutting edge in fields where you have lots of noise and are looking for a weak signal. Medicine sometimes, psychology, behavioral science, maybe others. I doubt if physics has this problem too often for too long. And there is also a rise effect. Sometimes things look better with time.
And there is definitely an improvement with age effect. We get more precise measurements of constants as our instrumentation improves and more measurements are made. Things that are wrong and not replicatable, get discarded sooner or later. Accumulating data clarifies murky pictures.
We always wondered where the Neanderthals fit in. All we had were a few bones here and there and some artifacts. Then technology improved and we sequenced their whole damn genome from a scrap of 33,000 year old bone.
Really, the little bit I've seen of this Jonah Lerner's idea, the more it looks like GIGO, garbage in, garbage out. Or he needed a quick bit of cash to pay the rent.
Posted by: raven
|
December 31, 2010 2:20 AM
For nerdette above who was asking what the difference is between science and religion. The difference is huge.
The sidebar had one explanation.
Posted by: ArabiaTerra
|
December 31, 2010 2:46 AM
@ nerdette
I get this one thrown at me all the time on the climate change thread on my local forum.
My standard response is:
I do not have a "belief" in climate change, I have a conclusion, based on an honest evaluation of the evidence.
Posted by: Cath the Canberra Cook
|
December 31, 2010 3:14 AM
This is really weird because I've just read his book "Proust was a Neuroscientist" in which he does much the same thing. Every chapter is fascinating, well worth reading - a study of some aspect of human perception and/or consciousness that some artist had an insight into before it was scientifically validated. In each case, it *was* scientifically validated. And yet he ends up with a moral that Art may teach us things that Science Can Not Know. The opposite of what each of his stories said. Huh?
I kept also feeling that he was cherry-picking. There were plenty of artists with other views, some of whom he discusses. So presumably no matter what Science had discovered there could have been some Artist to precede the discovery, but that's another issue.
Posted by: JoeB
|
December 31, 2010 3:28 AM
Uh, Raven...
You might want to take another look at your speed of light!
Posted by: Phill Marston
|
December 31, 2010 5:48 AM
Perhaps for 'science' one should read 'medicine'. In my field, geology, an observational science, hypotheses confirmed by early results often seem tenuous and it takes a long time for most of them to be confirmed or not (think of Wegener's 'continental drift'). With 'breakthroughs' in medicine readily seized on, or driven by, pharmaceutical companies it would seem likely that the early confirmation of hypotheses is often adopted for reasons of short-term profit rather than scientific rigour.
Posted by: Keith Kloor
|
December 31, 2010 6:13 AM
Sometimes I think an article like this is judged more by concerns over how it might be received in anti-scientific quarters.
#54:
Ioannides is in the piece.
Anyway, there's been a lot of discussion of this piece in the science blogosphere, much of it along the lines of PZ's criticism. I think people are overreacting a tad. I have a more favorable write-up at my site, and which also places the article in the context of relevant articles that preceded it:
http://www.collide-a-scape.com/2010/12/09/where-science-is-flawed/
For links to varied reax on the New Yorker piece in the science blogosphere and Lehrer's elaboration at his Wired blog, see the updates at end of my post.
Posted by: ericthehalfabee
|
December 31, 2010 6:32 AM
@ Nerdette #37
If you're looking for a specific system as opposed to a name for any scientific belief system, then I think Humanism would fit the bill.
Posted by: RickK
|
December 31, 2010 7:32 AM
Yep, it's true. Science doesn't work.
As I sit here in front of a powerful but affordable home computer using a global digital communications network to browse pictures sent back from a probe around Saturn so I can show them to my healthy children who've never had a classmate die from disease, I can say categorically that John Lehrer is right and the scientific method doesn't work.
Posted by: Comstock
|
December 31, 2010 8:05 AM
@ Cath #59:
I think you've hit it on the head why Lehrer's writing is so frustrating. It is cherry picking, then drawing some sweeping conclusion about What It All Means, in a vague but certainly grand sense. I found Proust Was a Neuroscientist to be aggravating in the extreme for this reason. In fact, I find a lot of Lehrer's writing to be shallow, made all the more annoying by his habit of whipping up his subjects as something profound in that educated-yet-plainspoken voice so common in benevolent-explainer science writing. This New Yorker story is right along those same lines.
Posted by: Otis
|
December 31, 2010 8:32 AM
What is meant, precisely, by a decline effect, if people are comparing P values from drug trials to P values derived from larger populations? For example, a P value of 0.001 obtained in a trial study compared to a later value of 0.004, and then say that things are getting worse. This would be an incorrect interpretation of the statistical tests. Would it not?
Posted by: Antiochus Epiphanes
|
December 31, 2010 8:49 AM
What RickK said.
Posted by: broboxley OT
|
December 31, 2010 9:01 AM
Science is accurate measurement and repeatability. As far as the declination effect anyone who has worked as a sysadmin understands that. When we have a product in the lab, test it repeatedly, load test it then promote it to the wild we understand that all kinds of strange outliers will occur. This changes our understanding of the product we are trying to build and improve upon it.
Posted by: adam.koncz
|
December 31, 2010 9:42 AM
I fail to understand how any of Lehrer arguments prove the decline of science.
As long as there is an educated community that has the mean to disprove any disprovable hypothesis, science will flourish.
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
December 31, 2010 9:44 AM
Ultimately, the problem with Lehrer: He's a bullshitter. He tells a story, but it's always with a view toward persuading the reader to his theme--proof by anecdote. And his conclusions are always sufficiently vague that he can claim he was "misunderstood" rather than misleading. There is a special circle of hell for such people:
-they play muzak all the time;
-the light is so soft that all outlines become indistinct;
-everything anyone says is linguistically correct, but has zero information content
-the seats are soft but never quite comfortable
-and the cuisine is perfectly done, but just ever so slightly annoying and has no nutritional content, so everybody is suffering from malnutrition
-and so on.
So I'm afraid I disagree. Lehrer is precisely the enemy because he is so damned imprecise. He strikes me a someone who couldn't quite hack science, and so he went into journalism--where he fucked that up even as he succeeded.
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
December 31, 2010 9:48 AM
Crudely Wrott: "What it is primarily for is answering questions. Secondarily but probably more useful, it is for allowing one to ask more questions, deeper questions, better informed questions."
Not quite. The aim of science is to yield understanding. Yes, nature can only answer questions we pose with simple answers--number or yes or no. However, ultimately we derive enough understanding that we anticipate the answers. Fortunately, nature doesn't change her story.
Posted by: raven
|
December 31, 2010 9:49 AM
Another way to look at it.
Science converges asymptopically on the truth over time. It never gets there but it gets closer and closer. And all truths are provisional, new data can falsify them any time.
This is different from religion.
Religion gets it totally wrong in the beginning. It diverges over time. There are now 38,000 different xian sects, varying wildly in beliefs. More are created every year by schisms, revelations, and con men.
What falsifiable claims religion makes end up being falsified. They believe it anyway. Many of the claims are unfalsifiable, meaning that there is no way to determine whether they are right or not. That is why there are hundreds or thousands of religions that seriously disagree. And why religions traditionally settle belief disputes by murders and wars.
Posted by: Gus Snarp
|
December 31, 2010 9:51 AM
You've said that the simple chance explanation is a hard one to get across. Count me as one of those it's hard to get it across to, I guess. It seems to me that 1 in 20 experiments still should not return a significant result at the 0.05 significance level. 1 in 20 tests should return a positive result, but I don't think that extends to 1 in 20 experiments. At least not if they are adequately designed and have sufficient sample size. That's the whole point of using large sample sizes, isn't it? Or am I doing it wrong?
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
December 31, 2010 9:51 AM
Raven,
Might one say that science provides understanding rather than "TRUTH". While there likely is an ultimate "TRUTH", there is no ultimate understanding.
Posted by: Nerdette
|
December 31, 2010 10:28 AM
Heh, yes, it is, and I didn't ask what the difference was between science and religion. I asked for a name for a scientific belief system, and then provided a definition for belief system, since properly defining terms on the internet often helps miscommunication...
Good call.
Posted by: KG
|
December 31, 2010 10:44 AM
Of course there's an ultimate truth! Assume that there is no ultimate truth. Then the ultimate truth is that there is no ultimate truth - so there is an ultimate truth. QED ;-)
Admittedly, an intuitionist would not accept this proof by contradiction.
Posted by: raven
|
December 31, 2010 10:46 AM
Posted by: Hank Roberts
|
December 31, 2010 10:50 AM
Philip K. Dick, "How to Build a Universe That Doesn't Fall Apart Two Days Later"
http://deoxy.org/pkd_how2build.htm
In which he also wrote:
Posted by: Multicellular
|
December 31, 2010 10:58 AM
Gus @ #73
If something is significant at p=0.05 it simply means that you have a 5% chance of making what is known as a Type I error - incorrectly rejecting a result as not significant when, in fact, it was. So if we did 20 experiments we accept that one set of results will be rejected even though it was significant. We must therefore assume that 1 in 20 results will be significant (which doesn't mean the overall effect is there, just that one result showed significance, which could be due to any of the reasons PZ cited, and we rejected it). Increasing your sample size to 100 will improve the overall results but assuming a p=0.05 still means you accept that 5 out of the 100 will be incorrectly rejected even though they show significance. If you want to reduce your chances of falsely rejecting a significant result you can set a lower p-value (e.g., p=0.01 or a 1% chance of falsely rejecting data).
I'm not a statistician, just a grad student who had this drilled into my head taking research stats so if I got any of this wrong I'm sure other more experience posters will correct me.
Posted by: WCorvi
|
December 31, 2010 11:04 AM
Richard Feynman made the point that you have to be careful you don't get fooled (in doing science), and the one most likely to fool you is YOU!
We tend to think that scientists impartially look at the data, and draw the best conclusions. That isn't true - what really happens is that there are strong opinions (biases?) on both sides, and eventually the data support one over the other. Science is impartial, but individual scientists are not.
This is the problem with UFO's, bigfoot, nessie, etc - the data haven't improved one iota over the last 50 years. There are many cases of scientists refusing to accept mediocre data, but eventually the idea becoming accepted - rocks falling from the sky (meteors) is one such.
Posted by: Gus Snarp
|
December 31, 2010 11:20 AM
@Multicellular - I think you might be right. But I'm still not sure. It's been too long since I took statistics, and I don't do research so I don't keep it up. But my thinking was that the 0.05 significance level is based on the results of a given experiment. It is the likelihood, based on the sample size, among other factors, that the mean of your sample is different from the mean of the population. It applies only to a single experiment. But I suppose that if you conducted the exact same experiment 20 times with exactly the same sample size, then yes, one of those should have a different result. But I'm still not sure that extrapolation to other experiments holds. Arrgh. And I did well in statistics.
Posted by: HertfordshireChris
|
December 31, 2010 11:43 AM
Ibis3 #26 said
The general view of respondents is that everything is well with science and I would like to widen the point made above to say that if a idea gets accepted rapidly (fanned by commercial and media interests) it can be accepted as the established answer by the scientific community itself – and anyone questioning the “establishment view” can find themselves thrown out on the rubbish heap. I know, it happened to me. In many ways what happened was if the computer was some kind of god whose basic fundementals could not be questioned.
I was trained as a scientist to ask questions – and when I had found a possible answer, to ask further questions to test its validity. In 1965 I decided, for career advancement reasons, to ditch formal science and move into commercial computing – and without realising it took my science research philosophy with me. Within two years I had become a “sales consultant” who actual job was to help identify the future market (for the early 1970s) for large main frame computers for what remained, after mergers, of the company which built the pioneering LEO computers. I spent a lot of time talking to everyone from the research teams and the engineers designing circuit boards through to the data processing managers of large companies. I came up with a “shocking” research question.
I suggested that (at the time) everyone was taking the programmable calculating machine philosophy of the digital computer for granted – and as a result we had black-box systems with poor human comprehension interfaces which were unable to cope with dynamically changing open-ended real world problems. No-one had ever sat down and looked at fundamentally different architectures. In particular I suggested a “white-box system” where mutual human-computer “understanding” was the starting point, and where it was possible to evolve the systems behaviour as requirements changed. One way of looking at what I was trying to do was to build an information mirror that not only stored the users own view of the task, but was also able to reflect his way of processing the information.
The research was eventually abandoned because I failed to get research grants and became I became exhausted and disillusioned after a family suicide. However the underlying reason was that the computer establishment acted just like a religious hierarchy when its foundations were questioned.
There is of course a big difference between thinking of the computer as a god and thinking of Jesus as a God. Belief in the power of computers demonstrably works (so it must be right) – while many people can see no evidence (apart from the placebo effect) for any positive benefit in believing in Jesus. Except to a few very highly specialised software engineers the modern personal computer is a mysterious black box which conducts mysterious “rites of passage” on the information. The computer is now held in such high esteem that virtually every school child is being taught to program one (just like learning the catechism) and those who cannot are made to feel inferior. Anyone who does not recognise the power of the computer is old-fashioned or a Luddite, and their views can be dismissed without a thought. (Does any of this remind you of anything?)
I believe the real problem is that I was questioning the underlying philosophy. We all know the story of VHS and Betamax. OK Betamax lost out because is was a year or two too late. My difficulty was that I started questioning the underlying assumptions of the computer some 30 years too late .... So much investment has been made in terms of money, careers, data bases, commerical organisations, etc, that it is too expensive to even consider going back to first principles.
Whether my approach would have been viable in an open competition is one thing. The question you should ask yourself, as a scientist, is when did you last ask whether the current black box approach to information processing is the only approach to handling real world information? Did the human brain evolve to be like a computer or does it work in a very different way? If your answer to the latter is that the human brain is different have you wondered why electronic information processing machines have not yet been built that way? Having been through the rejection mill I feel I know the answer.
If you say there is no need to ask these questions because computers are so successful, you are in effect proving that the “worship of computers” is a belief system – and hence that it is valid to compare at least some aspects of what is generally accepted as science with religionss.
Posted by: Nerdette
|
December 31, 2010 11:56 AM
You already established this. I accepted it. I asked for the name of a scientific belief system, one that is based on the tools of science, such as observation and reason. I provided the definition of belief system (#37)- if you would like to suggest another definition, I'm quite open to it.
Posted by: Quodlibet
|
December 31, 2010 12:06 PM
Try telling that to my husband, a PhD clinical neuropsychologist with extensive graduate training in research statistics and design. He's one of the sharpest people I know for using and interpreting statistics correctly.
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
December 31, 2010 12:10 PM
Gus and Multicellular,
The real answer to your question depends on 1)what model you are using for your errors, vs. 2)what the distribution of errors actually is.
Statistical tests tend to be based on rather simple error models--e.g. Normal or Lognormal--mainly because these models are tractable. Even if the actual error distribution doesn't follow these simple models, they often work because of the Central Limit Theorem (which says every unimodal distribution kind of looks Normal around its mode).
OTOH if your results have systematic errors and your model assumes only random errors, then your analysis will be incorrect.
In general, if results are significant at the 5% level, it means that once in 20 times, random fluctuations in your data could give you a result of that significance or higher. So why not go to higher significance. In part, it's because of the cost of obtaining such large samples. However, I think if we go much higher in significance, we start to reach the tails of the distribution, where departures from Normality in the actual distribution may start to distort the results.
Posted by: David Marjanović
|
December 31, 2010 12:14 PM
What a thread! I've bookmarked two of the links in this discussion. :-)
I still don't understand what you mean by "a scientific belief system". A theory/hypothesis/speculation?
"The base on which beliefs are based" – are there beliefs in science? And if so, isn't that base simply science or the scientific method (falsification & parsimony)?
We can't knowingly build "electronic information[-]processing machines" (computers for short) the way the brain works, because we don't understand how the brain works in sufficient detail.
:-|
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
December 31, 2010 12:15 PM
If science and reason dictate "belief", then I'd call that "rationalism". Either that or the Reality-Based Community.
Posted by: Kagehi
|
December 31, 2010 12:16 PM
Think you missed one:
Diet variance - In most trials, the key feature is to, as much as possible, limit variables, this can include, in many cases, the general diet of the people involved, either intentionally, or unintentionally (say, someone is diabetic, so eats different than the general population to start with, and that is your initial target group to test with). Differences in intake of any number of things could later skew the results, or even be counteractive, in some odd cases, with know way of being able to test 100% of every possible thing some fool might be putting in their mouth.
Posted by: Orac
|
December 31, 2010 12:17 PM
A "buddy" weighed in on this a while ago:
http://www.sciencebasedmedicine.org/?p=8987
So did Steve Novella:
http://theness.com/neurologicablog/?p=2580
What irritated me about Lehrer's article is that he presented the decline effect as though it were something mysterious and unexpected. Then he went all borderline postmodernist on us. Very annoying.
Posted by: pelican's-point
|
December 31, 2010 12:28 PM
Rhetorical: What is not emotional about profit-making?
Substance: One such fundamental emotional attachment is the dismissal or refusal to consider contradicting evidence to what we believe to be true about the universe and our place in it. For scientists that typically includes the scientific knowledge they currently posses.
By keeping such beliefs sacred (emotionally trusted) they (we) gain a sense of security - we feel that the universe is knowable and that we can more safely guide ourselves through it. It makes us feel in control of our destiny and able to protect ourselves from danger. These are non-conscious but very strong forces, of course. They are feelings. That's why they can be so tenacious.
Such emotional responses are part of human nature - not something that can be reasoned away. And so, even the best, most objective scientist have them.
Posted by: raven
|
December 31, 2010 1:00 PM
Whatever a scientific belief system means.
A variety of answers have been given, summarized.
Humanism, Scientism, Philosophical Naturalism, Materialism, Rationalism, Reality Based Community.
Pick one or pick all.
Posted by: xyx
|
December 31, 2010 1:02 PM
All I'm saying is that most researchers are trained primarily in their specific field of study and not statistics. I know graduate students get some training in statistics, and I realize that some get quite a lot, and that is good. But in my experience, most medical researchers know how to use various statistical tools but do not have a deep understanding of their meaning. And this is reflected in the literature.
It is worth reading this article, which I think is a much better popular description of the problem than the Lehrer article.
Posted by: xyx
|
December 31, 2010 1:09 PM
This is wrong. p=.05 means there is a 5% chance of getting a result at least as extreme as your result given that the null hypothesis is correct.
Posted by: Enkidum
|
December 31, 2010 1:23 PM
@HertfordshireChris
The brain-as-a-computer model still carries a fair bit of influence in cognitive science, imho mostly due to inertia. But I think virtually no researcher under 40 takes it very seriously any more. I.e. no one who studies vision (that I've ever met) takes David Marr's theory of vision as anything more than a useful early step in modelling.
As for why no one builds electronic processing machines that way, lots of researchers try to, and even more try to simulate that architecture on a standard digital computer. For those who are trying to understand the brain, then, your point really doesn't hold. (I know you've been out of the field for a while, but hell, artificial neural networks have been big since the 50's!)
As for why the computing industry in general doesn't build computers that way, it's because the purpose of computing isn't to simulate the human brain - it's to build useful tools. Digital computers are cheap, easy to build, and remarkably powerful for a vast number of tasks. And the user interfaces are now remarkably intuitive as well. Why fix something that ain't broken?
In short, I think your computing-as-religion analogy is out to lunch, not to mention completely off topic...
Posted by: Enkidum
|
December 31, 2010 1:32 PM
@ a bunch of people re stats...
Replication pretty much wipes out this worry. Either a result is replicated, in which case your p
The trouble is that in many fields - psychology being the one I know about - there isn't much of an incentive to replicate. Thus there are still "effects" being presented as "the truth" in first-year social psych textbooks, but no one has managed to replicate them in 30 years (and believe me people have tried). But no one can get a publication out of a null result.
I believe this is not the case in, e.g., physics. Once it becomes required practice in psych for results to be replicated by independent researchers before they are taken seriously, then we'll have a lot of bullshit out of the way. (Of course this would also come with its own host of problems.)
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
December 31, 2010 1:45 PM
xyx,
I work as an applied physicist, and it has been my experience that most researchers do not have a deep understanding of probability and statistics. Period. No need to qualify what field. Researchers learn what they need to for doing their research, and most of them learn recipes, not fundamentals.
There will hopefully be a few in a field who do understand statistics well enough to keep things going off the rails, and who will know to consult with a real statistician when they are out of their depth. There will also be a few who push the envelope in the field, bringing in new techniques, and once in a great while even improving on existing statistical practice.
However, probability and statistics is a very deep pool to swim in. Even mathematicians are not fully satisfied with its basis, and it pays to appreciate that it is possible to lie to yourself with statistics so that you instead use it to illustrate the truth.
Posted by: Nerdette
|
December 31, 2010 1:52 PM
Which is why I don't like the word belief - it can either be taken as 'a vague idea in which confidence is placed' or 'acceptance of what is true'. But as one commenter mentioned above, it's unlikely that its replacement in communications will improve any understanding of intent.
Taking 'belief' to be 'acceptance of what is true', then a 'belief system' would be the processes an individual takes to accept their reality - one can either do it through dogma and faith or through the scientific tools of logic and observation. It does sound rather hokey, but since it was the term that began the whole ordeal, it was difficult to move away from it.
Also a good call.
I know, I did in #75. Hence, my confusion as to why you kept backtracking.
Posted by: https://me.yahoo.com/a/isgWt2w1p9c5fkgFME9jirct#ef5d3
|
December 31, 2010 1:57 PM
Am I wrong or are all the examples in psychology? Do other sciences have the same problem?
Posted by: xyx
|
December 31, 2010 2:00 PM
a_ray_in_dilbert_space,
Good post. I fully agree with you.
Posted by: cd jauer
|
December 31, 2010 2:07 PM
Years of petroleum exploration; a very applied scientific endeavor in which failure can result in a very short career, gave me two scientific maxims:
1) If you seek the truth, be prepared for it to be ugly.
2) Mother Nature is a bitch.
PZ, you should give lectures on the philosophy scietific exploration!
Posted by: barney.oran
|
December 31, 2010 2:08 PM
I think these are important ideas that everyone needs to learn. I've had this argument with people where I say "study after study has shown no statistically significant effect" and they say "I specifically remember hearing of a study that proved an effect." Doh! I try to explain regression toward the mean or something like the file drawer effect, but people who believe one study they agree with over 10 they don't are predisposed to not understand those explanations.
I just hope quantum mechanics doesn't get proven wrong and all our nice electronics stop working....
Posted by: The Sailor
|
December 31, 2010 2:24 PM
My experience in medical research has lead to the conclusion that most MDs don't know squat about statistics.
He is an outlier, and one anecdote does not an argument make.Making statistical sense of medical data is something that most researchers are not prepared for. Hire a statistician, or be lucky enough to have one already on staff (like the above anecdote.)
I'm not a statistician, I can provide the data, and I can do the pre/post processing that provides the data, but someone good at stats has to analyze it. And every time I see a researcher throw out inconvenient data points so their curve fits better I want to scream.
Posted by: Enkidum
|
December 31, 2010 2:38 PM
"And every time I see a researcher throw out inconvenient data points so their curve fits better I want to scream."
Outlier rejection is a perfectly reasonable practice in principle. It's just most people don't do it unless their data seems odd to them, which is essentially trying to make their curve fit better. Even in this case, using principled rejection techniques ought to get rid of some of the worry - but I agree with you that not everyone uses these techniques, and that even if they did the worry would still be there.
Posted by: The Sailor
|
December 31, 2010 3:17 PM
@ me at #103 I should have written "And every time I see a researcher throw out inconvenient data points so their curve fits better I want to
screamsquirm."There, fixed it for me.
Posted by: The Sailor
|
December 31, 2010 3:39 PM
I love science, I love standing on the shoulders of giants, but most of all I like learning new things.
I even like learning that I'm wrong.
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
December 31, 2010 4:57 PM
The Sailor: "And every time I see a researcher throw out inconvenient data points so their curve fits better I want to scream squirm."
Enkidum: "Outlier rejection is a perfectly reasonable practice in principle."
Actually, outlier rejection is a black art for the following reasons.
First, you must have some criterion for rejecting the data. Unless there is a KNOWN reason why the data is "bad", this presumes some sort of model. How do you know that your model is appropriate for the data?
Second, outliers often convey very important information--e.g. the presence of a second rare mode, the shapes of the tails or just that it is possible to get outliers (which may indicate another mechanism is affecting the data). Often the outliers can be the most valuable data points because they inform us of the limitations of our models.
Third, the outliers are often the first indication that something might be wrong with the model or the apparatus...
So, it is sometimes necessary to reject outliers. It should never be done solely on the basis of their numerical value. Even looking for a post hoc justification can be risky. Usually to be safe, I look at what the data are trying to tell me both with and without the "outliers".
Posted by: HertfordshireChris
|
December 31, 2010 5:17 PM
Enkidum #94 says:
It is interesting to see you dismiss my comments with the same zealous comments that a fundamentalist xtian would dismiss the comments of an atheist. i.e. Don't talk rubbish and don't talk it here. Such Talk offends my beliefs.
This topic relates to the supposed limitations of science and my experience is that scientists are no different to other people which means that when they get on a roll (both singly and collectively) the ideals of “science” can go out of the window and "Belief" takes over. A significant number of scientists (particularly second rate ones) can behave just like a fundamentalist xtian when you question the foundations of their belief, especially if the questioner has a very different background so can be dismissed as “out of touch”. (= not having read the sacred books)
My “problem” was that before I ever came near a computer I had experience of processing very large quantities of often poorly structured information manually in several different areas. I switched to the big end of the commercial data processing business. (In 1965 the company had some 250,000 customers and 5,000 product lines. - all batch processing using magnetic tape storage) Perhaps because of my science research background, plus previous experience of providing complex R & D information for use in board level decisions in the pharmaceutical industry, I took a particular interest as what was causing the various, often large scale, glitches that occurred. The company was planning to move direct access storage (which we take for granted by the gigabyte today) and terminal working and I was asked to familiarise myself with the sales contract system. I misunderstood my task, which was to get myself ready to be one of a team of very intelligent systems analysts who would draw up a precise pre-definition of the new invoicing system. Instead I used my first hand experience of working in large open ended manual information processing systems, plus my experience of the faults that developed in the existing computer system. I worked on the assumption that what was required was a system which allowed sales management to control a dynamically open-ended system that could rapidly adjust to market changes. Because of my pre-computer information processing background I honestly had no idea that I was suggesting anything which was not obvious.
I could go on with the history of how this developed into an unconventional architecture - which would definitely be off topic. The distinction between the “conventional computer philosophy” and “my approach” in highlighted by the underlined words. Do you predefine define a task in advance, or do you provide a dynamically open-ended framework (on the basis that the world is too complex to precisely predefine) and simply provided a white-box system which the people who use the system can understand and drive. In a way you can say that the computer industry is like a railway, where specialist engineers build railway lines to take users where they want to go – while I was proposing a helicopter where the user can go where he wants and I, as the system designer, do not care, or need to know, where the user might decide to go.
You mention artificial neuron networks and I would hope that that with the vast increases in computer power, and the thousands of dedicated people who have worked on them some progress would have been mad. However I suspect that once allowance is made for the number of man-hours and computer power invested in this field, and in the artificial intelligence field, progress has been remarkably slow. Perhaps I am wrong and artificial neuron networks have now advanced to the stage where they allow sales staff to run commercial systems involving hundreds of thousand of sales transactions a day.
You mention the vast improvements in user interfaces – which I never queries – the existing systems are still mysterious black boxes where the user has no idea what is actually going on inside – while my idea was to have a system which was a user friendly white box at the hardware processor level, and where if there was a problem with a tasks the computer could only communicate in the user-defined application language. Modern computers may perform miracles (Praise be the Almighty Computer) but the level of understanding of language show by, for example Google, is pathetic.
OK there may well be misunderstandings about terminology. What I mean by information and what you mean by information may be different. You may think that an ipod with a sophisticated user interface is all that is needed for an electronic information processing system to understand and support the information needs of its owner. I think that the level of mutual understanding is actually very low.
So back to your response. Such disagreements in terms of reference between scientists behaving in the “science mode” are common, and are sorted out by asking questions to explore the areas of apparent disagreement. You don't ask questions. Like any fundamentalist who is convinced that he is right, and who is faced with someone who questions his belief system, you retaliate by trying to belittle your opponent.
If you are not careful you will make my point for me by demonstrating that your belief that computers are based on a well research foundation is more important to you that your adherence to the scientific method.
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
December 31, 2010 5:46 PM
Hertsfordshire Chris,
Waving a "religion" flag at a group of atheists--particularly atheists who are often told their atheism is a "religion", that "Darwinism" is a religion, the global warming is a "religion"...is probably not the best way to get your ideas across.
Now as to your thesis in general, I think you might have a leg to stand on of computers were designed for a single purpose--e.g. research, gaming... Indeed, when they were purely research tools a lot more research went into optimizing design, etc.
What has happened in the last 30 years is that 1)computers have become multi-purpose boxes used by practically EVERYONE... well except my parents and uncle; 2)thirty years of exponential growth in density, computing power, etc. has removed most of the constraints under which computers labored in their infancy (e.g. reliability, memory, speed of operations...). The net result is that sloppiness is not only not punished, it's almost rewarded.
There's nothing wrong with your thesis per se, regardless of whether it is correct or not. However, ixne on the eligion-re.
Posted by: Enkidum
|
December 31, 2010 5:59 PM
Chris, chill out. I said I thought you were off topic, and wrong. I still think you are. Other than that, I didn't ascribe any beliefs to you, belittle you, or anything else. I also tried to give a number of precise examples of where I thought you were wrong. There's nothing religious about that.
I can't go back in time and understand exactly what the problem you encountered with sales systems was. But it strikes me that your real problem isn't that computer scientists and programmers are religious, it is that what you want them to do is difficult. In fact, there are reasons to suspect that something like natural language processing (and other forms of ai) is pretty much the hardest problem that anyone has ever tried to solve in history. (I am not being hyperbolic here - it may literally be the most difficult thing that humans have ever tried to do. It might not, but it's in the running.) I don't know what you mean by a "dynamically open ended" system, exactly, but it sounds like it's much the same issue. Intelligence is hard. That's why there isn't very much of it in the universe. You're right that we don't have intelligent computers, and that even those who are trying to make computers intelligent are barking up a lot of wrong trees. You may even be right that some of this is due to adherence to ideas simply because they are the ideas other people like, and that this has something in common with religious belief. But I think you're forgetting how hard this problem is.
Wanting a white box at the hardware level is a different issue entirely. Again, I don't think it's a particularly good idea - if you want to use a tool, you want it to be a black box - you don't want to think about its insides. You don't need to know how your muscles work in order to use your arm - why should your computer be any different?
Posted by: JennieL
|
December 31, 2010 6:35 PM
Gus @73:
OK, try this. Suppose you want to know if the mean value gained from your sample group, X, demonstrates whatever effect you're investigating. You want to know how likely it is that you could have found a mean value of X just by taking a random sample of the population. The less likely it is that you could have come up with X by mere random sampling, the more likely it is that there is some real effect (e.g., that some investigational drug works).
Now suppose that the real population mean value is Y. Because the values of individuals in the population are distributed around Y, when you take a random sample from the population, the mean value of the samples will vary around Y in a normal (bell-shaped) distribution. Most of your samples will have mean values fairly close to Y, but every now and again your sample will consist of more outlying individuals, and return a sample mean which is quite far away from Y.
The normal distribution is a probability distribution, telling us how probable it is that you will get a sample mean at various distances from the (real, underlying) population mean. A p=0.05 value means that you will get a value at least that far away from the population mean 5% of the time. That is, 1 in 20 random samples from the population will yield a sample mean that diverges that much from the population mean.
Adopting an 'alpha' of 0.05 means that you've decided that you will consider a 1 in 20 chance unlikely enough to reject the null hypothesis if you get a result that could only happen by chance 5% of the time.
A Type 1 error (alpha) is the probability that you will reject the null hypothesis when the null hypothesis is actually true. Thus, supposing that you decided to take repeated samples of the same population, on average 5% of the time you would get a result which led you to reject the null hypothesis even though it was actually just an outlying sample mean.
That's what PZ is saying: that on average 1 in 20 random samples will have means that are significant at the 0.05 level - because that's what 0.05% significance means.
I remember seeing a 'study' published on homeopathic remedies for migraines, where they claimed to find a significant effect, but then noticed that their 'placebo' and 'remedy' groups were too different at the start of the trial to properly compare. Looking at their numbers, the groups were actually so different in initial incidence of migraine that, if you just compared the 'treatment' group to the 'placebo' group before the study started, you would reject the hypothesis of no difference at 99% significance!
Posted by: HertfordshireChris
|
December 31, 2010 6:42 PM
Just a quick further comment
If I suggested that Science was a religion I didn't mean to - what I was saying is that my experience is that in some areas at least areas there are scientists (and more particularly technologists in the computer industry) who behave in a similar way to fundamentalists believers when their views are questioned. This can actually lead to ideas which question their "beliefs" to remain unfunded, or otherwise discouraged. The very fact that comparing scientific attitudes (as actually practised -0 rather than the scientific ideal) with religious attitudes is like a red rage to a bull for many atheists (of which I am one) suggests that they are not sufficiently good scientists to stand back, see the wood from the trees, and realise that the are fundamentalist believers in all walks of life, including scientists.
As to the question of a white/black box, the most important thing about any tool is that you and it can work together in symbiosis. It you want a tool to work efficiently it is important that the user knows enough about how it works to get the best out of it. The more the user understands about the tool the more he can get out of it. If your experience of computers is that you prefer a black box because the inside is incompressible that is your choice. A white box can always be used as a black box, but can also allow creative working together with understanding. I doubt whether you have ever thought how an information white box should interact with its user - or even if one was possible - and your rejection is based on extrapolating your experience of computers.
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
December 31, 2010 6:43 PM
Jennie L., How do you know your errors are normally distributed?
Posted by: JennieL
|
December 31, 2010 6:44 PM
a_ray_in_dilbert_space OM:
I'm simplifying for explanatory purposes. :-)
Posted by: The Sailor
|
December 31, 2010 6:48 PM
"Google, is pathetic."
Really? If you can do it better then you too can be rich. Got Facebook or Youtube?, you too can be rich.
Your entire TL:DR skrees just sound to me like you got your fee fees hurt and don't want to play anymore. I get it, you are an unrecognized genius ... who has nothing better to do than whine about it.
It's just software, buy a machine and emulate your model and market it ... or ... there are plenty of analog computers, they're making a comeback, but they still require work and software.
Science is not dead, and it did not throw you on a rubbish heap. A commercial enterprise that you voluntarily went to work for, and where you couldn't play well with others, did.
GOYA and do amazing work, then I will be happily amazed.
Posted by: Enkidum
|
December 31, 2010 6:57 PM
"It you want a tool to work efficiently it is important that the user knows enough about how it works to get the best out of it."
Disagree entirely. At least if "enough" means that the user needs to know anything about the actual processes by which the tool accomplishes its goals. You don't need to be a mechanic to drive a car, etc.
Now if you're saying that tools should, in principle, be designed such that we can all be mechanics, at least to a certain extent, I agree with you. I think, for example, Apple's policy of closing off the insides of their devices so you can't even look at them is wrong (frankly, I think it should be illegal). I like the idea of being able to look at the guts of anything. But I'm a nerd, and most people aren't.
In order to use a tool efficiently, you want its use to be completely transparent - if you have to think about its operation at all, it's not efficient. This is why babies aren't good at walking - it's not just that they haven't got the requisite muscles, it's that each movement has to be thought about (to the extent that babies think). It isn't until the process of walking becomes automatized that we are good at it (and automatized means "black box" - you try and think about how you walk and watch yourself trip over!).
Posted by: Amphiox, OM
|
December 31, 2010 7:01 PM
Three reasons come to mind.
First, we don't yet know enough about how the human brain is built to reliably and economically replicate the process.
Second, electronic information processing machines are useful tools because they are able to do certain tasks much, much more easily and quickly than the human brain can, and they can do that because they are built in a very different way.
Third, if you really, really want or need an information processing system that works like a human brain, built with an architecture like a human brain, there are already a variety of reliable traditional methods for obtaining access to one. And at least one reliable traditional method for making more.
Posted by: Amphiox, OM
|
December 31, 2010 7:04 PM
Outlier rejection is the single most common cause for missing important, perhaps field-shaking, findings.
Outlier rejection would be Ian Fleming throwing out his fungi-contaminated petri dishes.
Posted by: Enkidum
|
December 31, 2010 7:20 PM
Yeah, but sometimes an outlier is just an outlier. Measurement error is real. I think there's probably more to be said for not chucking out 99 studies because of outliers than for keeping them in that 1 case where the outlier actually tells you something useful.
Posted by: Otis
|
December 31, 2010 7:26 PM
Data showing a hole in the ozone layer over
Antartica was "filtered out" for decades.
Posted by: Amphiox, OM
|
December 31, 2010 7:28 PM
Sure, but if you just toss them out you'll never know. The intellectually honest thing to do is to keep the outlier and do the hard work of making more measurements/improving your measurement technique until the outlier is no longer statistically significant (in other words you include the outlier in your data set, report it, and show that it is really an outlier).
Or you do the extra work necessary to prove that the outlier is a true measurement error, so you can reject not because it is an outlier, but because it is an error.
Posted by: JennieL
|
December 31, 2010 7:28 PM
a_ray_in_dilbert_space, OM:
I didn't mean my #113 to be flippant. I was just talking about mean values and normal distributions as the simplest way to explain the notion of 0.05 alpha.
But now that I think about it, I had thought that the Central Limit Theorem guaranteed that the distribution of sample means would be approximately normally distributed (even if the underlying variable is not normally distributed), provided that number of samples is sufficiently large and samples are independent.
Though IANAStatistican (yet... ;-) so I'd welcome correction if I'm wrong.
Posted by: The Sailor
|
December 31, 2010 7:31 PM
Ian Fleming would never have thrown out his petri dish, Q would have made it a weapon that's far more effective than throwing.
Posted by: Otis
|
December 31, 2010 7:35 PM
Did James Bond discover penicillin?
Posted by: Amphiox, OM
|
December 31, 2010 7:36 PM
Ha! Got me good Sailor.
And the funny thing is I don't even read, or like, James Bond.
Posted by: John Morales
|
December 31, 2010 7:49 PM
JennieL, not all probability distributions are continuous, and not all variables are random.
Even if they were, skew is a complicating factor.
Posted by: Enkidum
|
December 31, 2010 7:54 PM
Can I just add that I have never actually used outlier rejection? (Never needed to - it wouldn't have helped confirm my hypotheses :) !) The prevailing opinion seems to be against me, but I dunno - I've had enough stats people tell me they think it's a decent idea that I'm not about to throw it away entirely.
Posted by: The Sailor
|
December 31, 2010 7:55 PM
Otis, James Bond did not discover penicillin, but he did go thru quite a bit of it.
++++++++++++++
amphiox, all in good fun ... especially since I was quite dickish up above to HsomethingChris.
++++++++++++++
My fav stats quote is my brother's:"There are liars, outliers, and out and out liars."
Posted by: HertfordshireChris
|
December 31, 2010 8:17 PM
It already New Year here and I am off to bed.
At the time I had the key idea I very tentatively told my boss and he asked David Caminer and John Pinkerton - who were leading pioneers in building commercial computers - they built the Leo 3 - who moved me quickly into research, told me to tell no one, and work started on taking out patents.
Shortly afterwards I found my wife invited a lot of friend from the local church into our house, and the Vicar (the only other male present) asked me what I did. I suddenly realised that he understood what I was proposing far better than most of my highly computer literate colleagues at work and I had better shut up. The point is that he knew nothing about computers and so had no preconceptions.
Basically I found that with a few notable exceptions the more someone knew about programming computers the more they wanted my box to be a "programmable object" when the basic idea is that the partition of information into "program" and "data" is unnecessary and can actually make it harder to understand how my idea worked.
Whether the idea would have worked as a fully blown alternative for computers is actually irrelevant. The point is that I was asking a fundamental question as to whether an electronic processing machine based on a programmable calculating machine was a good foundation to meet the needs of humans processing large quantities of often poorly structured non-numerical information.
I suspect that others were working along other lines and hit similar obstacles. What one had in the 1960s (and later) was an overwhelming belief that the technology was right (because it was so successful) and that research meant exploiting the technology for all it was worth - and blue sky research which didn't promise to make a profit within weeks could not be funded.
As I said earlier I gave up because of the stresses of a family suicide and and a head of Department who deliberately obstructed my research because I did not have a research grant. A year after I had decided to drop the idea for personal health reasons a paper was published in the Computer Journal (the top UK refereed journal at the time) and shortly after the head who had forced me to leave was asked to quietly find a job somewhere else as the University wanted to avoid any scandal over his bullying of other staff (and from my experience, students).
Since then I have been researching in a completely different field (only using computers as tools), and only recently decided to see what had been happening. I in now way dispute that wonders have been achieved using existing computer approaches - but the progress in the specific areas which I was investigating seem comparatively trivial apart from the scale effect of millions of man hours slaving away with ever more powerful hardware.
Amphiox says First, we don't yet know enough about how the human brain is built to reliably and economically replicate the process A lot of effort has been put in trying to answer this one. One possibility - which as scientists we should not overlook, is that we are looking in the wrong place. If alternative approaches are crushed at the early blue sky stage because they don't fit in with the existing establishment funding mechanisms, and people who suggest them are ridiculed, what hope is there of ever breaking out of the mould.
Posted by: Amphiox, OM
|
December 31, 2010 8:24 PM
You can take some solace in history. Moulds often do get broken if the need is great enough. If the need never becomes great enough, you could make the argument that the mould did not need to be broken.
Posted by: JennieL
|
December 31, 2010 8:27 PM
Hi John Morales,
Thanks - I know that. You're right that I was assuming random samples. But I wasn't talking about the distribution of the values of variables themselves, I was talking about the distribution of sample means of values of the variable. My understanding is that the distribution of sample means will be approximately Normal given certain conditions (random, independent samples, sufficient number of samples). This will be the case even for variables which don't themselves have random distribution, and doesn't (AFAIK) require that the variables be continuous. It seems to me that this would be true, for example, even with a binomially distributed variable (even though I can't imagine why you'd bother taking a sample mean as opposed to a proportion there).
For example, if you go here, there's a java applet that lets you specify an initial distribution of a variable, then do a sampling distribution of means, medians, variance etc. and allows you to overlay a normal distribution over it.
Posted by: The Sailor
|
December 31, 2010 8:51 PM
Happy New Year HertfordshireChris. And everyone else of course.
But I can't help but add, god is dead, science is just fine, thank you.
Posted by: https://www.google.com/accounts/o8/id?id=AItOawnrGiPZRjcJk8qywZaZn0PvaU_1BEyc8J0
|
December 31, 2010 10:03 PM
The article focuses almost exclusively on medical, and biological problems where population and statistical significance are a major experimental struggles. The author finishes with an audacious effort to indict even non-statistical experimental results by citing the flawed gravity versus depth experiment in a mine. The experiment measured little g versus depth and assumed that the rock density into a flat a plain was uniform - thus any deviation meant that gravity was changing anomalously with depth. The experimental flaw is obvious but the press hyped the result as showing errors in gravity theory.
As was said about Atlas Shrugged, "this...should not be laid down lightly but flung with great force."
BCW
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
December 31, 2010 10:33 PM
JennieL, The mean will indeed be normally distributed, but the mean will only give you a rough idea of the center of the distribution, and it won't tell you how it's related to other estimates of central tendency--e.g. mode and median.
What is more, the width of the distribution of sample means depends on the population standard deviation--which we do not know.
All this is mainly important if you care about the extremes of the distribution, but that is in fact the case in many disciplines (e.g. engineering) and it probably ought to be the case in others (e.g. medicine)
Posted by: JennieL
|
January 1, 2011 1:26 AM
Hi a_ray_in_dilbert_space,
I get the feeling theree's some misunderstanding about what conversation we're having. I was trying to answer the question about why we would expect 1 in 20 samples to be statistically significant at the 0.05 significance level. I used the example of sample means, which I assumed to be roughly normally distributed, for the purposes of explaining this. You wanted to know how I knew the distribution of errors was normal, and admittedly I was just stipulating that for simplicity, but there's a theoretical justification for thinking that a distribution of sample means must be roughly normal.
So, everything you say is true, and I'm aware of it, but I'm a bit puzzled as to why it makes my original explanation faulty. Obviously for purposes of interpretation and analysis you need more information. This doesn't mean that the distribution of sample means is not in fact roughly normal (ceteris paribus) and it doesn't change the fact that if you adopt an alpha of 0.05, you'll get a 'statistically significant' sample mean on average 5% of the time even if there is no effect of whatever it is you're studying.
Posted by: a_ray_in_dilbert_space, OM, A little FUCKING ray of sunshine
|
January 1, 2011 10:09 AM
JennieL,
The problem is that the analysis assumes normality, and while it is true that the sample mean will be distributed normally about the actual mean, if the population is not normal, our sample standard deviations well tend to skew our confidence intervals on the mean. See:
http://mathworld.wolfram.com/SampleVarianceDistribution.html
When we take one-sided tolerance limits to define our confidence intervals appropriate for sample size, these will be skewed by the fact that we don't really know the errors on standard deviation.
If all we care about is the mean, this really only becomes important for very high confidence intervals. However, in many cases, the extremes of the distribution are likely more important than the central behavior. Try taking data from a Cauchy distribution and treating it with the assumption of normality and see what happens to your significance.
http://mathworld.wolfram.com/CauchyDistribution.html
Posted by: dweisman
|
January 1, 2011 1:55 PM
Like all of Leher's works, this essay makes too many allowances to gain the attention of the non-scientist reader. I'm left with the impression that he really has very little to actually say.
Posted by: John Phillips, FCD
|
January 1, 2011 7:43 PM
Gus Snarp, it simply means that, with a particular experimental setup and the statistical tools used, they are accepting a result with P=0.05 as being statistically a sure thing, so to speak. However, this also means that there is a 5% chance that the significance of the result is purely by chance. If the designed experimental setup looked for p=0.01 that would mean that only a result of 99% would be accepted as statistically significant. Or conversely, a 1% chance that the statistical significance arose purely by chance. It doesn't matter whether you do one, twenty or a hundred experiments, a result with P=0.05 means that there is a 1 in 20 chance that there is no statistical significance to the result as it is purely due to chance.
Posted by: Sven DiMilo
|
January 2, 2011 12:00 AM
Just off the road; have not read any comments.
Great post, man.
Posted by: unbound
|
January 2, 2011 11:22 PM
I think that the main challenge to science isn't the scientific method, but the massive misunderstanding of what statistical analysis is. Statistics are simply observations (granted, very good ones)...but there are far too many studies that come to conclusions based on (typically) very marginal statistically significant trends. This is a far cry from the historical scientific method where observations that didn't correspond with the theory means that you need to reexamine the theory.
If you are showing a trend that 60% of X correlates to a positive indication of the theory, that's nice and work should continue to refine the theory...because a 60% correlation means there is 40% that doesn't correlate well at all. I'm oversimplifying a bit, but this is a big issue. I've read countless studies (especially around diets) where the research stops at this level, and the researchers claim their theory is solid in their papers. The theory should be tweaked to account for the data better...but nothing happens until the next researcher produces another study with the same marginal correlation. Where is the improvement?
Until more researchers become more responsible about their work and their claims, this remains an easy target by the non-scientific groups to attack. It would be nice to blame the groups...but the poor research that rely on these marginal statistical analysis really needs to stop being sold as confirmation of any theories.
Posted by: jamesfromkenya
|
January 3, 2011 11:45 AM
What I find surprising is how this is quickly turned into a science versus religion argument; disdain, condescension and emotional language cloud what clarifications many of the persons posting here are trying to make- Jonah Lehrer makes some interesting points, is off the mark on some but what these posts show is that many scientists are invested in their work in a way that is more reminiscent of religionists that dispassionate empiricists. PZ's article isn't much of an improvement on Lehrer's- he agrees that the explanations he posits for the 'decline effect' have in fact been made by Lehrer in his article and then takes us down the road of 'psychological manipulation' which some psychological manipulation of his own. I guess that is what gets the posts going.
Posted by: John Phillips, FCD
|
January 3, 2011 2:21 PM
jamesfrom keny, I am curious, where do yu see the psychological manipulation by PZ? All I see, apart from where he agrees with some of the bits Lehrer wrote, is him stating that science is hard and simply pointing out how, in his opinion, Lehrer had used weighted language and also disagreeing with the conclusion of the article that science is in trouble. After all, it is scientists themselves, using the scientific method , are the ones who eventually correct the errors, biases or even occasional outright fraud of other scientists, not usually the general public or even so called science reporters/writers. And yes, I did deliberately use loaded language with the so called label for science reporters/writers. But in my decades of experience of reading MSM science reports and articles, the majority of MSM science reporters routinely prove that they know as much about science as the average creationist.
Posted by: del
|
January 3, 2011 6:59 PM
JennieL / Dilbert Space
Hi. FWIW I think Jennie's comments correctly address the question at hand: i.e., "what is it about the conventions of significance testing that result in 5% of apparent findings being "false positives" even absent other problems?
Non-normally distributed data, for its part, can certainly be added to the "other problems" list, but while Mandelbrot and Taleb have gotten a lot of mileage out of this problem in regards to financial risk modelling in particular I think non-normality is less of a practical problem for many other disciplines, and even where it is a problem, isn't there always someone who comes along and re-runs the initial, say, ordinary least squares analysis with some manner of log or other transform?
Posted by: natselrox
|
January 4, 2011 9:12 AM
Jonah Lehrer has posted a reply to some of the criticisms made here.
Sure, as scientists, we have a few common enemies and should be united in our fight to spread reason but shouldn't be too overwhelmed by the enthusiasm to forget the necessity to improve the scientific method from within. As of now, Lehrer's concerns are somewhat valid although the detest the pessimistic tone of his article. Science is not perfect but it's all we got!
Posted by: David Marjanović
|
January 4, 2011 9:54 AM
I don't understand any of that. In physics, and in science in general, this kind of negative result is easier to publish than yet another boring confirmation. "Widely accepted theory wrong!!!1! We need to rethink everything we thought we knew!eleventyone!!" is a good headline. Most journals only accept contributions that are new and newsworthy; PLoS ONE is the great big exception.
Are you trying to imply that psychology isn't being done as science...? :-)
Posted by: Orac
|
January 4, 2011 10:54 PM
Actually, as John Ioannidis has shown, in any individual clinical trial, the odds of a false positive result are actually much higher than 5%, particularly the more scientifically implausible the hypothesis:
http://scienceblogs.com/insolence/2007/09/the_cranks_pile_on_john_ioannidis_work_o.php
Posted by: frankenstein monster
|
January 6, 2011 5:55 AM
Solution. Take the hypothesis 'truth wears off' and test it as many times as possible. If it is true, then its own truth will wear off, and it becomes false. The problem is thus self-eliminating.
Posted by: bayesrules
|
January 6, 2011 11:21 AM
There's a huge amount of confusion about the meaning of p-values here. Part of it is due to confusing the alpha level of a test, which is set in advance of looking at the data, with the p-value, which is a function of the data.
If I do an alpha level test at 0.05, then I am saying that I will reject the null hypothesis if the p-value I get from the data turns out to be less than or equal to 0.05; In such a case the only probability statement that I can make is that IF the null hypothesis is TRUE, then the probability that I will incorrectly reject the null is 0.05. That is, my Type I error rate is 0.05.
HOWEVER, if I observe p=0.05, then it is NOT the case that there is a probability of 0.05 that I have committed a Type I error. The alpha level refers to the probability over the entire interval p=0.00 to p=0.05, not to its end point. And, in fact, if you draw a picture, getting p=0.05 is more likely than getting a smaller value, and since the alpha level is referring to the entire interval, it must be the case that getting p=0.05 is MORE likely than the alpha level would indicate.
In fact, there is no way to interpret an observed p-value as a probability. It isn't the probability that the results were "due to chance," since whatever result you get was due to chance (your experiment involves chance processes). The probability that you got the result you did "by chance" is 100%! It is not the probability that the null is false; Only a Bayesian analysis can give you that probability, and the p-value is purely a frequentist concept.
All of this is notwithstanding the fact that it is very difficult to devise hypotheses and tests of those hypotheses that actually have an exactly known null hypothesis, because in real life there are always experimental defects that will render the null hypothesis actually false. In real life, I actually don't need any data whatsoever to know that the null hypothesis + experiment I am doing is wrong. The only question is how much data you have to take to show this fact.
The eminent statistician Jim Berger has an excellent page on p-values, that includes a Java applet that allows you to verify that the numerical value of an observed p-value cannot be interpreted as a probability. It can be found here.
Posted by: furiousnegro
|
January 20, 2011 12:38 PM
Found on reddit:
As always, PZ Myer's shows himself to be a deceivingly articulate angry idiot, and he completely misses the point of the article.
The article uses this example to illustrate on of its central points:
Why would the effect become more difficult to find over time? Why would the effect appear to shrink over time? If the first result was a fluke, then subsequent papers testing the hypothesis should have almost immediately shown little to no effect. This is not regression to the mean.
The explanation suggested by the author is that the system is biased: journals tend to accept papers with positive results, and reject papers with negative results. Ie:
Now, what can we extrapolate from this? The 'hottest' scientific theories will be even more prone to this sort of unintentional bias. Which theories among our most dearly held are nothing more than an elaborate type I error?
Myers finishes with this stunning insight:
Yes. Generally speaking, science works. Any layman could tell you this - we drive cars, talk on cell phones, and undergo organ transplants - it's obvious that science, more or less, does work. I doubt the author of the article would deny this.
But not all science is amenable to this sort of "yea obviously it works" verification.
It's obvious that the quantum mechanics is accurate for most intents and purposes because we use lots of technology based on the theory. The endpoint is something so mind blowing ('magic', to quote Arthur C. Clarke) that the theory must be true - quantum mechanics must be true, because I'm typing on a keyboard and communicating with someone on the other side of the world.
It's obvious that the field of transplantation medicine is legit. If you take someone's liver out, they die. But somehow, doctors can take someone's liver out, replace it with someone else's liver, and the person will live. That's a magical endpoint. This is not the kind of science the article is criticizing.
[Let me preface this paragraph by saying that I'm not necessarily knocking any of the following fields.] Now, consider theories that are often not amenable to this sort of verification ("bland" theories). Antidepressants, vaccines, psychology, economics, climatology, parapsychology, etc. These theories, to date, have not yielded magical "no shit the theory is accurate" results (possible exception - smallpox vaccine), and they likely never will. There's nothing magical about reducing the incidence of disease - nutrition and sanitation will do that. Instead, the evidence for these is based on statistical arguments.
It's this bland science that the article is rightfully criticizing.
So Myers is right, who cares if we haven't exactly proven cell theory? It works.
But cell theory is magical. Obviously it works - we can look through a piece of glass and watch little round things flitting around. They can attack each other, grow, shrink, reproduce, etc.
Myers deceptively fails to point out the difference between bland science and magical science. And this distinction is important, because bland science requires faith in the system of peer review whereas magical science does not.
His first implication is that all bland science eventually works itself out into magical science - this isn't true. There may never be a time when we can say, "oh yea, antidepressants obviously work" - that assessment will likely always be based on our faith in the system.
His second implication is that the system of peer review will eventually work out the good bland science from the bad science - this isn't necessarily true either. If that peer review system is flawed (the article shows it may very well be), then many of the things we consider true may in fact be false.
As usual, PZ Myers misses the mark by a mile.
Posted by: Ing: PhD Trollologist
|
January 20, 2011 12:43 PM
OOOOOOOOOOOOH what a surprise Vaccination and climatology!
*hands a magnet bracelet* here's your prize.
Posted by: Gaebolga
|
January 20, 2011 1:11 PM
Well, no, it actually isn't true, at least in the sense that it will be impossible for anyone to prove it true using the scientific method. The best we've got is "it has still failed to be proven untrue."
Every scientific theory we know - every single one - is wrong. The ones we still use are simply far, far less wrong that the ones we've discarded. And the ones we will use in the future, the ones that will come to replace the ones we currently use, will be less wrong still.
And yet still wrong.
And that right there is the core of the "science: it works, bitches!" line of argument you seem to have a problem with. Science isn't in the business of finding truth with a capital T; science finds things that work, rules that can effectively predict how things will turn out in particular instances, and revises its theories when evidence comes along that doesn't fit.
Science can't be broken out arbitrarily into "bland" and "magical" variants; it's all just science, and science works. You may not like it, but that's your problem; science has been happily and effectively working this way for over a hundred years, with spectacular results.
Posted by: https://www.google.com/accounts/o8/id?id=AItOawmgTasB5-I0ORBYS-DEwgqhfW1LpAnv5eY
|
January 20, 2011 5:02 PM
"That's all this fuss is really saying. Sometimes hypotheses are shown to be wrong, and sometimes if the support for the hypothesis is built on weak evidence or a highly derived interpretation of a complex data set, it may take a long time for the correct answer to emerge. So?"
Well no, that isn't even close to all this article is saying. Every scientist, even the pseudo ones, knows that a hypothesis may end up being proven wrong. That's why "proposed explanation for an observable phenomenon" is within the definition of hypothesis - it is only proposed.
This article is saying that the observable phenomenon themselves, such as the percentage of female birds attracted to symmetric males, changes depending on who is doing the looking. These observations will be used to support the above-mentioned false hypothesis. The false hypothesis, along with accompanying false observations, gets through the peer review process, is published and gets to the point that it becomes widely accepted.
Yes, the scientific process works as long as there are enough people willing to question the original results and attempt replication, but more often than not this does not happen with any rapidity. The process works but it doesn't turn on a dime, so at any given point in time we can expect a number of false hypothesis to be widely accepted for any number of reasons (most of them psychological).