statistics

A reader sent me a link to *yet another* purported [Bayesian argument for the existence of god][unwin], this time by a physicist named Stephen Unwin. It's actually very similar to Swinburne's argument, which I discussed back at the old home of this blog. The difference is the degree of *dishonesty* demonstrated by the author. As usual, you can only see the entire argument if you [buy his book][buymybook]. But from a number of reviews of the book, and a self-interview posted on his personal website, we can get the gist. Scientific American's [review][sciam] has the best concise description of…
Joe Morgan is a Hall of Fame baseball player and a former member of the Cincinnati's Big Red Machine. He is also a commentator for ESPN and a strong opponent of all the new fangled baseball statistics. Anyone who has listened to an ESPN broadcast of Major League Baseball has heard Morgan criticize the Moneyball style of managing baseball teams. There are some interesting parallels between Little Joe's position on baseball statistics and creationists' dissent from science. Ideally, baseball statistics should objectively measure the performance of individual players. Traditional baseball…
Last September, Bruce Lahn and colleagues published a couple of papers on the evolution of two genes responsible for brain development in humans (ASPM and Microcephalin). A group led by Sally Otto published a criticism of the analysis performed by Lahn's group in last week's issue of Science (JP has written a good summary on GNXP). Lahn and colleagues issued an excellent response to that criticism. The original papers on ASPM and Microcephalin argued that the patterns of polymorphism and linkage disequilibrium at the two loci were inconsistent with our current understanding of demographic…
A press release from the UK Green Party says: A survey carried out by the Green Party shows overwhelming opposition to Government nuclear plans Energy Survey shows 87% of public opposed to new nuclear power stations 89% agree 'the Government had already decided what they wanted to do about nuclear power stations before this debate started' 66% will take part in mass protests against nuclear power, if new stations are approved So far so good, except... The data reported covers a survey of 524 people interviewed between 01 February and 18th April 2006. Fieldwork was split between 324 people…
Last night, a reader sent me a link to *yet another* wretched attempt to argue for the existence of God using Bayesian probability. I *really* hate that. Over the years, I've learned to dread Bayesian arguments, because *so many* of them are things like this, where someone cobbles together a pile of nonsense, dressing it up with a gloss of mathematics by using Bayesian methods. Of course, it's always based on nonsense data; but even in the face of a lack of data, you can cobble together a Bayesian argument by *pretending* to analyze things in order to come up with estimates. You know, if you…
Since my amateur discursion into stochasticity appears to have flushed out all of the mathematically savvy, I'm going pose a real life statistics question for you. I have some data that are non-normally distributed (in fact, they really don't seem to fit any distribution well--and, yes, I've tried various transformations and the data still don't fit anything). If the data were normally distributed, I would like to perform ANOVA to partition the sources of variance (using percent sums of squares). Since the data aren't normally distributed, ANOVA is not the right test to use. Because I'm…
Since my amateur discursion into stochasticity appears to have flushed out all of the mathematically savvy, I'm going pose a real life statistics question for you. I have some data that are non-normally distributed (in fact, they really don't seem to fit any distribution well--and, yes, I've tried various transformations and the data still don't fit anything). If the data were normally distributed, I would like to perform ANOVA to partition the sources of variance (using percent sums of squares). Since the data aren't normally distributed, ANOVA is not the right test to use. Because I'm…
Dishonest Dembski:the Universal Probability Bound One of the dishonest things that Dembski frequently does that really bugs me is take bogus arguments, and dress them up using mathematical terminology and verbosity to make them look more credible. An example of this is Dembski's *universal probability bound*. Dembski's definition of the UPB from the [ICSID online encyclopedia][upb-icsid] is: >A degree of improbability below which a specified event of that probability >cannot reasonably be attributed to chance regardless of whatever >probabilitistic resources from the known universe…
As I've frequently said, statistics is an area which is poorly understood by most people, and as a result, it's an area which is commonly used to mislead people. The thing is, when you're working with statistics, it's easy to find a way of presenting some value computed from the data that will appear to support a predetermined conclusion - regardless of whether the data as a whole supports that conclusion. Politicians and political writers are some of the worst offenders at this. Case in point: over at [powerline][powerline], they're reporting: >UPI reports that Al Gore's movie, An…
I recently got a real prize of a link from one of my readers. He'd enjoyed the [Swinburne][swinburne] article, and had encoutered this monstrosity; an alleged [probability of christianity][prob] argument *significantly worse* than Swinburne. [swinburne]: http://goodmath.blogspot.com/2006/04/mind-numbingly-stupid-math.html "My shredding of Swinburne" [prob]: http://www.biblebelievers.org.au/radio034.htm "Mathematical Probability that Jesus is the Christ" The difference between Swinburne and this bozo (who's name I can't locate on the site) is that at least Swinburne made *some* attempt to use…
Andy Clark has written a review of comparative evolutionary genomics for Trends in Ecology and Evolution. His review deals with identifying functional regions of the genome and inference of both positively and negatively selected sequences. Clark is one of the leaders in the field of evolutionary genetics (and now genomics), actively participating in the analysis of both the human and Drosophila genomes. He also brings a solid understanding of biology, as well as an appreciation of statistical rigor. You can sense his excitement about the union of molecular biology and evolution in the…
The Nature Newsblog is reporting that mathematicians have shown that scoring begets more scoring in soccer football association football. I don't have access to the Nature News article, but it appears that World Cup goals cannot be modeled as Poisson random variables. Wondering why I called it association football? Do you know where the term 'soccer' comes from? Read on below the fold. I had no idea why Americans play 'soccer' and Brits play 'football' until a few months ago when a football loving Englishman clarified it all for me. Depending on where you live, 'football' can mean very…
A poll of 1,200 undergrads at 100 colleges in the United States found that 73% of the students think iPods are "in". One tenth of all old people know that "in" means "hip". Half of all old people think "hip" means "the thing I just got replaced". Drinking beer and stalking Facebook tied for second most "in" thing -- scoring affirmative amongst "71%" of the students. Sorry, got a little bit too aggressive with the quotes; I promise it won't happen again. Given my infatuation with alcohol, I figured this problem needed to be addressed. By problem, I mean the 29% who don't think beer is the…
I've gotten an absolutely unprecedented number of requests to write about RFK Jr's Rolling Stone article about the 2004 election. RFK Jr's article tries to argue that the 2004 election was stolen. It does a wretched, sloppy, irresponsible job of making the argument. The shame of it is that I happen to believe, based on the information that I've seen, that the 2004 presidential election was stolen. But RFK Jr's argument is just plain bad: a classic case of how you can use bad math to support any argument you care to make. As a result, I think that the article does just about the worst thing…
This paper is rather timely considering I just finished reviewing methods for detecting natural selection. Jonathan Pritchard's group has scanned SNP data from three populations (Europeans, East Asians, and Nigerians) for signatures of positive natural selection. The authors used measures of polymorphism to detect natural selection. In their approach, they polarized polymorphic SNPs as ancestral and derived (kind of like a Fay and Wu test) using the other populations as outgroups. In this type of test, high frequency derived SNPs are a hallmark of recent positive selection; the authors…
Polymorphism and Divergence This is the eighth of multiple postings I plan to write about detecting natural selection using molecular data (ie, DNA sequences). The introduction can be found here. The first post described the organization of the genome, and the second described the organization of genes. The third post described codon based models for detecting selection, and the fourth detailed how relative rates can be used to detect changes in selective pressure. The fifth post dealt with classical population genetics methods for detecting selection using allele and genotype frequencies…
My advisor received an email from a fairly prominent geneticist regarding some results published by Dobzhansky over fifty years ago. The geneticist had done some back of the envelope calculations and noticed some trends that had been overlooked for a half of a century. We happened to have the animals to replicate the experiments (and I was planning on doing some similar experiments) so my advisor had me perform the crosses. I ended up with a negative result -- I did not see the same trends that Dobzhanksy and colleagues observed. I guess you could say my negative result was a positive…
One of the most important developments in evolutionary biology in the past few decades has come without much fanfare outside of a small circle of population geneticists. The early models of population genetics were limited when it came to analyzing the nucleotide sequence polymorphism data that began to appear in the 1980s. New statistical techniques were developed to analyze this data, and they all fell under the umbrella of coalescent theory. If you want to understand the evolution of populations, you're missing a lot if you do not understand the coalescent. When I wrote about the best…
If you like sports (specifically hockey) and you like statistics, two posts from Tom Benjamin's NHL Blog are must reads (available here and here). With help from Dave Savit, a math professor at the University of Arizona, Tom describes how hockey can be modeled using a Poisson distribution. There are also Poisson Standings for the NHL season. Some have called this Moneyball for hockey. More stuff below the fold. The idea of the model is that goals can be considered Poisson random variables. You can calculate the expected number of goals scored by a team in a single game using the number…
If you have not read it, go check out Nicholas Wade's article on doctored images in scientific publications. This is especially pertinent given the recent Hwang Woo Suk stem cell debacle. There is nothing all that revolutionary, but Wade gives a nice review and introduces us to some of the editors who are trying to catch the cheaters. In commenting on the article, John Hawks brings up a good point regarding Photoshop: "I don't worry too much about Photoshopping illustrations of fossils. Instead I worry about two things. "One is picture selection. It is easy to choose pictures that…