statistics

The Real Flaw with Scratch Lottery Cards

mikethemadbiologist | February 6, 2011

Wired has a fascinating article about a statistician who figured how to beat the odds on the scratch-off lottery tickets--that is, pick cards that are more likely to produce winning combinations. And "more likely", I mean getting it right up to 95 percent of the time. But the article mentions only in passing the real problem with lotteries: While approximately half of Americans buy at least one lottery ticket at some point, the vast majority of tickets are purchased by about 20 percent of the population. These high-frequency players tend to be poor and uneducated, which is why critics…

On Educational Metrics and Special Sandwiches

mikethemadbiologist | January 27, 2011

Over at Open Left, jeffbinnc pithily summarizes all of the metrics of which educational 'reformers' are fond: Then, to illustrate just how the focus on more and better tests is going to be raised to the levels of panacea, the CAP rolled out a new report last week that based just about everything on the notion that test scores are the be-all and the end-all of education attainment in our country. As I noted in a Quick Hit here on Open Left, CAP's analysis of school performance "relied on the results of 2008 state reading and math assessments in fourth grade, eighth grade, and high school"…

I Hope Obama Doesn't Mention 'Educational Efficiency' in the State of the Union: On Misusing the Residual

mikethemadbiologist | January 24, 2011

At this point in Obama's term, I'm simply hoping that the things I care about, like Social Security and education, aren't mentioned by Obama in his State of the Union address. I've given up on thinking he'll actually institute good policy (lost hope, if you will), and am just attempting a holding action. This is the administration that told Democrats they had to choose between school funding and feeding hungry children, and also has imposed Education Secretary Arne Duncan's failed ideas on the entire country, so I'm not optimistic. Matthew Yglesias draws our attention to a Center for…

A Critical Cause of the Decline Effect: When Weak Effects Meet Small Sample Size

mikethemadbiologist | January 6, 2011

A couple of weeks ago, Jonah Lehrer wrote about the Decline Effect, where the support for a scientific claim often tends to decrease or even disappear over time (ZOMG! TEH SCIENTISMZ R FALSE!). There's been a lot of discussion explaining why we see this effect and how TEH SCIENTISMZ are doing ok: PZ Myers gives a good overview, and David Gorski and Steven Novella also address this topic. Monday, a long-time reader asked me about the article, and I raised another issue that hadn't--or so I thought--been broached: There's also a sample size issue: if the effect is should be weak and sample…

Benford's Law of Amazon Rankings

drorzel | December 31, 2010

Late last year, Matthew Beckler was nice enough to make a sales rank tracker for How to Teach Physics to Your Dog. Changes in the Amazon page format made it stop working a while ago, though, and now Amazon reports roughly equivalent data via its AuthorCentral feature, with the added bonus of BookScan sales figures. So I've got a new source for my book sales related cat-vacuuming. Still, there's this great big data file sitting there with thousands of hourly sales rank numbers, and I thought to myself "I ought to be able to do something else amusing with this..." And then Corky at the Virtuosi…

A Dead Salmon: Bestest Control Experiment EVAH!

mikethemadbiologist | December 14, 2010

When analyzing data, understanding the limitations of your data is critical. One of the things we need to understand is significance: how strong does an effect have to be to considered not a result of random chance. Typically, we assume that if an effect has a five percent or less probability of occurring due to random chance, then it is "significant." But significance becomes very problematic when making many simultaneous assessments. If we make one hundred assessments (e.g., comparisons) and not a single one is actually different (assume that the Omniscience of the Mad Biologist is…

Robert Samuelson and Why The Washington Post Deserves to Go Broke

mikethemadbiologist | September 21, 2010

Robert Samuelson has a penchant for willingly misinterpreting data. Time was, the newspaper bidness considered that to be a bad thing. Given his track record on Social Security, which led me to create the Samuelson Unit, it should be no surprise whatsoever that Samuelson screws up educational data. Bob Somerby, rightfully offended by Samuelson's false claim that students have made no educational gains over the last forty years, asks, "Does Robert Samuelson hate black kids? It's always possible he doesn't--but he certainly seems to enjoy misstating their academic gains." Somerby: Among 17-…

"I Need to Know How Many Patients I Need": This Is My Scientific (and Statistical) Life

mikethemadbiologist | September 19, 2010

Not to pick on anyone, but this is what I go through with far too many projects: "My grant is due tomorrow. I just need to know...." And: "I can put you on my grant for a 0.5% FTE..."

Simple Answers to Complex Medical Questions

drorzel | September 1, 2010

There's a new medical study of the effects of alcohol consumption that finds a surprising result: Controlling only for age and gender, compared to moderate drinkers, abstainers had a more than 2 times increased mortality risk, heavy drinkers had 70% increased risk, and light drinkers had 23% increased risk. A model controlling for former problem drinking status, existing health problems, and key sociodemographic and social-behavioral factors, as well as for age and gender, substantially reduced the mortality effect for abstainers compared to moderate drinkers. However, even after adjusting…

Some More Thoughts About Value-Added Teacher Evalution

mikethemadbiologist | August 19, 2010

Tuesday, I criticized the LA Times' use of the 'value-added' approach for teacher evaluation. There were many good comments, which I'll get to tomorrow, but Jason Felch of the LA Times, pointed me to the paper describing the methodology. I'm not happy with the method used. First, I was right to have concerns about the linearity of test scores. Consider the mean score for each quartile: highest = 852 second highest = 768 third highest = 730 fourth highest = 682 What this means is that an increase from 40th percentile to 50th is not the same as an increase from 50th to 60th. Now, as far…

Some Concerns About the LA Times' Attempt at Teacher Evaluation

mikethemadbiologist | August 17, 2010

The LA Times has taken upon itself to rate school teachers in Los Angeles. To do this, the LA Times has adopted the 'value-added' approach (italics mine): Value-added analysis offers a rigorous approach. In essence, a student's past performance on tests is used to project his or her future results. The difference between the prediction and the student's actual performance after a year is the "value" that the teacher added or subtracted. For example, if a third-grade student ranked in the 60th percentile among all district third-graders, he would be expected to rank similarly in fourth grade…

The Statistics of Tea Partying and Interracial Trust, Part I

mikethemadbiologist | August 12, 2010

By way of Digby, I came across this poll of white attitudes towards various ethnicities (including whites) based on self-identified Tea Party support (note: respondents were only from four states, NV, MO, GA, and NC). One of things that struck me while looking at the data (pdf and pdf) was the extent to which whites who identified strongly with the Tea Party didn't trust other whites. 72% of skeptics of the Tea Party thought other whites were trustworthy, while Tea Partiers thought only 49% of whites were trustworthy (p = 0.0022). The other significant result, and which is puzzling is…

Some Thoughts About the Statistics of the Human Microbiome

mikethemadbiologist | August 10, 2010

Reporting on the human microbiome--the microorganisms that live on and in us--is quite the rage these days. As someone who is involved in NIH's Human Microbiome Project, it's a pretty exciting time because the size and scale of the data we're able to generate is unprecedented. This also means we have to figure out how to not only generate, but also analyze these data. One of the kinds of data we generate are 16S rRNA sequences, which are found in all bacteria and can be used as a 'barcode' to identify and quantify the bacteria in a community without having to culture each species. A…

85% of Statistics are False or Misleading

gboustead | August 5, 2010

Numbers don't lie, but they tell a lot of half-truths. We have been raised to think that numbers represent absolute fact, that in a math class there is one and only one correct answer. But less emphasis is put on the fact that in the real world numbers don't convey any information without units, or some other frame of reference. The blurring of the line between the number and the quantity has left us vulnerable to the ways in which statistics can deceive us. By poorly defining or incorrectly defining numbers, contemporary audiences can be manipulated into thinking opinions are fact. Charles…

C Is for Cookie, but Is It Good Enough for Delayed Gratification and Linear Regression?

mikethemadbiologist | August 4, 2010

This is what children with poor self-control become (from here) Melody Dye at Child's Play has an interesting post about the famous (or infamous) cookie experiments, which involved observing children presented with a cookie and then left alone in a room. If they wait long enough, they get another cookie (and they know this). If not, then clearly they are doomed to fail in life: Twenty years later, having spent long hours forgetting those misplaced moments, a new crop of experimental psychologists will add insult to injury, and call you, and ask pointed questions about your education, your…

Doubts About the STAR Study: How Much Is Kindergarten Really Explaining?

mikethemadbiologist | July 29, 2010

Since I've been writing a lot about education, I have some brief thoughts about the NY Times report by David Leonhardt about some findings from Tennessee's Project STAR which tracked the long-term outcomes about a randomization trial of kindergartners (slides from a presentation are available as a pdf): Just as in other studies, the Tennessee experiment found that some teachers were able to help students learn vastly more than other teachers. And just as in other studies, the effect largely disappeared by junior high, based on test scores. Yet when Mr. Chetty and his colleagues took another…

Poverty and Science Education in Massachusetts

mikethemadbiologist | July 29, 2010

Yesterday, I described the relationship between low-income and poor performance in English and math in Massachusetts (see the post for methodological details). Well, I've saved the worst for last--science education: Just to remind everyone, the horizontal axis is the percentage of children in a school who qualify for free lunch, and the vertical axis is the percentage of children who, according to their MCAS scores, are either classified as "Need Improvement" or "Warning/Failing" in science. The R2--how much of the school to school variation is accounted for by variation in school lunch…

Poverty and Learning in Massachusetts

mikethemadbiologist | July 28, 2010

I've described before how there is a significant correlation between poverty and educational performance when we use state-level data. But as I pointed out, one of the interesting things is that the residual--the difference between the expected scores for a given state and the actual scores--can be quite large for some states (e.g., Massachusetts does much better than expected, Arkansas much worse). We can learn a lot from these differences (i.e., what does MA do differently from Arkansas). But if we look at only one state, can we determine what the effect of poverty is? To do this, I've…

Lies, Damn Lies, and Educational Statistics: Can We Even Count Students?

mikethemadbiologist | May 6, 2010

Forget about measuring student outcomes. Can we even measure student numbers? A couple of weeks ago, I started pulling data from the NY Times website that displays the citywide testing scores (I was interested in exploring the relationship between poverty and test scores at a finer resolution than I had previously). Here's the problem: the No Child Left Behind (NCLB) numbers--the ones the federal government uses--and the state numbers don't agree. I'm not referring to educational outcomes: they don't even have the same number of students. Let's look at New York City. The NCLB numbers*…

Using Statistics to Create The Ultimate TEDTalk

grrlscientist | May 2, 2010

tags: Using Statistics to Create The Ultimate TEDTalk, statistics, public speaking, Sebastian Wernicke, TEDTalks, streaming video In a brilliantly tongue-in-cheek analysis, Sebastian Wernicke turns the tools of statistical analysis on TEDTalks, to come up with a metric for creating "the optimum TEDTalk" based on user ratings. How do you rate it? "Jaw-dropping"? "Unconvincing"? Or just plain "Funny"? After making a splash in the field of bioinformatics, Sebastian Wernicke moved on to the corporate sphere where he motivates and manages multidimensional projects. You can get your copy of…

Advertisment

Donate

ScienceBlogs is where scientists communicate directly with the public. We are part of Science 2.0, a science education nonprofit operating under Section 501(c)(3) of the Internal Revenue Code. Please make a tax-deductible donation if you value independent science communication, collaboration, participation, and open access.

You can also shop using Amazon Smile and though you pay nothing more we get a tiny something.