With the annual Office Pool Season now open, lots of people are queueing up to offer advice on how to fill out the bracket sheets for our national foray into illegal gambling:
- Inside Higher Ed offers a bracket based on graduation rates, with a Final Four of Florida, Virginia, Michigan State, and, um, Holy Cross. Perhaps this isn't the way to bet...
- If you'd like statistical backing for your bad picks, Ken Pomeror is breaking down the regions, predicting the West and Midwest today.
- The New York Times offers two strategies for winning contests: 1) foolishly pick the wrong team, for example, confusing George Mason with George Washington (the "Secretary Effect" strategy), and 2) submit a whole bunch of entries (the "Shotgun" strategy).
None of these are behind my ace picks for this year (though I will end up filling out at least five brackets for different contests), though I am tempted to put together a bracket sheet based on the rankings of the physics Ph.D. programs of the various schools.
If you would like to pit your picking prowess against the brilliant minds at ScienceBlogs, you can enter the ScienceBlogs group at the Yahoo tournament contest. If you're interested, email Dave Munger (dave at wordmunger dot com), and he'll send you the group name and password.
- Log in to post comments
More like this
It's March, and you know what that means: brackets. There are two ScienceBlogs brackets to keep your eye on:
The barkers at the World's Fair have put together a Science Showdown -- bracket style -- broken into four regions: Octopus (life sciences), Mortar and Pestle (chemistry), Chair (philosophy…
I give you the last four rounds of the Worst NCAA Pool Bracket Ever:
That's small and hard to read, but it's filled out with the winners determined by the rankings of the physics graduate programs of the competing schools. (If only one of the schools offers a Ph.D. program in physics, that school…
Even though the really important Final Four has already been decided, the Division 1 NCAA basketball championship starts this week, which means it's time to fill out your championship brackets. And so, as usual, I present the guaranteed-can't-miss-sure-thing method of picking the winner based on…
Would you like to join the ScienceBlogs NCAA pool?
Do you want to try to beat your favorite bloggers?!
Email dave--at--wordmunger--dot--com for the login and password.
I don't know if there are any prizes - but perhaps we'll come up with something.
Now that Ken Pomeroy has posted the log5 estimates for all teams, we can do an experiment. Here are the first round matchups, along with the bootstrap probability of the team in the 1st column beating the other:
tm1 -- tm2 -- p
Kentucky -- Villanova -- 0.5009
Purdue -- Arizona -- 0.51
Illinois -- Va. Tech -- 0.51
Arkansas -- USC -- 0.55
Mich. St. -- Marquette -- 0.61
Xavier -- BYU -- 0.62
Boston Coll. -- Texas Tech -- 0.649
Creighton -- Nevada -- 0.67
Georgia Tech -- UNLV -- 0.70
Indiana -- Gonzaga -- 0.7117
Vanderbilt -- GW -- 0.7157
Louisville -- Stanford -- 0.7181
Butler -- Old Dominion -- 0.7239
Notre Dame -- Winthrop -- 0.803
Wash. St. -- Oral Roberts -- 0.8151
S. Illinois -- Holy Cross -- 0.8289
Oregon -- Miami (OH) -- 0.8602
Duke -- VCU -- 0.8621
Maryland -- Davidson -- 0.8642
Tennessee -- LB St -- 0.874
Texas -- New Mex. St. -- 0.8843
Virginia -- Albany -- 0.8961
Pittsburgh -- Wright St. -- 0.8971
Texas A&M -- Pennsylvania -- 0.9507
Wisconsin -- TAMU - CC -- 0.9591
Georgetown -- Belmont -- 0.9659
Memphis -- North Texas -- 0.9803
UCLA -- Weber St. -- 0.9808
Ohio St. -- C. Conn. -- 0.9862
Kansas -- Play-In -- 0.9919
N. Carolina -- E. Kentucky -- 0.9926
Florida -- Jackson State -- 0.9972
Messy. Sorry about that.
I've ordered the games by increasing p. Lets bin the p's into 0.05 chunks. Four teams (Kentucky, Purdue, Illinois, Arkansas) have between 50% and 55% chance of winning accoring to these estimates. If all four teams win (or lose) that would be unlikely accordong to our log5 estimates. How unlikely? The binomial probability of 0 for 4 with a p=0.52 (the average of the 4 p's) is about 5%. if all four teams win, there is probably something wrong with the log5 estimates.
We can test the rest of the games the same way. (Yay! I'm doing science!) Most interestingly are the 9 teams that have between 95% and 99.7% chance of winning. My bootstrap estimate for all of those teams winning is 82%, for one of those teams losing 17%. I estimate a 1.4% chance of two of those teams losing -- a good level of statistical significance.
So, Chad: how about it. I'll post a public denoucment of log5, statsgeekery, and slide rules if two or more of those 9 teams with 95%+ chance to win end up losing in the first round. Can I count on you to take up the other end of this wager?
So, Chad: how about it. I'll post a public denoucment of log5, statsgeekery, and slide rules if two or more of those 9 teams with 95%+ chance to win end up losing in the first round. Can I count on you to take up the other end of this wager?
Which would be what?
Maybe I'm just cranky this morning, but this seems really silly to me. Would two losses by top teams actually convince you of anything? I know that I wouldn't be terribly impressed by the fact that the system was able to predict with confidence that all the #1 and #2 seeds would win, given that #16 seeds are 0-88, and #2 seeds are something like 4-84 (Richmond, Santa Clara, Hampton, Coppin State-- am I missing any?).
A convincing demonstration would need to involve the statistical method predicting outcomes that are at odds with the conventional wisdom (sort of like Pomeroy's insistence that the Pac-10 was overrated, while every talking head in the business was declaring it the Best Conference Ever earlier this season). Predicting that the high seeds will win is not terribly interesting.
Two wins would persuade me that the system was useless. The probability of that happening by chance alone is 1%. That is the significance level I use in my work, why wouldn't I use the same level for things of less importance?
You're right that it wouldn't be a fair bet. As you point out, I can only lose, not win. The way to do this fairly is to have a baseline, and measure the success rate against that. I just couldn't think of a baseline at 2 in the morning, and I still can't think of one now.
I guess what I'm looking for is a set of potential results that would show you the value of these esimates. Since I don't know anything about college basketball (I can't even name three players, and can only name a few teams because I had to type them up above) I can't tell you what, if any, esimates run counter to conventional wisdom. Do you see any matchup estimates above that run counter to conventional wisdom?