Now on ScienceBlogs: I'm not vulnerable, just especially plastic. Risk genes, environment, and evolution, in the Atlantic

Seed Media Group

Mixing Memory

An entrée of Cognitive Science with an occasional side of whatever the hell else I want to talk about.

Search

Profile

No3.jpg Cognitive stuff from a cognitive person. If you've got any requests, drop me an email. If it takes me a while to get to it, drop me another one.

The lovely banners were created by Anton Oetll and Todd Hartman.

April is the cruelest month, breeding lilacs out of the dead land, mixing memory and desire, stirring dull roots with spring rain.

iloveyoupzmyers.jpg

Reading Group

The Mixing Memory Reading Group is a place for experts and non-experts alike to discuss books and papers in cognitive science.

Recent Posts

Categories

Archives

Blogs For All and For None

Cognitive Science and the Like

The Lesser Sciences

Philosophy

Feminists

Politics Or Close to It

Seriously Good But I Don't Know What to Call It

Other Links

Journals

« The Basics of Statistics IV: Confidence Intervals | Main | Fractal Expressionism? »

The Basics of Statistics V: A Quick Example

Category: Statistics
Posted on: July 12, 2007 4:29 PM, by Chris

So the last post was pretty dense, and I haven't used an example since the first post, so I thought I'd throw one out there that you can play with. In what follows, I pretend to use the equations, but I'm actually doing all this in Excel. If you've got Excel, here are some helpful functions. AVERAGE gives you the mean of a range of numbers, VAR gives you the variance, and STDEV gives you the standard deviation. Note that VAR and STDEV give you the variance and standard deviation for a sample (i.e., using n-1 instead of n). If you want population variance and standard deviation, use VARP and STDEVP. Also, since I'm often interested in getting the standard error quickly, I usually use this formula to get it in Excel: STDEV(range)/(SQRT(COUNT(range)). Or you can just do the math by hand if you enjoy that sort of thing. On to the problem.

Imagine we're given a problem: what is the average height of male liberal bloggers in Washington DC? We could go out and survey every liberal blogger in the Washington DC, but since it's DC, there are about 3 gazillion liberal bloggers there, and we don't have the time or money to survey them all. So instead, we randomly select 30 male liberal bloggers from DC, and get their heights. Here are their heights in inches (sorry metric people, but DC is in the States):

62
62
76
68
59
62
60
74
63
60
63
68
74
61
69
60
66
57
66
72
74
64
71
47
66
66
71
60
60
77

First we compute the mean with ΣX/n. ΣX = 1958, n = 30, and 1958/30 = 65.27. So x = 65.27 inches.

Then we compute the variance, using Σ(X - x)2/(n-1). Remember, we're using n-1 because we're trying to estimate the variance of the population using a sample (see post III). Using that equation, we get a variance of s2 = 43.65. Take the square root of that, and we get the standard deviation, s = 6.61.

Next we need to compute some measure of our confidence that our sample mean represents our population mean. Obviously, it's unlikely that with one sample we'd get the exact mean of the population, so we need to decide how confident we want to be about our mean, and then find a range within which we can be that confident that the population mean occurs. Let's choose 95% confidence, and since we don't know the population's standard deviation and are therefore using the t-distribution, look up the critical t-value (see post IV) for 95% confidence in a t-table. We have 30 observations, so our degrees of freedom is equal to 30 - 1, or 29. In the table, scroll down to 29, then over to .025 (not .05, see post IV). There you'll find that our critical t-value is 2.76.

One last step, before we can get confidence intervals. We have to get the standard error. Using the sample's standard deviation as an estimate of the population's standard deviation, we compute the standard error (sx) with:

sx = s/√n

Plugging in our numbers, we get 6.61/√30, which gives us a standard error of 1.21. Now we can use our confidence interval equation from post IV, which will give us:

65.27 - (2.76 x 1.21) ≤ μ ≤ 65.27 + (2.76 x 1.21)

Doing the math, we get this for our 95% confidence interval: 61.93 ≤ μ ≤ 68.61.

We report back that according to our sample, the average height of a male liberal blogger in Washington DC is 65.27 inches, and that we're 95% certain that the average height is between 61.93 and 68.61 inches.

Comments

1

This is fairly off topic, but I am having trouble running a Tukey test. Which might because it's the wrong test to run for what I need.

Basically I have run an ANOVA and I need to check for variance between groups.

I have four variables, the first has two factors, the other three have 3 factors. Which gives 54 rows of data. There is one trial run and nine replicates, so essentially 10 trials all together.

For whatever reason SPSS won't let me run an ANOVA with a Tukey test on this data. So I downloaded PHStat2 for Excel which ran the test fine. I looked up the Q value entered it in and all seemed fine. Except PHStat ran a Tukey on columns and not rows. So I transposed the columns, but now I don't know how to calculate the Q value for more than 10 groups.

Is Tukey the right test? If so how do I look up the q value for my data?

Posted by: Webs | August 5, 2007 4:40 PM

Post a Comment

(Email is required for authentication purposes only. On some blogs, comments are moderated for spam, so your comment may not appear immediately.)





ScienceBlogs

Search ScienceBlogs:

Go to:

Advertisement
Follow ScienceBlogs on Twitter
Visit the Collective Imagination blog
Advertisement

© 2006-2009 Seed Media Group LLC. ScienceBlogs is a registered trademark of Seed Media Group. All rights reserved.

Sites by Seed Media Group: Seed Media Group | ScienceBlogs | SEEDMAGAZINE.COM