Good Math, Bad Math

Basics: Normal Distributions

In general, when we gather data, we expect to see a particular pattern to
the data, called a normal distribution. A normal distribution is one
where the data is evenly distributed around the mean in a very regular way,
which when plotted as a
histogram will result in a bell curve. There are a lot of ways of
defining “normal distribution” formally, but the simple intuitive idea of it
is that in a normal distribution, things tend towards the mean – the closer a
value is to the mean, the more you’ll see it; and the number of values on
either side of the mean at any particular distance are equal.

If you plot that a set of data with a normal distribution
on a graph, you get something that looks like a bell, with the hump of the
bell positioned at the mean.

i-2b0fdfc5fb2a158370958e04f03bdbc3-bell-curve.jpg

For example, here’s a graph that I generated using random numbers. I
generated 1 million random numbers between 1 and 10; divided them into groups
of ten; and then took the sum of each group of 10. The height of each point in
the graph at each x coordinate is the number of times the sum was was that
number. The mean came out to approximately 55, the mode was 55, and the median
was 55 – which is what you’d hope for in a normal distribution. The number of
times that 55 occurred was 432,000. 54 came up 427,000 times; 56 came up
429,000. 45 came up 245,000 times; 35 came up 38,000 times; and so on. The
closer a value is the the mean, the more often it occurs in the population;
the farther it is from the mean, the less often in occurs.

In a perfectly normal distribution, you’ll get a perfectly smooth bell
curve. In the real world, we don’t see perfect normal distributions, but most
of time in things like surveys, we expect to see something close. Of
course, that’s also the key to how a lot of statistical misrepresentation is
created – people exploit the expectation that there’ll be a normal
distribution, and either don’t mention, or don’t even check, whether the
distribution is normal. If it is not normal, then many of
the conclusion that you might want to draw don’t make sense.

For example, the salary example from the mean, median, and mode post is also using this. The reason that the median is so different from the mean is because the distribution is severely skewed away from a normal distribution. (Remember, in a proper normal distribution, the number of values included at the same distance either side of the mean should be equal. But in
this example, the mean was 200,000; if you went plus 100,000, you’d get one value; if you went minus 100,000, you’d also get one – which looks good.
But if you went plus 200,000, you’d still get just one; if you went minus 200,000, you’d get twenty-one values!) But it’s a common rhetorical trick to take a very abnormal distribution, not mention that it’s abnormal, and
quote something about the mean in order to support an argument.

For example, the last round of tax cuts put through by the Bush administration was very strongly biased towards wealthy people. But during the last presidential election, in speech after speech, ad after ad, we heard about how much the average American taxpayer saved as a result of the tax cuts. In fact, most people didn’t get much; a fair number of people saw an effective increase because of the AMT; and a small number of people got huge cuts.

i-35fb6a7ce9bc555a76737a6b83503220-bimodal.jpg

For a different example, the high school that I went to in New Jersey was considered one of the best schools in the state for math. But the vast majority of the math teachers there were just horrible – they had three or four really great teachers, and a dozen jackasses who should never have been allowed in front of a classroom. But the top performing math students in the school did so well that we significantly raised the mean for the school, making it look as the the typical student in the school was good at math. In fact, if you looked at a graph of the distribution of scores, what you would see would be what’s called a bimodal distribution: there would be two bells side by side – a narrow bell toward the high end of the scores (corresponding to the scores of that small group of students with the great teachers), and a shorter wide bell well to its left, representing the rest of the students.

Comments

  1. #1 bwv
    January 15, 2007

    Income is one of those data points that follows a power law distribution (i.e. if x% of the population makes $D per year then 1/n of x% will make n*D) and the normal distribution is a poor model. Using a normal distribution to describe stock market returns are also problematic, as the tails tend to follow a power law (therefore having theoretically infinite variance so the distribution never quite converges to normal ala the central limit theorem. The 1987 crash was a -27 standard deviation event, which should only occur with the probability of 7.4 to the -160th – a number that if you invested for the age of the universe you would still not expect to see.

    In regard to the Bush tax cuts they were a boon to the middle class and the very wealthy saw little benefit (except for the reduction in cap gains rates) because they tend to all be in AMT.

  2. #2 Rich
    January 15, 2007

    Reminds me of a binomial (I think) distribution random number generator that someone told me about years ago.

    If you have a uniform distribution random number generator that yields numbers between 0.0 and 1.0 (from ‘(double)rand()/RAND_MAX’ say), add 12 uniform random numbers together and subtract 6.0.

    The result will be a fast, pretty good, simulation of a normal distribution with a mean of 1.0 and a standard deviation of 1. (for Monte Carlo simulations, etc).

    Excuse me if my terminology is weak — I’m an armchair mathemetician at best who’s picked up a couple of tricks over the years.

  3. #3 Ahcuah
    January 15, 2007

    Let me recommend Benoit Mandelbrot’s The (Mis)Behavior of Markets. He pretty convincingly demonstrates that markets do not follow a normal distribution. Instead, they follow something more like 1/(1+x^2), which just don’t tail off appropriately. [Yes, that's the same Mandelbrot who invented fractals.]

    That means that events that, if we used a normal distribution, we would think would never occur, instead happen with surprising frequency. Certain stock market crashes fall into that category.

    [I also wonder about certain weather events. How many 100-year floods did we have in a given area last century?]

  4. #4 Rich
    January 15, 2007

    Remembered wrong — I should know better the algorithm above gives a mean of 0.0 and a standard deviation of 1.0.

    Look up first, then then open mouth.

  5. #5 bwv
    January 15, 2007

    Ahcuah:

    Yes, the Mandelbrot book is excellent, if perhaps marred by some overzealous iconoclasm. I am not sure how useful his multi-fractal models are in pricing derivatives or other applications where lognormal distributions are used. Levy Stable Distributions are a good compromise as they can handle skewness and fat tails.

    As far as algorithms go, the quickest one I have found in VBA programming is to map the percentiles of the distribution to an array (say of dim 100) and then pull numbers by addressing the array y = array(int(100*rnd)). The other easy excel cell function is normsinv(rand())

  6. #6 RBH
    January 16, 2007

    Ahcuah wrote

    That means that events that, if we used a normal distribution, we would think would never occur, instead happen with surprising frequency. Certain stock market crashes fall into that category.

    Depending on the depth of historical data used to make the estimate, the October 19, 1987, U.S. stock market one-day decline (“crash”) was on the order of a 20 standard deviation event. That’s a once in multiple lifetimes of the universe event.

    Financial markets, particularly derivatives markets, violate any number of assumptions routinely made in estimating market risk, including assumptions about distributional stationarity, temporal price independence, continuous price movement, and liquidity (the assumption that there will be a willing buyer at some price). To be blunt, anyone who depends on the usual sorts of statistical estimates of risk in the markets (e.g., Value at Risk and its kin) is standing blindfolded on a tiny little island in a lake of quicksand.

  7. #7 Ahcuah
    January 16, 2007

    RBH wrote:

    To be blunt, anyone who depends on the usual sorts of statistical estimates of risk in the markets (e.g., Value at Risk and its kin) is standing blindfolded on a tiny little island in a lake of quicksand.

    Absolutely. But to relate back to Mark’s piece, it is always way too easy to assume that a distribution will be “normal” without actually examining the underlying assumptions. One of those critical assumptions is lack of history (each event is independent of the previous event). One can get away with that, usually, in the hard-physical sciences, but it is easy to forget that the violation of that is endemic elsewhere. Of course, with markets, there actually is a memory involved, and “while past results cannot guarantee future results”, they certainly have an influence on them.

  8. #8 bwv
    January 16, 2007

    As Mandelbrot mentions in his book, most phenomenon of importance in the world is governed by power laws, not Gaussian bell curves. The financial markets are a good example of this. In addition to Mandelbrot, I would recommend Nassim Nicholas Taleb’s Fooled by Randomness (http://www.fooledbyrandomness.com/) for another good book on the subject.

  9. #9 Mark C. Chu-Carroll
    January 16, 2007

    Folks:

    Let’s try not to get sidetracked here. I didn’t say that incomes or financial markets follow normal distributions. I just used them as an example of how making an invalid assumption of normal distribution can be misleading. So all of this debate about the proper distribution model for markets comes down to “Yeah, it’s not a normal distribution”, but it’s going to be very off-putting to people who aren’t already familiar with this stuff.

  10. #10 Torbjörn Larsson
    January 16, 2007

    I am not sure how useful his multi-fractal models are in pricing derivatives or other applications where lognormal distributions are used.

    It is interesting to know why different distributions occur. I think I just realized that lognormal distributions (independent factors contribute multiplicatively) are situated between normal distributions (independent factors contribute additively) and power-law distributions (factors contribute in many ways, on all scales).

    Hmm. I have encountered lognormal distributions as describing some side effect growth process, for example size distributions of slag or bubble inclusions in melts. Suits, between the simple and the complex.

  11. #11 BWV
    January 16, 2007

    My understanding is that Lognormals (i.e. natural log of [Price at time 1 / Price at time 0] is normally distributed) are used in finance because:

    A) the price of an asset can never fall below $0, which would be theoretically possible with a normal distribution

    B) you can do math in continuous, rather than discreet time – allowing up the use of the exponential function, stochastic calculus & differential equations (like the Black Scholes model)

    Now these criteria would apply to anything with a growth rate over time as the basic process is geometric Brownian motion

  12. #12 Bob O'H
    January 16, 2007

    As someone brought up a method for simulating a normal distribution, one of the faster methods used nowadays is called the Monty Python method.

    Bob

  13. #13 CCP
    January 17, 2007

    data ARE
    data ARE
    data ARE

  14. #14 billb
    January 17, 2007

    CCP: English is a living language. “Data” has become the plural with singular construction for the word “datum.” And by “has become,” I mean “has been used this way despite the efforts of prescriptivits since 1807 (according to my copy of the OED)”. :)

  15. #15 Mark Seecof
    January 19, 2007

    Someone mentioned weather. It can be handy to view weather excursions (and many other types of natural events) as examples of “1/f noise.” Intuitively, it’s hard to predict stuff like city-destroying tornadoes because the bigger they are, the less often they occur. Even if some are–unbeknownst to us–cyclic, it’s hard to get a baseline on them to figure out their periods or whatever.

  16. #16 Jonathan Vos Post
    January 20, 2007

    Mark: in my opinion, you are wise to have made the caveat: “I didn’t say that incomes or financial markets follow normal distributions. I just used them as an example of how making an invalid assumption of normal distribution can be misleading.”

    To even begin with markets, one would have to explain where volatility comes from (it’s merely assumed in Black-Scholes) and even more fundamentally, explain “heteroskedacity.” I may be mildly published in Mathematical Economics, but I leave those two to the real experts.