Applied Statistics

Random matrices in the news

Mark Buchanan wrote a cover article for the New Scientist on random matrices, a heretofore obscure area of probability theory that his headline writer characterizes as “the deep law that shapes our reality.”

It’s interesting stuff, and he gets into some statistical applications at the end, so I’ll give you my take on it.

But first, some background.

About two hundred years ago, the mathematician/physicist Laplace discovered what is now called the central limit theorem, which is that, under certain conditions, the average of a large number of small random variables has an approximate normal (bell-shaped) distribution. A bit over 100 years ago, social scientists such as Galton applied this theorem to all sorts of biological and social phenomena. The central limit theorem, in its generality, is also important in the information that it indirectly conveys when it fails.

For example, the distribution of the heights of adult men or women is nicely bell-shaped, but the distribution of the heights of all adults has a different, more spread-out distribution. This is because your height is the sum of many small factors and one large factor–your sex. The conditions of the theorem are that no single factor (or small number of factors) should be important on its own. For another example, it has long been observed that incomes do not follow a bell-shaped curve, even on the logarithmic scale. Nor do sizes of cities and many other social phenomena. These “power-law curves,” which don’t fit the central limit theorem, have motivated social scientists such as Herbert Simon to come up with processes more complicated than simple averaging (for example, models in which the rich get richer).

The central limit theorem is an example of an attractor–a mathematical model that appears as a limit as sample size gets large. The key feature of an attractor is that it destroys information. Think of it as being like a funnel: all sorts of things can come in, but a single thing–the bell-shaped curve–comes out. (Or, for other models, such as that used to describe the distribution of incomes, the attractor might be a power-law distribution.) The beauty of an attractor is that, if you believe the model, it can be used to explain an observed pattern without needing to know the details of its components. Thus, for example, we can see that the heights of men or of women have bell-shaped distributions, without knowing the details of the many small genetic and environmental influences on height.

Now to random matrices. A random matrix is an array of numbers, where each number is drawn from some specified probability distribution. You can compute the eigenvalues of a square matrix–that’s a set of numbers summarizing the structure of the matrix–and they will have a probability distribution that is induced by the probability distribution of the individual elements of the matrix. Over the past few decades, mathematicians such as Alan Edelman have performed computer simulations and proved theorems deriving the distribution of the eigenvalues of a random matrix, as the dimension of the matrix becomes large.

It appears that the eigenvalue distribution is an attractor. That is, for a broad range of different input models (distributions of the random matrices), you get the same output–the same eigenvalue distribution–as the sample size becomes large. This is interesting, and it’s hard to prove. (At least, it seemed hard to prove the last time I looked at it, about 20 years ago, and I’m sure that it’s even harder to make advances in the field today!)

Now, to return to the news article. If the eigenvalue distribution is an attractor, this means that a lot of physical and social phenomena which can be modeled by eigenvalues (including, apparently, quantum energy levels and some properties of statistical tests) might have a common structure. Just as, at a similar level, we see the normal distribution and related functions in all sorts of unusual places.

Consider this quote from Buchanan’s article:

Recently, for example, physicist Ferdinand Kuemmeth and colleagues at Harvard University used it to predict the energy levels of electrons in the gold nanoparticles they had constructed. Traditional theories suggest that such energy levels should be influenced by a bewildering range of factors, including the precise shape and size of the nanoparticle and the relative position of the atoms, which is considered to be more or less random. Nevertheless, Kuemmeth’s team found that random matrix theory described the measured levels very accurately.

That’s what an attractor is all about: different inputs, same output.

Thus, I don’t quite understand this quote:

Random matrix theory has got mathematicians like Percy Deift of New York University imagining that there might be more general patterns there too. “This kind of thinking isn’t common in mathematics,” he notes. ‘Mathematicians tend to think that each of their problems has its own special, distinguishing features. But in recent years we have begun to see that problems from diverse areas, often with no discernible connections, all behave in a very similar way.

This doesn’t seem like such a surprise to me–it seems very much in the tradition of mathematical modeling. But maybe there’s something I’m missing here.

Finally, Buchanan turns to social science:

An economist may sift through hundreds of data sets looking for something to explain changes in inflation – perhaps oil futures, interest rates or industrial inventories. Businesses such as Amazon.com rely on similar techniques to spot patterns in buyer behaviour and help direct their advertising.

While random matrix theory suggests that this is a promising approach, it also points to hidden dangers. As more and more complex data is collected, the number of variables being studied grows, and the number of apparent correlations between them grows even faster. With enough variables to test, it becomes almost certain that you will detect correlations that look significant, even if they aren’t. . . . even if these variables are all fluctuating randomly, the largest observed correlation will be large enough to seem significant.

This is well known. The new idea is that mathematical theory might enable the distribution of these correlations to be understood for a general range of cases. That’s interesting but doesn’t alter the basic statistical ideas.

Beyond this, I think there’s a flaw in the idea that statistics (or econometrics) proceeds by blindly looking at the correlations among all variables. In my experience, it makes more sense to fit a hierarchical model, using structure in the economic indexes rather than just throwing them all in as predictors. We are in fact studying the properties of hierarchical models when the number of cases and variables becomes large, and it’s a hard problem. Maybe the ideas from random matrix theory will be relevant here too.

Buchanan writes:

In recent years, some economists have begun to express doubts over predictions made from huge volumes of data, but they are in the minority. Most embrace the idea that more measurements mean better predictive abilities. That might be an illusion, and random matrix theory could be the tool to separate what is real and what is not.

I’m with most economists here: I think that, on average, more measurements do mean better predictive abilities! Maybe not if you are only allowed to look at correlations and least-squares regressions, but if you can model with more structure than, yes, more information should be better.

Comments

  1. #1 Stagyar zil Doggo
    April 14, 2010

    That is, for a broad range of different input models (distributions of the random matrices), you get the same output–the same eigenvalue distribution–as the sample size becomes large.

    Speaking from some familiarity with Linear Algebra but with no knowledge of Random Matrix theory, I’ll guess that this common Eigenvalue Distribution is seen only for the principal (or largest few) Eigenvalues.

    Do probability distributions of corresponding Eigenvectors also approach some standardized distributions?

  2. #2 Blake Stacey
    April 15, 2010

    The analogue of the central limit theorem in random-matrix theory is the Wigner semicircle law (and no, it doesn’t apply to just the principal eigenvalues).

  3. #3 Stagyar zil Doggo
    April 17, 2010

    Thx, Blake.

  4. #4 Some Random Prof
    June 15, 2010

    Excellent post. Would that the original article had done half as good a job of explaining as you did.

    Off to read more,

    SRP

  5. #5 Yaser Helmy
    July 1, 2010

    Hi Blake,

    If I understand you correctly, the theory revolves around studying the distribution of the eigenvalues of a matrix of random variables, each of which, coming from a possibly different distribution – i.e. a[1,2] ~ Normal, a[1,1] ~ Power Law.

    If the above is correct, then what information can the distribution reveal about the structure of the variables? I mean, does this mean that the distributions of the eigenvalues can give us a better prediction about the future for example than assuming a normal or a power law distribution? Or, does it provide better information regarding the correlation of different variables, each of which belonging to a different distribution?

    Excuse my ignorance, since i’m not a statistician. However, since I read the horrible article in New Scientist, and I can’t stop thinking about the number of problems that this idea can’t potentially solve :)

    Y.

  6. #6 Yaser Helmy
    July 1, 2010

    Correction:

    Horrible Article – Amazing Article.
    this ideal can’t potentially solve – this idea can potentially solve.

    Sorry for the confusion