The world wide web can be understood as a giant matrix of associations (links) between various nodes (web pages). At an abstract level, this is similar to human memory, consisting of a matrix of associations (learned relationships, or neuronal connections) between various nodes (memories, or the distributed representations constituting them). In the new issue of Psych. Science, Griffiths et al. ask whether Google’s famously accurate and fast PageRank algorithm for internet search might behave similarly to the brain’s algorithm – whatever that might be – for searching human memory.
About PageRank
The PageRank algorithm is based on the assumption that the most important nodes in a network contain a large number of associations with other nodes, which themselves contain a large number of associations with other nodes, which themselves… and so on. This “recursive definition of importance” is formalized in Google’s algorithm to efficiently calculate the rankings of different web pages, and to return those web pages which are mostly highly ranked that also fit a certain search term.
Search in Human Memory
One way of graphing the associative structure of human memory is simply to ask human subjects to generate words which are strongly associated with other words. Averaged across many subjects, the frequency of those generated words reflects the “associate frequency” of the words in human memory. You might think of this result as “MemoryRank” instead of PageRank.
How well does PageRank account for human memory?
Griffiths et al note one critical difference between PageRank and the “associate frequency” measure of human memory: the latter doesn’t account for the fact that some cues are strongly associated with more words than others. This is captured by PageRank’s more recursive definition of importance.
To evaluate which ranking scheme better predicts human data, the two methods were used on a large set of verbal associations, all generated by humans in response to each of over 5,000 words. The result of this process was two ranks for each word in the set – one generated according to PageRank and one according to associate frequency. This list was then culled to include only those words generated by a set of 50 adults, each of whom had been asked to generate the first word that came to mind in response to each letter of the alphabet (excluding 5 low-frequency letters).
If PageRank or associate frequency were perfect models of human memory, then the human data should be completely predictable: humans should always pick those words which have the highest rank and start with the desired letter.
The result:
“PageRank outperformed both associate frequency and word frequency as a predictor” of those words generated by humans in response to each letter of the alphabet. And this wasn’t due merely to the training set – Griffiths et al. manipulated the training set in various ways, and in all cases, PageRank came out on top (relative to associate frequency and word frequency).
What does this mean?
It turns out that PageRank is mathematically equivalent to a large number of other formalisms that are used in cognitive science. For example, severely limited connectionist networks (limited insofar as connection weights are equalized across all projections from a certain node) are mathematically equivalent to PageRank: the activation in such a network should ultimately settle on those nodes in proportion to their PageRank. Likewise, PageRank can also be considered an estimate of “priors” in a Bayesian network (with some simplifying assumptions about likelihood).
So Google’s PageRank may accomplish network search in ways that can also be implemented in other frameworks widely used in cognitive psychology. However, PageRank (at least as it is known in the public domain) makes the strongly simplifying assumption that all associations from a particular node equally contribute to the importance of each of the connected nodes.
Although this assumption may be necessary for Google’s purposes, it is extremely clear that no such limitation exists in the brain. After all, the most widely recognized algorithm for neural computation – Hebbian learning – works precisely because it modifies the relative weights of one node to another independently from the weights of that node to all other nodes.
Is Google in my brain?
No one is suggesting that Larry Page has discovered the secret to the organization of human memory. In fact, it’s clear that some of PageRank’s (public) assumptions about the structure of networks do not hold – for example, the idea that the importance of a single node is distributed equally through all its connections. Much better models of verbal processing abound in cognitive psychology (see, for example, LSA). Still, Griffiths et al. compellingly demonstrate that the advantageous qualities of PageRank do indeed generalize from the world wide web to the semantic networks present in the brain.