Newman, M. E. J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E (Statistical, Nonlinear, and Soft Matter Physics), 69(2), 26113-21. (just go here)
This article, like the ones from Barabasi, sort of kicked off this flurry of research. They use a divisive clustering technique - so they start with the whole network, and break the connections with the highest betweeness. See figure.
See how if you remove
that one line, how you completely break up the thing? That line has
high betweenness. So they calculate that for all of the lines using
whatever method, then take the line with the highest out, then
re-calculate and remove, and again. They then go on to talk about the
actual algorithm to use to efficiently do all of this betweenness
calculating and give some examples. There's a lot in this
article, though, because they next talk about how to figure out when
you're done and if you've got decent communities. This measure is
modularity (see the article for the definition), but basically it's 0
if random and 1 is the maximum. If you calculate Q at each step, then
you can stop when it's highest. Note that any given node can only be in
one community, unfortunately. (in real life, people are nearly always
in multiple communities)Reichardt, J., & Bornholdt, S. (2006). When are networks truly modular? Physica D, 224(1-2), 20-26. doi: 10.1016/j.physd.2006.09.009 (or look here)
They review Newman and Girvan and suggest a new way that groups connected nodes and separates non-connected nodes. They go through a process and end up with an equation that's apparently like a Hamiltonian for a q-state Potts spin glass (dunno, ask a physicist if you need more info on that!). This method allows for overlapping communities because there could be times when you could move a node from one community to the next without increasing the energy. They compared it for some standard graphs and it did better than N-G. Instead of just stopping by minimizing modularity, they compare the modularity to a random graph with the same degree distribution.
Reichardt, J., & Bornholdt, S. (2007). Clustering of sparse data via network communities-a prototype study of a large online market. Journal of Statistical Mechanics: Theory and Experiment, P06016. doi:10.1088/1742-5468/2007/06/P06016
In this one they test the spin glass community detection method against the German version of ebay to look for market segmentation. The network has bidders as nodes, and if they bid on the same item there is an edge. The spin glass method was successful at pulling out clusters and using odds ratios, the authors showed that these clusters corresponded to groupings of subject categories. The Q was much higher than it would be for a random graph.

Christina K. Pikas is a science and engineering librarian in a special library as well as a doctoral student in information studies.


