Over at Faraday’s Cage, Cherish has a very nice post on Fourier series, following on an earlier post on Fourier transforms in the Transformers movie. She gives a nice definition of the process in the earlier post:
A Fourier Transform takes a signal and looks at the waves and then shows us the frequencies of all the waves. If we only have a single sine wave, like above, we will have a frequency that is zero everywhere except for the frequency of that sine wave. More complicated signals will be made up of several of these different frequencies and thus will have several peaks. The idea is that you could move back and forth between the period of the wave and the frequency by using the Fourier Transform. If you only have the frequency information, for example, you can use that to figure out which waves you need. Add them together, and you have your original signal back.
Of course, you might not see the point in this, as expressed that way– it sounds sort of like a game, just flipping back and forth between different representations of the same thing. This is actually an extremely powerful analysis tool, though, because it lets you pick patterns out of what otherwise can look like random noise.
In order to demonstrate, I will do what may be the dorkiest thing I’ve ever done here, namely taking the Fourier transform of the traffic on this blog. Here’s a graph showing the number of pageviews per day for the last 1024 days of Uncertain Principles (for technical reasons, it needs to be a power of 2):
Looking at that, you might be saying “This is just noise– a bunch of low-level hash with a handful of big spikes up above it.” Which is true– the really big spike corresponds to the Many Worlds, Many Treats post hitting Boing Boing, and the smaller, more recent one is this year’s Speed of God post. But is there some hidden pattern here that we’re not seeing?
The next graph is the Fourier transform of the traffic data:
I’ve zoomed in the vertical scale, because there’s always an enormous spike at a frequency of zero, again for technical reasons that don’t really matter. What you see here is a measure of the “power” at a given frequency, with the frequency measured in units of “per day.” This “power” tells you how much of a sine wave at that frequency you would need to put in in order to recreate the original graph of blog traffic.
This still looks like a bunch of crap with two spikes rising above the noise: one at around 0.14 day-1, the other at around 0.28 day-1. What does that mean for our data? That tells us that there’s a big concentration of power at a couple of frequencies, which indicates that there’s a strong recurring pattern in the traffic signal with that frequency.
So what is it? It’s maybe a little easier to understand if I plot the power as a function of one over the frequency, which is to say, the period in days:
OK, in order to see anything, you need to plot it on a log scale (which is why this isn’t usually done), but I’ve taken the liberty of labelling the peaks. The biggest peak in the signal is at a period of 7.014 days, meaning that there is a pattern in the data that repeats every 7 days. What’s that? Well, it’s the days of the week. If we zoom in on the original traffic graph a bit, you can see it:
Traffic is generally lower on weekends, so every seven days, you see a dip in the numbers. This is just visible in the raw data, but really pops out in the Fourier transform.
What’s with the 3.5 day peak? Well, the weekly variation isn’t really sinusoidal– it’s kind of flat for five days, then drops down for two. That sort of pattern requires additional frequencies that are harmonics of the fundamental pattern, as discussed in Cherish’s posts. 3.5 days is half a week, so this is twice the freuqency of the main signal, reinforcing the idea of a weekly pattern. The next harmonic would be at a frequency of around 0.42, or a period around 1.75 days, but it would also be small enough that it would get lost in the other stuff.
So that’s why scientists use Fourier analysis, and why it’s important enough to merit a (clunky and essentially undocumented) tool in the Excel analysis tool pack. Fourier analysis lets you find recurring patterns in large data sets, even when those patterns aren’t obvious in looking at the raw data. The result is kind of trivial in this case– I had thought there might be something at a period of around a year, because traffic usually dips significantly in December, but there isn’t enough data to really see it (there might be a peak at about 170 days, if you squint, but I’m not sure that’s real).
(Back when I was tracking baby feeding times and making colorful graphs, I had intended to follow up with a Fourier analysis of the feeding data, but things got busy at work, and I never found the time. That probably would’ve been more interesting, given that I didn’t see as clear a pattern in the raw data, but I no longer have SigmaPlot at home, so I can’t get at that data.)