Why Antibunching Equals Photons

In my post about how we know photons exist, I make reference to the famous Kimble, Dagenais, and Mandel experiment showing "anti-bunching" of photons emitted from an excited atom. They observed that the probability of recording a second detector "click" a very short time after the first was small. This is conclusive evidence that photons are real, and that light has discrete particle-like character. Or, as I said in that post:

This anti-bunching effect is something that cannot be explained using a classical picture of light as a wave. Using a wave model, in which light is emitted as a continuous sinusoidal wave, you would expect some probability of a detector "click" even at very short times. In fact, you can easily show that any wave-like source of light must have a probability of recording a second click immediately after the first one that is at least as big as the probability of recording a second click after a long delay. Most of the time, the probability is actually higher at short times, not lower. A decrease in the probability of a second detection at short times is something that can only be explained by the photon model.

Since then, I have had a steady trickle of comments asking for the detailed explanation of why you can't, or protesting that you can too explain anti-bunching with a wave model. I generally try to avoid posting lots of equations on the blog, but this is unavoidably mathematical, so I will put the explanation below the fold.

If you want to characterize the detector signal from a wave model of light, you need to look at the intensity of the light, and in particular at the fluctuations of that intensity. For a classical source of light waves, we can describe this mathematically as a constant average intensity I0 plus a fluctuating term that is a function of time δ I(t). So, the intensity at any given time t is:

I(t) = I0 + δ I(t)

If you just had the constant intensity, you would always have exactly the same probability of measuring a "click" at your detector in any given instant of time. The fluctuating term δ I(t) gives you a way to describe changes in this probability, and can be either positive (indicating an increased chance of a "click") or negative (indicating a decreased chance of a "click"). It's equally likely to be positive or negative at any given instant in time.

Of course, your detector can't give you a perfect and instantaneous snapshot of the intensity, so what we really measure is the average intensity over some measurement interval, which we write as:

<I(t)> = <I0 + δ I(t)> = <I0> + <δ I(t)>

(Basically, angle brackets indicate an average over some time interval, and the average of the sum of two quantities is equal to the sum of the averages.) Looking at our definitions above, we can immediately see that <δ I(t)> = 0-- if it didn't, that would just represent a change in the average intensity I0, so we could trivially redefine everything so that the average of the fluctuating term is zero.

With me so far? This is just the definition of the way you deal with characterizing the energy flow in a classical wave model of light. You can do the whole thing in terms of the electric field amplitude if you really want to-- the math is slightly more tedious (because intensity is the average square of the electric field amplitude), but works out exactly the same in the end. I haven't said anything about the mathematical details of <δ I(t)>, so this is all perfectly general.

If you want to know the probability of getting a second detector "click" a short time τ after the first, the quantity you need to describe this is the intensity correlation:

<I(t)I(t+τ)> = <(I0 + δ I(t))(I0 + δ I(t+τ))>

This is just saying that you take the intensity at time t and multiply it by the intensity a short time τ later, and take the time average of that product. The right-hand side just plugs in the definition of those two intensities in the mathematical notation we used earlier. If we multiply out the right-hand side, and use the fact that the average of a sum is the sum of the averages, we get:

<I(t)I(t+τ)> = <I0I0> + <I0δ I(t)> + &ltδ I(t+τ))I0> + <δ I(t)δ I(t+τ)>

This may look a little scary, but if you go through it term by term, it's really pretty simple. The first term is just the average intensity squared, which is some positive number. The next two terms are the average of the intensity (just a number) multiplied by the fluctuating term, which by definition averages to zero. So, the next two terms are zero. That leaves us with just the first and last terms to deal with:

<I(t)I(t+τ)> = <I0I0> + <δ I(t)δ I(t+τ)>

So, what is <δ I(t)δ I(t+τ)>? This is the average of the product of the fluctuating term at time t, and the fluctuating term a time &tau later. We can't give a precise value of this without specifying the mathematical form of δ I(t), but we can say a couple of things about limiting cases.

For large values of τ, we expect the average <δ I(t)δ I(t+τ)> to be equal to zero, because the fluctuations are supposed to be random. That means that there shouldn't be any particular correlation between the fluctuation at one instant and the fluctuation at any later time-- if you could use the value of δ I(t) to predict δ I(t+τ), then the fluctuations wouldn't be random, would they?

The other time whose value we can specify <δ I(t)δ I(t+τ)> for is τ=0, for which we have:

<δ I(t)δ I(t+ 0 )> = <δ I(t)2>

That is, for zero delay between "clicks," the final term in our correlation is the average of the square of the fluctuating term at time t. Since the intensities we're dealing with are by definition real numbers, this has to be a number greater than or equal to zero. There's no way for the square of a real number to average out to be a negative number.

This means that, for a classical wave picture of light, the probability of getting a second detector "click" immediately after the first "click" has to be at least as big as the probability of getting the second click a long time later (large values of τ, where the correlation term averages to zero). The very best you can do with a classical source is to find a flat correlation function-- that is, exactly the same value everywhere-- which you can get using a laser. Thermal sources tend to show "bunching" effects-- that is, a correlation at short times that is significantly greater than the correlation at long times (as in the famous Hanbury Brown and Twiss experiments).

The anti-bunching effect observed by Kimble, Dagenais, and Mandel, and every other correlation experiment with single-photon sources that's been done shows a lower value of this correlation at short times, which is impossible unless you know a way to make the square of a real number turn out negative. You can find more detailed versions of this calculation in any textbook on quantum optics. As it makes no assumptions about the mathematical form of the fluctuations beyond insisting that the intensity be a real number, there's no way to get around the problem while still using a wave model of light. Thus, anti-bunching measurements conclusively show that light has particle nature, and thus photons are real.

More like this

I realized after posting this that I ought to say something about the time scales involved: The times we're talking about here are long compared to the period of oscillation for a light wave. The averaging time is longer than an optical cycle, and the delay time τ in the correlation is much longer than that.

As visible light has a frequency of hundreds of terahertz (about ten femtoseconds per cycle), while most photon detectors have response times in nanoseconds, this isn't a terribly difficult requirement to meet.

To be contary for a moment...

"If you just had the constant intensity, you would always have exactly the same probability of measuring a 'click' at your detector in any given instant of time."

Isn't this sort of the crucial assumption? Is there a justification for it that doesn't involve presupposing a particle model for light? Suppose you had a detector that "bins" the incoming light, storing the incoming energy until enough of it has accumulated to emit a click, rather than just clicking with a probability proportional to the instantaneous intensity? Such a detector would exhibit anti-bunching even for purely wavelike light.

(For what it's worth, please note that I don't actually doubt the existence of photons, nor do I know of a way in which to construct my theoretical detector. I'm just trying to understand the anti-bunching argument a little better.)

Isn't this sort of the crucial assumption? Is there a justification for it that doesn't involve presupposing a particle model for light? Suppose you had a detector that "bins" the incoming light, storing the incoming energy until enough of it has accumulated to emit a click, rather than just clicking with a probability proportional to the instantaneous intensity?

That's what this model is. The intensity is a measure of the rate at which energy flows from the light into the detector. Your detector is a quantum system that makes a transition from the "no detection" state to a higher energy "click" state. This transition is inherently probabilistic-- that is, the thing you can calculate from a quantum model is the probability of this transition occurring at any given instant.

You can't think of the detector as a classical bucket that takes some time to fill with energy before it clicks, because we know that atoms have discrete energy states, and obey quantum rules. Not without throwing out all of quantum mechanics, which would create much bigger problems than explaining anti-bunching.

Thanks for returning to this subject which I find very interesting. As in the original topic I have some reservations as to the final conclusion.

First the most obvious issue - even assuming everything else is fine the fact that light is not a wave doesn't in itself prove it has to be a particle, especially when one takes into account effects like interference which convincingly show it cannot be a simple particle.

I also have more specific reservations, for example you basically assume what you want to prove when you say "for large values of Ï, we expect the average <δ I(t)δ I(t+Ï)> to be equal to zero, because the fluctuations are supposed to be random."

This of course means anti-bunching is impossible. But why "fluctuations are supposed to be random"? Light is emitted by complex matter which may impose very complex patterns on the emitted light.

Maybe antibunching is a proof that fluctuations are in fact not random, and not that photons are real.

Then there is another assumption that detectors really measure the average intensity over some measurement interval. As above the anti-bunching might be seen as a proof that this assumption is wrong and that the interaction between light and detectors is far more complex then that.

All in all I would say that what you have described only excludes the most simplistic wave model of light.

What I like about this is that it leads to a nice lesson about how the photoelectric effect, usually purported to be proof of the photon, can actually be explained semiclassically- quantized atomic levels, but a classical em wave impinging on the solid. Which makes antibunching even more important.

I'm glad you recommend Greenstein and Zajonc's "The quantum challenge", I found their discussion very clear and approachable. Would you agree or disagree with the following statement?:

"Photons clearly exist, but their existence is only relevant in experiments involving coincidence counting."

The "correct" explanation of the photoelectric effect raises a fun aesthetics question: When should we resort to a 'deeper' theory? I assume string theory can explain the photoelectric effect, too, but it's probably a lousy tool for the job. Certainly the math for treating light classically and photoelectric detectors with QM is complicated, but I can follow every step of it, starting from Maxwell's and Schrodinger's equations. The photon explanation seems simpler, but I suspect this is only because so much of the model is assumed; I certainly don't know how to describe the interaction of a metal surface and a photon from first principles.

I've never done coincidence-counting experiments, so Maxwell's equations for light and Schrodinger's equation for electrons and protons explain every experiment I've ever done. Switching from a semiclassical model to a photon model has yet to simplify anything, or provide me any useful insight. So, my tastes are to use the most shallow theory possible at all times.

I believe you quantum optics guys that photons exist and they are important, but I think the photon model is a little more popular than it deserves. Biologists doing single-molecule experiments LOVE to talk about photons (check out some papers on PALM or STORM, good stuff). I'm 98% convinced the photon model is 100% irrelevant to this work. Crotchety old Lamb had a similar sentiment:

"The photon concepts as used by a high percentage of the laser community have no scientific justification"
( http://www.springerlink.com/content/h16g2307204h5654/ )

Finally, I suspect replacing Maxwell with the photon model can be actively harmful to visualization and intuition. Have you ever had an important physical insight guided by the photon model? Was it unrelated to coincidence counting?

Paul: First the most obvious issue - even assuming everything else is fine the fact that light is not a wave doesn't in itself prove it has to be a particle, especially when one takes into account effects like interference which convincingly show it cannot be a simple particle.

It shows that light has particle-like properties in addition to wave-like ones. The conventional method of describing this situation is to describe the light as a stream of photons, which are not strictly particles or strictly waves, but are a third class of object-- the term I usually use is "quantum particle"-- that has characteristics normally associated with both.

There isn't any experiment that will definitively show that light is made up of classical-type particles like billiard balls, because it's not. These experiments conclusively show that light has particle-like characteristics, though, and thus is made up of photons.

I also have more specific reservations, for example you basically assume what you want to prove when you say "for large values of Ï, we expect the average to be equal to zero, because the fluctuations are supposed to be random."

This of course means anti-bunching is impossible. But why "fluctuations are supposed to be random"? Light is emitted by complex matter which may impose very complex patterns on the emitted light.

If the fluctuations were not random, we would be able to see that by monitoring the intensity of the light directly, and we do not see those sort of intensity patterns in classical sources of light (that is, light sources where there are large numbers of photons involved).

There may also be an issue of Fourier components-- that is, in order to make a classical wavepacket that would mimic photon behavior, you would need to include a huge range of different frequencies, which would show up as a broad spectrum of emission rather than a narrow spectral line of the sort known to be emitted by atoms. I haven't worked that through, though.

Then there is another assumption that detectors really measure the average intensity over some measurement interval. As above the anti-bunching might be seen as a proof that this assumption is wrong and that the interaction between light and detectors is far more complex then that.

We understand the operating principles of the detectors (basically, the photoelectric effect) well enough to rule that out. The proper quantum model of the detector is a system with discrete bound states coupled to a continuum of free-electron states-- anything else would fail to reproduce well known properties of the detectors.

Andrew: I'm glad you recommend Greenstein and Zajonc's "The quantum challenge", I found their discussion very clear and approachable. Would you agree or disagree with the following statement?:

"Photons clearly exist, but their existence is only relevant in experiments involving coincidence counting."

I think that is probably true in the context of table-top optics experiments. I don't think that it is a correct statement about the totality of physics.

In particular, the quantization of the electromagnetic field is kind of central to quantum electrodynamics (QED), which is one of the most precisely tested theories in the history of science. You need to quantize the field in order to understand the Lamb shift and the g-factor of the electron, for example, and the latter has been measured to something like 14 decimal places, and found to agree perfectly with a theoretical prediction that includes photons.

QED, in turn, is kind of central to the Standard Model of particle physics, which again is quite well understood, and agrees very well with experiments. If anything, the Standard Model and QED are almost too successful-- lots of people would've liked to see one of them break down before now.

I'm sorry that you personally have never found photons enlightening, but they are a critical part of the modern understanding of physics, and the photon picture is extremely useful for providing insights into high-energy physics, among other areas. While it may be possible to understand many atomic (and even chemical or biological) processes using classical models, the photon-based explanation is almost always more concise and convenient (the few serious objections to it usually suggest teaching everything in terms of quantum field theory instead, which strikes me as unrealistic for undergrad physics majors, let alone chemists or biologists). And given that the electromagnetic field is unquestionably quantized, I see no good reason why the photon language shouldn't be used.

Good answer! I wasn't thinking of the high-energy guys at all. My context is mostly nonlinear optics and biology.

I'm looking forward to my first experimental experience where the photon concept becomes essential. I find I don't usually understand something deeply until I've tried to build it or simulate it.

Perhaps I'm being a bit naive, but isn't the single-photon Young's interference experiment a pretty good indicator of the particle-like nature of light? I suppose there may be a way to account for the observed results (individual photons deposit energy in discrete, localized locations, while an ensemble produces an interference pattern) with a quantum theory of matter, but the simplest and most enlightening explanation is that a "continuous" light wave consists of a large number of discrete "bits" of energy.

Perhaps I'm being a bit naive, but isn't the single-photon Young's interference experiment a pretty good indicator of the particle-like nature of light? I suppose there may be a way to account for the observed results (individual photons deposit energy in discrete, localized locations, while an ensemble produces an interference pattern) with a quantum theory of matter, but the simplest and most enlightening explanation is that a "continuous" light wave consists of a large number of discrete "bits" of energy.

I think you can explain all that with a large number of quantized detectors and a classical field illuminating the whole bunch. You might be able to back a correlation measurement out of it by looking at arrival statistics, but most double-slit experiments don't do this.

It occurs to me that I didn't mention Cavity QED above; that's an area that might require the photon picture without being a coincidence counting experiment. You might be able to explain most of it with the same semiclassical business, though.

Quantum Information is another area that makes a lot of use of photons, though Bell test and teleportation experiments mostly use coincidence counting at some level, so they might not be an exception. It's certainly much easier to talk about Bell's Inequality in terms of photons than classical waves, though. I kind of doubt you could do it in terms of pure waves, actually, but I haven't put all that much thought into it.

(I know they talk about those in The Quantum Challenge, but I'm not sure whether the statement about coincidence counting comes before or after those chapters...)

You can't think of the detector as a classical bucket that takes some time to fill with energy before it clicks, because we know that atoms have discrete energy states, and obey quantum rules. Not without throwing out all of quantum mechanics, which would create much bigger problems than explaining anti-bunching.

Got it. So the quantum nature of the detector is critical to prove the existence of the photon by anti-bunching, and the hypothetical detector I described is non-physical. I figured it was something like that, but I couldn't quite put it together. Thanks!

Thanks for explanations Chad, if possible I would like to talk a bit more about the fluctuations being random. You say:

"If the fluctuations were not random, we would be able to see that by monitoring the intensity of the light directly, and we do not see those sort of intensity patterns in classical sources of light (that is, light sources where there are large numbers of photons involved)."

First what do you mean by "monitoring the intensity of light more directly"? We can never access light directly, can we? It is always processed by quantum matter first, even if it's an eye.

Also such correlations may not be present in regular light or may be swamped when intensity is high enough.

For a very rough analogy a tap water experiences antibunching (single droplets) when the tap is almost completely closed but if you open it more the correlations are gone (I'm not saying the same mechanism is responsible of course, it's just an illustration of the fact that simply rising the intensity can completely alter the nature of the phenomenon).

I've also read a bit more about anti-bunching (though it's hard to find free and accessible material) and it looks to me like it is a very rare state of light which needs very special emitting matter to be present.

For example this article states in the abstract (cannot access full text):
"It is shown in some detail that nonlinear interaction mechanisms like multiphoton absorption and parametric three-wave interaction are suited to change the photon statistical properties of incident (in most cases coherent) light such that the output field will be endowed with antibunching properties."
http://rmp.aps.org/abstract/RMP/v54/i4/p1061_1

So antibunching is far from a generic property of light and so not simply a consequence of hypothetical photons but rather it is a property of light emitted in very special circumstances by non-linear materials. To me it means the assumption that there are no correlations in intensity cannot be made. What is the role of those non-linearities if not to introduce correlations into emitted light? If there were no correlations needed a generic material would suffice.

The abstract also mentions resonance fluorescence of a single atom as a source of antibunched light but there it is even more obvious there have to be correlations as the atom needs time for re-excitation.

As for Fourier components, maybe they are there, maybe their intensity is too weak to be detectable in this regime, or maybe the modulation is due to overlap of many oscillators with the same frequency but a complex time dependent phase pattern. This last possibility is hinted at in the abstract mentioned since they talk about "transforming phase fluctuations produced in a Kerr medium into antibunching-type intensity fluctuations".