In every cop drama there’s a scene where a suspect is being questioned in an interrogation room. The room contains a large mirror, and behind that mirror the detectives and district attorneys are observing and arguing about the progress of the case. The mirror is a two-way mirror.
These kinds of mirrors aren’t complicated. Light shines on them, and some fraction is reflected back while some fraction passes through. The suspect in the brightly lit room can’t see the dark room beyond the mirror because the bright room light washes out the much smaller amount coming from the adjacent dark room.
We use the same concept in the lab. A device called a beam splitter is essentially a one-way mirror. They come in various types and styles for various purposes, but let’s pretend we have one that reflects exactly half the incoming field and transmits the other half. I’ll draw a very simplified schematic of this sort of mirror:

Here the incoming light is a, the half that’s reflected is b, and the half that’s transmitted is c. Dig into the classical electrodynamic details of this and you’d find that the transmitted wave will generally undergo a pi/2 phase shift, but that’s details. For classical light as described by Maxwell’s equations this works just fine. Since pretty much all light is produced either incoherently by light bulbs and the sun or whatnot, or coherently by lasers and related devices, our description of the beam splitter can pretty much stop there.
But if we’re dealing with very small amounts of light – down to the level of individual photons – it turns out this description fails to work properly. A photon is a discrete quantum of light, so it can’t be split in two. Any single photon you shoot at the beam splitter will instead have a 50:50 chance of going one way or another. For large numbers of photons this reproduces the classical case, but for small numbers of photons this statistical description is required.
That’s not so strange. What’s strange is what happens if you shoot two photons of the same frequency at the beam splitter at the same time, one from the left on the picture and one from the top. Then there’s three possibilities: both photons might exit via path b, both photons might exit via path c, or one photon could take b while the other takes c. So the situation we have is this:

In our particular situation we have a 1-photon state at input “a”, a 1-photon state at input d, and unknown states that we have to calculate at outputs b and c. Now we have one of those cases where doing the math is easy, but for the math to make any sense I’d have to teach you how to quantize an electric field. That’s is a very math-intensive project, so we’ll skip it for now. What we find is that the two possibilities “both photons are reflected” and “both photons are transmitted” have opposite signs in their contributions to the process “one photon in each output”. As such the amplitude of that process ends up being zero. The only possibilities left are “both exit via b” and “both exit via c”.
But until you actually measure where the photons are, you have a superposition of those two states. If you find that one of the photons is in path b, you know for a fact that you will also find the other photon in path b. And if you find one in path c, you know the other one is also in path c. Until you measure, you can’t know which one of those cases it will be.
Commonly these processes are described by breathless popular press articles like the article we talked about earlier this week as though simultaneously both photons are in path b and both photons are in path c. After all, the article described the oscillator in the experiment as both vibrating and not vibrating. While to some extent it’s just a matter of semantics, I feel that kind of description is more confusing than helpful. A quantum superposition is not two contradictory things magically happening at the same time, it’s a combination of different and sometimes mutually exclusive states – only one of which will actually end up being observed. Now these states are real; each state in the superposition can interact with the other states and affect their probability of being observed once the system actually interacts with something. But the actual observation will only involve one of these states on a probabilistic basis. Quantum mechanics is an odd and frequently counterintuitive theory, but like every other accepted theory in physics it is rigidly bound by the rules of mathematics and logic. It’ll give you strange results, but it’ll never give you impossible ones.
UPDATE: Reworded this paragraph slightly as per suggestion in the comments.