Most of what would ordinarily be blogging time this morning got used up writing a response to a question at the
Physics Stack Exchange. But having put all that effort in over there, I might as well put it to use here, too…
Values collected as part of a scientific investigation; may be qualified as ‘science data’. This includes uncalibrated values (raw data), derived values (calibrated data), and other transformations of the values (processed data).
In response, he got a note saying:
You have a bias here towards observational data. Need to recognize that a lot of data comes from models and analyses.
The question is phrased as, basically, “What constitutes ‘data?’” but really it’s about the status given to simulation results within science.
This is, of course, a politically loaded question, which is probably why it got this response at the AGU, where there are people who work on climate change issues. Given the concerted effort in some quarters to cast doubt on the science of climate change in part by disparaging models and simulations as having lesser status, scientifically, it’s not surprising that people would be a little touchy about anything that seems to lean in that direction.
As for the actual status of models and simulations, that varies from (sub)field to (sub)field, more or less in accordance with how difficult it is to interpret experimental or observational data. My own field of experimental Atomic, Molecular, and Optical physics has a fairly clear divide between experiment and theory, largely because the experiments we do are relatively unambiguous: an atom either absorbs light or doesn’t, or it’s either in this position or that one. We do need simulations to compare to some experiments, but there’s never much question that those are theory, and not part of the experiment. The correspondence between things like density distributions observed in experiments and those generated by simulations is often close to perfect, differing only by a bit of noise in the experimental data.
When you get to nuclear and particle physics, where the detectors are the size of office buildings, the line gets a little fuzzier. The systems they use to detect and identify the products of collisions between particles are so complicated that it’s impossible to interpret what happens without a significant amount of simulation. As a result, experimental nuclear and particle physicists spend a great deal of time generating and analyzing simulated data, in order to account for issues of detector efficiency and so on. I don’t think they would call these results “data” per se, but computational simulation is an absolutely essential part of experimental physics in those fields, and those simulations are accorded more status than they would be in AMO physics. Experimental nuclear particle physicists spend almost as much time writing computer code as theoretical AMO physicists, at least from what I’ve seen.
The situation gets even more complicated when you get to parts of physics that are fundamentally observational rather than experimental. If you’re a particle physicist, you can repeat your experiments millions or billions of times, and build up a very good statistical understanding of what happens. If you’re an astrophysicist or a geophysicist, you only get one data run– we have only one observable universe, and only one Earth within it to study. You can’t rewind the history of the observable universe and try it again with slightly different input parameters. Unless you do it in a simulation.
My outsider’s understanding of those fields is that simulation and modeling is accorded a much higher status than in my corner of physics, just out of necessity. If you want to use a physical model to explain some geological or astrophysical phenomenon, the only way you can really do it is by running a whole lot of simulations, and showing that the single reality we observe is a plausible result of your models. Correctly interpreting and establishing correspondences between simulations and observation is a subtle and complicated business, and constitutes a huge proportion of the work in those communities.
I don’t know that many astrophysicists, and even fewer geophysicists, so I don’t know the terminology they use. My impression of the astrophysics talks I’ve seen is that they wouldn’t put such simulation results on the same level as observational data or experiments, but then my sample isn’t remotely representative. It may well be that there are fields in which model results are deemed “data” in the local jargon.
Of course, part of the reason for moving this over here is that there will be many more geology/ climate science types hanging around ScienceBlogs than there are at the Physics Stack Exchange, so there’s a good chance of getting some clarification from within the relevant communities. And even extending the question to other fields outside the physical sciences– I know even less about biology than geophysics, so for all I know this is a question that comes up there, too. If you work in a field where simulation results are commonly termed “data,” leave a comment and let me know.