[This correspondence started with an email from McKitrick commenting on this post. I've edited it to remove most of the quoted text from previous emails. Further discussion is here.]
I saw your suggestion about how to test whether the increase in average T was an artifact of the changed sample. I can see 2 problems with it. First, there was a change post-1990 in the quality of data in stations still operating, as well as the number of stations. Especially in the former Soviet countries after 1990, the rate of missing monthly records rose dramatically. So you need a subset of stations operating continuously and with reasonably continuous quality control.
Second, if in this subset you observe an upward trend comparable to the conventional global average, in order to prove that this validates the global average you have to argue that the subset is a randomly chosen, representative sample of the whole Earth. Of course if this were true the temperature people would only use the continuously-available subset for their data products. It isn't, which is why they don't. It would leave you with a sample biased towards US and European cities, so it is not representative of the world as a whole. The large loss in the number of stations operating (50% in a few years) was not random in a geophysical sense, it was triggered by economic events, in which stations were closed in part if they were relatively costly to operate or if the country experienced a sudden loss of public sector resources. One can conjecture what the effect of that discontinuity was, but to test the conjecture, at some point you have to guess at what the unavailable data would have said if they were available. Because of that, I cannot see how one can devise a formal test of the representativeness of the subsample.
None of this means that those researchers with access to the raw data can't propose and implement such tests as you propose (I wish they would). But it means that when you are shown a graph of the "global temperature" you are not in a position to know why the 1990 numbers are higher than those in 1980, given at least 2 confounded explanations.
The raw data is freely available for download. YOU are a researcher with access to the raw data. I downloaded it and was able to approximately reproduce your graph here. Your graph seems to have a severe problem with the right hand scale--you show a peak of 15,000 stations when it was actually 6,000.
I then did another one, just using stations with data from every year between 1980 and 2000. The resulting average is reasonably close to the one shown in the GISS graphs and shows that the 90s were warmer than all previous decades. You must have accessed the GHCN data to produce your graph. Why didn't you do the simple test I just did?
Furthermore, Hansen et al give a detailed explanation of how their graph was constructed here:
There is nothing wrong with my graph--the same counts can be seen here.
[Why didn't you do the simple test I just did?]
For the reasons set out in my email. However I have a paper forthcoming in Climate Research, which was presented at the January American Meteorological Society meetings, that implements more thorough set of tests.[Why are you raising objections that were addressed in H&R?]
I do not claim that adjustments are not being made, only that there is no formal test of their adequacy. In denying this you are making a stronger claim than H&R do, in either their '99 paper or their updated analysis in '01.
the same counts can be seen here
Well, they're wrong as well (assuming that they are from the GHCN dataset). You even reproduce the graph from Peterson that shows that the peak was 6,000 as figure 2 in your paper.
I do not find your reasons for not doing the test persuasive. However, I would like to present your side in my post. Is it OK if I quote from your email?
Thank you for checking first on quoting the emails. You may do so; my preference would be that you reproduce the whole correspondence unedited.