Evan Jones speaks

Just over a month ago, RP Sr wrote a ludicrous post [WebCite] about the "game-changing" Watts Et al 2012 (I've just realised quite how illiterate that is, too: Et shouldn't be capitalised, and "al" is spelt "al.". McNider Et Al even gets the Al capitalised). I was less impressed; indeed, disappointed. Everyone took the piss out of RP, quite deservedly. It didn't take long for those who read it carefully (not me, sorry) to find flaws in the paper; VV is one such. There were promises of updates (or so people tell me) but it all seems quiet. However, perhaps unbeknownst to many, Evan Jones [*] has made some extensive comments here discussing the paper, and ongoing work. So, if you're interested, I recommend reading why-wattss-new-paper-is-doomed-to-fail-review/#comments.

For myself, I think I'd like to see the revised version of the paper before I commit more time to working out what its about and if its of any importance. I remain dubious that the time-varying bias has been demonstrated, or the difference in methodology and results from previous has been justified, or the comparison with the satellite record been addressed.

Since we're talking about comments, I'd like to say that (a) the comments have been very interesting recently, thank you; but (b) please extend courtesy to those visiting, perhaps in particular those you disagree with. This will be enforced.

[*] Or someone claiming to be him. But I've no reason to believe it isn't EJ.

Refs

* Why Watts’s new paper is doomed to fail review
* Watts disappoints
* Virtual water from Eli
* DA pulls some of my words from a comment
* xkcd on Muller

More like this

Er, why say please then?

[Its the familiar iron fist in the velvet glove. Better than the iron fist in the rubber glove -W]

By Steve Bloom (not verified) on 01 Sep 2012 #permalink

Velvet gets dirty and wet when used for those purposes.

By Eli Rabett (not verified) on 01 Sep 2012 #permalink

The short version: Leroy is gray lit. That's OK if the standards are used appropriately, but these guys aren't. If they want to use it the way they are, they need to do the science to back it up. That means going out and systematically measuring potential biases in the real world (*lots* of work) and only then using the outcome to QC the USHCN network. It does *not* mean assuming that Leroy's biases are valid for a given station based just on photos.

[There does seem to be a very heavy reliance on the overwhelming virtues of the new Leroy -W]

By Steve Bloom (not verified) on 01 Sep 2012 #permalink

The expression et al. is an abbreviation for the Latin et alia, which being from a foreign language in English should be written in italics, although this is not done for common Latin abbreviations e.g., viz., i.e., etc.

By Alastair McDonald (not verified) on 02 Sep 2012 #permalink

Alastair, shouldn't that be "... for common Latin abbreviations, e.g., e.g., viz., i.e., etc., etc."? ;)

Color me dense, but I fail to see how an increasing trend could be induced by an constant offset in a value. Please explain the physics here.

By Rattus Norvegicus (not verified) on 02 Sep 2012 #permalink

Another bit of denseness, but Jones seems to be claiming that they have found that (for some class of stations) Tmin is increasing faster than Tmax. Doesn't basic greenhouse theory say that this should be the case? Is Jones really claiming that rediscovering the wheel is earthshaking?

By Rattus Norvegicus (not verified) on 02 Sep 2012 #permalink

JBL,

Nope :-)

But thanks for expaining my little joke!

Cheers, Alastair.

By Alastair McDonald (not verified) on 03 Sep 2012 #permalink

Rattus Norvegicus,

'Tmin is increasing faster than Tmax. Doesn’t basic greenhouse theory say that this should be the case?'

This seems to be a common belief and is broadly intuituve, but physical models don't back it up.

Stenchikov and Robock (1995) seems to be one of the earliest papers to fully investigate causes behind diurnal trends in terms of the physics. They found very little difference between Tmax and Tmin in their modelled direct longwave response to CO2 forcing (aka the greenhouse effect - if anything, slightly more warming during the day). However, they found the overall no-feedback response produced warming at night and little trend during the day, due to the downward shortwave absorption effect of CO2 rather than the greenhouse effect.

Aerosol forcing has a clearer no-feedback effect on DTR (Diurnal Temperature Range, Tmax-Tmin): aborbing and reflecting downward shortwave radiation to produce an overall large trend in daytime cooling. However, they found that the feedbacks from this cooling (water vapour, clouds) cancelled out the DTR decrease.

Taken together, including feedbacks, the CO2 and aerosol changes produced a strong enough overall warming, due to CO2, allowing aerosols and other shortwave effects to reduce Tmax and induce a DTR decrease.

The upshot is that models do generally predict DTR decrease, but not due to the greenhouse effect. The problem is that GCMs don't seem to produce DTR decreases to the extent observed, even ones with strong negative aerosol forcing. So, the age old question is raised: which is right? Models or data?

AIUI urban heat island effects (I could have written 'AIUI UHI', but though better of it) are thought to manifest mainly in Tmin measurements. One of the clearest examples I've seen of where this understanding has been applied is the HadCET data. If you subtract Tmin from Tmax there you'll get a flat curve from 1880 to 1980, followed by a DTR increase. Whereas if you take the BEST UK data you get a large DTR decrease from 1950 to 1980, followed by an increase to present. This difference occurs because the HadCET keepers, somewhat arbitrarily (in their own words), apply their UHI adjustment with a 75%/25% ratio in favour of Tmin. If this is the case, theoretically the large observed DTR could indicate a UHI component in the data.

Another possibility tabled has been that boundary layer effects cause more Tmin warming specifically at the near-surface height than the "real" rate of global warming.

First: I think it is great the Evan Jones engaged so thoroughly on the previous post.

Second: When I skimmed the Watts et al. draft the first time, one thing I noted was that arguments regarding Leroy 1999 vs. 2010 were almost a red herring for what seemed to me to be the more important difference between Watts and Fall which was the TOBS siting... and I thought to myself that there is an easy way to check to see how much difference the Leroy classification makes versus the methodology: so, I see the discussion of the update that gets around TOBS, and I wonder if this check has been made.

To whit: run the Watts et al. methodology using Leroy 2010, and then run it again using Leroy 1999. Compare the two graphs. This tells you how much the Leroy change actually matters. If that change is small, well, then, we know to scrutinize the rest of the methodology. If the change is big, then we know that the classification method used is actually important, and we can think about that. But until this fairly simple experiment is done, I see the "Leroy 2010!" justification as still a red herring...

y'all ought to be engaging evan on the proper thread ...

Okay, so I am late to the show again.

That’s OK if the standards are used appropriately, but these guys aren’t. If they want to use it the way they are, they need to do the science to back it up. That means going out and systematically measuring potential biases in the real world (*lots* of work) and only then using the outcome to QC the USHCN network. It does *not* mean assuming that Leroy’s biases are valid for a given station based just on photos.

Yet Menne, et al. (2010), Fall, et al. (2011) and BEST all rely on (my) ratings, using Leroy (1999). Yet, those are not held to be inadequate. The former two passed peer review and the latter was not rejected on those grounds.

If Leroy is "gray literature", its use as a basis for Meteo France and USCRN siting and its endorsement by WMO certainly leave on the whiter shade of pale side.

"Proper use" of Leroy (either 1999 or 2010) does not involve going out and redoing Leroy's study. Proper use means separating out warming and cooling factors without conflating them. In both studies (Fall and Watts) we consider the single variable of proximity to heat sink/source.

We are not attempting to determine what Leroy is attempting to demonstrate. Leroy is determining offset. Leroy's premise that nearby reflective surfaces (etc.) can affect the thermometer reading is neither controversial nor currently disputed.

We, on the other hand, are not concerned at all with offset. We are looking at trend, and trend, only. Furthermore, as in Fall, et al., we are not attempting to determine tactical cause. We are merely making observations.

Finally, we do not rely only on photos. Photos alone would be completely inadequate to determine heat sink proximity within radii up to 100 m. For that, we use satellite imagery.

By Evan Jones (not verified) on 02 Oct 2012 #permalink

Color me dense, but I fail to see how an increasing trend could be induced by an constant offset in a value. Please explain the physics here.

We suspect that the excess heat released at Tmin time causes a (slightly) geometric rise in trend as the overall climate warms -- up to a certain point (i.e., Class 5), when the effect of omnipresent heat sink/waste heat begins to overwhelm the sensor. When the sensor is overwhelmed, it shows a higher trend that Class 1\2, but a lower trend than Class 4 (or even 3).

The other side of this coin is that if poorly sited stations warm faster during a warming phase, they would be expected to cool faster during a cooling phase in the spirit of "What goes up, must come down".

By Evan Jones (not verified) on 02 Oct 2012 #permalink

MMM: We have indeed dealt with TOBS. (And MMTS conversion.) I did a quick run of the numbers using Leroy 1999 ratings and the results were much the same as in Fall, et al. Using Leroy 2010 made a whole lot of difference (much to our surprise, I might add).

By Evan Jones (not verified) on 02 Oct 2012 #permalink

This seems to be a common belief and is broadly intuituve, but physical models don’t back it up.

Interestingly, we find a greater Tmin trend affect in urban areas than in non-urban.

By Evan Jones (not verified) on 02 Oct 2012 #permalink

Interestingly, we find a greater Tmin trend affect in urban areas than in non-urban.

That would be expected according to mainstream understanding.

When the sensor is overwhelmed, it shows a higher trend that Class 1\2, but a lower trend than Class 4 (or even 3).

In what sense would a sensor be overwhelmed? Are you suggesting some thermometers won't respond in a predictable fashion beyond a certain threshold temperature? Have you tried performing/reading up on laboratory tests of the equipment in question?

Are you suggesting some thermometers won’t respond in a predictable fashion beyond a certain threshold temperature?

Not at all, I am sure the raw, absolute readings are quite accurate at whatever temperature level at issue. We are not talking absolute readings, however, we are talking effect on trend.

I am suggesting that overwhelming presence of artificial heat overwhelms and dampens the trend compared with the trend of Class 3 or 4 stations.

This is quite predictable and holds true no matter how we subdivide the data sample (i.e., by mesosite or equipment).

The fact that Class 4 stations begin to show this tendency in urban areas is consistent with this thesis. It appears that when siting is sufficiently poor, we get results that are higher than the well sited stations but lower than the badly-but-less-badly located stations.

This is not an effect we predicted, but we observe it consistently throughout our mix of data.

By Evan Jones (not verified) on 02 Oct 2012 #permalink

"Using Leroy 2010 made a whole lot of difference (much to our surprise, I might add).

Pointer, please, to the data upon which Leroy (2010) is based.

"Leroy’s premise that nearby reflective surfaces (etc.) can affect the thermometer reading is neither controversial nor currently disputed."

The issues are magnitude of the effect and potential inappropriate grouping of disparate influences.

That one can pass peer review with a paper that discusses the effect of applying Leroy's standards does nothing to validate the standards themselves, BTW.

As far as the WMO's adoption of them goes, I would suggest to you that they're fine on a seat of the pants basis, but should not be taken farther than that as you're trying to do.

By Steve Bloom (not verified) on 03 Oct 2012 #permalink

Not at all, I am sure the raw, absolute readings are quite accurate at whatever temperature level at issue. We are not talking absolute readings, however, we are talking effect on trend.

But the trend at a particular station is produced by a series of absolute readings, so how can the trend be dampened if the most recent absolute readings aren't?

The issues are magnitude of the effect and potential inappropriate grouping of disparate influences.

Yes. That is why we use -- only -- the proximity influence. The accuracy or inaccuracy of the offset magnitude is irrelevant, since we are not using those numbers or even addressing absolute offsets in the first place.

A trend change in either direction or no trend change at all would neither validate not invalidate Leroy, who has nothing to say either way about trend.

But the trend at a particular station is produced by a series of absolute readings, so how can the trend be dampened if the most recent absolute readings aren’t?

Actually, overall, a very poorly sited station trend is increased compared with good siting. But it is not as much as for moderately bad siting. That is the observation. It does not seem particularly counterintuitive to me that a large amount of waste heat could partially mask a small outside increase.

And, as I say, that is the observation. This paper (like Fall, et al.) is about observations.

By Evan Jones (not verified) on 03 Oct 2012 #permalink

And, as I say, that is the observation. This paper (like Fall, et al.) is about observations.

Except that's not really the case. The draft manuscript of the new paper made some very strong conclusions. You've proposed a conclusion above concerning this specific matter, or at least suggested a possible conclusion:

I am suggesting that overwhelming presence of artificial heat overwhelms and dampens the trend compared with the trend of Class 3 or 4 stations.

That is not an observation. What I'm saying is that this suggestion, as you've formulated it above, doesn't seem to make sense. Can you provide an example to illustrate how the trend can be overwhelmed and dampened without absolute readings having been affected?

If you were to suggest that the sensors themselves are behaving non-linearly in response to different temperature thresholds that would be a testable hypothesis. What you've said about the sensor behaving as expected but somehow producing a dampened trend... well, I can't see how that's not ascribing supernatural causation.

Except that’s not really the case. The draft manuscript of the new paper made some very strong conclusions. You’ve proposed a conclusion above concerning this specific matter, or at least suggested a possible conclusion:

We observe that trend is higher for Class 4 than for Class 5. What defines Class 5 is overwhelming exposure to heat sinks/waste heat. We find this pervasive no matter what data sample is used.

We also observe that Class 4 stations in urban areas tend to behave like Class 5 stations in rural areas.

Furthermore, differences between well and poorly sited stations are greater in rural than in urban areas (although there is still quite a difference in urban areas).

These findings are consistent with the hypothesis that the most poorly sited station trends are being partially overwhelmed. They still show a higher trend than well sited stations, but not as high as the Class3\4 range.

If your conclusions would be any different from mine, I would be interested to see them.

Let us say you exposed a station to, oh, 1000 degrees C. I suspect a change in climate of 1 degree might not show up in the readings. I think this is what may be happening when manmade heat influences reach perhaps 5+ degrees C.

But what is primary is the observation. I think a study to explain the reason for these observations would be interesting. But it is not the goal of the paper.

That is not an observation. What I’m saying is that this suggestion, as you’ve formulated it above, doesn’t seem to make sense. Can you provide an example to illustrate how the trend can be overwhelmed and dampened without absolute readings having been affected?

That is meaningless. I think you are misunderstanding what i have been saying. I have tried to re-explain above. If this doesn't make sense to you I would be interested to know why it doesn't and what you would propose as an alternative explanation.

[The question makes sense to me. I'm unable to understand how you're unable to understand it. Perhaps you didn't really mean your original "without absolute readings having been affected?" -W]

By Evan Jones (not verified) on 03 Oct 2012 #permalink

Evan,

In an earlier post you said:

Not at all, I am sure the raw, absolute readings are quite accurate at whatever temperature level at issue. We are not talking absolute readings, however, we are talking effect on trend.

In your latest post you say:

Let us say you exposed a station to, oh, 1000 degrees C. I suspect a change in climate of 1 degree might not show up in the readings. I think this is what may be happening when manmade heat influences reach perhaps 5+ degrees C.

Can you not see these two statements are incompatible?

I can see that a strong outside constant (in this case, lage amounts of waste heat) can mute a trend. And it is supporterd by the data.

By Evan Jones (not verified) on 17 Oct 2012 #permalink