Why Watts's new paper is doomed to fail review

Anyway, I haven’t got to the science yet.

My snarky side can't resist saying ... "despite having reached the end?"

It's "it's 'Watts (2009)'...."

Sorry. Couldn't resist!

BEST has published? All I see from looking around Google News is a bit of controversy about how they started publicizing their results before the paper went through the full review process (e.g. http://blogs.nature.com/news/2012/07/amid-criticism-berkeley-earth-exte…)

[Not published, no. But they claim "The Berkeley Earth team has submitted a fifth paper for publication; it has received journal peer review, and we are now posting it for a wider peer review from the scientific community" is available at http://berkeleyearth.org/beta/results-paper-july-8.pdf . But the link, for me, requires sign-in.

And note, still, that weird woo hand-wavy "has received journal peer review". WTF is that supposed to mean? If its been reviewed, and is now accepted, they would say so. If its been reviewed, and requires revision, they should say so -W]

Per the commenting on the WUWT Press Release thread, I have the impression that Evan Jones is the principle editor of the paper. The paper is nowhere near ready for submission, IMHO.

You can cite pretty much anything you like, peer-reviewed or not, as long as it's publicly accessible.

"Technical reports", in particular, are non-reviewed documents produced with the imprimatur of an institution and offered for the world to see. Some of them are pretty widely cited. And yes, people do cite their own technical reports (especially MIT people).

I guess citing his Heartland stuff would fall in that category, so I don't think it would necessarily "doom" it to fail review.

[I'm dubious. Journals can be fairly sniffy about what they allow you to cite - just being available isn't enough. An MIT technical report, maybe - but that's rather different to Heartland -W]

Do you have a link to the paper? Sometimes these things are useful in teaching -- give an anonymized version to the students and see if they can spot the silliness.

Perhaps I should steel myself with a shot of good scotch before reading it. Or maybe just have the scotch, and not bother...

[http://wattsupwiththat.files.wordpress.com/2012/07/watts-et-al_2012_dis… - I'll add that to the post, too -W]

William, try http://berkeleyearth.org/pdf/results-paper-july-8.pdf

[Thanks, that works. Those bozos should sort out their web site -W]

It is my impression that they use a lot of "raw averages" for their comparisons. I assume they are simply averaging over all existing stations without gridding or area weighting. I have explicitly asked for clarification at WUWT but did not get a response. This kind of averaging does not give you any meaningful results as changes in the station distribution will change the average.

Furthermore they compare the class 1,2 trend and class 3,4,5 class trend of the unhomogenized data to the NOAA trend calculated from the homogenized data including corrections for time of obeservation bias etc. It's barely discernible in the paper and omitted in the press release. Watts does seem to think that the completely unadjusted data is the correct temperature to use; when he compares his trend to the satellite data he argues that tropospheric amplification is up to 1.4 in some models and he needs 1.5. So without mentioning it Watts questions the vailidity of each and any adjustement done to NOAA final data, be it TOBS, SHAP or whatever. No discussion for this, just sneaking this in.

Given the nonsense above it's almost a minor point that all trends are listed to three significant digits, without the slightest indication of error ranges or significance levels.

Eschenbach 2011: So, despite a promise of transparency, to great fanfare BEST has released four pre-prints, based on admittedly “buggy” data, without the accompanying code or data to back them up.
http://wattsupwiththat.com/2011/11/01/pre-prints-and-pre-data/

I think that pretty much sums up last Sunday.

h/t PDA @ lucia's

Oh, so someone threw Eschenbach's denial of Mosher's claim back in his face,eh?

(You'll have to read the thread over there to understand.)

FWIW, Eli has been softly hinting about that a 2008 photograph tells you nothing about the state of anything in 1980. Spread the koan:)

Eli Rabett,

To be fair, if you know the date of the last station move, its not useless. Unfortunately I don't think Watts et al have gotten that far in their analysis yet.

So Watts basic argument is ...

We have a set of rather recent digital photographs wherein you will see many ponies.

And the cart before the pony ploy ...

"Until I came along with Watts 2009, they really weren’t looking closely at the issue. The SurfaceStations photography forced them into reaction mode, to do two things."

But right below that Tony the Tiger states ...

"Additionally, if they think they can get good data out of these stations with the myriad of adjustments they perform, why did they need to spend millions of dollars on the new Climate Reference Network commissioned in 2008 that we never hear about?"

Methinks 2008 comes before 2009. But wait, from the "we have this department"...

"The U.S. Climate Reference Network (USCRN) ... Experimental stations have been located in Alaska since 2002 and Hawaii since 2005, providing network experience in polar and tropical regions. ..."

So long before there ever was a WTFUWT? blog there was NOAA doing the right things to begin with in the first plase.

Somehow this episode of Tony the Tiger reminds me of that South Park episode ...

http://en.wikipedia.org/wiki/Dum_dum_dum_dum_dum

NOAA ... "Smart, smart, smart, smart, smart,.."

Tony the Tiger... "Dum, dum, dum, dum, dum..."

Somehow a set of recent digital photographs does not make one a subject matter expert in field measurements.

Eli writes "FWIW, Eli has been softly hinting about that a 2008 photograph tells you nothing about the state of anything in 1980"

As I understand it Watts et al took the unprecedented step of actually talking to people about the history of the stations :-P

Besides I would imagine it to be rare that sites become better over time, rather I would expect the norm to be more bitumen and concrete added in the immediate area over time so if its bad in the photo now and you dont have a good history of the station its safest to simply downgrade it.

This is selection criteria independent of the temperatures recorded at the station itself and that gives their method some validity (cf tree ring selection criteria)

Let's see, the title of this post is "Why Watts’s new paper is doomed to fail review" and then the (summarized) content of this post is: "Anyway, I haven’t got to the science yet."
So the conclusion is settled, now you only have to find the arguments?
Oh well, that just means I don't have to visit this site anymore.

[I'm not sure if you've heard of a concept we English have, its called "humour". You might want to investigate this concept some time, we find it makes life more enjoyable.

As to the substance: I was trying to gently hint that the paper seems to contain some newbie flaws that would require revision at the very least; and which suggest that Christy, although nominally an author, hasn't actually read it -W]

Check out orlowskis latest two articles on the register. One promoting the watts paper and the other attacking mullers. Apparently miller is bad for doing science by press release and not releasing data immediately. There's a very amusing paragraph critisizing journalists for promoting non-peer reviewed papers!

I left a comment questioning the different approach orlowski took to each paper, but so far it hasn't been published. I notice the article has just one comment and its marked as removed by the moderator....I wonder

"suggest that Christy, although nominally an author, hasn't actually read it -W"

Do you think that Watts has read it?

The WUWT headline is interesting: New study shows half the global warming in the USA is artificial.

First parsing 'global warming in the USA' is tricky. Parsing 'half being artificial' is easier - half wouldn't have happened but for humans.

Muller reckons almost all the recent land surface warming is artificial while Watts reckons only half of the "global / USA" warming is artificial.

(In three days working full time on this you'd think he could have come up with a different headline.)

Well, Tony missed something important

Though it uses a lot of words to get there, my (brief) reading of the paper suggests the synopsis is that they have two collections of stations, one with a significantly lower trend than the other....And that's pretty much it.

So, the question becomes: at this stage, is that enough? I suspect it could be given that the station collections have been divided on the basis of objective quality ratings (even if that rating system has been applied by Watts & McIntyre). However, it would need some major revisions. For one, it rambles on for pages about things which could be summed up in a few lines. It's far too long for what it actually says.
The paper also arrives at strong conclusions based on effectively zero evidence, such as 'These factors, combined with station siting issues, have led to a spurious doubling of U.S. mean temperature trends in the 30 year data period covered by the study from 1979 - 2008.' Since they haven't actually investigated the adjustment factors (TOBS, SHAPS) which lead to the greater trend they have no basis for this statement.

Here's an interesting paragraph from the paper: 'By way of comparison, the University of Alabama Huntsville (UAH) Lower Troposphere CONUS trend over this period is 0.25°C/decade and Remote Sensing Systems (RSS) has 0.23°C/decade, the average being 0.24°C/decade. This provides an upper bound for the surface temperature since the upper air is supposed to have larger trends than the surface (e.g. see Klotzbach et al (2011). Therefore, the surface temperatures should display some fraction of that 0.24°C/decade trend. Depending on the amplification factor used, which for some models ranges from 1.1 to 1.4, the surface trend would calculate to be in the range of 0.17 to 0.22, which is close to the 0.155°C/decade trend seen in the compliant Class 1&2 stations. '

This is wrong of course - the 1.1 - 1.4 figure would relate to global land+ocean TLT amplification. The global land relationship in CMIP3 models centres on a 1:1 relationship (0.8 to 1.2 between 1979 and 2005, 0.9 to 1.1 between 2010 and 2100), and land temperatures are what is at issue. However, the interesting thing is that McIntyre knows it's wrong, because Gavin pointed out his minunderstanding in a climateaudit thread last November.

Also, the CONUS surface-TLT amplification factor in models may well be different from the global average.

Link fail: climateaudit.org/2011/11/07/un-muddying-the-waters/

Though it uses a lot of words to get there, my (brief) reading of the paper suggests the synopsis is that they have two collections of stations, one with a significantly lower trend than the other….And that’s pretty much it.

That's my reading too. The paper can basically be summarised as 'Biases exist. We didn't bother to check if they matter.'

We know of at least one "journal" that will publish this "paper" as is.

E&E

The peer review will be done by Moe, Larry and Curly.

Turnaround from initial submittal to final publication will be immediate if not sooner.

FWIW McIntyre seems to have been surprised at his inclusion as an author. He is also being very non-committal about the paper, pointing out some of the more common observations (e.g. the TOBS confound problem discussed by Eli).
http://climateaudit.org/2012/07/31/surface-stations/

[That's interesting. McI is distinctly non-commital there. I notice he is keen to avoid having to look at the satellites; but as he says (as everyone says) once you've done that, there isn't much scope for the surface temperature record being very wrong -W]

McI:

When I had done my own initial assessment of this a few years ago, I had used TOBS versions and am annoyed with myself for not properly considering this factor. I should have noticed it immediately. That will teach me to keep to my practices of not rushing.

More than non-commital, he's saying he screwed up by not noticing that the paper he was helping on ignored TOBS issues.

Coming from McIntyre, even a tiny acknowledgement of having screwed up is absolutely huge. It's obvious that he just helped clean up a few details of the analysis without bothering to read the entire paper for comprehension ...

More than non-commital, he’s saying he screwed up by not noticing that the paper he was helping on ignored TOBS issues.

He's also putting only a very thin veneer over the underlying fact that the paper itself is really only fit for use in the smallest room in the house - and I don't mean for reading.

It's popcorn time, by all appearances.

I quite like this comment on CA http://climateaudit.org/2012/07/31/surface-stations/#comment-345330

"The fact that a statistician was brought in over the weekend to finish a paper does not reflect well on the entire effort. A scientific paper is not a homework assignment"

"Here’s an interesting paragraph from the paper: ‘By way of comparison, the University of Alabama Huntsville (UAH) Lower Troposphere CONUS trend over this period is 0.25°C/decade and Remote Sensing Systems (RSS) has 0.23°C/decade, the average being 0.24°C/decade. This provides an upper bound for the surface temperature since the upper air is supposed to have larger trends than the surface (e.g. see Klotzbach et al (2011). Therefore, the surface temperatures should display some fraction of that 0.24°C/decade trend. Depending on the amplification factor used, which for some models ranges from 1.1 to 1.4, the surface trend would calculate to be in the range of 0.17 to 0.22, which is close to the 0.155°C/decade trend seen in the compliant Class 1&2 stations. ‘"

Yes notice that this adjustment was made but not TOBS

Emerges, not arises, after all these things are audited.

McI is distinctly non-commital there.

To McIntyre's credit, he is now saying that he totally screwed up in not catching the fact that Watts ignores TOBS issues, is extremely annoyed at himself for having done so (and presumably allowing himself to be rushed into a quick round of help over a short weekend), and has every intention of making sure that TOBS issues are properly addressed, even if it means "redoing the statistics from the ground up".

Now, we'll see where this leads, but so far it's promising.

And the possibility of an interesting clash between Watts and McIntyre is promising, too, because I just can't imagine Watts letting go of his "50% of the warming is spurious" conclusion.

Has Christy said anything about the paper yet? Or is he also "surprised" to see himself listed as an author too?

Eli arising / emerging from the monitor reminds me of Wolfe's Long Sun tetralogy. Probably not what you had in mind at all, but I shall risk ridicule :-)

[You are exactly correct. http://mustelid.blogspot.co.uk/2005/04/i-dwindle-go-unnoticed-now.html -W]

When will we (if ever) see a list of station ID's (with numbered classifications) used in this, err, missive, cast down to Earth by the Holey Church of the Eternal Denier?

Just asking for the the list of station ID's and nothing else.

I don't need to see literally thousands of very recent 2D digital pictures of ponies, thank you very much. I don't need to see their overly verbose excuses for siting selection. And I don't need to see KML files for Google Earth. By the time they put that all together, the text alone, will be bigger than The Bible.

A simple two colume ASCII file, Column 1 = station ID, Column 2 = Classification (an integer between 1 and 5).

Simple enough data request. Watts Up With That?

So now RP, SR. is telling all the world that the "game changer" paper by you know is "[to be submitted to JGR]"

But, as usual, RP, Sr. just can't help but reference one of his own "game changer" literary works.

No denial yet (as to where it will be submitted) from AW and Co. as they have a post up pointing directly to RP, Sr. "game changing" essay.

JGR? Seriously? Are they trying to get soundly rejected on purpose? Meaning the "as is" paper. IMHO I do think so!

Zeke Hausfather has written what is probably the best possible review of Watt's draft given available information (or lack thereof) over at Lucia's.

You may be surprised to learn that his assessment is not exactly over-enthusiastic. :p

[You broke your HTML. I think you mean http://rankexploits.com/musings/2012/initial-thoughts-on-the-watts-et-a… -W]

Not only is Pielke Sr describing this as a game changer, he says:

Anthony has led what is a critically important assessment of the issue of station quality. Indeed, this type of analysis should have been performed by Tom Karl and Tom Peterson at NCDC, Jim Hansen at GISS and Phil Jones at the University of East Anglia (and Richard Muller). However, they apparently liked their answers and did not want to test the robustness of their findings.

Which is kind of ironic in view of the fact that even one of the authors (McIntyre) doesn't think the stats have been done proerly (yet at least). In other words, Roger is happy to blindly accept something that supports his viewpoint.

I remember when Pielke Sr was an interesting voice (and obviously has had a great career). Not one I necessarily agreed with, but generally thought provoking. In the last couple of years or so, however, he's become an utter embarrassment.

[RP Sr has drifted off. He's been like that for a while. Its a Linden thing, really. He can't bear to be part of the mainstream, and he feels personally slighted because, despite his relentless and shameless self-promotion, his work isn't widely recognised. Probably, he is now being outrageous in order to get a quote into the limelight of NYT -W]

Well now Roger thr Dodger does a walk back of sorts, all about TOBS now, "game changer" now on hold.

[Ah, you mean http://pielkeclimatesci.wordpress.com/2012/07/31/summary-of-two-game-ch… Yeeeessss... its a bit much (well, for anyone who wants to think they have any credibility) to call the paper a "game changer" complete with big logo, and then suddenly have to say "oh, hold on, actually I haven't read it..." -W]

And as usual he just has to reference himself yet again

So as I see it, this paper by Watts won't even make the IPCC AR5 WG1 submittal deadline (it's already 8/1/2012 over there).

https://www.ipcc-wg1.unibe.ch/AR5/AR5.html

“redoing the statistics from the ground up” rules out looking at anything like borehole data, but I suppose ... well, no, I don't.

"Well now Roger thr Dodger does a walk back of sorts, all about TOBS now, “game changer” now on hold."

Oh, God, that was quick. And he's been a co-project leader of the surface stations project from day one, so for all practical purposes, he's lying.

Wow, crap on a crutch.

So as I see it, this paper by Watts won’t even make the IPCC AR5 WG1 submittal deadline (it’s already 8/1/2012 over there).

Christy's #5 author on the paper, and he's scheduled to testify to the US Senate tomorrow, so I think you might be missing the intended target? Watts stated that he wanted to prime Christy for his testimony ...

It will be interesting if Christy (who I doubt actually read the paper, McI admits he hadn't) plays the Watts card ...

dhogaza,

Very true. I'm sure Inhofe will charge right into it, with the "It's all a cons piracy." Political theatre at it's best.

Just hope for someone else on that panel who points out how deeply flawed it really is, not fit for peer reviewed (or public) consumption. And I do mean forcefully.

It will be interesting to see though, how well he supports his, err Watts, own work.

I don't think Christy will use Watts' report. If he does, he will be openly ridiculed by everyone, touting a paper that within a few days of blog review is already shown to be based on flawed analysis.

What would be really funny if one of the senators (it can only be a Democrat) asks Christy about Watts' report. I'll gladly make popcorn for everyone if that happens!

@Paul S
July 31, 2:12 pm

You should apologize to all people who read that topic at CA. There is nothing more laughable than McIntyre berating Gavin for not writing all his code in R, because that is the high level language everyone uses. The man's inability to see the world in any other terms than what is suits him personally is amazing. Not one of his minions pulls him up on it.

Is Watts seriously trying to get this thing into IPCC? It's likely that if it were submitted for publication in current form it would have a very hard time, not least because the length needs to be cut in half.

I suppose it's a win-win for him. If it gets published, he builds his credibility. If it's rejected, he can wail against the godless left-wing fascist anarchist "gatekeepers."

1) Christy has indeed references the Watts study in his congressional testimony: http://epw.senate.gov/public/index.cfm?FuseAction=Files.View&FileStore_…

2) On bluegrue's point: reading the paper methodology, it seems to me that Anthony has averaged all stations within a given region, and then does an area-weighted averaging of the regions to get a US average. He uses the word "gridded" in his phrase, "gridded, area-weighted mean of the regional averages", but I don't think that word means what he thinks it means, because that phrase makes no logical sense as is. (if it was gridded, there would be no need to have done the regional averaging step first)

The fact that Watts cannot do a real gridded anomaly average after all these years of working in this field is an embarrassment. The TOBS ignorance is just the cherry on the embarrassment cake. (Apparently Watts is now claiming that he doesn't believe in TOBS because he doesn't think that observers actually changed their observation times when NOAA told them to).

Very glad to see someone is taking Watts to task over this nonsense. Goodness knows, I have tried...
http://lackofenvironment.wordpress.com/2012/08/01/muller-knows-best-wat…

MMM: 'Apparently Watts is now claiming that he doesn’t believe in TOBS because he doesn’t think that observers actually changed their observation times when NOAA told them to'

Is that really what he's saying? The BEST methodology for dealing with discontinuities such as TOBS does not use metadata, so doesn't care when anyone was told anything, but produces a final adjusted record remarkably similar to USHCN's. Does he provide any justification?

Ah, I was mistaken. Watts did use, for some of his analysis, a crude gridding routine, dividing the US into 26 six degree grid boxes, and it also appears he used a reasonable anomaly method, so that should address the major non-TOBS issues.

(better gridding methods do exist, but this is a big step up from just regional averaging) (note that Watts ALSO does raw averaging, and it isn't always clear which of his results are raw-average based and which aren't)

Just hope for someone else on that panel who points out how deeply flawed it really is, not fit for peer reviewed (or public) consumption. And I do mean forcefully.

I posted a heads-up over at Climate Progress on the thread regarding the upcoming hearing, but it was already 8:30 PM eastern time when I thought to do so.

I was clearly too slow ... it would've been nice if Romm had been able to prime a dem on the committee ...

Paul S: Here's the exchange I was referring to:

Nick Stokes:
The 2009 BAMS paper of Menne et al has a Fig 3 which shows the trend of observation times that observers actually reported. And Fig 4 shows the resulting effect of a TOBS adjustment on trends, based on the Fig 3 data and the known diurnal and day-to-day variability.

REPLY: Noooo…Fig. 3. Changes in the documented time of observation in the U.S. HCN. is about the times they assigned the observers. There’s no proof the observers adhere to it. – Anthony

Regarding BEST: Yes, I agree that BEST provides confirmation that the TOBS adjustment is reasonable. But for Watts, BEST is project-non-grata.

Harry,

Heh, much of McIntyre's input (probably the whole purpose of the post) on that page amounts to little more than preening for his audience.

McIntyre has pleaded ignorance on the contents of the paper, so my point is irrelevant now anyway. I have no reason to suspect Watts knows anything about TLT/surface temperature trend amplification.

MMM,

Just more Wattsian anecdotal evidence. Not much meat on them there bones

Wattsian logic will take qualification over quantification each and every time.

That most famous of Wattsian informal logical fallacies, the cherry pick.

Look see a pony, an exception to the rule is 100% proof that all other information is equally invalid.

I completely forgot that the testimony of Christy was likely prepared some time ago. It may thus explain Watts' rushjob: it needed to be ready before today.

More eggs on Christy's face.

REPLY: Noooo…Fig. 3. Changes in the documented time of observation in the U.S. HCN. is about the times they assigned the observers. There’s no proof the observers adhere to it. – Anthony

Watts is essentially claiming that most observers LIED ON THEIR DATASHEETS, entering bogus reset times, therefore that bit of metadata can't be trusted.

I don't think that a claim based on the assertion of lying observers will make it far through the review process ...

This is what Christy has to say about Watts paper in his congressional testimony:

"Watts et al. demonstrate that when humans alter the
immediate landscape around the thermometer stations, there is a clear warming signal
due simply to those alterations, especially at night. An even more worrisome result is
that the adjustment procedure for one of the popular surface temperature datasets actually
increases the temperature of the rural (i.e. best) stations to match and even exceed the
more urbanized (i.e. poor) stations. This is a case where it appears the adjustment
process took the spurious warming of the poorer stations and spread it throughout the
entire set of stations and even magnified it. This is ongoing research and bears watching
as other factors as still under investigation, such as changes in the time-of-day readings
were taken, but at this point it helps explain why the surface measurements appear to be
warming more than the deep atmosphere (where the greenhouse effect should appear.)"

So in spite of theTOBS problem, Christy still thinks it is good enough.

Link

Team Tony The Tiger (or T4) dribbles the ball down the court ... T4 takes the shot ... Team Reality Ultimately Endures (or TRUE) massively BLOCKS the shot ... the ball EXPLODES right in T4's face .. TRUE picks up the many shards of the ball off the floor ... dribbles said shards down the court ... massively SLAM DUNKS said shards ... the backboard EXPLODES ... the clock expires ... TRUE remains undefeared.

T4 regroups, after all they do have a pony.

Wow, I can't believe that Christy relied on the Watt's POS in his testimony. Credibility, buh-bye.

Now you see why Watts pushed it out this weekend ...

I watched the Senate hearing this morning. Christy didn't mention the Watts paper in his spoken testimony, but Senator Boxer (D from CA, and chair) absolutely slammed him for referring to it in writing, asking him whether it was peer-reviewed and how could he be relying on one unreviewed paper, she trusted in the many reviewed papers that supported warming. Christy looked like he wanted to crawl somewhere and hide.

Score one for our side!

Got a link to the replay?

Sorry, all I saw was a live stream, I assume they post a video later somewhere? I'd actually like to see it as I missed most of the second part of the hearing on impacts.

Never mind, I found the video at the epw site:

http://epw.senate.gov/public/index.cfm?FuseAction=Hearings.Choose&Heari…

Christy free for all begins around 120:00 or there abouts. Questions from Boxer around 130:00.

RN, did Christy have any credibility remaining prior to this? Don't think so.

By the way, if you are left unimpressed by citing of Heartland literature I wonder how you'll feel about the citation - '(as noted in online discussions at the time)' later in the paper?

Paul S: i'll have you know that the International Journal of Shit I Heard Down The Pub is a highly respected publication.

Arthur,

do you know how many days in advance the written testimony has to be submitted to the senate? Might be an interesting point in regard to the timeline of the fabrication of Watts et al.

And the possibility of an interesting clash between Watts and McIntyre is promising, too, because I just can’t imagine Watts letting go of his “50% of the warming is spurious” conclusion.

Watts runs McIntyre's server.

Andreas - I'm not sure there's a requirement - Boxer actually left open another 2 days for witnesses to submit written materials to be included in the record, so it sounds like it can be submitted even after the hearing itself. But I think the habit is to send it in at least a day early so at least some senate staff can prep questions. Like that one :)

I hadn't remembered that Boxer made that query about Anthony Watts the very first question for the witnesses, it looked very deflating to the contrarians there...

Christy's graphs looked rather dubious too - I wish somebody had asked whether they had been submitted for peer review. It sounds like from the written testimony he artificially reduced the UAH/RSS temperatures in the comparison - very shady thing to do.

"Christy’s graphs looked rather dubious too – I wish somebody had asked whether they had been submitted for peer review. It sounds like from the written testimony he artificially reduced the UAH/RSS temperatures in the comparison – very shady thing to do."

The other shady thing about that graph is the 1979-1983 baseline. 4 year baseline? And not even the first 4 years being examined? Is it coincidence that this 4 year period is warmer than the years immediately before and after?

Judging by the latest crop of comments at WUWT:

"Lucy Skywalker says:
August 2, 2012 at 3:27 pm
I shall be with you in spirit on 18 August ....But ah, I had to take time off to celebrate and support the “protestant reformation” in Science emerging with the historic nailing to the blog door of Watts et al 2012."

It may be time to toss another Lollard on the barbie.

Since Watts et al is not yet submittable in either form or content, it's up in the air if it will ever get to peer review. While the world anxiously awaits Watts' TOBS reanalysis, I wonder if he is at least correct that Leroy 2010 should supersede Leroy 1999 as a criterion for station adjustments.

" I wonder if he is at least correct that Leroy 2010 should supersede Leroy 1999 as a criterion for station adjustments."

Well, digging around a bit, I get this quote:

"[leroy 2010's methodology] endorsed by the World Meteorological Organization (WMO) Commission for Instruments and Methods of Observation (CIMO-XV, 2010) Fifteenth session, in September 2010 as a WMO-ISO standard, making it suitable for reevaluating previous studies on the issue of station siting."

While "endorsed" does not mean "ISO standard" yet (this requires a usually lengthy process among national-level standards organizations and eventually a vote by them, and I doubt the process is complete, though don't know for sure), it's a good sign.

"...the historic nailing to the blog door of Watts et al 2012."

Watts' triumphant revamping of the peer-review and publication processes of science is also being compared to the invention of the printing press by Gütenberg.

The comparison having first been made by Watts himself ...

Oh, ugh, that quote regarding "endorsed by the WMO" actually came from a re-post of Watts PR on his paper.

But I think it's probably accurate.

dhogaza,

The irony of Wattsian logic is truly amazing.

So Watts has never ever heard of ArXiv?

http://en.wikipedia.org/wiki/ArXiv

"It started in August 1991 as a repository for preprints in physics and later expanded to include astronomy, mathematics, computer science, nonlinear science, quantitative biology and, most recently, statistics."

I've never been quite sure that all ArXiv articles have been submitted to at least one peer reviewed journal though. But some real duds do appear to make it through.

But on the other hand, Watts most surely is not the first person to post a "paper" for review purposes prior to submission either.

And even if this paper is never submitted in any form to any journal, what Watts is suggesting is no formalities are necessary for publishing works of art on the internets. Which kind of goes without saying, as this form of internet "publishing" has been happening ever since there was the internet to begin with in the first place.

The difference here with Watts, is in his completely closed minded approach taken to the many people who have pointed out the rather serious flaws in this paper. He definitely isn't playing with a full deck of cards (on purpose), if you know what I mean.

Gütenberg or Galileo or Einstein, gosh there's a lot of famous people out there, which one will Watts choose to be tomorrow?

In the spirit of Carl Sagan, I would suggest Bozo the Clown.

Dhogaza, it's way past "serious flaws" and into "erroneous premise" territory. IOW there's no salvaging *this* paper. The paper that there's material for, one discussing the differences between Leroy 1998 and Leroy 2010, would be a different paper entirely. Or maybe, if he wanted to do more (and more subtle) work than he's capable of, he could examine whether the Leroy 2010 standards are valid by analyzing whether particular influences that affect the Leroy class are reflected in the USHCN record. The latter might even be kind of interesting, but it would also involve being very careful in the application of the Leroy standards. (Actually I suspect field work to take direct measurements of the influences of heat sources and sinks would be needed to do such a thing correctly, another reason why it won't happen.)

And I didn't want to let the thread end without it being pointed out as baldly as possible what an utter fool Watts was to imagine that this would not blow up in his face instantly.

Re the Leroy 2010 standards, the key point is that applying them is more art than science. A too-strict interpretation of e.g. the shadowing standard could demote an otherwise Class 1 station to Class 4. In the hands of a scientist, that's not a problem, since it would be understood how easily an incorrect class can be assigned. In the hands of Watts, it's guaranteed to be one.

BTW, I see in his withdrawal statement that Watts says he won't post the revision, but will just go ahead and submit it for publication. That will make it a lot easier just to slip the whole thing down the memory hole.

Re your famous person question, I vote for Helen Keller. :)

[The Leroy '10 stuff is interesting. I'm not familiar with this at all. Watts is very keen on it, but of course only because it produces a result he likes. If he liked Leroy '9, then of course *that* would be the one faithful standard and the '10 would be a modern corruption - see all the endless junk about "IPCC used to endorse the MWP".

But I haven't really seen anyone else commenting much on Lereoy versions -W]

"I wonder if he is at least correct that Leroy 2010 should supersede Leroy 1999 as a criterion for station adjustments."

To be clear on this, the 1998 standards were for siting only. For years right up until this recent draft, Watts had claimed otherwise, but note that the draft admits that they were never appropriate for adjustments (actually not adjustments, BTW, but addition of error bars). The 2010 standards claim to be suitable for adding error bars.

Now that I think about it, the fact that it's error bars and not adjustments is rather important. In effect, the former makes no claim that there really is an error. One would then want to examine the station record to see if there really is an issue.

A fellow named ChristianP has left some comments on Leroy 1999/2010 at the Blackboard.

[Thanks, that's interesting. It reinforces my view (well, the obvious view) that Leroy '10 isn't a magic bullet. The other point (which I think McK made in his review, but again its an obvious one) is that the classification that Watts did was right at the end, whereas what they need is classification through time. A constant cold bias isn't a problem for trends. Also the talk of "heat sinks" makes no sense to me -W]

Christy says in his committee testimony that temperature is an incorrect measurement for global warming, the correct measurement is in joules. I've seen this asserted elsewhere and wonder if it is true.

The correct measurement is Hiroshima Bombs for extra energy added because of GHGs: two every second.

What McK said has been Eli;s point for years. A 2008 photo tells you nothing about 1980.

Since the radioactive decay flux is about a hiroshima/ second, ten to the fourteenth J or so, this must really be the Anthropocene.

Note that shooting off all the world's nuclear arsenals could only sustain the rate of radiative bracket creep for a hay and a half.

They just don'tmake Ages like they used to .

If a hiroshima is 6x 10 exp 13 J, just how big is a Shima anyway ?

Neven at 9:01 pm, 3 August

The correct measurement is Hiroshima Bombs for extra energy added because of GHGs: two every second

James Hansen has stated that the the energy excess is equivalent to 400,000 Hiroshima bombs per day. If one does the arithmetic is actually works out to be 4.63 HB/second.

Since 1 Tsarbomba= 4.25 x 10 exp3 Hiroshima , to a first appoximation the Shima is about a bomba / hour, which is still enough to ruin your entire day.

re: "Srsly? He’s trying to cite Heartland trash in a real journal?"

The idea of citing something is to acknowledge a source to either given them credit or for people to turn to it for more info. People have even cited personal conversations just to give someone credit. It is simple concept, it is odd you have difficulty with it. I suppose those who care more about style than substance may run journals that limit cites to other journals even if that makes the content less useful and steals credit from those they are prevented from citing.

I suspect you care neither for substance nor even style but merely expressing your hatred for allowing anyone to cite those that dare disagree with your simplistic worldview.

[You cite personal conversations as pers. comm.. You don't pretend they are real papers -W]

"It is my impression that they use a lot of “raw averages” for their comparisons. I assume they are simply averaging over all existing stations without gridding or area weighting."

No, we use not one but two weighting measures. One uses 9 areas of the US. The other cuts up the US into 26 grid boxes.

Our revised data fully addresses the TOBS issue and also factors in MMTS conversion. Stations with documented moves post 2002 are dropped unless the prior location is known and rated.

We will include statistics for all dropped stations, as well, in order to establish that there is no cherrypicking at issue.

I will be happy to address any and all objections.

(P.S., I am NOT the primary editor of the paper. But I did do the vast bulk of the actual footwork, so I am in a position to comment.)

[Welcome! I'd be interested to know if you have any comments re the current state of the paper. As you'll see from the comments here, it looks from the outside as though promised updates haven't occurred -W]

Thank you. Here is some feedback. The paper is controversial in its conclusions and we expect it to be carefully scrutinized.

comments re the current state of the paper.

Well, I'm working on it very hard. Rome wasn't built in a day. Or even burnt in a day, for that matter. But I can tell you that the paper is very much alive and in active process.

To be fair, if you know the date of the last station move, its not useless. Unfortunately I don’t think Watts et al have gotten that far in their analysis yet.

Well, I dropped all the stations that have moves after 2002 (except for a small handful of cases, where we know the precise pre-move location and can therefore rate it). That data is available from MMS. Prior to that it is difficult to be consistent because the given coordinates become very imprecise very quickly and the remarks themselves are not uniform.

That does not solve the issue in its entirety, but it goes as far as we can.

(In our update, we will be including the data for all dropped stations partly for purposes of demonstrating that we are not indulging in cherrypicking.)

Do you think that Watts has read it?

I'm not sure he read it. (But he did write it.)

Since they haven’t actually investigated the adjustment factors (TOBS, SHAPS) which lead to the greater trend they have no basis for this statement.

We will have addressed TOBS, and I think very well.

To me, the very idea that SHAP is an overall positive trend adjustment is mindboggling. But TOBS is a legitimate concern.

So is MMTS, though one of considerably lesser magnitude. We will be addressing that as well.

We know of at least one “journal” that will publish this “paper” as is.

E&E

The peer review will be done by Moe, Larry and Curly.

Our previous paper on the subject (Fall et al., 2011) was published by the Journal of Geophysical Research. One of the reviewers was from NOAA. (We figure that was probably Moe.)

Our conclusions were different. But we published anyway. We certainly did not withhold the paper because it concluded that siting has little effect on trend. The answer, either way, is an important question for both its scientific and policy implications.

To be clear, our current paper refutes our earlier paper, and devotes much space as to the reasons therefor.

Also the talk of "heat sinks" makes no sense to me -W

The basic premise is that heat is absorbed by the sink during the day and then released at night, having a serious effect om Tmin.

The question we address is whether this effect on trend increases during a period of sustained warming, or whether it is constant, as per Menne, et al. (2010), and does not increase trend.

We conclude that the trend does indeed increase. Or, to put it another way, a heat sink does not only make a station warmer, it makes it warmier.

[Does the draft paper address that? I didn't see it, but then I only skimmed it If you could point me to the section, I would read it -W]

One might well hypothesize that if a poorly sited station warms faster during a sustained warming phase, it will therefore cool faster than a well sited station during a period of sustained cooling, as the effect 'undoes" itself. a sort of "what goes up must come down" type of argument.

A note on our use of Leroy in both our papers: We are using the proximity factors only. We are not considering shading or vegetation. This is important. For one thing, shading is a cooling effect that makes a rating worse -- and it is far more likely to affect a poorly sited station (thanks to proximity to structures).

So we are not rating a station as a class 4 because of a small nearby bush. (A small nearby power plant, maybe . . . )

I hope this answers some of your questions.

Another interesting observation is that stations with poor mesosite (i.e, urban and airports) have markedly superior microsite.

This gainsays the prejudice that urban sites have worse specific locations than those in rural or semi-rural environment.

Under 20% of stations in rural areas are Class 1\2, while 30% of stations in urban areas are Class 1\2.

I believe that because of this phenomenon, the fact that TOBS bias is more prevalent in rural areas does not "wash out" the differences between well and poorly (micro)sited stations.

OK, well, obviously, its “Watts (2009)” not “Watts, (2009)” but he’ll fix that eventually.

I concede Anthony does have a problem with punctuation prior to parentheses. (And parenthetical commas.)

I did fix all that, but unfortunately those fixes did not make it through to the release.

But, yes, it will ultimately be fixed.

[Does the draft paper address that? I didn't see it, but then I only skimmed it If you could point me to the section, I would read it -W]

Well, it does so indirectly. Bear in mind that the paper is not primarily about the underlying science, it is about observation.

After all, before one gets into the "why", one has to nail down the "what is".

We claim observe in the overall sampling poor siting affects Tmin more than Tmax.

I note that in our revised statistics, that the Tmin difference is greater in areas with poor mesosite than in areas with good mesosite, and that Tmax differences are more dominant in rural areas.

Overall, the differences in Tmean trend between good and poor microsite stations, while omnipresent, diminishes somewhat in areas with poor mesosite as omnipresent heat sink begins to overwhelm the trends (esp. Tmax).

Even in rural areas, Class 5 stations have lower Tmean trends than Class 4. And in urban areas, Class 4 stations trends dip to or even below those of Class 3 (although remaining higher than Class 1\2). Urban Class 5 stations are so overwhemed that their trend is dampened slightly below (though not significantly below) that of Class 1\2.

Overall, the combination of Class 3\4\5 stations Tmean trend remains higher than Class 1\2 even with urban and airport stations. but the rural "class differences" are greater.

For reference, 10% of sites in our study are urban, a bit over 5% are airport, with a fair bit of overlap between thew two..

We believe this is consistent with the heat sink/Tmin trend hypothesis.

Remember, it's not about whether urbanized areas are warmer. it's about whether they warm faster during a n overall warming trend.

[Yes, exactly. That is a key question. So I'm surprised to see you only address it indirectly, since it is so vital to your thesis. I was hoping you could direct me to the part of the paper that covered this issue -W]

Also remember, the gold speck here is the observation rather than the theories adduced to accommodate the observations. We are, of course, far more certain of the former than of the latter.

We conclude that the trend does indeed increase. Or, to put it another way, a heat sink does not only make a station warmer, it makes it warmier.

[Does the draft paper address that? I didn't see it, but then I only skimmed it If you could point me to the section, I would read it -W]

Well, yes. That is the main point of the paper.

Leory (both v. 1999 and 2010), inter multa alia agree that poor microsites are warmer (sic) than goodr microsites. That is not controversial; both sides agree.

The question, also addressed by Menne et al. (2010) and by us in Fall et al. (2011), is whether poorly sited stations warmed faster than the well sited stations, from 1979 - 2008 (or in Menne's case, from 1980 - 2010).

We find that they do indeed warm faster. And, as I say, that is the main point of the paper.

[And yet this appears to be highly methodology dependent, and to differ from previous results. Undoubtedly, you've looked into exactly why your results differ from previous, and explained why your new results are to be considered superior? Again, which section of the paper covers this? -W]

We also agree that there has been warming. We are, after all, lukewarmers. In order for a warming trend to be exaggerated by poor siting, there has to be a warming trend in the first place to exaggerate.

But, when it comes to global warming, size matters. (Not to mention the motion of the ocean.)

[And yet this appears to be highly methodology dependent, and to differ from previous results. Undoubtedly, you've looked into exactly why your results differ from previous, and explained why your new results are to be considered superior? Again, which section of the paper covers this? -W]

Yes, it is methodolggical and, yes, it differs from previous results. Including our own.

The paper itself attributes it to binning. This is correct, as far as it goes, but I will explain further (and will suggest that it be explained in the paper):

Leroy (1999) bases it rating on distance from heat source or sink, but does not take into account the size of the sink. Therefore a small garden path will have the same effect on rating as a parking lot. But Leroy (2010) accounts for area covered within radius.

For example, Leroy (1999) rates a station as Class 4 if within 10 m. of a heat source. But Leroy (2010) says that for a Class 4 rating, 10% of the area within a 10 m. radius must be heat sink. If you do the circle-segment dance, you will find that if a station that is 6.9 m. from the side of a house is NOT considered to be a Class 4 (unless there is other stuff near the station) because under 10% of the radius would be covered by the house.

This is a more lenient system (except for defining Class 5)than Lerroy (1999). So a larger number of stations are rated Class 1\2.

And the reason that this lowers the Class 1\2 average rather than raises it is that nearly all of the stations re-rated from Class 3 (or even 4) to Class 2 are Non-Airport stations.

This greatly decreases the proportion of Class 1\2 Airports compared with the total of all Class 1\2 stations. And airports have a much higher trend than non-airports. Therefore, the Class 1\2 average goes down instead of up even though the standards are loosened.

But it doesn't really matter why. Assuming Leroy is correct in that a < 10% area does not affect the readings, then what the result is is, well, what the result is. It is certainly logical that the area of heat sink is of primary importance.

When I get a chance, i will go over the paper again and provide page numbers and the like. Your questions are quite reasonable and certainly deserve answers. (And I feel certain that peer review will be a most bruising process -- far more so than it was last time.)

[I think others will be interested in your comments here; I'll write a post pointing them out, since many won't notice new comments on an old thread. Can I ask you to clarify the "lack of visible activity" question: the natural place for you reporting progress would be WUWT, I'd imagine, rather than here, so its a bit odd to see no sign of this activity, there -W]

Well, okay, so I must, perforce, consider this to be, to some degree, "enemy territory". But that's okay, I'm figuring peer review will be worse . . .

And there is no lack of activity, really, just a lack of visible activity. When I get the new data properly assembled and Anthony gets the new version written up, there will be plenty to see.

Meanwhile, I have no problem defending the paper and explaining what we have and will be addressing. If there is something that I consider to be seriously wrong, I will need to address it, after all.

We kinow what the overall MMTS adjustment is (via Menne 2010 and 2009). And we have cleverly addressed the TOBS issue in a manner that even the most unsympathetic will find satisfactory (you'll find out more about that later . . .).

SHAP is very minor after 1980. As for homogonization, I think that the professionals are going to have to rework that. What's happening -- I think -- is that the cooler-trend stations (which are mostly Class 1\2) are being considered as outliers and are being "brought in to conformity" with the majority -- which are poorly sited.

I think that for homogenization to get it right, there will have to be a siting adjustment prior to homogenization. unfortunately at this point NOAA does not concede that siting maters in respect to ttrend.

Not only will that screw up the homogonization procedure, but it also imperils any pairwise comparisons (because the comparisons may well be made between stations of different siting quality). This will inevitably affect MMTS and TOBS adjustments, both of which, as I understand it, involve pairwise comparisons.

Note that I am not saying all adjustments are wrong in concept, merely that by ignoring siting, they are currently being done wrong and will have to be readdressed.

The big objection to Watts et al. (2012) is TOBS. With that issue dealt with, I predict it will make it through peer review.

Evan Jones, the most visible issue with Watts et al. (in preparation) is TOBS. It was certainly not the only issue.

My concern, however, is a what I regard as an inadequacy with the new (and the old) classification system. Specifically, it takes no account of the difference in thermal properties between heat sinks and natural terrain.

To illustrate the issue, consider Watts classification of the Australian Bureau of Meteorology weather station at Ceduna SA as urban. It is located in arid country where natural ground cover covers only 20-50% of the terrain, and consists of water retentive vegetation with a very high silica content. Watts classifies the site as urban because there is a graded runway within 100 meters of the stevenson screen, which probably sees traffic of 4 or 5 light aircraft a day.

From my perspective, the graded airport runway would make no appreciable difference to the temperature record at that location. At Mount Isa, even a large brick building would make little difference in a terrain similar to Ceduna except for the presence of many large exposed rock surfaces, thermally little different from such a building.

Another example comes from the Australian Antarctic Territory where Watts has twice attributed warming trends to the nearby location (2-4 meters) of an insulated two man hut. (That the Stevenson screen in question has never been used for climate records is beside the point.)

Obviously these are not US examples, but it illustrates that classifications that do not take into account the change in thermal properties introduced by supposedly artificial structures are inadequate.

To my mind they also illustrate the fact that Watts is a biased observer whose own biases are likely to distort his application of any classification system, but that is another matter.

I do not understand most of the discussion above, I am afraid, but maybe I can make some helpful remarks on the time of observation bias.

Evan Jones: "This will inevitably affect MMTS and TOBS adjustments, both of which, as I understand it, involve pairwise comparisons."

The TOBS adjustments have been computed from hourly measurements. The pairwise homogenisation algorithm is a recent development to detect and correct additional non-documented inhomogeneities and was not involved in any way in the computation of the TOBS.

Evan Jones: "And we have cleverly addressed the TOBS issue in a manner that even the most unsympathetic will find satisfactory "

May I ask, is there need for a clever solution? You can download TOBS adjusted data from the NOAA homepage. If you do not believe these adjustments have been computed correctly, as Anthony Watts seems to do, you can restrict your analysis to stations for which the time of observation did not change.

If you do not believe these adjustments have been computed correctly, as Anthony Watts seems to do

Actually, it's worse, Watts has claimed that significant humbers of station monitors ignored instructions to change the TOBS and , in essence, rather than doing so lied on their data sheets (i.e. wrote in the new time but continued to take observations at the old time).

If the revised paper rests at all on that premise I predict a difficult future for it ...

Evan Jones:

This will inevitably affect MMTS and TOBS adjustments, both of which, as I understand it, involve pairwise comparisons.

VV already pointed out that you understand wrong, in regard to TOBS. Regarding MMTS adjustments, these are done immediately after TOBS adjustments are made:

"Temperature data at stations that have the Maximum/Minimum Temperature System (MMTS) are adjusted for the bias introduced when the liquid-in-glass thermometers were replaced with the MMTS (Quayle, et al. 1991). The TOB debiased data are input into the MMTS program and is the second adjustment. The MMTS program debiases the data obtained from stations with MMTS sensors. The NWS has replaced a majority of the liquid-in-glass thermometers in wooden Cotton-Region shelters with thermistor based maximum-minimum temperature systems (MMTS) housed in smaller plastic shelters. This adjustment removes the MMTS bias for stations so equipped with this type of sensor."

I see nothing in the description that discusses pairwise comparisons. Homogenization comes after TOBS and MMTS adjustments, not during/before.

Note that I am not saying all adjustments are wrong in concept, merely that by ignoring siting, they are currently being done wrong and will have to be readdressed.

If the measurement characteristics of liquid-in-glass vs. MMTS sensors is understood, why oh why would you have to take siting into effect? Siting issues are not going to wipe out or modify the differences in the physical characteristics of the two sensor types!

It is reasoning like this that makes me question whether you and Watts will successfully overturn decades of work done by professional scientists. You apparently don't understand the adjustment procedures which are used, even though they are clearly documented by NOAA complete with links to the relevant papers.

(oops ... the link to the Quayle paper is apparently messed up, so maybe "complete with links" is an overstatement, but at least the full title of each paper is available.)

May I ask, is there need for a clever solution? You can download TOBS adjusted data from the NOAA homepage. If you do not believe these adjustments have been computed correctly, as Anthony Watts seems to do, you can restrict your analysis to stations for which the time of observation did not change.

Indeed you can! *evil grin* (Love that MMS "Phenomena" Tab.)

Specifically, it takes no account of the difference in thermal properties between heat sinks and natural terrain.

Leroy (2010) is empirical. he is not attempting to do that. he merely notes that he measures a definite temperature bias when sensors are located near heat sinks.

He is, of course, examining offset, not trend. NOAA agrees insofar as offset goes, and adopted the 1999 version as a basis for siting its CRN network.

He is observing the WHAT, not the WHY. And the "what" is that the offset is biased.

We are also making an empirical observation, not of offset, but of trend, only. I cannot tell you the details of thermodynamic exchange variance of nasty gray stuff vs. funky green stuff.

The Leroy papers tell you, via observation, is that, on average, stations in proximity to the former ARE warmer. And what Watts et al. tells you, also via observation, is that during a period of sustained warming (1979 - 2008) they WARM FASTER.

I would expect that during a period of sustained cooling, they would cool faster. But i have no measurements for that.

dhogaza needs to explain why well sited station trends are adjusted upwards to match poorly sited station trends and not vice-versa.

NOAA makes its adjustments on the premise that siting may affect offset, but does not affect trend.

And, yes, I know when MMTS adjustment is made.

Siting issues are not going to wipe out or modify the differences in the physical characteristics of the two sensor types!

If one is well sited and the other poorly sited and one does not account for that, one is going to get an incorrect result. After all, Leroy documents the difference in offset between good and poor stations. And, as we demonstrate, the decadal trend difference between a well and poorly sited station is five times the difference of that between MMTS and CRS (Menne et al., 2009, 2010).

What I think is going on is that during the homogenization process, well sited stations are being identified as outliers and wind up being pasteurized. Can anyone explain this otherwise?

It is reasoning like this that makes me question whether you and Watts will successfully overturn decades of work done by professional scientists.

Maybe they should have taken six months out and looked at what we have been looking at?

[This is getting close to the "Galileo gambit". Its sort-of OK in a comment thread, but you're going to have to provide more detail in a paper to be credible. Ie, why your new results are better than the existing ones, which they contradict -W]

If the measurement characteristics of liquid-in-glass vs. MMTS sensors is understood, why oh why would you have to take siting into effect?

It is also understood that MMTS is a superior instrument. Is it also understood that MMTS units show far less warming than CRS even after MMTS offset adjustment? (Or am I the only one who could be bothered to make that measurement?)

[Unlikely -W]

If the revised paper rests at all on that premise I predict a difficult future for it …

It doesn't. VV is on the right track, here.

clearly documented by NOAA

THAT old thing? I've been all over that page any number of times. It identifies adjustments are and in what order, but it really doesn't have much to say about how they are made. We never get to find out why FILNET is such a whopping positive adjustment. Folks prefer taking measurements in the cold, maybe?

USHCN2 has a considerably longer-winded adjustment explanation. But, I note, with far fewer actual useful numbers appended. At least with the USHCN1 page you can tell (roughly) by how much and when the adjustments are occurring.

[This is getting close to the "Galileo gambit". Its sort-of OK in a comment thread, but you're going to have to provide more detail in a paper to be credible. Ie, why your new results are better than the existing ones, which they contradict -W]

Well, yes. I agree.

In a nutshell:

1.) NOAA's conclusions are based on Menne et al.

2.) Menne's paper uses Leroy (1999).

3.) Leroy himself realized the severe shortcomings of his 1999 paper and revised it in a very logical manner.

4.) We find that Tmean trend for compliant (i.e., Class 1\2 stations using Leroy 2010 proximity ratings) stations is fully 0.11C lower than that of poorly sited stations (Class 3\4\5). This includes consideration for both TOBS and MMTS.

5.) Unless LeRoy 2010 is wrong in a manner which reflects on our use of it or unless our ratings are incorrect (or both), our findings are consistent with the hypothesis that stations with poor sitings not only have higher readings, per se, but significantly higher trends as well.

6.) Stations with both good microsite and mesosite warm at a rate of approximately half their NOAA-adjusted trends, which, in turn, coincide with the adjusted trends for poorly sited stations.

7.) Therefore US ground surface temperature is exaggerated by that amount of difference.

It's really as simple as that.

I guess I will have to wait for the next manuscript, press release.

Sorry, Mr. Evans, but in the most generous case, your comments are unclear. Maybe try to write a more coherent text on the manuscript work page.

And, yes, I know when MMTS adjustment is made.

Then why did you mistate it as being a pairwise adjustment ala the homogenization step?

Siting issues are not going to wipe out or modify the differences in the physical characteristics of the two sensor types!

If one is well sited and the other poorly sited and one does not account for that, one is going to get an incorrect result. After all, Leroy documents the difference in offset between good and poor stations

Even if true, this is *separate* from and *independent* of the offset due to the switch in sensors. Leroy's work isn't going to tell you anything about that, nor justify your reducing the amount of the adjustment made to account for sensor change.

Good luck, Evan, you're going to need it ...

clearly documented by NOAA

THAT old thing? I’ve been all over that page any number of times. It identifies adjustments are and in what order, but it really doesn’t have much to say about how they are made.

The underlying papers are, as I pointed out, referenced. For the details you go to the underlying literature, that's why they reference the papers directly.

Regardless, you've clearly mistated how both TOBS and MMTS adjustments are made. The page isn't opaque. You've gone over it many times and haven't noticed that the only time they mention pairwise adjustments are when discussing homogenization?

You're going to have to show why:

1. your results differ not only from GISTemp etc which use TOBS, MMTS and homogenization but also reconstructions which *don't*, including BEST.

2. why your results are inconsistent with the satellite data.

Good luck.

Generosity accepted. Allow me to try again.

A.) In the presence of an unambiguous period of warming poorly sited stations appear have significantly higher Tmean trends than well sited stations, no matter what subset of the sample is considered.

B.) Urban (as defined by NASA) and airport sites also appear to have a higher Tmean trend than rural sites, especially for Class 1\2 stations.

C.) After full NOAA adjustment, the well sited station trends are increased to the same level as as the poorly sited stations.

D.) When referring to the "rural, no airports" subset, NOAA-adjusted trend is nearly twice that of well sited stations. Tmean trend is more than double if only majority-time MMTS stations are used, even after MMTS adjustment is applied.)

Those are the major findings. They are strictly empirical, although, naturally, we do have our prejudices and speculations as to the reasons they occur.

In the revised draft, TOBS will be dealt with and MMTS adjustment applied. Those were the primary objections to the preliminary release.

A.) In the presence of an unambiguous period of warming poorly sited stations appear have significantly higher Tmean trends than well sited stations, no matter what subset of the sample is considered.

B.) Urban (as defined by NASA) and airport sites also appear to have a higher Tmean trend than rural sites, especially for Class 1\2 stations.

So poorer sited sties have a higher trend than well-sited sites except for those that don't?

D.) When referring to the “rural, no airports” subset

There's really no point in using a classification scheme at all if you're just going to toss out the class 1\2 stations that give results you don't care for ...

In the revised draft, TOBS will be dealt with and MMTS adjustment applied. Those were the primary objections to the preliminary release.

These were two obvious blunders. There were other objections made as well, but my impression is folks didn't bother to dig deeply into the paper because the existence of these two obvious blunders were severe enough to roundfile it without further review.

If you folks have corrected these two obvious blunders (though your novel and non-standard treatment of TOBS changes raises the possibility of a repeated blunder), you can be sure that people will dig deeper to see where else you've gone wrong.

Those are the major findings. They are strictly empirical

On the surface, the impression the first paper gave of slicing and dicing the dataset until you got the result you wanted persists ...

Regardless, you’ve clearly mistated how both TOBS and MMTS adjustments are made.

Regardless, the paper does not deal with how those adjustments are made.

The TOBS issue is easily dealt with, as VV has pointed out, and we can easily extract and apply the (much smaller) overall MMTS adjustment impact from Menne 2010 and 2009.

[I'm surprised that you think you can so easily re-work your paper to include this major change, without (apparently) affecting the conclusions at all -W]

1. your results differ not only from GISTemp etc which use TOBS, MMTS and homogenization but also reconstructions which *don’t*, including BEST.

Yes, they do. The reason for that is that Menne and Muller (and Fall, for that matter) use Leroy (1999) as a basis for rating rather than Leroy (2010).

2. why your results are inconsistent with the satellite data.

Actually, all we have to do is demonstrate our observations. The new set of data is not inconsistent when one considers that satellites convert MW readings to determine LT rather than ST. There are many reasons to believe that ST trends will not be equal to LT trends.

To be clear, our revised dataset (dealing with TOBS and MMTS) shows a decreased (though still very large) difference between Class 1\2 station Tmean trends and NOAA-adjusted Tmean trends, but a slightly greater difference between Class 1\2 and Class 3\4\5 Tmean trends than our previous dataset.

Good luck.

#B^j

On the surface, the impression the first paper gave of slicing and dicing the dataset until you got the result you wanted persists …

Then you need to go below the surface.

The first release (and the revisions) show that no matter how you slice and dice the dataset, the results are the same.

We were not content to settle for just one slice. We had to make sure that not only did the whole yield the results, but that the parts did as well.

Regardless, the paper does not deal with how those adjustments are made.

But presumably you understand why your inability to understand something as simple as that NOAA page despite having read it several times might lead one to question whether or not you really are the next Galileo?

These were two obvious blunders.

We stated clearly that TOBS was an issue and we would be addressing it. After initial review, we decided we needed to deal with the issue immediately. I have done so.

Funny how those were not "obvious blunders" in Fall et al. (2011). But, then, the results from Fall et al.produced far less consternation than Watts et al. (2012), did they not?

There were other objections made as well, but my impression is folks didn’t bother to dig deeply into the paper because the existence of these two obvious blunders were severe enough to roundfile it without further review.

Dig away. Funny how those exact same considerations were not enough to "roundfile" Fall et al. But those results were far more pleasing, were they not?

If you folks have corrected these two obvious blunders (though your novel and non-standard treatment of TOBS changes raises the possibility of a repeated blunder), you can be sure that people will dig deeper to see where else you’ve gone wrong.

Indeed, we can be quite sure of that! (But I must stress that my method of dealing with TOBS is neither nonstandard nor controversial.)

But presumably you understand why your inability to understand something as simple as that NOAA page despite having read it several times

What I cannot understand is your inability to understand why failure to consider microsite would not affect adjusted data.

Especially as we provide direct comparisons of Class 1\2 station and Class 3\4\5 station raw and adjusted data.

might lead one to question whether or not you really are the next Galileo?

All truths are easy to understand once they are discovered; the point is to discover them.

B.) Urban (as defined by NASA) and airport sites also appear to have a higher Tmean trend than rural sites, especially for Class 1\2 stations.

So poorer sited sties have a higher trend than well-sited sites except for those that don’t?

Well, yes . . .

Or to put it another way:

A.) Rural Class 1\2 trends are lower than Rural class 3\4\5 trends.

B.) Urban Class 1\2 trends are lower than Urban Class 3\4\5 trends.

C.) Rural Class 1\2 trends are lower than Urban Class 1\2 trends.

D.) Rural Class 3\4\5 trends are lower than Urban Class 3\4\5 trends.

E.) Urban Class 1\2 trends are slightly lower than Rural Class 3\4\5 trends.

(The above is true even after MMTS adjustment, which has a much greater effect on rural stations than on urban.)

D.) When referring to the “rural, no airports” subset

There’s really no point in using a classification scheme at all if you’re just going to toss out the class 1\2 stations that give results you don’t care for …

That's funny!

I took the entire set stations. We dropped all stations with moves after 2002 (where we could not establish the pre-move location), all stations with major TOBS issues, and all stations with no NOAA data, and all unrated stations.

I have data for the following sets:

1.) All stations
2.) Airports Excluded
3.) Airports Only
4.) Rural Only
5.) Urban Only
6.) Rural, No Airports
7.) CRS (majority of period) Only
8.) MMTS (majority of period) Only
9.) "Pure" CRS (never converted to MMTS)
10.) Rural MMTS
11.) Dropped stations (for amusement value and to demonstrate there is no cherrypicking)

All is included. Nothing is tossed out.

Funny how when one bends over backward to include everything, one gets accused of "tossing out".

Seeing as how only a minuscule percentage of surface area is urban (yet 15% of Class 1\2 stations, and 10% overall are urban, with c. 5% airports, with some overlap), the set most representative of the actual climate would be #6, or if one wanted to avoid equipment inhomogeneity, #10.

FWIW, for our revised set of data #1 (All Stations) shows Class 1-5 NOAA-adjusted temperatures as over 60% higher than raw Class 1\2 even after MMTS adjustment.

In all slices, Class 1/2 stations have a lower Tmean trend than their Class 3\4\5 counterparts. I'd have to say that rates the "robust" word.

Even if true, this is *separate* from and *independent* of the offset due to the switch in sensors. Leroy’s work isn’t going to tell you anything about that, nor justify your reducing the amount of the adjustment made to account for sensor change.

No, Leroy, 1999 and 2010, is not about equipment inhomogeneity. It is about microsite.

We account for sensor change citing Menne (2010), p. 5, fig. 1. The MMTS aggregate adjustment works out to +0.11C/d Tmax and -0.07 C/d Tmin (therefore +0.02 Tmean) during the 1980 - 2010 period. 75% of stations are converted to MMTS during that period, which coincides with Menne (2009), which states that overall USHCN trends are up 0.0139C/d since MMTS introduction to 2009.

Is it also understood that MMTS units show far less warming than CRS even after MMTS offset adjustment? (Or am I the only one who could be bothered to make that measurement?)

[Unlikely -W]

Unlikely, yes.

Yet it would also seem unlikely that no one rated the USHCN using Leroy (2010) heat sink proximity parameters. It just sat there for over a year before I tackled the job.

For that matter, no one had even done that for the whole USHCN using Leroy 1999 until I did it (Anthony had previously done a first rough draft of 40% of USHCN). Even Muller didn't -- he used the ratings for BEST that I compiled for Fall et al.

And how unlikely is that?

Therefore, presumptions of likelihood are not necessarily given.

Mr. Evans, I assume you and Mr. Watts can demonstrate the station classifications have remained constant throughout the 30+ years in question. In other words, a station classified as 1 or 2 for instance based on pictures taken in the last 5 years can be demonstrated to have been that way for the entire period under study. I am sure you can understand why this would be necessary.

Also, since you dismissed above the need to consider the satellite data (which shows a warming trend that is in good accord with the homogenized instrumental data), has Mr. McIntyre (who didn't even know he was being listed as an author in the first draft btw) declined to be listed as a co-author on this revised paper? I say that because he too also says the satellite data for CONUS is in agreement with the homogenized instrumental data.

How about in the revised version of this piece of "paper" we actually get a list of station ID's and classification ID'S?

If you can't be bothered with this rather simple request, all discussion of this piece of "paper" is indeed a moot point, until such time as the list of station/classification ID's are made available.

Oh and error bars on classification itself, you know like in a double blind experiment, where two different observers using independent site information each do their own classification.

You do do statistical significance testing on potential systematic selection biases in siting classification?

Sans all your one time slice pictures of ponies and the other myriad anecdotal hearsay data.

And as to the Leroy classification system itself, do we all have actuals (meaning statistics from which Leroy supports the temperature errors in a quantitive way) on tenperature offfsets biases and potential temperature trendline bias errors?

To me I do find it rather odd to even begin to discuss this piece of "paper" without the list of station/classification ID''s.

Somehow all this rather verbose use of the english language that EJ uses over here (itself an odd use of postnormal blog science) leaves me wanting, justifying your biases on this website, sans the underlying datasets, in and of itself, means absolutely nothing.

As to substative peer review (say in GRL/ERL), I'd expect at least one reviewer to come from the NOAA camp (but perhaps two or even three can't be ruled out). So you should expect at least one reviewer, to get down into the "weeds" of your paper, meaning that they just won't read the darn thing and simply accept it at face value (something that you seem to be assuming in all your too many poses over here, ad infinitum, ad nauseam)..

If you wand a real discussion of your paper, then please submit a revised draft of said paper including the station/classification ID's.

I mean of the two choices, working on improvements to your paper versus discussions of an as yet unseen reviesed draft of said paper over here, which of those two options is the most productive to actually getting your paper published?

Seriously?

You've made 22 comments (so far) on over here in the past 4 days, its like you've come over here to "test the waters" of your current/revised paper, to rationalize and reinforce, certain "subjective" choices made in said paper.

At least one of us is on to your game, the "tell" as they say in poker, on your part, is way too obvious.

That should have been "Mr. Jones", not "Mr. Evans", above. <<<typing without his coffee.

I assume you and Mr. Watts can demonstrate the station classifications have remained constant throughout the 30+ years in question.

That is a fair question.

No, we can't (as I said earlier). And neither did Menne, Muller, or Fall, all of whose papers were peer reviewed and approved.

But I went as far as I reasonably could, a lot further than Menne or Muller -- or what we did in Fall.

As MMS records were brought up to speed, I found that I could discern station moves with reasonable reliability going back to around 2003. Some of the moves were recorded in the remarks section. Some station moves were not recorded as such, but by coordinating the "Change, Ingest user" notes and coordinate changes, I was able to pick up the rest. Unfortunately it was not until well after Y2K that reasonably precise coordinates were provided in MMS. (They also went back and forth between 4 and 5 decimal paces, finally settling on 4. In many cases they updated the coordinates (eg., from 41.500 to 41.5231, or whatever, when there were no station moves.

So I removed all stations with pre-survey recorded moves after 2002. It is not as much as I would have liked, but it was as good as could be done with any consistency, and certainly more than was done in the three previous studies.

Also, since you dismissed above the need to consider the satellite data (which shows a warming trend that is in good accord with the homogenized instrumental data), has Mr. McIntyre (who didn’t even know he was being listed as an author in the first draft btw) declined to be listed as a co-author on this revised paper? I say that because he too also says the satellite data for CONUS is in agreement with the homogenized instrumental data.

Well, Dr. Christy (a co-author in this paper and Fall et al.) has addressed this. He claims that LT temperature trends will inevitably be considerably higher than ST. We do address this issue in the first draft of Watt, et al (2012).

[I'm concerned that too many of your answers here are too glib. It looks to me as though you're too confident, and not checking things properly. That certainly showed up in the first draught of the paper. I'm not checking up on everything, or close to everything, but here I can tell you've not done your job properly. You don't "address" this issue in any meaningful fashion. You know this, because people have already pointed it out -W]

I didn't dismiss the satellite data, I just pointed out that LT should be a higher trend. If the LT trends and ST tends "match", that means that either LT is too low or ST is too high.

As for whether St. Mac. will remain as co-author is entirely up to him. (I don't handle that part, I just do research and analysis on the ST side.)

How about in the revised version of this piece of “paper” we actually get a list of station ID’s and classification ID’S?

That data and all related spreadsheets will definitely be archived and made available. Plus whatever notes we have on each station, and reasons for dropping any station that we drop. And the trends for each station derived for Fall, et al. (by Dr. Fall), will be included as well..

[But, you have (today) and you had (when the draft paper was put up) a spreadsheet with the station IDs in the draft paper. Extracting just the station IDs from that spreadsheet would be the work of minutes. That you won't do that raises suspicions in people's minds -W]

In addition, I put up hundreds of "Measurement Views" for the stations on surfacetations.org in order to facilitate the ratings via Leroy 2010, and all that will be available too, for independent review.

Oh and error bars on classification itself, you know like in a double blind experiment, where two different observers using independent site information each do their own classification.

Absolutely. I can't manage to tease Excel into making error bars that account for the toals that make up an averagfe for a single data point. All I can get is bars that cross the single graph, treating each average as a single data point.

We had the same problem with Fall, et al., but our co-authors on the stats side ran Monte Carlos on the averages and got the bars up and running.

In any case, be assured that there will be "full and complete" error bars in the paper before we submit.

And as to the Leroy classification system itself, do we all have actuals (meaning statistics from which Leroy supports the temperature errors in a quantitive way) on tenperature offfsets biases and potential temperature trendline bias errors?

Leroy does not address trends at all. He only addresses offset. Those organizations which have adopted Leroy 2010 are noted in the draft paper.

All we are doing is observing whether Leroy's heat sink proximity ratings show any difference in Tmean trend between good and poor sites. They do not for Leroy (1999). They very much do for Leroy (2010).

We used to have a joke about Leroy (1999) back when we were doing Fall, et al -- that all Class 4 stations were equal, but some Class 4 stations were more equal than others. At the time, John N-G suggested we might make our own distinctions in a followup paper.

But then Leroy (2010) made those distinctions himself, having recognized the pitfalls in his previous paper, and we (that is to say, I) classified the stations using his new rating system. I will say that I was very surprised at the difference in results!

Sans all your one time slice pictures of ponies and the other myriad anecdotal hearsay data.

I mean of the two choices, working on improvements to your paper versus discussions of an as yet unseen reviesed draft of said paper over here, which of those two options is the most productive to actually getting your paper published?

In the history biz they call that one "false dichotomy".

So you don't think I have been working on improvements?

Rly?

(I don't think he knows me very well!)

As to substative peer review (say in GRL/ERL), I’d expect at least one reviewer to come from the NOAA camp (but perhaps two or even three can’t be ruled out). So you should expect at least one reviewer, to get down into the “weeds” of your paper, meaning that they just won’t read the darn thing and simply accept it at face value (something that you seem to be assuming in all your too many poses over here, ad infinitum, ad nauseam)..

Been there, done that. The Journal of Geophysical Research threw us an NOAA peer reviewer last time (for Fall, et al.). I certainly wouldn't expect less this on this pass.

Only they won't be liking the results so much this time around the wheel, so you can bet the farm the sparks will be a-flyin'.

You think we don't know this?

. . .

At least one of us is on to your game, the “tell” as they say in poker, on your part, is way too obvious.

WhatEver.

Funny how all that was good enough when the results were different! But I don't suppose we'll be getting personal thanks for improving the science from Dr. Muller this time around -- like we did last time . . . *grin*

But if them's the best cards you got showing, this poker player sees and raises.

Funny how all that was good enough when the results were different!

It's not funny or strange at all. Extraordinary results require extraordinary evidence. Results that merely support previous results - not so much. People have calculated the US surface temp trends in a variety of ways - GISTemp and BEST are just two - using data that's been subjected to a variety of pre-processing, and come up with roughly the same trend.

Consistent with the satelite data which everyone, including Christy until very recently, expects. Christy's in a bit of a bind if he's proclaiming that the satellite (LT) trend should be considerably higher than the ST because if you guys fail to overturn all of these previous results, he's going to have to explain what's wrong with his satellite trend computations *again*. Or why his claim that LT should be considerably higher (consistent with a ST that's cut in 1/2) is bogus after all.

Now you guys come along and say *all* But I don’t suppose we’ll be getting personal thanks for improving the science from Dr. Muller this time around — like we did last time . . . *grin*

If you actually improve the science, you'll get congratulated.

Don't count your chickens until the hatch, Gali ...

"How about in the revised version of this piece of “paper” we actually get a list of station ID’s and classification ID’S?"

That data and all related spreadsheets will definitely be archived and made available. Plus whatever notes we have on each station, and reasons for dropping any station that we drop. And the trends for each station derived for Fall, et al. (by Dr. Fall), will be included as well..

In addition, I put up hundreds of “Measurement Views” for the stations on surfacetations.org in order to facilitate the ratings via Leroy 2010, and all that will be available too, for independent review.
_________________________________________________

Does that mean explicitly in the NEXT revised draft version to be "published" at WUWT (or however the next revision might be publicised)?

A direct yes or no answer, if you don't mind. Dodgy answers need not apply.
_________________________________________________

"And as to the Leroy classification system itself, do we all have actuals (meaning statistics from which Leroy supports the temperature errors in a quantitive way) on tenperature offfsets biases and potential temperature trendline bias errors?"

Leroy does not address trends at all. He only addresses offset. Those organizations which have adopted Leroy 2010 are noted in the draft paper.

All we are doing is observing whether Leroy’s heat sink proximity ratings show any difference in Tmean trend between good and poor sites. They do not for Leroy (1999). They very much do for Leroy (2010).

We used to have a joke about Leroy (1999) back when we were doing Fall, et al — that all Class 4 stations were equal, but some Class 4 stations were more equal than others. At the time, John N-G suggested we might make our own distinctions in a followup paper.

But then Leroy (2010) made those distinctions himself, having recognized the pitfalls in his previous paper, and we (that is to say, I) classified the stations using his new rating system. I will say that I was very surprised at the difference in results!
_________________________________________________

So no quantitative assessments of Leroy (2010) have ever been done (e. g. temperature offset and trend biases)?

Thank you! For NOT answering that question.

So there you have it folks, a classification system with no quantifiable results as to temperature biases.

What good is any classification system with no underlying quantitative data on the accuracy of the data itself?.

It does indeen look like confirmation bias and circular logic does apply here.

And yes, I do like my hand alot more than your hand at this point in time. Very much so.

Does that mean explicitly in the NEXT revised draft version to be “published” at WUWT (or however the next revision might be publicised)?

A direct yes or no answer, if you don’t mind. Dodgy answers need not apply.

Oy.

At the latest, my spreadsheets will be archived at the time of publication. I don't even know if there is going to be another pre-publication release. That's not up to me.

You can go to surfacestations.org for the stations pics and images right now.

You may have to wait for the rest, but it will be available and easily accessible.

So no quantitative assessments of Leroy (2010) have ever been done (e. g. temperature offset and trend biases)?

For offset, obviously. As far as I am aware, there is no controversy whatever regarding the effect on offset. It's WMO endorsed. The 1999 version is the standard for NOAA/CRN and MeteoFrance. All the 2010 version does is quantify areas rather than mere distance.

It has not been quantified for trend (until we came along). Leroy isn't doing trend, he's doing offset. We're the ones doing trend. In fact, we don't do offset at all.

And yes, I do like my hand alot more than your hand at this point in time. Very much so.

That's the best you can do? Infer we are going to conceal up data and impute that Leroy has no underlying quantifying data for offset?

Good luck with that.

(Robert Murphy's concerns, OTOH, are actually meaningful.)\

It’s not funny or strange at all. Extraordinary results require extraordinary evidence. Results that merely support previous results – not so much.

Menne and Fall didn't support any previous results. There weren't any previous results to support

But regardless, we do provide extraordinary evidence.

he’s going to have to explain what’s wrong with his satellite trend computations *again*. Or why his claim that LT should be considerably higher (consistent with a ST that’s cut in 1/2) is bogus after all.

But UAH CONUS shows 0.23 C/d for the study period. NOAA-Adjusted USHCN2 clocks in at over 0.3. Our Rural No AP trend is around 30% lower than UAH. Our All Station trend is well within 20%. LT is supposed to be between 10% and 40% higher than ST trend.

So if you ask me, it's NOAA that's needs to be doing the 'splainin', not Christy.

Evan Jones, I was hoping you might respond to a comment our host snuck into one of your posts that you may have missed:
[I'm surprised that you think you can so easily re-work your paper to include this major change, without (apparently) affecting the conclusions at all -W]
Or perhaps to that comment, rephrased as a question: why do you think that you can easily rework your paper to include the major TOBS change without affecting the conclusions?

It’s not funny or strange at all. Extraordinary results require extraordinary evidence. Results that merely support previous results – not so much.

Menne and Fall didn’t support any previous results. There weren’t any previous results to support

What bull. I'm not going to bother to go into detail, everyone but you reading this thread understands.

Your credibility continues to drop with such statements. Tch, tch.

Or perhaps to that comment, rephrased as a question: why do you think that you can easily rework your paper to include the major TOBS change without affecting the conclusions?

He's already said they're dealing with TOBS changes with a new technique which refutes the previous TOBS changes which he doesn't understand.

My guess is that this will be "interesting".

[I'm surprised that you think you can so easily re-work your paper to include this major change, without (apparently) affecting the conclusions at all -W]

Thanks, JBL. Yeah, I missed that. It's a fun story, too. (Fun for me, anyway; YMMV.)

To begin our tale, it is necessary to discuss prejudice. Not all prejudice, and not unduly condemning of prejudice, either. After all, many prejudices turn out to be true.

But not all of them do. That is, after all, the nature of prejudice, in accordance with its very etymology.

And we are here to discuss one prejudice in particular.

It is a prejudice concerning TOBS. Even Anthony was somewhat affected by it, as it turns out. And Steve McIntyre, as well, perhaps (or not). It is a prejudice that I shared, early on in this process, but one of which I was disabused two years ago, under the harsh light of empiricism.

It isn't a prejudice about whether TOBS is a valid concern. (It is.) And it isn't about how it's calculated, either. it's about how it's distributed.

It's not even a prejudice about TOBS. Rather, it is a prejudice that concerns TOBS. It's a natural prejudice and seems logical.

And that prejudice is that stations in urban areas are more poorly sited than stations in non-urban areas.

The reason this affects TOBS is that it is well known (correctly, as far as I can tell) that TOBS corrections apply mostly to non-urban areas.

Therefore, the better microsites (which our prejudice tells us are rural) will be disproportionately hammered by TOBS, and the trend differences between good and poor microsites will be washed away.

But it's not so. Not so! Urban areas, which are indeed miserable mesosites, have, on average, far better microsites than do non-urban areas. Under 20% of stations in non-urban areas are Class 1\2. But fully 30% of urban areas are class 1\2.

So it is not primarily the well sited stations that are falling victims to TOBS bias. In fact, poorly (micro)sited stations are proportiobnately more affected by TOBS. than good sites.

So instead of washing out the trend differences between well and poorly sited stations, the removal of TOBS-bias affected stations actually slightly increases them.

Yes, removing them from the sample. That way one is not required to spitball over the calculation of TOBS bias; one can simply bypass the problem as easily as Vaslievsky at Brobruisk in that fateful July of 1944.

Sometimes the best way to stay out of trouble is to stay away from it.

Well, okay, reworking the study was not so easy. It did take a number of hours of hard work. I also had to find the needle in the haystack. That needle was hiding under the Phenomena tab in NCDC/MMS. Having located said needle, I then had to review TOBS for nearly 800 stations, coding them "in" or "out".

Having done that, it was a simple matter of "Filter" and "Delete Row".

Now, removing TOBS does indeed narrow the gap somewhat between the Class 1\2 raw and Class 1-5 NOAA-Adjusted data. But only by under .04 per decade in the "All Stations" set, even with MMTS adjustment thrown in. And, as I have already said, the differences between well and poorly sites stations are slightly wider that they were before.

So the basic premise of the paper (good v. bad siting) remains rock-solid, and the secondary premise (the "adjustment gap") is narrowed somewhat, but still gapingly wide.

And it turns out that the reason one thought it mightn't turns out to be a mere prejudice . . . a prejudice concerning a bias . . .

[You should have read Peterson -W]

He’s already said they’re dealing with TOBS changes with a new technique which refutes the previous TOBS changes which he doesn’t understand.

My guess is that this will be “interesting”.

Interesting? Perhaps.

But I never said was refuting TOBS and I never said I was employing a new technique, either.

What I said was that VV had it right -- and he did.

What bull. I’m not going to bother to go into detail, everyone but you reading this thread understands.

Your credibility continues to drop with such statements. Tch, tch.

Funny how you elided my very next sentence. Which everyone (including you) will have understood. You were speaking of credibility?

There are many reasons to believe that ST trends will not be equal to LT trends.

Have you looked at http://climateaudit.org/2011/11/07/un-muddying-the-waters/

and in particular http://climateaudit.org/2011/11/07/un-muddying-the-waters/#comment-3092…

This seems to imply there won't be an amplification factor of trends in LT vs ST over land

[The draft paper is naive in the extreme in its treatment of the satellite record. That won't survive competent peer review -W]

Yes, removing them from the sample.

Ahhh ... there lies a cherrypick ... toss out the stations you don't like.

That's not exactly a novel way of handling the TOBS problem that will satisify everyone, as you've claimed you've done up above ...

Sure. What of it?

But when push comes to shove we are not primarily theorizing. We are observing.

The best reason to believe that CONUS LT trends won't equal to ST trends is that, well, CONUS LT trends don't match the trends of well sited surface stations. With TOBS and MMTS accounted for.

For that matter, they don't match the adjusted trend either -- NOAA-adjusted ST trend is over a third higher than LT trend.

It is demonstrated.

Ahhh … there lies a cherrypick … toss out the stations you don’t like.

That’s not exactly a novel way of handling the TOBS problem that will satisify everyone, as you’ve claimed you’ve done up above …

That's a cherrypick? GOOD Lord!

Besides, our hypotheses hold just fine with or without TOBS-biased stations removed.

(No wait. I get it. You are joking.)

[You should have read Peterson -W]

Which one? Peterson, Parker picked a peck of packed parking lots (2006)?

I read the abstract on that one but it was at least two years ago.

In any case, it was quite easy to observe for myself that urban areas have a higher proportion of well microsited stations than non-urban areas.

But most folks on either side of the debate do not appear to be aware of this. And it is counterintuitive.

I am obviously html-challenged. I need to be better about closing my italics.

[Fixed. But you must be aware of the Peterson one where he points out that many nominally "urban" sites have good exposure -W]

Fixed. But you must be aware of the Peterson one where he points out that many nominally "urban" sites have good exposure -W]

Thanks. I've read a bunch of abstracts over the last few years, but for the last two I've been pretty well immersed in directly working on the data myself. (That plus a 10-hour workday doesn't leave a lot of time left over.)

By the way, thanks for providing this forum. It gives me the opportunity to answer questions and criticism from the other side of the aisle (some quite legit., yours included), and to communicate what is actually going on with the research for this paper.

Our findings demonstrate that microsite is fully as important as TOBS on an individual station basis (~0.1C/d over the study period). Yet microsite is not accounted for when adjusting the data.

We clearly demonstrate that poorly sited stations trends receive little overall adjustment while well sited station trends are heavily adjusted upward and match (even exceed) the adjusted trends of the poorly sited stations.

I therefore think it is important to readdress the NOAA adjustment procedure.

That is secondary, however. The primary finding is that microsite matters. A lot. Not merely for offset (which is not an issue of controversy), but for trend. We do not believe this finding is counterintuitive. But it is clear that opinion varies widely as to that.

"2) The siting classification which was proposed in ET-AWS-5 and further expanded in ET-AWS-6 was endorsed by CIMO-XV in Helsinki, 2010. The Commission has requested that it be included in the CIMO Guide with the following clarifications in order to ensure its appropriate use: 1) the use of the siting classification of observing stations depends on the purposes of the observations, 2) the proposed classification is the first official version of the siting classification, and will be reviewed and updated as needed at the next CIMO. The classification was published in Annex IV of CIMO-XV (WMO No. 1064)."

http://www.wmo.int/pages/prog/www/OSY/Meetings/ET-AWS7-2012/ET-AWS-7.ht…

(Item 9)

So one wonders what (1) might mean for the purposes of climatology versus daily temperature values from AWS? AFAIK, Leroy has never been involved in the climatological aspects of surface temperature measurements.

Also (2) suggests future updates and no doubt further clarifications.

Finally Leroy (2010) states;

Class 1/2: No uncertainties in temperature defined.
Class 3: "(additional estimated uncertainty added by siting up to 1°C)"
Class 4 "(additional estimated uncertainty added by siting up to 2°C)"
Class 5: "(additional estimated uncertainty added by siting up to 5°C)"

Note that to date, there are no statistical datasets that actually quantify the stated uncertainties. Thus we can take these uncertainties as "ad hoc" conjectures.

Note also that these uncertainties as stated do NOT define temperature offset biases or temperture trend biases. In fact, until such datasets can quantify these "ad hoc" bounds, we don't even know in which direction (to the offsets and trendlines) those uncertainty biases apply.

Considering that the WMO has to date, had 7 conferences devoted strictly to homogenization of temperature data for the purposes of developing long term climatology, the last being;

"Seventh seminar for homogenization and quality control in climatological databases"

(24-28 October 2011, Budapest, Hungary

You know, kind of makes me think that the WMO will continue on in the homogenization and quality control in climatological databases reguardless of how many pictures of ponies that Tony can show.

The term, "additional estimated uncertainty added ", applies to offset.

Considering that the WMO has to date, had 7 conferences devoted strictly to homogenization of temperature data for the purposes of developing long term climatology, the last being;

All based on the premise that microsite does not affect trend.

I think they are going to need to hold an 8th conference.

And not only regarding microsite -- a very large percentage of GHCN station are located in airports. One of our findings is that airport stations show far higher Tmean trends than non-AP stations.

Until and unless they account for all that, they are just misidentifying outliers and smearing the error around.

Note that to date, there are no statistical datasets that actually quantify the stated uncertainties.

Yilmaz et al. (2008) measures the offset effects of asphalt, dirt, and grass surfaces. it's a limited study, however.

But that is not very relevant. Unlike Leroy, we are not addressing offset. We are addressing trend. And we definitely have a very nice statistical dataset to demonstrate comprehensively and unequivocally the large effect on Tmean trend.

How about you submit your paper to E&E already.

I mean it's highly unikely to stand up in any regard in the long run (given only the original draft and your rather weak defense of the revised draft over here to date).

Get on with it already, let the thorough shredding of a final published Watts, et. al. begin already.

Put up or shut up already. That's what people want now, what people don't want is your continuous spin on things.

[That is a touch harsh, but I do agree that there are too many words here to wade through. What is needed is a revised draft, or the version-to-be-submitted if you're that close -W]

The term, “additional estimated uncertainty added “, applies to offset.

No, that's not what that means. Sheesh. Increasing uncertainty will leave an offset of 0, i.e. the measured value, with the probable range.

No, that’s not what that means. Sheesh. Increasing uncertainty will leave an offset of 0, i.e. the measured value, with the probable range.

It means the offset will be off by up to that much. This is not obvious? Shade on the cool side, heat source on the warm side.

But, as we know, offset has little to do with trend. For example, airports often tend to be cooler than surrounding areas (hence the SHAP adjustments of the 1950s, acc, to NOAA), but tend to have higher trends than the average non-AP stations.

That is a touch harsh

I consider it to be praise by faint damnation.

What is needed is a revised draft, or the version-to-be-submitted if you're that close

Well, the data is there. I've finished up the regional grid, though I still need to run the new set though my grid box sheets. (With the reduced number of stations, I may have to increase the size of the boxes.)

As for the paper, I'm not letting Anthony submit until I have both proofread and edited it. He may or may not want to present before submitting; that will be up to him.

Sure. What of it?

But when push comes to shove we are not primarily theorizing. We are observing.

The best reason to believe that CONUS LT trends won’t equal to ST trends is that, well, CONUS LT trends don’t match the trends of well sited surface stations. With TOBS and MMTS accounted for.

AIUI in the original version of the paper I think you were claiming, once you consider the amplification factor predicted, there was reasonable agreement between the satellite LT record and the trend in ST that you found.

Your position now seems , given the non-amplification factor predicted over land, indicates either that prediction over land is wrong or the satellite record LT trend is too high.

It just means for you to be right, one or other of a independent line of evidence must be wrong,

there was reasonable agreement between the satellite LT record and the trend in ST that you found.

That is presuming Christy is correct in that LT is at a somewhat higher trend than ST.

The revised trends after removing TOBS-biased stations makes it a very strong agreement (~17% lower for "All Stations" and ~30% for the "Rural, No Airports" slice).

The above is for Class 1\2 stations, of course.

I'll be gone for the next few days.

If there is any further feedback, I will comment when I get back.

I'll be taking my precious spreadsheets with me, but I almost hope I will be in a non-Excel, non-email zone. I need the rest.

The revised trends after removing TOBS-biased stations makes it a very strong agreement (~17% lower for “All Stations” and ~30% for the “Rural, No Airports” slice).

OK - it's difficult without seeing the actual calculation but UAH for US48 is 0.22 Deg C /decade so 17% lower gives 0.183 and 30% lower gives 0.154 which gives amplification factors over land of 20% and 43% - that doesn't seem consistent with the non amplification expected over land

One other thing that has been mentioned earlier is the degree of subjectivity in rating stations (particularly if you have seen the temperature history and from the temperature history you expect a siting issue you might see what you expect !)
So I wonder if would be sensible to get someone like john n-g to rate the stations independently based on the Leroy 2010 criteria and see if he rated them similarly - also there was a good suggestion on the other thread from MMM I don't know if you saw it :

run the Watts et al. methodology using Leroy 2010, and then run it again using Leroy 1999. Compare the two graphs. This tells you how much the Leroy change actually matters. If that change is small, well, then, we know to scrutinize the rest of the methodology. If the change is big, then we know that the classification method used is actually important, and we can think about that...

OK – it’s difficult without seeing the actual calculation but UAH for US48 is 0.22 Deg C /decade so 17% lower gives 0.183 and 30% lower gives 0.154 which gives amplification factors over land of 20% and 43% – that doesn’t seem consistent with the non amplification expected over land

Yet it's dead on if the amplification is expected. Assuming Christy is correct that there is amplification.

I would consider these results as support of amplification.

And I also would point out that the 0.22 UAH figure is well below the 0.31 C/d adjusted trend of NOAA (or the 0.28 C/d raw trend, for that matter).

Therefore, I contend that it is NOAA that needs to do the 'splainin'.

One other thing that has been mentioned earlier is the degree of subjectivity in rating stations (particularly if you have seen the temperature history and from the temperature history you expect a siting issue you might see what you expect !)

Well, yes. That's why I made darn sure that when I did the re-ratings i did not have the station data in front of me. Doing it "blind" is essential for objective results.

I was and am acutely aware that there will be close scrutiny of the ratings themselves. As there should be.

To repeat from earlier, we are using ONLY the heat sink/source parameters from Leroy 2010, as we did for our previous paper (Fall et al., 2011, which had very different results).

And do bear in mind that I rated the stations for Fall et al., using Leroy (1999).

I found results that showed no difference between well and poorly sited stations for Tmean trend. And we went to publication even though the results were very different from the results of Watts, et al.

So I wonder if would be sensible to get someone like john n-g to rate the stations independently based on the Leroy 2010 criteria and see if he rated them similarly – also there was a good suggestion on the other thread from MMM I don’t know if you saw it :

Funny you should mention that. We had John N-G do a spot-check on my ratings couple of months (before the pre-pub release). Anthony needed to be sure that I was not 'way into Confirmation Bias Land! And so did I, for that matter.

As it turned out, we differed only on one station where he pointed out that rock formations I thought we natural we probably landscaping. So I made the change. (And I don't even know if this one wound up getting dropped for TOBS bias or not.)

The trouble is that even with the reduced station set (600 down from 1000), it still takes a lot of time. After the dust cleared on the QC, I had spent hundreds of hours at it. It won't be easy to get anyone who will spend that much (unpaid) time on it.

So while we can get spot checks, getting the whole thing done over will be difficult. I am sure there are one or two borderline cases, esp. where the imagery was less than stellar. I have had to re-look at some of the ratings as better Google Earth, Bing, and Street Level imagery has become available. Some of it has "helped" our results, some not. But I really, really have tried to do it straight and get it right.

I daresay many more man-hours (raw or adjusted) have gone into this paper than into most peer-reviewed papers of this sort.

run the Watts et al. methodology using Leroy 2010, and then run it again using Leroy 1999. Compare the two graphs. This tells you how much the Leroy change actually matters. If that change is small, well, then, we know to scrutinize the rest of the methodology. If the change is big, then we know that the classification method used is actually important, and we can think about that…

I ran a quickie on that using the updated set. Change was huge. The Leroy version change has a very large effect.

also there was a good suggestion on the other thread from MMM I don’t know if you saw it :

Can't find the ref. Could you spot for me?

Evan, allow me to point out that if you ever want to cite the supposed discrepancy between satellite trends and land-based trends, a "if Christy is right" reference will get reviewers to roll off their chairs laughing for multiple reasons. You'll have to do the work. Fortunately, it has already been done once, and I'll leave it to your co-author Steve McIntyre to find that work.

Just so you know, the 20% is the absolute largest amplification factor over land that you will find in the climate models. Some, however, have a 20% 'dampening'. Mean of the climate models around 0.98.

Hint: see the Klotzbach et al disaster.

I'll leave all that up to Dr. Christy. He does the microwaves. I do the surface stations. Whatever discrepancy there is (or is not), is his territory. As it stands, the NOAA-adjusted trend is ~25% higher than the UAH trend.

It is also unequivocally demonstrated that the well sited station trends are adjusted upwards to match the poorly sited stations (quite apart from TOBS and MMTS, which our revised paper addresses).

For that to be considered valid it needs to be demonstrated that well sited stations produce bad data in need of severe adjustment, while poorly sited stations produce good data in need of very small adjustment. I predict that that will be a tougher nut to crack than discrepancy between Watts (2012) and UAH.

Evan, there's already stuff in the literature that shows Christy is wrong. And that's the co-author you are depending on! At the very least you, and with you I mean the author group, cannot claim not to have known.

Perhaps.

Or perhaps our findings are empirical evidence that it is the stuff in the literature may be wrong. Direct observations -- if determined to be correct -- trump models.

As it stands today, UAH and NOAA-adjusted Tmean trend data for CONUS are not currently in agreement, in any case. So the applecart was upset long before Watts et al. (2012) came on the scene.

But in the end it is of tertiary importance to our paper. Our thesis does not hang on UAH or RSS.

What matters far more is our primary finding on Tmean trend (siting matters) and our secondary finding (the adjustments need adjusting). None of that directly involves satellite data.

At any rate, if Dr. Christy is wrong, then he will have to address the issue during peer review.

Evan, the claim that you (read: Christy) expect the land to warm faster than the lower troposphere comes from "models". That notion is wrong, but there it is. If we take the *correct* notion: land as fast, and maybe even faster than lower troposphere, your original analysis that lowered the trend by a factor two was very problematic, and you would have had to address that in the paper. Just handwaving "maybe the satellites have a warm bias, too!" won't cut it.

And last time I checked, Spencer and Christy kept on speeding up their applecart.

That notion is wrong

Is it? Or is the "correct notion" based on the comparison between official adjusted surface data and satellite data? That would indeed support the contention that you are right.

But we observe that the official adjusted ST trends are too high. Our reanalysis shows it is not as exaggerated we thought as in our original paper, but is still severely exaggerated.

Just handwaving “maybe the satellites have a warm bias, too!” won’t cut it.

The satellites are not measuring ST.

[That's not really a credible answer. I think you're going to need one, for the reviewers, although they might decide that they don't care, if you get lucky -W]

Evan,

Sadly I see you did not even look up the prior literature which I pointed out to you (again: see the Klotzbach et al disaster).

The "correct notion" is based on the climate models. You cannot claim you expect the surface to warm slower than the troposphere (as Christy did - it's the claim present in the first version of your paper) without a climate model. And you'll have to cherry pick climate runs or a specific model to make that claim work. Any reviewer worth his money will have something to say about that, and it won't be pretty.

Sadly I see you did not even look up the prior literature which I pointed out to you (again: see the Klotzbach et al disaster).

That is true. Instead, I worked for ten hours-plus and then I traveled home and then I spent three hours searching for more stations -- and found one. (Then I collapsed in exhaustion.)

You cannot claim you expect the surface to warm slower than the troposphere (as Christy did – it’s the claim present in the first version of your paper) without a climate model.

I don't expect it. I observe it.

I do not accept that observations require models to support them. Rather, I suggest that models require observations to support them.

Besides, our paper is not about creating models. We merely observe.

What we observe regarding ST vs. LT is:

A.) poorly sited stations and
NOAA-adjusted trends exceed LT trends.

B.) Well sited stations are exceeded by LT trends.

Perhaps someone would care to confirm or disprove our observations and then create a model that conforms therewith. That would be scientific method.

I contend that the models do not disprove my observations; it is my observations which call the models into very serious question.

For my observations to fail peer review, they must be found to be methodologically or tactically flawed. Failure to match models is not a valid criterion.

And, finally, I repeat that agreement or disagreement with the satellite record is entirely tertiary to our paper.

[That's not really a credible answer. I think you're going to need one, for the reviewers, although they might decide that they don't care, if you get lucky -W]

If the ST/LT trend comparisons start turning into trench warfare, we can merely drop them and the paper will lose next to nothing.

[I'm dubious about that - and I certainly would be, if I was a referee. The MSU provides a useful benchmark comparison, it would be odd not to use it -W]

I doubt we will have to, but we can, if necessary.

(Also, see the above comment.)

Evan, it's nice that you think you can drop the satellite versus surface trend difference, but remember that it was used in the first version as evidence that you were right. Now that it suddenly turns out to be contradictory, you just call it tertiary?

It really starts to sound like Klotzbach et al all over again...

BTW: does removing the satellite issue not equal throwing John Christy off the paper? :-)

"So the applecart was upset long before Watts et al. (2012) came on the scene."

I'm betting (2013) at the earliest, if ever.

[Um yes. Its all gone a bit quiet, hasn't it? Wasn't there supposed to be a work-in-progress page at WUWT? -W]

[I'm dubious about that - and I certainly would be, if I was a referee. The MSU provides a useful benchmark comparison, it would be odd not to use it -W]

But while LTT and ST are in the same park, they are not the same bench.

Evan, it’s nice that you think you can drop the satellite versus surface trend difference, but remember that it was used in the first version as evidence that you were right. Now that it suddenly turns out to be contradictory, you just call it tertiary?

It was never anything but tertiary. We find what we find. And can demonstrate it.

And what the evidence clearly demonstrates is that ST trend on land is lower than LT trend -- as supported by the data.

Citing models won't change that finding. Looks as if there'll have to be some new models. I doubt very much we'll drop it. No need. (Though we could.) Busted models without valid statistical backup doth not a refutation make.

[Um yes. Its all gone a bit quiet, hasn't it? Wasn't there supposed to be a work-in-progress page at WUWT? -W]

2013? No doubt. These things take time.

As for "work in progress", well, that would be me. Much work. And much progress! (And I don't do "quiet". Q.E.D.)

P.S., Thanks again for entertaining this discussion.

Evan Jones is defending Watts PBS Newhour statements over on the PBS blog:
http://www.pbs.org/newshour/rundown/2012/09/climate-change-from-differe…

The money quote:
"NOAA is going to have to readdress its entire USHCN dataset."

[EJ is, to my mind, over confident of his correctness. But then, to an absurd degree, so is AW. Also I think (somewhat in the Roy Spencer mold, http://scienceblogs.com/stoat/2011/03/05/dr-roy-spencer-is-sad-and-lone/) he needs to talk to more people, and not just in blog comments -W]

@Evan Jones
"2013? No doubt. These things take time."

But Anthony Watts claims that
"After going through our second round of review, I’m confident that our results will hold up"

Does it mean your paper is already submitted?

Ah, so they finally posted that!

Yes, I am very confident, especially since TOBS and MMTS has been dealt with and the results are still robust.

They hold no matter how you subdivide the data by type (All stations, Rural only, Urban only, MMTS only, CRs 0n;y, No Airports, Airports only, etc., etc.).

So I have every reason to be confident. Yes, NOAA adjusts step changes for station moves, but does not concede trend is affected by poor microsite over time (Menne, 2010).

So, yes, NOAA is going to have to readdress adjustment. And I'll go even further. It is not merely USHCN that has to be readdressed, it is GHCN. But rating the GHCN stations would be a much wormier bag of beans (for a number of reasons I could go into if anyone cared..)

Does it mean your paper is already submitted?

Not before I get a chance to correct the grammar and punctuation!

But, no, we are waiting for our stats guys to add the error bars and Dr. Chtristy to address the satellite issues.

@Evan Jones

"But, no, we are waiting for our stats guys to add the error bars and Dr. Chtristy to address the satellite issues."

If it's not submitted, then what kind of review Watts is talking about?

"Yes, I am very confident"

Are you confident the same way Anthony Watts was confident about Fall et al. results BEFORE the paper was published?

http://scienceblogs.com/deltoid/2011/05/13/anthony-watts-contradicted-b…

I am not sure whay you mean by that. Watts, et al., has not yet been submitted. It has only been subject to independent review, not peer review.

I personally did the ratings for Fall, et al (of which I was a co-author). We went to press despite findings that we did not expect.

I also personally made the ratings for Watts, et al.

I did not expect the results to be any different from Fall, et al. We can explain why those findings were different and why the previous findings were not correct. (Well, they were correct -- for the flawed Leroy, 1999, rating methodology.)

I am confident in the results because we have addressed the concerns expressed during independent review and they still hold up. At this stage, that would be a pretty good reason for confidence.

Sorry, the entire approach is sunk by the Dr. Who Effect

Already asked and answered previously. You need to go back and read a bit.

Actually, we go much further than Menne (or what we did in Fall, et al., for that matter) to exclude recently moved stations. We would have gone even further, but MMS does not provide reliable, uniform data earlier than around 2003.

Did you raise similar objections to Menne et al.? Or to Fall? (Just curious.)

"I am not sure whay you mean by that."

Maybe you should ask Anthony Watts, he's the one who's talking about the "second round of review".

"We went to press despite findings that we did not expect."

You didn't really have a choice since N-G joined your team. You could only decide how to spin these results and pretend it was all about the diurnal temperature range.

"I am confident in the results because we have addressed the concerns expressed during independent review and they still hold up."

Well, that's what Anthony Watts said:

"Dr. Pielke Sr. and I, plus others on the surfacestations data analysis teams (two independent analyses have been done) see an entirely different picture [than Menne et al], now that we have nearly 90% of USHCN surveyed. NCDC used data at 43%, and even though I told them they’d see little or nothing in the way of a signal then, they forged ahead anyway."
http://wattsupwiththat.com/2010/05/19/tom-karls-senate-dog-and-pony-sho…

Surprisingly, this "entirely different picture" turned out to be essentially a confirmation of Menne et al. results.

And even earlier,

"Around 1990, NOAA began weeding out more than three-quarters of the climate measuring stations around the world. They may have been working under the auspices of the World Meteorological Organization (WMO). It can be shown that they systematically and purposefully, country by country, removed higher-latitude, higher-altitude and rural locations, all of which had a tendency to be cooler."

"Of course there will be those who say “but it is not peer reviewed” as some scientific papers are. But the sections in it have been reviewed by thousands before being combined into this new document. We welcome constructive feedback on this compendium."

http://wattsupwiththat.com/2010/01/26/new-paper-on-surface-temperature-…

My point is, the surfacestations project people have a long history of making overconfident statements that, despite 'independent analyses' and 'review by thousands', turned out to be flatly wrong. That's why I take anything you say about being confident with a grain of salt.

You didn’t really have a choice since N-G joined your team.

We knew the results well before he joined the team. He came on towards the end of the process to do the Monte Carlo on the findings.

In fact he even suggested that Leroy (1999) was off the mark and that we should have a "plus" and "minus" category for Class 3 and 4 stations. If I had taken him up on that, our findings might have changed.

He also urged rerating the stations a la Leroy 2010, but we decided to wait until the followup.

Well, Watts (2012) is the followup.

As for Fall, we did show results different from Menne, esp. re. Tmin. But I told them at the time, though, that no one would give half a damn for anything by Tmean, and, boy, was I right or what.

Dr. Pielke emphasized all along that, whatever the findings were, they would be important and needed to be published. We were, and are, in it for the science.

At any rate, the analysis is done and the factors missing are now included. So it's not a matter of "expecting" a result. We now have the result. So I'll stand on my confidence.

"It has only been subject to independent review, not peer review."

Two questons on this:
1) Are you implying here that peer-review is not independent?
2) In what way is the review of this work independent?

Evan Jones: I don't understand why the surfacestations team, supposedly concerned about siting issues, wouldn't take the most obvious and straightforward path to show siting problems.

We have a "gold standard" now, so why not compare the well -sited and poorly-sited stations in grids with the USCRN station anomalies, and show the well-sited stations match the USCRN anomalies, and the poorly-sited station do not match.

Menne did this with the surfacestations previous classifications of stations, and showed no difference for matching USCRN between the two classes.

You also have a problem with your "we are only reporting observations" statement. Actually you are drawing conclusions from the observations, that may not be valid, and this extrapolating these conclusions to comparisons with satellite records etc. Come back down to the ground, and finish the analysis of the surface stations. If your team has calculated the anomalies correctly for the two sets of stations, you still don't know which is correct. You are jumping to a conclusion when you claim the Class 1/2 stations are correct. Matching the USCRN data from the last five years to each of the two sets of data, should tell you which set is correct.

Possible problem: What if over the last five years both sets of stations match the USCRN stations? If so, then you have a big problem. For the "siting issue makes significant difference in anomaly", to be believable, then there should be a significant difference between the poorly sited and the best sited stations compared to the gold standard USCRN stations over the the history of the USCRN. If not, it raises questions about lack of historical siting data for the stations.

My goodness, why all this pussyfooting around? If siting is an issue, just compare the poorly sited/good sited anomalies with the "gold standard" USCRN data, and show it. Its that simple.

Menne already did this with the previous siting classifications, although with less USCRN data. He found no significant difference in the two sets of stations with USCRN station data. That is a very powerful finding that you will need to overturn, before publishing.

Let me try saying it this way; surfacestations parsed the stations into two sets previously, lets call them set A1 and A2. We also have a set of stations from USCRN, which we can use over timeframe that USCRN has existed ( the set of stations are 5-10 years old), which we can call the USCRN set.

Menne showed that set A1 matched set A2 results, and both matched the published NOAA temperature reports. Menne also showed that A1 and A2 matched the USCRN set, over the timeframe covered by the USCRN.

Now surfacestations has re-parsed the stations into sets B1 and B2, and claims B1 doesn't match B2 (and that B2 matches the published temperature trend from NOAA). But if B2 matches the published NOAA trend, then B2 should match sets A1 and A2, and likely matches the USCRN set.

This leaves set B1 out in the cold (pun intended). The class 1/2 set that surfacestations now proposes, very likely doesn't match the USCRN set. Whoa… stop the presses. If set B1 conflicts with the USCRN data, then the surfacestations conclusion (that the new classified 1/2 stations correctly measures temperature anomalies) has a big problem.

The surfacestations Team needs to compare the new classifications with USCRN data, similar to the Menne paper.

Which, of course, was exactly what Atmoz did for one of Willard Tony's 10 worst stations. Guess what he found.

"We knew the results well before he joined the team."

Are you implying that Anthony Watts knew the results and was purposely lying when he said that

"the early arguments against this project said that all of these different biases are going to cancel themselves out and there would be cool biases as well as warm biases, but we discovered that that wasn’t the case. The vast majority of them are warm biases"
?

"But I told them at the time, though, that no one would give half a damn for anything by Tmean, and, boy, was I right or what."

I know, I know, actually it was pretty sad. You didn't even deserve to get a sticky post at WUWT, and the only person who seemed to care about the paper was Dr. Nielsen-Gammon.

Fortunetely, this time it's different.

Just a dumb (I mean a *really* dumb) -- when the USDA redrew its plant hardiness zones based on where various species of plants can make it through the winter, what USCHN temperature stations did those plants use to figure out where they can now grow? And did the plants use raw or adjusted temperature data?

Inquiring minds want to know!

That should be,

Just a dumb (I mean a *really* dumb) **question**...

Following up: If the longer-term warming trends computed from USHCN stations are fictional (due to hot bbq's nearby, etc.), then there should certainly be a disconnect between the new USDA hardiness zone map and where various species of garden plants can actually grow.

But it definitely appears that various species of garden plants are cooperating with the curators of the USHCN temperature record by deciding where they are now able to grow.

Exciting news (ahem): Don Tony yesterday replied to Victor Venema at/in my location "REPLY: Watts et al 2012 on the sidebar, soon to be updated to handle the TOBs (non)issue, ..."

["soon" eh? That would be interesting, at least a bit, but we'll have to wait and see. VV seems to be doing a good job over there -W]

Any news about Watts et al 2012 (soon to be 2013)? It takes awfully long time to handle the TOBs "non-issue"...

[Not a sausage, as far as I'm aware -W]

Why Watts's new paper is doomed to fail review

More like this

Last warning: mustelid.blogspot.com

Dynamics of Stoats

Gunz: constitutionalism and majoritarianism

That it is easier to agree on economics than morality

Morality and economics

Comments of the Week #2: From the Sun's death to the light elements

Nobody promotes antivaccine nonsense in my state...without receiving some Insolence

Allergies, hookworms, and genetic engineering