Now we know why UAH v6 is so late...

By stoat on April 26, 2015.

I said that AFAIK S+C's code for UAH isn't available but VV pointed me to Eli who pointed me to ncdc.noaa.gov/cdr/operationalcdrs.html which offered me RSS and UAH. UAH is the one we want to take the piss out of, so read it and weep, below.

First, though, as far as I can tell it doesn't even tell you what version of UAH this corresponds to (ah, but actually it 5.4. You can tell this by reading things like "The program txx_1_5.4". Yay). Under "1.3 Document Maintenance" it does say: When requested by NOAA, if there have been any changes in procedures required for the production of the products or if the description of procedures has inadvertent omissions or errors, we will update this; and there certainly have been updated to UAH since 2011; but the doc is still the original.

I'm also a teensy bit unclear about what its describing: on the face of it, its missing rather a lot, because it says The deep layer temperature products described here come from measurements produced by Advanced Microwave Sounding Units (AMSU-As, hereafter “AMSU”) ... Before AMSU, the Microwave Sounding Units (MSUs) flew on the NOAA polar orbiters since late 1978. Processing of the older MSU data, except in the homogenization routines, is not addressed by this document. WTF?

[Update: by bizarre co-incidence, S+C+B have just released, or announced, v6. As they say Many procedures have been modified or entirely reworked, and most of the software has been rewritten from scratch. There's just a hint that they may have rushed this out: After three years of work, we have (hopefully) finished our Version 6.0, but who knows.

Ha. Actually, there's rather more than a hint that this may be rushed: if you read to the end, they back off: This should be considered a “beta” release of Version 6.0, and we await users’ comments to see whether there are any obvious remaining problems in the dataset.

Eli, never one to stand on ceremony, steps into the torrent of ignorant praise over at Roy's to ask where's the code? But its not available "yet".]

Anyway, onto the excuses (my bold):

The codes described here and provided to NOAA have not been optimized in a software engineering sense. Much of the programming structure originated over 20 years ago, starting around 1989, and was written by the authors who came from a generation of self-taught programmers and have little formal computer programming training. Much of the work was done with little funding support, so no professional programmers were utilized. In Christy’s code, there are numerous sections devoted to image creation through NCARgraphics for detection of problems, but which are not necessary for the production of the ASCII files desired by the users.

There is little use of subroutines in Spencer’s code, but more in Christy’s. Continuity of operational procedures has taken precedence over elegance or speed of execution.

As algorithm enhancements were tested, many were abandoned, but those portions of the code were simply commented out rather than deleted, i.e. they are vestigial in reality. While this is somewhat sloppy from a software design standpoint, the practical advantage of this is to provide a detailed reminder of what has been tried before.

In some cases, rather than having unused code commented out, there are sections which are never branched to in the operational running of the code because an initial adjustable parameter is always assigned a single value. A good example is diurnal adjustment of the AMSU data, for which much code is included, but has never been used operationally. In other cases, a particular ancillary analysis was needed for a publication, but not needed for production runs. These sections are usually commented out.

Most of the programs have array dimensioning and assignments which must be manually updated every month and year, since (at this writing) they only handle data through July 2011. Similarly, if a new satellite is added, then there are program changes which must be made to accommodate those new datasets.

The programs were originally developed on an SGI workstation or an IBM mainframe,and then later transitioned to Linux. As a result, all previous binary input and output files had a byte-ordering issue. We retained the SGI handling of binary files, so some of the programs must be run with a byte-swap option used on execute. This might not be an issue if NOAA re-generates all output files from scratch, but if our previous outputfiles are used, there will be a problem.

Also, we have had problems processing of a month’s worth of global AMSU data causing some sort of memory size allocation exceedance during a single program execution, which leads to only a portion of the data being processed properly. This is also handled with a special option during execute.

Well, that looks like a perfect way of making sure that no-one at all ever reads your code. But it also looks like a way of ending up with a hideous heap of gunk that even you can't update.

Update: mmm

                ksat1 = 18 ! NOAA 15, 16, or 18
                ksat2 = 18

      klun1 = 165
                klun2 = 165


                istore=1

                diffdat=0

      do 1000  ksat=ksat1,ksat2  !...15 or 16 or 18

c....OPEN OUTPUT FILES ..................................................................
      if(ksat.eq.15)then
         OPEN(191,FILE='/rstor/spencer/amsu/grids/amsu_n15_monthly_2LT.grd',form='binary',access='append')
         OPEN(192,FILE='/rstor/spencer/amsu/grids/amsu_n15_monthly_2.grd',form='binary',access='append')
         OPEN(193,FILE='/rstor/spencer/amsu/grids/amsu_n15_monthly_4.grd',form='binary',access='append')
                end if

                if(ksat.eq.16)then
         OPEN(191,FILE='/rstor/spencer/amsu/grids/amsu_n16_monthly_2LT.grd',form='binary',access='append')
         OPEN(192,FILE='/rstor/spencer/amsu/grids/amsu_n16_monthly_2.grd',form='binary',access='append')
         OPEN(193,FILE='/rstor/spencer/amsu/grids/amsu_n16_monthly_4.grd',form='binary',access='append')
                end if

                if(ksat.eq.18)then
         OPEN(191,FILE='/rstor/spencer/amsu/grids/amsu_n18_monthly_2LT.grd',form='binary',access='append')
         OPEN(192,FILE='/rstor/spencer/amsu/grids/amsu_n18_monthly_2.grd',form='binary',access='append')
         OPEN(193,FILE='/rstor/spencer/amsu/grids/amsu_n18_monthly_4.grd',form='binary',access='append')
                end if

c.....input files.......
       if(ksat.eq.15)then
c          OPEN(11,FILE='/rstor/spencer/amsu/grids/amsu_n15_9808_newLC.grd',form='binary')
c          OPEN(12,FILE='/rstor/spencer/amsu/grids/amsu_n15_9809_newLC.grd',form='binary')
... 100+ similar lines still in the original removed...
c          OPEN(157,FILE='/rstor/spencer/amsu/grids/amsu_n15_1010_newLC.grd',form='binary')
c          OPEN(158,FILE='/rstor/spencer/amsu/grids/amsu_n15_1011_newLC.grd',form='binary')
c          OPEN(159,FILE='/rstor/spencer/amsu/grids/amsu_n15_1012_newLC.grd',form='binary')
c               OPEN(160,FILE='/rstor/spencer/amsu/grids/amsu_n15_1101_newLC.grd',form='binary')
c               OPEN(161,FILE='/rstor/spencer/amsu/grids/amsu_n15_1102_newLC.grd',form='binary')
c               OPEN(162,FILE='/rstor/spencer/amsu/grids/amsu_n15_1103_newLC.grd',form='binary')
c               OPEN(163,FILE='/rstor/spencer/amsu/grids/amsu_n15_1104_newLC.grd',form='binary')
c               OPEN(164,FILE='/rstor/spencer/amsu/grids/amsu_n15_1105_newLC.grd',form='binary')
                OPEN(165,FILE='/rstor/spencer/amsu/grids/amsu_n15_1106_newLC.grd',form='binary')
        end if

Refs

* Does He or Don't He via Eli, and others: Luther the anger translator.

More like this

I'm shocked, shocked to find bad coding going on in here.

Looks like time for an audit.

Sounds like when S&C retire, the UAH dataset might decide to do the same

As a self-taught computer programmer from roughly the same era, I can appreciate their honesty. I also cringe with embarrassment when I think of some of my early programs. That said, there were reasons elegant, well-structured, well-commented code was rarely the norm. Chief among them were:
1) I didn't know any better,
2) I didn't have the time,
3) If it ain't broke, don't fix it

Now, most of my programs were simply automation of manual tasks - tasks that I and my co-workers performed sometimes 3 or 4 times a day and sometimes only 3 or 4 times a year. It didn't make any sense to spend more time on a program than the time that was going to be saved by using it - especially since this was never part of my job title, job description, or any major portion of my employment review. In fact, many of the early programs were written on my own time (and really for my own benefit) precisely because my employer wouldn't pay anyone to do it.

Eventually, programming did become part of my job - despite never having any formal education in computer programming. The gains in productivity and efficiency didn't go unnoticed, but only rarely have I revisited my early programs and made them more presentable. It's difficult to justify rewriting code that works -- even though it may be ugly, finicky, and/or unreadable -- when there are a hundred other programs that still need to be written.

Every now and then I'm able to incorporate an old program into a new one - usually the old code is completely ignored in writing the new program. Even though the task itself may not have changed one iota, coding standards and capabilities have changed so dramatically that little or nothing is gained from even attempting to read and understand the old code.

So I can actually sympathize with Christy and Spencer and the deplorable state of their code. Every defect they mention I've done at one time myself - even when I knew better and knew I'd kick myself later for writing it quick and dirty.

[I, too, have a good deal of sympathy for the predicament they seem to be in. I've written lots of crappy Fortran code; indeed, its easy to write crappy code in any language. However... however, actually, I probably shouldn't push this too hard, as I haven't read their actual code yet, or RSS's -W]

This may be how software growths, especially when there is no funding to do it decently. Getting funding for a short temperature series, which has many problems with non-climatic changes and needs major highly uncertain adjustments, is probably hard. Funding is allocated based on scientific merit, not for importance in a political "debate".

Even if they find the same trends as the surface temperatures, this does not sound like code I would base policy on.

"Even if they find the same trends as the surface temperatures, this does not sound like code I would base policy on."

You should take a look at some of the GISS fortran code then.

[Happily, as Eli points out below, GISStemp has already been rewritten in clean Python. They even found a couple of trivial bugs in the process. More important, I think, is that we already have 3 or 4 independent surface records, that essentially agree. If one had been an outlier, people would have been crawling over its code -W]

GISSTEMP is being re-written by Nick Barnes' and the Clear Climate Code project. ccc-gistemp is available

http://climatecode.org/blog/2015/03/how-to-run-ccc-gistemp-with-isti/

Terry:

"You should take a look at some of the GISS fortran code then."

But it's not the mainstream science side arguing that programs written by scientists who are amateurs at writing code makes those programs worthless.

It is your side.

My guess is that the UAH code, as sloppy as it is, and as awkward as it is to update for new years, new satellites, and the need to kludge around a memory allocation issue actually probably works surprisingly well.

Just like GISTemp, and a bunch of other scientific code which might give us software professionals cause to chuckle.

Now, the question is, if you (or those on your side of the fence), have been arguing for years that GISS code can't be trusted because of "unprofessionalism" will you

1. throw out UAH
2. make excuses for UAH and claim the case is "different" (perhaps because Spencer and Christy both can claim "God is my co-programmer")?

How much time and money would it take to write a clean modern code for UAH as is being done for GISS? How does a cleaner code improve the accuracy of UAH and GISS output?

[As d says, below, rewriting GISS in Python threw up a couple of trivial bugs, but made no significant difference. But it does increase the confidence that the GISS code is correct. And as an added bonus it as free, as CCC did it in their spare time. My feeling is that reworking UAH would be much harder -W]

Paul Kelly:

"How much time and money would it take to write a clean modern code for UAH as is being done for GISS? How does a cleaner code improve the accuracy of UAH and GISS output?"

In the grand scheme of things, not much.

But you're missing the point (so surprised).

Oh my word. I work in software testing and this is shocking.

[P]ortions of the code were simply commented out rather than deleted...While this is somewhat sloppy from a software design standpoint, the practical advantage of this is to provide a detailed reminder of what has been tried before.

Have these people not heard of version control, archiving, or documenting properly?

[R]ather than having unused code commented out, there are sections which are never branched to

Unreachable code is a very bad coding practice, and something we're taught to catch.

Well, that looks like a perfect way of making sure that no-one at all ever reads your code. But it also looks like a way of ending up with a hideous heap of gunk that even you can’t update.

You nailed it Dr. Connolley. Code that's not maintainable or update-able is very poor code.

[My strong suspicion would be that they have, indeed "never heard of version control", in the sense that they've heard vague rumours but its scary stuff that they're not going to touch -W]

Ha. It's worse than we thought.

There is little use of subroutines in Spencer’s code

This is one of the things that has me boggled. Subroutines and functions (I consider C to be my native programming language) make your life a great deal easier, because you only have to update that particular module, not the entire code, and the danger of undetected typos is correspondingly reduced. I can understand most of the other poor choices: version control would have been mostly unknown in non-CS academic compartments when they started the project (not that that excuses their failure to adopt it subsequently); commenting out code is something that may have been done as a quick-and-dirty fix which was left in; they are primarily paid to do things other than maintain this code, etc. Some of what they did was just poor design decisions made by people who didn't know better (and I can't say I would have avoided those mistakes myself). But you don't want to make it harder on yourself than you have to.

The other thing I have little sympathy for, as Julian mentioned above, is the sections of unreachable code. GOTO has been considered harmful since about the time I was born, and this is one of the reasons why. It's one thing to have sections of code labeled, "If we get here, something has gone horribly wrong." Or you put a bunch of code in a subroutine that never gets called. I have put in debugging statements that are included via #ifdef statements at compile time, and left them in the code (but not compiled) for the production version. But to have a block of code in your main routine that you intentionally skip over with GOTO statements is one of the worst forms of spaghetti coding out there.

The overhead structure at many Federal labs is such that it costs nearly as much to hire a technician or programmer as it does to hire a Ph.D. scientist. There's probably a lot of code out there that has never been seen by a trained programmer.

And I'm not convinced we're much the worse for that. My experience is that professional programmers can turn inelegant but understandable (to a scientist) code into an over-modularized, pointer-infested black box. That happened with one of the models that I work with.

Ray

"turn inelegant but understandable (to a scientist) code into an over-modularized, pointer-infested black box"

is something Eli is going to have Ms. Rabett stich onto a sampler

Code is read by two distinct entities: a human and a machine.

The machine doesn't care what it looks like only that it executes. They don't get confused as easily as we do.

Badly-organised code is tough for the humans trying to read it and work with it. I'd personally never go near a project which wasn't fully tested (in the agile sense) because it would drive me insane wondering what the code was supposed to do or even if it did what it was supposed to do. There's no way to know without a suite of acceptance/unit tests.

Still, these are issues at the human end. Even if it the code hasn't been designed to be user-friendly it could still function perfectly well.

hmmm are all of those currently-ignored input files still lying around? I wonder what the differences are between the 100+ versions of what I assume is more or less derived from the same raw data?

"c.....input files.......
if(ksat.eq.15)then
c OPEN(11,FILE='/rstor/spencer/amsu/grids/amsu_n15_9808_newLC.grd',form='binary')
c OPEN(12,FILE='/rstor/spencer/amsu/grids/amsu_n15_9809_newLC.grd',form='binary')
... 100+ similar lines still in the original removed..."

Hmmm, OK, I'm thinking those files represent one month's data, since UAH updates monthly, and they manually edit the source code each month.

Which the commentary in the OP actually says, duh: "Most of the programs have array dimensioning and assignments which must be manually updated every month and year"

[I haven't got my head round the processing chain yet. I think there's a lot of intermediate files. Probably, they just have to run this anew for each new month. Which is fine, until you need to re-process all the months... -W]

"I haven’t got my head round the processing chain yet. I think there’s a lot of intermediate files."

Maybe I'll become curious enough to look. Some preprocessing of the data they get from the satellite folks before using it to build the temp reconstruction seems likely, though.

"Probably, they just have to run this anew for each new month"

That's what it looks like, and the code is providing the history of each run. The appearance of the word "monthly" in the output files would seem like a reasonable clue :) The commentary mentioned processing of data through July of 2011, while the file being opened (not commented out) contains "1106" which would seem to be June 2011, so perhaps they meant "through June". One might think that the file *1105* is may and *1104* is april etc.

"Which is fine, until you need to re-process all the months"

Given the number of times errors in their algorithm have been found, you'd think they'd have gotten tired of having to edit, compile and run the program for each historical month everytime they've fixed something! Guess they have too much time on their hands ...

Do you think Spencer read your post since he just revealed his 6.0 version?

[Its a bit of a bizarre co-incidence, isn't it ;-? -W]

And 6.0 seems to be updated enough for climate risk deniers to embrace it again, as it is now in line with interim favourite RSS: No warming for 220 months!!! Hurrah, long live the endless possibilities of sloppy code!

25 years after the rock-ribbed rhetoric of their 1989 Science Op-ed:

"What have been the warming effects, if any, of anthropogenic gases? The typical answer is 0.5'C.

But the answer depends on what time interval is chosen. There was substantial increase in temperature from 1880 to 1940. However, from 1940 until the 1960s, temperatures dropped so much as to lead to predictions of a coming ice age. New, precise satellite data raise further questions about warming. From 1979 to 1988 large temperature variability was recorded, but no obvious temperature trend was noted during the 10-year period."

--R.W. Spencer and J. R. Christy, "Precise Monitoring of Global Temperature Trends from Satellites , "Science 247 (March 30, 1990): 1558

S&C are still afloat on a sea o fermenting spagetti code.

>"S&C are still afloat on a sea o fermenting spagetti code."

RS says
"Finally, much of the previous software has been a hodgepodge of code snippets written by different scientists, run in stepwise fashion during every monthly update, some of it over 25 years old, and we wanted a single programmer to write a unified, streamlined code (approx. 9,000 lines of FORTRAN) that could be run in one execution if possible."

so it is all well structured modern? FORTRAN code now?

[If he ever releases it, we might find out -W]

Neven,

One shouldn't attribute anything to "sloppy code" w/o having seen the code. According to Spencer in his blog, UAH and RSS are in much better agreement now. I am going to assume, until proven otherwise, that, with this UAH revision, RSS has been vindicated from the suspicion of having a cold bias and being an outlier.

The claims by fake skeptics about global warming supposedly stopped x years ago are based on cherry picking and ignorance of statistics. It is unfortunate, if they can abuse two satellite data sets for their unscientific claims now, but this can't be any basis to assume that the data were wrong.

If it comes up somewhere, one should point out why those conclusion drawn about the "pause" for x years lack scientific basis, but not attack the data. (However, one can mention that the UAH version 6.0 isn't backed up with any peer reviewed publication yet, despite major changes in the methodology seem to have been done.)

Regarding RSS, a useful fact is also that the 5-95% range of uncertainty in the global TLT trend due to measurement uncertainty and methodological decisions is rather large, between 0.075 and 0.190 K/decade (for the period 1979-2012).
(ftp://ftp.remss.com/msu/data/uncertainty/percentile_realizations/tlt/tl…)

This uncertainty will likely be even larger for shorter time periods or individual regions.

On the subject of code quality in science more generally, see my long-delayed kvetch here:

http://initforthegold.blogspot.com/2015/05/why-i-am-not-paid-scientist…

And for some substantive advice on the subject see here:

http://arxiv.org/pdf/1407.2905v1.pdf

In email, Steve Easterbrook argued that the elephant in the room is a lack of funding and career path for people who are professional software engineers in science.

True indeed.

Leaving aside questions of my own career, some of the tenure/promotion decisions I've heard of for the most productive scientist/coders have been inexcuseable.

I agree with dhogaza, that this doesn't necessarily impact the validity of the results, but to say it doesn't affect their credibility seems an overstatement.

It also affects recruiting and retention. Working in this sludgy software environment appeals neither to the engineer nor the scientist part of the potential contributor's ambitions.

Engineering has moved on, and despite its importance much scientific software is a computational methods backwater and in some corners even a backwater in ordinary software competence. Climate science is a poster child for this problem.

I'm not surprised to find UAH code is a hideous mess. The community climate models at least have a lot of eyeballs on them, albeit focused on the model fidelity more than its usability.

But it's the nature of software - one-offs from labs are likely to be broken, as defensive coding practices are absent from the social milieu. The pressure is to publish something credible, nbot to publish something correct.

The smaller the user base the more likely the code is to be broken. Getting code right is expensive and there's little institutional motivation for it. Investigators are motivated to minimize/trivialize problems. The fewer the eyes on the code, the more likely errors are to persist. (Corrolary to Linus's Law of Eyeballs: with few enough eyeballs, all bugs are deep.)

That Spencer knows what he is looking for only makes matters worse, of course, but in this he isn't as much an outlier as we might want to believe. Everyone wants their own intuitions confirmed. Much of scientific method is to avoid fooling oneself. But we haven't yet applied that in a systematic and serious way to complicated computations.

[I saw your post, and sympathise. It mirrors some of my own frustrations running the "ported" unified model on workstations and linux clusters. The models and more particularly the scripts, and the build scripts, that surround them, are very environment-sensitive. Or rather, 99% of the script isn't, but its hard to track down and fix the 1%. I would regularly puzzle over the exact compiler options; or the exact version of xargs, that was needed. The irritation (in retrospect) is that much of this was voodoo - it was hard to distinguish between things that just needed a little tweak to make it work, and things that had to be exactly correct to have a chance of working.

Like you (I think) I found the porting to a new system quite fun the first time, but much less fun in subsequent times.

However, and this is something I'd struggle to put into words, I find the entire attitude towards software different in industry. As evidenced by the reactions to http://scienceblogs.com/stoat/2014/12/06/4082/, perhaps -W]

> software different in industry

I read years back one of the Contributors at RC mentioning that the petroleum industry employs competent climatologists who do paleo modeling -- to figure out where sediment accumulated that became petroleum, and where to look for it now -- but rarely publishes.

[I mis-wrote. Its not industry - I bet the petroleum folk have crappy Fortran too - its being in the *software* industry. People just think differently about their code (or at least, a reasonable fraction do, and they're the people that matter) -W]

"Like you (I think) I found the porting to a new system quite fun the first time."

I never enjoyed the voodoo aspect of it, and never could find a decent explanation of Fortran compiler flags, supercomputer queuing scripts, the plethora of MPI libraries, etc. anywhere sufficient to take it out of the voodoo frame for me.

As I said in my plaint, it satisfies neither the programmer impulse nor the scientist impulse to keep making guesses until something seems to work. And as far as I could tell that was what I was expected to do in my recent position.

[It took me a while to realise that it was mostly necessary to find someone who understood the various compiler flags, because some mattered and some didn't; that was frustrating and annoying. and in the case of the UKMO code, it would fail to compile at some optimisation levels and fail to run correctly at others, but this wasn't too important as the higher optimisation didn't gain you much.

I think I could do it all much better now; I failed to realise at the time what the rules were -W]

Now we know why UAH v6 is so late...

Update: mmm

Refs

More like this

Last warning: mustelid.blogspot.com

Dynamics of Stoats

Gunz: constitutionalism and majoritarianism

That it is easier to agree on economics than morality

Morality and economics

Life on the Hi'ialakai: a few photos

At the end of their lives, stars glow hotter than ever!

A Dream Festival Partner: U.S. Department of Energy