Open Source Dendrochronology

i-3e3b88fbe1554cb60942f42b64e27578-dendro.jpg

Dendrochronology is the study of tree-rings to determine when and where a tree has grown. Everybody knows that trees produce one ring every year. But the rings also vary in width according to each year's local weather conditions. If you've got enough rings in a wood sample, then their widths form a unique "bar code". Collect enough samples of various ages from buildings and bog wood, and you can join the bar codes up to a reference curve covering thousands of years.

Dendrochronology has a serious organisational problem that impedes its development as a scientific discipline and tends to compromise its results. This is the problem of proprietary data. When a person or organisation has made a reference curve, then in many cases they will not publish it. They will keep it as an in-house trade secret and offer their paid services as dendrochronologists. This means that dendrochronology becomes a black box into which customers stick samples, and out of which dates come, but only the owner of the black box can evaluate the process going on inside. This is of course a deeply unscientific state of things. And regardless of the scientific issue, I am one of those who feel that if dendro reference curves are produced with public funding, then they should be published on-line as a public resource.

But there is a resistance movement: amateur dendrochronologists such as my buddies Torbjörn Axelson and Ãke Larsson. They practice open source data transparency on the net, which means that arguably amateur dendrochronology is at this time more scientific than the professional variety. Torbjörn recently published a fascinating study of a wooden building in Dalecarlia, showing that it was originally built in ~1240 and then refurbished in the 1490s, the 1570s and the 1830s. If you doubt his results, then just re-run the analyses. His data are all there in the report. I came into contact with Torbjörn through Aard and then had the pleasure of seeing his 2007 debate piece into print in Fornvännen, introducing many other professionals to the issue. (English translation here.)

Now the guys have started a wiki and dug into Alf Bråthen's work. Alf (born in 1924) is an old-school non-computerised dendrochronologist who hasn't generally published his reference data. He once famously dropped nine years from his pine curve for Gotland, causing Johan Rönnby to unwittingly publish an erroneous date for the Bulverket lake settlement in his 1995 PhD thesis. Torbjörn and Lars have re-analysed some data from Västergötland that Alf has released, and whaddaya know: there are errors there too.

The point here isn't to say that Alf Bråthen is an unusually clumsy dendrochronologist. He probably isn't. The point is that all scientists are fallible, and that science moves forward through discussion. We help spot each other's mistakes, and once an issue has been wrangled over and everybody's had a chance to check the details, we have more secure knowledge than can ever be attained by a few people tending their black box. But as long as data and methods stay in that box, the process is impeded.

Update 24 June: Here's something really scary. Says Torbjörn,

The most problematic issue from an archaeological perspective may be that the Hohenheim data (Germany), which forms a lot of the basis for radiocarbon calibration, is not open to inspection. Nobody outside the "circle of the initiated" has seen the measurement series nor the average chronologies. We know that European dendrochronologists have often used the Gleichlaufigkeit method (GLK), which is not dependable, but which places great demands on the skill of the analyst. If you run GLK "blindly" you can end up anywhere. Anybody who says so in public, though, is likely to make few friends.

More like this

Holy crap. I've been doing archaeology for 35 years (since I was a babe) and I had no idea this was an issue. (Well, I've done only a tiny bit of work in any place where dendro is used.)

The tree rings are everybody's. Free the tree rings!!!!! Seriously.

i've never done archaeology at all, but i've done information sciences (after a fashion - computer programming) and open source (ditto). i'm flabbergasted that anybody ever tried to do closed-source dendro; that seems all wrong for the problem domain to me.

i'm assuming that dendrochronology is dependent on a number of different variables; i would imagine general region where the tree(s) grew, species of tree, maybe soil type? so you'd need a large number of samples just to match up currently living trees from a widely distributed region, am i right? then repeat as necessary to go back in time, i'd presume. lots and lots of data gathering needed, with geographical locations and tree species specified for each sample, if i haven't misunderstood the task entirely.

and then you'd have to correlate all that data with itself. reduce each sample to a series of relative ring thicknesses, possibly with a date anchor for one or more of the rings to fix it in time, and certainly with metadata regarding GIS and species; then compare these lists one against another (within a given region, correcting for differences between tree species, etcetera) to try and infer time fixes for rings belonging to much older samples. i assume, based on what little i've heard of dendrochronology. (don't chide me too hard for making all these uninformed assumptions about dendrochronology, please. i'm a programmer --- i build inferences about other people's professions all the time, that's how i write code to help them with their jobs.)

anyway, the point of it all is that this seems like it would be a big undertaking for any single person, and a large one for an organization. rooting out the inevitable bugs and errors in the data set (and the conclusions drawn from it) would be hellish work. gods help you if more than one person or organization tried to do it that way; they'd each have to duplicate all the others' work. but it's perfect for an open source effort; multiple data sources all sharing one another's efforts, many eyes making the bugs shallow. folks could become domain specialists in little parts of the data set, serving as caretakers of their favorite sub-problems, yet with open publishing still get the benefit of peer review and cooperative bug fixing.

in fact, it sounds a little like what i understand the geneticists and molecular biologists have with their gene sequence databases and open source software tools to search them. if anything, it smells to me like an easier problem to solve, software-wise.

and now you're telling me the "professionals" tried to do it with all proprietary data, closed-source analysis? gimme an effin' break.

By Nomen Nescio (not verified) on 23 Jun 2009 #permalink

What you tend to do is construct a reference curve specified as to tree species and region, such as BrÃ¥then's curve for pine from the Swedish island-province of Gotland. It isn't actually a superhuman amount of work to do a 2000-year curve once you have the samples. With the overlap between the samples, you need to measure about 3000 ring widths and tabulate them. Then you reformat the data from millimeters to percentage-of-preceding-ring (though Ãke prefers the percentage of the average of the two preceding rings).

Well, for a working 2000 years chronology you may need at least 20000 single ring width measuremets, which gives a sample depth of 10. Practically such a chronology will more likely have 25-100 samples depth for at least some parts of the timespan, wich means that you will have probably about 100 000 single ring width measurements from maybe 500-1000 wood samples behind such a chronology (and a lot of refused ones not in count). But that is still not a supehuman amount of work, if you have a computer and proper equipment. The problem will be to to find the more than 1000 years old samples to measure. That will not be possible for any singel person. That is probably the main reason why those measurment data sets are locked in inside the walls of many institutions. And therefore it is important that the wood samples from archaeological investigations are kept by the museums in the same way as other finds. Because as long as the samples are available, the measurements may be recreated and released openly, and independent, open and transparant chronologies may be created based on them.

By T. Axelson (not verified) on 24 Jun 2009 #permalink

Hi Martin,

What you say may be true with respect to for-profit dendroarchaeological applications (I've heard similar things said before), but this is certainly not accurate for dendrochronology as a whole. Quite a large amount of tree-ring data from North America and elsewhere are available from the ITRDB at NOAA:

http://www.ncdc.noaa.gov/paleo/treering.html

There are also European data repositories.

Your description in #4 above also doesn't accurately convey the process nor quantity of raw data that go into building a master chronology. As Torbjörn indicates in #5, the number of individual rings in a master chronology can be tens or hundreds of thousands. And it is not simply a matter of measuring them. Rather, it is standard practice for dendrochronologists to go through a process of visual, graphical, and statistical pattern matching called cross-dating to ensure each ring in every sample is assigned to its correct year of formation.

More information here at the Ultimate Tree-Ring Web Page:
http://web.utk.edu/~grissino/

cheers,
Kevin

Hi Kevin,
I do agree. And my point is that it is necessary that those people and laboratories dealing with data useful for dendroarcheology in Europe would better become as open with data as those dealing with dendrodata useful mainly for e.g dendropaleoclimatology often use to be. A search for European Quercus before AD 1000 at ITRDB will indeed not give very many hits...

By T. Axelson (not verified) on 04 Jul 2009 #permalink

There is an enormous amount of data available, both raw ring width measurements (and in some cases earlywood, latewood, isotopes, etc.) from various sources, notably the Internat. Tree-Ring Data Bank maintained by the Nat. Oceanic and Atmospheric Administration (NOAA), including large amounts of European data. In a perfect world all data would be freely available. Some dendrochronologists, however, are not publicly funded. Their argument is, "this data is how I make my living, if I give it away I'll starve." Hard to argue with that.

Scientists who are publicly funded should be made to not only archive their data, but curate their samples. Their are cases where that was not done and valuable scientific material has been lost forever. An example is the virgin teak samples used by Berlage. Modern sampling of younger teak has shown that he misdated the wood, but it is no longer available to make the necessary corrections. Nor are his ring width measurements available.

An example of European scientists, publicly funded, who have not made their data available are Baillie and Pilcher at Belfast. They have a very long oak chronology, but have never contributed their data to any repository. Some of the German dendrochronologists may also fall into this category.

Public funding agencies should get tough and demand that data be made publicly available, and that includes universities. Unfortunately, university administrators are usually clueless about such things.

By Malcolm Cleaveland (not verified) on 24 Feb 2011 #permalink

There are some factual errors in your statement. First, I do not think that the Hohenheim data has been used for C-14 calibration. The main calibration dendro resources are the bristlecone pine and other species from the southwestern U.S., including Douglas-fir, foxtail pine, etc. In addition, varved sediments in the anoxic Curiacao (sp?) Basin have been used. You ignore the importance of crossdating, saying it is known that trees form a new ring every year. A.E. Douglass proved that "what everyone knows" is not necessarily true, hence the use of crossdating to account for missing and false rings. In a project to date musical instruments I examined a number of chronologies from Italy, France and Spain. Some definitely had problems, mainly poor replication at critical points, but there were definitely excellent chronologies. There is an enormous amount of data in the ITRDB from N. and S. America, Europe and Asia. There is some from Africa, but not as much.

The tree-ring section of the IntCal09 calibration curve uses in principle the data sets also present in IntCal04. These are derived from Pacific Northwest Douglas fir, Californian sequoia, Alaskan sitka spruce, German oak and pine (Hohenheim) and Irish oak (Belfast). It's all available here: http://www.radiocarbon.org/IntCal09%20files/IntCal09_atm_rawdata.csv
The marine extension towards older times is based on corals and foraminifera varved sediments.
The Hohenheim data is unpublished, and so was the Belfast data until april 2010.

By Petra Ossowski… (not verified) on 02 Mar 2012 #permalink

I've spent quite some time today on analyzing the IntCal09 list of samples for tree rings
http://www.radiocarbon.org/IntCal09%20files/IntCal09_atm_rawdata.csv
for the period roughly 500 AD to BC 1500 (BP 1500-3500).

First, there are no Bristlecone pine within this list!
American data comes from recent trees and older back to BP 2089 with California Sequoia in the oldest section.

The older section of Intcal09 (before BP 2089) contains ONLY European tree ring data - back to BP 12549!

The oldest section is based on German Pine with samples marked "Hd" (Heidelberg?).
From BP 10059 towards younger times there is a mix of German and Irish oaks. That's all!