Now on ScienceBlogs: The Australian's War on Science 41

Seed Media Group

Good Math, Bad Math

Finding the fun in good math; Shredding bad math and squashing the crackpots who espouse it.

Search

Profile

markcc.jpg
Mark Chu-Carroll (aka MarkCC) is a PhD Computer Scientist, who works for Google as a Software Engineer. My professional interests center on programming languages and tools, and how to improve the languages and tools that are used for building complex software systems.

Donors Choose

Other Information

Add this blog to my Technorati Favorites!

Recent Posts

Recent Comments

Categories

Blogroll

Old Topic Indices

Great Online Books

« Rounding and Bias | Main | Tax Thresholds: Why the horror stories about the Obama tax plan are lies »

Basics: Significant Figures

Category: BasicsNumbers
Posted on: March 4, 2009 8:55 PM, by Mark C. Chu-Carroll

After my post the other day about rounding errors, I got a ton of requests to explain the idea of significant figures. That's actually a very interesting topic.

The idea of significant figures is that when you're doing experimental work, you're taking measurements - and measurements always have a limited precision. The fact that your measurements - the inputs to any calculation or analysis that you do - have limited precision, means that the results of your calculations likewise have limited precision. Significant figures (or significant digits, or just "sigfigs" for short) are a method of tracking measurement precision, in a way that allows you to propagate your precision limits throughout your calculation.

Before getting to the rules for sigfigs, it's helpful to show why they matter. Suppose that you're measuring the radius of a circle, in order to compute its area. You take a ruler, and eyeball it, and end up with the circle's radius as about 6.2 centimeters. Now you go to compute the area: π=3.141592653589793... So what's the area of the circle? If you do it the straightforward way, you'll end up with a result of 120.76282160399165 cm2.

The problem is, your original measurement of the radius was far too crude to produce a result of that precision. The real area of the circle could easily be as high as 128, or as low as 113, assuming typical measurement errors. So claiming that your measurements produced an area calculated to 17 digits of precision is just ridiculous.

As I said, sigfigs are a way of describing the precision of a measurement. In that example, the measurement of the radius as 6.2 centimeters has two digits of precision - two significant digits. So nothing computed using that measurement can meaningfully have more than two significant digits - anything beyond that is in the range of roundoff errors - further digits are artifacts of the calculation, which shouldn't be treated as meaningful.

The rules for significant figures are pretty straightforward:

  1. Leading zeros are never significant digits. So in "0.0000024", only the "2" and the "4" could be significant; the leading zeros aren't.
  2. Trailing zeros are only significant if they're measured. So, for example, if we used the radius measurement above, but expressed it in micrometers, it would be 62,000 micrometers. I couldn't claim that as 5 significant figures, because I really only measured two. On the other hand, if I actually measured it as 6.20 centimeters, then I could could three significant digits.
  3. Digits other than zero in a measurement are always significant digits.
  4. In multiplication and division, the number of the significant figures in the result is the smallest of the number of significant figures in the inputs. So, for example, if you multiple 5 by 3.14, the result will have on significant digit; if you multiply 1.41421 by 1.732, the result will have four significant digits.
  5. In addition and subtraction, you keep the number of significant digits in the input with the smallest number of decimal places.

That last rule is tricky. The basic idea is, write the numbers with the decimal point lined up. The point where the last significant digit occurs first is the last digit that can be significant in the result. For example, let's look at 31.4159 plus 0.000254. There are 6 significant digits in 31.3159; and there are 3 significant digits in 0.000254. Let's line them up to add:

    31.4159
  +  0.000254
-------------
    31.4162

The "9" in 31.4159 is the significant digit occuring in the earliest decimal place - so it's the cutoff line. Nothing smaller that 0.0001 can be significant. So we round off 0.000254 to 0.0003; the result still has 5 significant figures.

Significant figures are a rather crude way of tracking precision. They're largely ad-hoc. There is mathematical reasoning behind these rules - so they do work pretty well most of the time. The "right" way of tracking precision is error bars: every measurement has an error range, and those error ranges propagate through your calculations, so that you have a precise error range for every calculated value. That's a much better way of measuring potential errors than significant digits. But most of the time, unless we're in a very careful, clean, laboratory environment, we don't really know the error bars for our measurements. Significant digits are basically a way of estimating error bars. (And in fact, the mathematical reasoning underlying these rules is based on how you handle error bars.)

The beauty of significant figures is that they're so incredibly easy to understand and to use. Just look at any computation or analysis result described anywhere, and you can easily see if the people describing it are full of shit or not. For example, you can see people claiming to earn 2.034523% on some bond; they're not, unless they've invested a million dollars, and then those last digits are pennies - and it's almost certain that the calculation that produced that figure of 2.034523% was done based on inputs which had a lot less that 7 significant digits.

The way that this affects the discussion of rounding is simple. The standard rules I stated for rounding are for rounding one significant digit. If you're doing a computation with three significant digits, and you get a result of 2.43532123311112, anything after the 5 is noise. It doesn't count. It's not really there. So you don't get to say "But it's more than 2.435, so you should round up to 2.44.". It's not more: the stuff that's making you think it's more is just computational noise. In fact, the "true" value is probably somewhere +/-0.005 of that - so it could be slightly more than 2.435, but it could also be slightly less. The computed digits past the last significant digit are insignificant - they're beyond the point at which you can say anything accurate. So 2.43532123311112 is the same as 2.4350000000000 if you're working with three significant digits - in both cases, you round off to 2.44 (assuming even preference). If you count the trailing digits past the one digit after the last significant one, you're just using noise in a way that's going to create a subtle upward bias in your computations.

On the other hand, if you've got a measured value of 2.42532, with six significant figures, and you need to round it to 3 significant figures, then you can use the trailing digits in your rounding. Those digits are real and significant. They're a meaningful, measured quantity - and so the correct rounding will take them into account. So even if you're working with even preference rounding, that number should be rounded to three sigfigs as 2.43.

Share this: Stumbleupon Reddit Email + More

Comments

1

Thanks for this explanation. I guess my objection to your last post was that I was thinking of a situation like you described in your last paragraph here, when you were describing the rounding rules for a situation akin to that described in the penultimate paragraph here.

Posted by: The Science Pundit Author Profile Page | March 4, 2009 9:44 PM

2
If you're doing a computation with three significant digits, and you get a result of 2.43532123311112, anything after the 5 is noise. It doesn't count. It's not really there.

Why? I mean, isn't rounding to 2.44 more likely to yield an answer close to the true value? What makes 2.43 (assuming you favor odds in the last sig fig) a better approximation of what the error bars do than 2.44?

And for that matter, aren't error bars just an approximation for what we should "really" be using, which is probability distributions over all possible values?

Posted by: Ed Author Profile Page | March 4, 2009 10:19 PM

3

Have you seen Chris Mulliss's work on this topic? He ran a bunch of monte carlo trials with different sig fig methods, and found that for mult/division and exponentials, the standard methods are less accurate than other methods. Specifically, for those operations, he recommends adding one digit onto the least significant argument of the input terms.


Here's his results for the standard method:
http://www.angelfire.com/oh/cmulliss/standard_rounding_rules_summary.htm

And for his "improved" method:
http://www.angelfire.com/oh/cmulliss/recommended_rounding_rules_summary.htm

Posted by: Juneappal Author Profile Page | March 4, 2009 10:55 PM

4

Error bars and by association significant digits, have always made me quite uneasy. Maybe you can shed some light Mark. My perplexion can be summarized as follow: What well stated and correct math construct are they an approximation of?

It seems to me that the problem is a result of the fact that we are thinking at the interface between theoretical real numbers and real world measurements along with the fact that most real numbers cannot precisely be represented in an finite amount of time and space. In a theoretical setting, when not using error bars or significant digits the standard practice with real numbers seems to be to write down a bunch of digits after the decimal and then assume it represent the same thing as if we had followed by an infinite number of zeros, an assumption which also makes me quite uneasy.

But what happens in the real world where we must manipulate numbers that don't have infinite precision? We can set intervals and follow the rules but what do these intervals represent? Is it a limiting bound that "true" values are assumed never to cross? Is it a measure of variance, an interval that signals that a certain proportion of the samples are known to be within? If so, can we assume a central tendency to the distribution of the samples? And if so, wouldn't it make sense to keep more digits to know where the center of the tendency should be? Otherwise aren't we trowing out information? But then how many digits should we keep?

Furthermore, the numbers used to represent the intervals, should they have a confidence interval too? Recursively?? How many digit should we write on each numbers in this example: 3.56 +-0.56+- 0.045 +-...? What mathematical principles govern all of this?

My laymen intuition is that there is an information theoretic explanation to it all. That it might have to do with the diminishing returns on information content of extra digits in the face of rough measurements. A kind of criterion meant to save our efforts which justifies not bothering with too many digits. I am not a mathematician but this has been quite the enigma for me and I really wish someone would point me towards some insight. I really feel like I am missing a fundamental part of mathematics that is key to understanding the relation between real numbers and the real reality. Does anyone have a clue?

Posted by: BenE Author Profile Page | March 4, 2009 11:18 PM

5

Good Post! I was taught these rules at least 6 years ago, in my basic high school science courses. Now as an engineering student I am constantly amazed at how few of my classmates know them. They'll leave things with 5 or 6 sig figs when they were only give a measurement with 2, and if you ask them why they did they'll give you a blank stare. I don't think this really gets taught to kids in public school, at least in AZ (I was in a good private school in HS), and I know no one ever bothered to explain it to us as freshmen.

Posted by: Uncephalized Author Profile Page | March 5, 2009 12:07 AM

6
The real area of the circle could easily be as high as 128, or as low as 113, assuming typical measurement errors.

Of course these doesn't stop some atheists from criticizing 1KI 7:23, which describes the circumference of a circle measuring 10 cubits as being 30 cubits.

They're largely ad-hoc. There is mathematical reasoning behind these rules - so they do work pretty well most of the time.

Again, I pretty much agree with you. However, you should take a look at sigma-delta modulators (SDMs). These devices can, for example, use a single bit resolution analog to digital converter to come up with a measurement that has many more bits of resolution, e.g., 12 bits. And the 1-bit A/D converter doesn't even have to be all that balanced to get good results. I have a basic grasp of SDMs, but still have difficultly explaining to others how taking a low-precision measurement repeatedly can generate high resolution estimates. I can convince myself from time to time, when I decide to look at it again, but I soon forget the details.

Posted by: William Wallace Author Profile Page | March 5, 2009 12:11 AM

7

replace "these doesn't" with "these observations don't"
replace "measuring 10 cubits" with "having a 10 cubit diameter"

Posted by: William Wallace Author Profile Page | March 5, 2009 12:15 AM

8

Typo: "will have on significant digit;" - should probably be "will have one significant digit;".

Posted by: Alex Besogonov Author Profile Page | March 5, 2009 2:26 AM

9

I agree with #3 and that's what I was always taught to do. Include the first "insignificant" digits from the imput in the calculation and then round the result to the correct number of significant ones.

I also agree with #5. I'm sick of seeing papers in peer reviewed medical journals with ridiculous degrees of spurious precision. It's something I pick up when I referee but clearly others don't.

Posted by: regordane Author Profile Page | March 5, 2009 3:35 AM

10

What about the distinction between accuracy and precision? I was thaught that these are two different things. The precision is the number of digits used, regardless of whether it's justified. In Mark's example, 2.034523% would have a precision of six decimal places, but likely an accuracy much less than that (and therefore the precision used is not justified).

Posted by: Kristian Z Author Profile Page | March 5, 2009 3:39 AM

11

I agree with some of the commenters before me that significant digits (or error bars, of which "significant digits" are just one example) are a very weird approximation of the concept of uncertainty.

I would expect in most cases that the "real" value is well modeled by something like a Gaussian distribution around the measured value (with as much precision as possible, no reason to "drop digits" there). "Significant digits" or error bars suppose a uniform distribution within a given interval.

When you do your calculations using the distributions (assuming they are symmetric) your result is a distribution centered around the value you get when calculating using the centers of your initial distributions (if I'm not mistaken).

Dropping digits at any point of your calculation before obtaining the final result just arbitrarily shifts the centers of your distributions. Do you have any reason to believe that this would improve the model you're using?

If you obtain 2.43532123311112 as the center of your distribution, why would you possibly shift it to 2.435 and then suddenly claim that that's as close to 2.43 as to 2.44? It clearly is not. It may be (depending on the shape of your probability distribution) that 2.43 is almost as probable as 2.44, but arbitrarily shifting your distributions around is certainly not the best way to come to that conclusion.

Posted by: Jens Author Profile Page | March 5, 2009 6:09 AM

12

On your point 4 about multiplying 5 by 3.14, I would disagree, as you are assuming that the 5 is a rounded value. It may however be exact.
Imagine for example changing a 5 dollar bill to spome currency where you got 3.14 to the dollar. Then an answer of 15.70 would make perfect sense.
I know you've tried to make the post simple, but context is very important here, and there are other times when you've over-simplified.
Writing as a statistician, I'd say that if you found the mean of about 100 integer values, it would be quite reasonable to give the answer correct to 2 decimal places, which could well be a value with 2 or more significant figures than the original values. Similar results would occur in almost any statistical computation, from Standard Deviation onwards. I would agree that there are many people who give values to a completely spurious level of accuracy; I saw a case recently when something like £20000 in 1870 was said to be equivalent to 1234567.89 present day pounds.

Posted by: misterjohn Author Profile Page | March 5, 2009 11:50 AM

13
Imagine for example changing a 5 dollar bill to spome currency where you got 3.14 to the dollar. Then an answer of 15.70 would make perfect sense.
It would also make perfect cents.

Posted by: pjb Author Profile Page | March 5, 2009 6:12 PM

14

On your point 4 about multiplying 5 by 3.14, I would disagree, as you are assuming that the 5 is a rounded value. It may however be exact. Imagine for example changing a 5 dollar bill to spome currency where you got 3.14 to the dollar.

If the 5 is exact, then you aren't multiplying 5 by 3.14. You are multiplying 5.0000000...(with infinite significant zeroes) by 3.14.

If the exchange rate is exactly 3.14 (e.g. if the mystery currency divides into even hundredths and this a real cash transaction, not an electronic exchange where you can have portions of the smallest unit of currency), then you are really multiplying 5.0000000... by 3.1400000000..(again with infinite sig figs). The result in this case has infinite sig figs, but would probably be listed as 15.70 because we don't need to know about fractions of the smallest unit of currency when counting out cash.

But the original statement is correct. When scientists are using sig figs, saying you multiply 5 by 3.14 means the 5 is measured to 1 sig fig.

Posted by: Todd P Author Profile Page | March 6, 2009 12:18 PM

15

Those of us who grew up using slide rules to do calculations understood these principles fairly well. Another skill we learned was to estimate the magnitude of our calculation first, so that we didn't put the decimal in the wrong place. It's amazing how first calculators then computer spreadsheets led to such ridiculous claims of "accuracy". I can't remember how many times I had to review this with the talented young engineers from good schools that I managed, but there's obviously something missing in the way we teach simple mathematical concepts these days.

Posted by: MJM Author Profile Page | March 7, 2009 2:42 PM

Post a Comment

(Email is required for authentication purposes only. On some blogs, comments are moderated for spam, so your comment may not appear immediately.)






Stats

ScienceBlogs

Search ScienceBlogs:

Go to:

Advertisement
Follow ScienceBlogs on Twitter
Visit the Collective Imagination blog
Advertisement
Enter to win

© 2006-2009 Seed Media Group LLC. ScienceBlogs is a registered trademark of Seed Media Group. All rights reserved.

Sites by Seed Media Group: Seed Media Group | ScienceBlogs | SEEDMAGAZINE.COM