Countdown to Y2K38

The Year 2038 problem could begin today. Similar to the Y2K problem, certain operating systems cannot handle dates after about 3 AM Universal Time on January 19th, 2038. If your bank is handling a 30 year mortgage starting today, funny things could happen starting now.

The Y2K problem occurred because the amount of space allocated in computer hardware and software to store the date was insufficient to handle a year greater than 1999. A huge amount of effort and funds were spent in preparation for Y2K. Arguments have been made that the problem was overblown (including the fact that across the globe, countries that spent less time and money on this problem did not have extra difficulties compared to countries where huge efforts and piles of money were spent). Arguments have also been made that the efforts were not overblown. The latter arguments mostly consist of things like “Well, we don’t know if we spent too much money on Y2K, but as a result, you got a shiny new computer, so that’s good, right?”

The Y2K28 problem is similar but different. This problem is caused by the fact that Unix like systems (but also software implementations using the C programming language) tend to use an entity known as time_t to store dates. The dates are counted in seconds since January 1, 1970. Older dates are sometimes represented as negative numbers.

The number that is stored in this place (time_t) is in many cases a 32 bit signed integer. This means no decimal places (integer) but it can be positive or negative (signed). A 32 bit signed integer can have a maximum positive value representing seconds since January 1, 1980, translated into time, of 2038, 03:14:07.

That is on 32 bit systems. How many bits a system has depends partly on hardware and partly on software. Generally, desktop PC’s and certain larger computers use a 32 bit architecture, but increasingly, PC’s are manufactured to use 64 bit architecture, and the software to run on them uses this architecture as well. Using time_t on a 64 bit machine results in a maximum date about 290 billion years into the future. So, even if all of the computers and operating systems switch over to 64 bits, this problem will plague us once again. Eventually.

The Y2K38 problem has already surfaced, once, according to a Wikipedia article.

In May 2006, reports surfaced of an early Y2038 problem in the AOLserver software. The software would specify that a database request should “never” time out by specifying a timeout date one billion seconds in the future. One billion seconds (just over 31 years 251 days and 12 hours) after 21:27:28 on 12 May 2006 is beyond the 2038 cutoff date, so after this date, the timeout calculation overflowed and calculated a timeout date that was actually in the past, causing the software to crash.

In my view, this problem is one of a larger category of problems that relate to the link between hardware, software, and real life for computers. Computers use binary numbers and these binary numbers are usually stored in places that have a fixed storage place linked to the hardware. So, for example, an 8 bit system stores everything in physical places that are either 8 or 16 bits wide (a single or a double space). Within this context, if a decimal place is used or if the number can be signed (positive or negative), some of the storage space for the number is used up.

Binary is different from decimal when it comes to certain calculations. Division does not produce exactly the same results in binary and decimal systems. So, you are working in decimal with almost everything you do, but the computer you use may be translating back and forth between decimal and binary, doing the calculations in binary and giving you converted results. Since binary and decimal systems are different, you can get strange results. For instance, 9 divided by 100 using software that does “real” decimal division is, not surprisingly, 0.09. But software that does not emulate decimal calculations, but rather simply converts the number to binary, does the calculation, and converts the result back to decimal, may give you: 0.08999996. Ooops.

Most computer languages in use today (but not all!) emulate true decimal calculations one way or another. This has a few disadvantages. There is a loss of efficiency in storage space, a loss of speed, and the somewhat more esoteric problem that the solution is a kludge re-implemented by a range of different implementors. Thus, if you write a computer program in one language and “port” it to a different language, or in the same language to a different system, you can’t necessarily be sure that he kludge is working the same way.

It seems to me that the ability to have a number of arbitrary size, sign, and precision, and to be unambiguously correctly manipulated in a commonly used base system (like decimal) should be something that happens very close to the hardware level. That is actually true to some extent now because there are machine components that process the math this way if they are available and if they are used by the software (math processors). But the fact that this hardware may or may not exist and may or may not be used is just more of the same … a kludge.

Rather than fixing Y2K, of Y2K38, or rounding errors in calculations on an ad hoc basis, we need a decimal counting machine.

By the way, I set the scheduled posting time of this blog post at January 19th, 3:14:07 AM to see what would happen. But just to be sure, I used EST, not UTC . I don’t want to take any chances….


Sources

Year 2038 problem. (2008, January 16). In Wikipedia, The Free Encyclopedia. Retrieved 13:18, January 16, 2008, from http://en.wikipedia.org/w/index.php?title=Year_2038_problem&oldid=184617273

Decimal Arithmetic FAQ (IBM)

Comments

  1. #1 R N B
    January 19, 2008

    Very interesting.
    Rather than fixing rounding errors in calculations on an ad hoc basis, we need a decimal counting machine?
    Or we should start using a Hexadecimal system. Or even a base 8 system. At least start getting kids using it in school. Today they only touch upon it in higher maths. But if they started learning it in Kindergarden then even basic arithmetic becomes much easier to learn. They’d still learn decimal but like a foreign language, like we learn Roman numerals, “look at that inefficient counting system they used to use”. But they would never want to go back. It makes sense. Seriously.

  2. #2 Greg Laden
    January 19, 2008

    But we have ten fingers and ten toes!?!?!!?

  3. #3 VladimirS
    January 19, 2008

    fingers are good to count numbers in range -512:511 or 0:1024
    just use binary system. :)

  4. #4 HP
    January 19, 2008

    Shouldn’t that be “Y2.038K”?

    For consistency’s sake, you understand.

  5. #5 Theo Bromine
    January 19, 2008

    I think it should be Y 2K038, i.e consistent with the circuit schematic labeling standard (probably now deprecated) that marks a 4700 ohm resistor “4K7″.

  6. #6 chris y
    January 19, 2008

    Arguments have also been made that the efforts were not overblown.

    Working on legacy corporate systems which in no sense endangered life and limb, but held the potential to cause acute financial embarrassment, I personally identified and corrected about 80 instances of potential Y2K failure. (As far as I know I missed two – that is, there were two failures in early 2000 which were attributable to date miscalculation. Since the rest of the division’s systems were OK by then, I was able to identify and correct them in a few hours.) If you consider how much code there is is the world, I think it’s reasonable to conclude that the issue deserved some attention.

  7. #7 vjb
    January 19, 2008

    theo– the 4K7 type of notation for resistors is used in Europe and Asia, but never has caught on widely here. It doesn’t work so well for 1% resistors, like 2.87K. Should that properly be 2K87? Should a 287K resistor properly be called 287K0? Remember that the color band code is 2 or 3 bands for the value, plus a band for the decimal multiplier (plus a final one for tolerance. Looked at that way, the 47 x 10exp2 = either 4.7K or 4700 could be considered more rational, IMHO.

  8. #8 foo
    January 19, 2008

    The POSIX epoch is January 1, 1970.

    Regardless, it is rare for any system that actually cares about dates to use a 32bit time_t as its range is so limited.

  9. #9 R N B
    January 19, 2008

    I could talk a bit about Unix kernels here (honestly:) but back to the more interesting diversion …

    Yes we have ten toes. But by some measures we have eight fingers, the thumb is sufficiently distinct to warrant a separate moniker. Little children could easily be told to get up to their eight “main” fingers, then call that a group. But I think the many flaws in our biological anatomy do not necessarily justify a particular counting system anyway.

    Preschool kids would still count 1-8 on their fingers, but as with learning written language, a system that has wider applicability needs to be learned in school. And I think it would be easier for them, honestly, it is so much easier to do any arithmetic in your head if we think that way. And the irrational numbers are irrational anyway …

  10. #10 bigTom
    January 19, 2008

    We have the usual scope for confusion about hardware word sizes, HW/SW address space sizes, and data sizes.
    hardware word sizes: most current systems support 32bit and 64bit words. This applies to native load/store instructions on the machine. Software can construct data items that use only a part of a word, or several words.

    HW/SW address space limitations, pointers are things which address the memory in the system. Older OSes use 32bit pointers, 64bit OSes use 64bit pointers (although no memory systems have yet been built to accomodate this much memory). This is what is implied by 32bit/64bit OS.

    HW/SW data sizes: this is the size for individual data items. usually int/longlong choose between 32bit and 64bit integers, and float/double choose between 32/64 bit floating point formats. Data sizes are independent of the particular data size choosen for pointers. Most 32bit OSes will support 64bit integers, and floating point data. The data type for time_t is what matters for Y38K problems.

  11. #11 Chris Hanson
    January 19, 2008

    Damn, just when I thought it was safe to climb out of the bomb shelter. Well, see you in 2100 when this millenium is through with it’s beta testing.

  12. #12 Ktesibios
    January 19, 2008

    @vjb:

    The convention for component values that uses “k”, “M” or “R” as appropriate in place of the decimal point works just fine for 1% components. You simply put the letter where the decimal would be and use the correct number of significant figures.

    For example, 5% and 10% values are specified to 2 significant digits, so 1.2k=1k2, 2.7M=2M7, 0.33 ohms=R33 (or sometimes 0R33,) and so forth.

    1% values are specified to 3 significant figures, so 2.21k=2k21, 33.2k=33k2, 287k=287k (not 287k0 because there are only 3, not 4, significant figures in the value), 41.2 ohms=41R2 and so on.

    No problem!

    This method of notation has the great advantage that when you’re looking at a drawing that’s been through a few generations of photocopying you never have to wonder “is that a decimal point or a speck on the copy machine’s glass”.

    About the old Y2K hysteria:

    Back in ’99 I happened to do a repair on a Crown D150 Amp that belonged to a major local AM radio station. As I was putting it back together I noticed that the back cover bore a sticker certifying that it was Y2K compliant.

    This was a 100% analog piece of gear, designed around the time that the Intel 8080 hit the market. It not only didn’t have anything remotely resembling a CPU, it didn’t even contain a single gate or flip-flop.

    Some “consultant” got paid good money to slap those stickers on equipment for which the Y2K issue was utterly irrelevant. Of course, you couldn’t expect that the suits in the front office would know that. The engineering department would have, but the chances that the PHBs bothered to ask them are the same as the chances that the amp would suddenly stop working on 1/1/2000.

  13. #13 Theo Bromine
    January 19, 2008

    vjb: I did my EE degree in the US, but all my actual electronics work (both paid and hobbyist) has been in Canada, where we tend to be midway between EU and US practices for most things.

    IMO, using 2K87 for a 2.87K, and 287K (trailing “0” not needed) is fine. As I recall, among the justifications for this standard was the intent to align more with SI practices (which would explain why it would not have been popular in the US). But there was also the intent to reduce errors by using the “K” (or M or R, or m, n or p for capacitors) in place of the decimal point, to compensate for poor quality photocopies. If you write 4K7, you will know for sure it is 4700 ohms – 47K might be 47000 ohms, or it could be 4700ohms if it was originally 4.7K but the “.” got lost. (Ah for the lost art of reading circuit diagrams and resistor colour codes…at least I have done my best to pass it on to my sons.)

  14. #14 Anna-Jayne Metcalfe
    January 21, 2008

    FWIW Under Win32 time_t has been typedef’ed to a 64 bit type (_time64_t – good until 23:59:59 UTC on 31st December, 3000) since Visual C++ 7.0 in 2002.

    I’m sure GCC t all have made similar changes by now, so provided the software is in maintenance, the developers are using a reasonably up to date compiler and aren’t hardcoding casts to 32 bit values (as I found in one client codebase a while ago) the probem should not arise.

    If in any doubt, fairly simply testing can establish whether there is a potential issue with a particular piece of softwa.re

  15. #15 Anna-Jayne Metcalfe
    January 21, 2008

    FWIW Under Win32 time_t has been typedef’ed to a 64 bit type (_time64_t – good until 23:59:59 UTC on 31st December, 3000) since Visual C++ 7.0 in 2002.

    I’m sure GCC et all have made similar changes by now, so provided the software is in maintenance, the developers are using a reasonably up to date compiler and aren’t hardcoding casts to 32 bit values (as I found in one client codebase a while ago) the probem should not arise.

    If in any doubt, fairly simply testing can establish whether there is a potential issue with a particular piece of software or not.

    (reposted to correct typos. Please delete the previous post)

  16. #16 Koos Dering
    January 21, 2008

    When I started in 1969 we were quite aware that floating-point units were meant for limited-precision purposes only and as such unsuitable for calculations requiring exact results.
    As our applications without exception were of a financial nature, our computer did not include a floating-point hardware unit (available at extra cost).
    Any arithmetic requiring evaluation of fractions (decimal or otherwise) was done as ordinary fractions, called this way because these (at least in the Netherlands) are taught to
    children before decimal fractions.
    ‘Real’ decimal arithmetic requires a representation like .{3}* for 1/3.
    Apparently the assumption is that numbers like these do not arise in financial applications.
    I remember specifications that amounts were to be rounded to a multiple of 12 cents – to ensure fixed monthly payments.
    With the seconds-based representation of absolute time is another problem associated: if both deriving a relative time and an absolute date is required one or the other needs a (yearly updated) table of leap seconds.

  17. #17 Stuart
    January 21, 2008

    I’ve already run into a similar problem with an Apollo system back in the mid 90’s. From memory, they used a 32bit counter counting from 1 jan 1970, but they counted 4 ticks to the second. I went to do some routine maintenance on the system and after found it would no longer reboot. It turned out that I had worked on the machine 4 days after D day. The only fix that we had was to set the clock back to a year where the dates matched the days as the system was obsolete and there was no more operating system support to fix the problem.

  18. #18 David Marjanovi?
    January 21, 2008

    But software that does not emulate decimal calculations, but rather simply converts the number to binary, does the calculation, and converts the result back to decimal, may give you: 0.08999996. Ooops.

    Oh, so that’s what Mesquite is doing…

  19. #19 OMEGA_RAZER
    January 21, 2008

    Has no one else noticed this…

    The Y2K28 problem

    seconds since January 1, 1970
    seconds since January 1, 1980

  20. #20 Ware
    January 21, 2008

    The other reason we may have an issue soon is due to rollovers that was used as one way to solve the Y2K problem.

    What happens in 2011 (or 2021) when they enter a 30 yr mortgage and the system uses 1941 (or 1951) maturity date when they enter 1/1/41 (or 1/1/51) since the rollover was set at 40 (or 50)?

  21. #21 BJ
    January 21, 2008

    Geeese!… you people make a simple thing…HARD!

    10 is a perfect number. Get over it.

    For those who can’t remember how to read resistor color codes…here’s how a Navy Wave instructor taught us…

    “Bad Boys Rape Our Young Girls Behind Victory Garden Walls…Get Some Now”

    Translation for the slow: Black, Brown, Red, Orange, Yellow, Green, Blue, Violet, Gray, White…. Gold, Silver, None

    Y2K was a be laugh! People chasing spooks for money. I never saw a single failure because of a clock error in the PC world. And, most heard of problems were caused by the programmer himself. He should be fired and replaced by a competent one.

  22. #22 RDW2
    January 21, 2008

    BJ,

    >Y2K was a be laugh! People chasing spooks for money.
    Far from it! I was working the Y2K remediation and there were some scary scenarios that were prevented by the work I did . . . and I wasn’t working on things that were nearly as critical as some.

    >I never saw a single failure because of a clock error in the PC world.
    Golly, I guess that means that the bigger boxes and more critical applications were all safe, too, huh? No . . . what that really means is that you had a newer computer and that you don’t do much of anything that is critical to the world’s economy.

    >And, most heard of problems were caused by the programmer himself. He should be fired and >replaced by a competent >one.

    As a matter of fact, it often wasn’t the programmer but the manager who made the decisions and, since many of the decisions were made in the 60’s, 70’s, and 80’s, many of the programmers and managers were retired! However, more directly to the point of your comment, it is also obvious that you don’t develop applications for a living or buy hardware for a company.

    Many of us who worked on Y2K remediations are all to aware of the fact that decisions were made that could soon come back to haunt us. There were “sliding windows” used to determine whether something was referencing a 20th century date or a 21st century date. Some of these, as indicated in the article, used cut-off years of 1930 or 1940; however, there were systems that I helped remediate where management made the decision that the cut-off date would be 1920, 1960, 1980, or even 1990. As was shown during Y2K, code lingers as long as it works and I wonder how much of the code with cut-off dates of 1940 will still be around when it starts impacting the businesses . . . 2 more years and mortgages will start crossing that boundary.

    Chris Y.,
    Good on ya’, Mate! If you had as much “fun” as I did, you worked some long hours! ;-)

  23. #23 Jim L
    January 21, 2008

    I was using VisualStudio 2008 last week and noticed that time_t is now defined as an __int64 by using __time64_t. Yes, this is even in 32 bit development. This makes me happy, but I know that the software I am writing today will not be in service 30 years from today. The market for the product that I create plans to move from C++ to C#. It will be good to clean up the sloppy code base when we do the refactor.

  24. #24 Trevortni
    January 21, 2008

    Well, we obviously can’t switch over to any kind of power of 2 counting system. Just think how embarassing that wouldbe for all those people that have been working so hard for so long to convince us that the metric system is inherently better because the math is easier. If we suddenly switched to octal or hexadecimal, they would have to find new reasons to justify a system that leaps from too small to too big for normal uses, instead of the inherently more intuitive English system that puts all the relevant measurements at exactly the right place to be usable!

  25. #25 Epistaxis
    January 21, 2008

    R N B, #1:

    Or we should start using a Hexadecimal system. Or even a base 8 system. At least start getting kids using it in school. Today they only touch upon it in higher maths. But if they started learning it in Kindergarden then even basic arithmetic becomes much easier to learn. They’d still learn decimal but like a foreign language, like we learn Roman numerals, “look at that inefficient counting system they used to use”. But they would never want to go back. It makes sense. Seriously.

    We tried that 40 years ago (50 in octal). It was called “New Math,” it was supposed to defeat communism, and it was a flop. Maybe it didn’t have to be a flop, but my point is that not enough of those people are dead yet to try it again.

  26. #26 Trevortni
    January 21, 2008

    I thought New Math failed because it emphasized “getting the right idea” over “getting the right answer.”

    “Hooray for New Math, New-ew-ew Math.
    It won’t do you a bit of good to re- view math
    It’s so simple,
    So very simple,
    That only a child can do it!”

    – Chorus to “New Math,” Tom Lehrer
    (Contains a subtraction problem which is worked in both base 10 and base 8.
    Also does some subtraction wrong, “but that’s ok – the idea’s the important thing”)

  27. #27 Victor DeCurtis
    January 21, 2008

    If I’m still alive, 30 years from today, I’ll be 93, going on 94. I’ll probably be drooling, wetting myself, soiling myself and generally being a nasty pain in the AZS. Now, I have something else to look forward to. Thanks a lot!<:d>

  28. #28 Oban
    January 21, 2008

    Well, well, well. Too many noise around this little issue, while in 2036 a some kind of big 25-tons meteor named Apophis will potentially crash on earth and fade away any risk of computer date miscalculation two years later…
    How do we code April 13th 2036 in a 32 bit pointer ?

  29. #29 Despard
    January 21, 2008

    I don’t think Apophis is that big an issue any more.

    http://en.wikipedia.org/wiki/99942_Apophis

  30. #31 BJ
    January 21, 2008

    Sorry Chris,
    #1 I used the same boxes from 1998 – 2004. They had no problem. So did many dozens and dozens of my clients.

    #2 I happen to be a professional programmer working on systems for a major Wall Street Mortgage Investment Company. Guess that doesn’t fit your important critical economy application?

    #3 As I said I am a professional programmer and have been in the industry since 1975. Even today, I still work full time as a professional programmer using .NET technologies and SQL Server.

    #4 I owned my on computer business for 20 years where I built systems to spec and provided accounting software to some of the largest corporations. I believe dates counted to them.

    #5 Yes there were tricks we used to evade the Y2K fiasco. Smart programmers saw this and used their bag of tricks. Laissez-faire programmers caused a problem for their employer by not acting in the 90’s.

    #6 I’m with Victor.

Current ye@r *