Troubled Time

Posted on May 30, 2018 by Michal Necasek

This is not an article about current affairs

Over the last few weeks, I had several interesting run-ins with time, specifically how time is represented and processed by computers. Deep down it’s really all about a clash of human culture and history with physical reality.

At one extreme, there is local time, with noon exactly when the Sun is highest in the sky. Different depending on where you are, and exactly how humans worked with time throughout most of recorded history. That approach works very well as long as people and information can’t move much faster than about the speed of a horse. The 19th century introduced train travel and telegraph. If one sat on a train and started going eastwards or westwards, it didn’t take long for a pocket watch to get increasingly out of sync with local time. To solve that problem, and make it possible to maintain and publish usable schedules, time zones were introduced.

To solve a different problem, or perhaps cause more problems, the 20th century introduced daylight savings time. To cause maximum pain to computer scientists, daylight saving is not observed universally and is not constant. A real winner in this category is probably Egypt’s 2016 cancellation of daylight saving time three days before it was due to begin.

To communicate over longer distances, computers are forced to agree on a common definition of time. That is the other extreme: UTC, or Universal Coordinated Time, which conveniently doesn’t know any time zones or daylight saving and is the same everywhere on Earth (modulo relativity effects).

Sadly, computer users only care about local time, which means computers have to convert between local time and UTC all the time. That is merely complicated when that time is “now”, hideously difficult when the time is in the past, and impossible when the time is in the future.

A Bit of History

Computers are old enough that when they started out, only local time was used. The Motorola MC146818 real-time clock (RTC) chip used in the IBM PC/AT (1984) perhaps represents the pinnacle of the old thinking. The chip has built-in calendar and automatically advances minutes, hours, days, months, and years as it counts seconds. Not only that, it can automatically (optionally) adjust for daylight saving time, starting on the last Sunday in April and ending on the last Sunday in October.

The Motorola RTC was designed around 1983, when the Uniform Time Act of 1966 had been in effect for more than 15 years. It’s too bad that in 1986, the daylight saving functionality of the chip was already destroyed by moving the daylight saving time beginning to the first Sunday in April.

At any rate, an IBM PC/AT running DOS used strictly local time. All dates and times were recorded in local time; moreover, there was no system-provided way to set the local time zone. As a consequence, the operating system has no chance to determine UTC.

In a completely unrelated development, wide area networking was becoming more and more common. With the arrival of technologies like e-mail and usenet, local time was no longer adequate. How do you chronologically sort a set of e-mails that were sent around the same time but from different time zones, or with different daylight saving time? You don’t—not without knowing the corresponding time zones (and therefore, UTC offsets).

Microsoft’s Bad Influence

In the world of PCs, the initial PC/AT architecture led to poor design choices which to some extent persist to this day. As mentioned above, the PC/AT RTC was meant to be run in local time. The problem is that PC-compatible firmware provides no way to record or indicate a) the local time zone, and b) daylight saving time.

That leads to minor but completely unsolvable problems related to daylight savings. As mentioned above, the RTC cannot handle actual daylight saving time (DST) changes according to current rules, therefore the clock must be run with DST off. The unavoidable consequence is that when the system is powered up in the DST transition window (typically one hour in Spring and one in the Fall), the OS simply cannot determine what time it is because it has no way to tell if DST has been already applied to the RTC or not.

That is in theory not an unsolvable problem, after all the RTC’s non-volatile RAM could easily store this information—if all firmware vendors could agree on how. But they never did, because it wasn’t their problem, it was the OS vendors’ problem.

The issue is exacerbated if multiple operating systems are installed on the machine, each correcting for DST, as is the default with Windows. Each OS keeps its own DST correction flag, and therefore each will want to adjust the clock, making the problem worse rather than better.

The obvious solution is to not run the RTC in local time but use UTC instead. That is what most Unix-style operating systems have been doing for many years. That is also something that Windows NT has been designed to do from the beginning, but due to implementation bugs and lack of user interface, RTC in UTC has never been widely used with Windows.

The original behavior was justified in the early 1990s when multi-booting various operating systems was common and most of them only supported the RTC to be run in local time. It was counter-productive in the 2000s when a PC in almost all cases only had some Windows NT derivative on it, or could multi-boot to several NT versions and/or Linux.

There is no technical reason why the PC’s RTC can’t be run in UTC; it can. The only problem is Microsoft’s inertia. This might be one one of the few areas where EFI actually helps, because EFI does have a concept of time zones and UTC offsets.

There is also no technical reason why the RTC couldn’t be run in the local time zone but always without DST adjustments, and let software apply DST or not. But because existing OS software expects the RTC to be in local time including DST, that was never done.

It is apparent that the PC/AT and compatibles are to some extent victims of poor timing. The PC/AT was designed with the idea that the DST is predictable, and the RTC could take care of it. That was not a crazy thought in 1984 (at least considering the American market). But a few years later, poof, that assumption went out of the window—and it was too late to correct the initial design.

What Time is it Again?

The historic computer behavior leads to annoyances that are known to computer archivists. An old FAT-formatted floppy or hard disk, but likely also an old ZIP or similar archive, contains timestamps in local time. With absolutely no information as to in which time zone that local time was.

What happens when such file is copied onto a modern file system which does keep timestamps in UTC? Why of course, the OS just quietly makes something up. Given the utter lack of information the OS has, the alternative would be to ask the user what the time zone was when copying such files. Except the user almost certainly does not know either, so making something up is, on balance, really the least bad solution.

Which does not make it any less annoying. The conversion happens invisibly and suddenly one ends up with multiple copies of the same file with different timestamps, and then which one is the right one?

This is especially problematic in the fairly common case where the timestamps were synthetic and the hour:minute portion indicated the version number; for example, MS-DOS 6.22 files are timestamped 05-31-94 6:22a. That only makes sense if the time is shown the same everywhere, even if it is technically wrong.

A Rare Bug

And now for something slightly different, a bug that I recently had an opportunity to investigate. Out of the blue, software running on customers’ systems suddenly came up with the wrong date, after showing no issues for at least six months.

After much head scratching, it turned out that a routine determining the local time offset from UTC failed if it was run on the 2nd of a month, and moreover if it was run in a window between midnight and the local offset from UTC (e.g. between midnight and 2am in summer in most of Europe, but between midnight and 9am in Japan). That window does not exist in the timezones “behind” UTC, and in Europe and most of Asia it’s well outside of working hours.

The failure mode was determining the direction of UTC offset incorrectly in the problem window, when the local and UTC dates were different. As a result, the calculated date would end up being off by two days, although the time of the day was correct.

The bug had been in place for slightly more than ten years before it was discovered.

A Rarer Bug

While working on the above problem, I tested the behavior of various C run-time library routines. Among others I used the Open Watcom compiler. I noticed that the localtime() results did not match the data produced by the OS X system compiler when the local time was just after the switch from DST to standard time (that is, within an one-hour window once a year).

After a bit of debugging, it turned out that the problem was surprisingly trivial. The Watcom C run-time works with DST start and end times specified in terms of standard time. When parsing the TZ variable, the run-time took that into account and adjusted the DST end time (which the TZ variable specifies in terms of local time, i.e. with DST in effect).

When there was no TZ variable, the Win32 flavor of the C library run-time took the information from the GetTimeZoneInformation API. Which also specifies the DST end in terms of local time. And in that case, the time wasn’t adjusted, which caused the run-time to get it wrong and produce incorrect results in the narrow window around the end of DST.

The bug has been in the Watcom C run-time library for quite some time, unnoticed or at least unreported. It has now been fixed by applying the DST-to-standard time adjustment. It was likely to fly under the radar because the results were not wildly off and only affected (typically) one very early morning hour within a year.

I am almost certain that the MSVC 7.1 run-time has a similar but different bug related to DST changes, but I have not investigated it in detail.

Conversion Trouble

Even with no bugs, DST causes undue stress to computer scientists. The problem is that it’s not constant—just because year X has given DST start and end times doesn’t mean year Y has the same start or end, or is applied at all.

To handle accurate worldwide local time to UTC conversion of past dates, it is possible to keep a database of all historical DST data for all locales in the world. That can get out of hand quickly. Fortunately this doesn’t matter too much, especially when going further than a few years into the past.

Handling accurate worldwide local time to UTC conversion for future dates is, on the other hand, simply impossible. There is no way to predict future DST and time zone changes. Again, fortunately 100% precise conversion usually does not matter, especially going further into the future.

Moral of the Story

Time is hard. Because humans make it hard. Getting it 100% correct in software is quite difficult, if not impossible.

This entry was posted in Bugs, PC history, Random Thoughts. Bookmark the permalink.

18 Responses to Troubled Time

dosfan says:

May 30, 2018 at 7:25 pm

A good solution would be for everyone to get rid of stupid daylight saving time altogether.
Kendall Bennett says:

May 30, 2018 at 8:45 pm

Hey Michal! Steve sent me this article to read and I took the time to do so. Haha.

https://zachholman.com/talk/utc-is-enough-for-everyone-right

Time zone stuff sucks. We have had issues with our servers in that for some reason time.windows.com is not reliable, and the default is of course to use that for Windows. Now we (stupidly now it seems) run our web servers in local time, primarily because our web site stack has always operated in local time. At some point we will fix that, but for now we live with it. Alas the physical computers our VM’s run on are of course all in UTC for the RTC clocks in them, and if time.windows.com fucks up (or whatever the primary time server is) during the boot process for a VM, Windows will just use whatever came from the RTC, which of course is UTC, not local time! So if you happen to reboot your VM when the NTP server is dead, and you are running your VM in local time, poof, your times are all fucked! So we have to pay careful attention to the VM’s when we reboot them to make sure the time does not get messed up, but we changed to using time.nist.gov instead of time.windows.com and it generally does not screw up anymore.

We originally discovered that problem because by default our provider normally syncs to *their* servers for NTP, however their servers happened to be all jacked up one day when we rebooted our VM’s and shit went completely sideway until we figured out what happened.
Michal Necasek says:

May 30, 2018 at 9:00 pm

Hi Kendall! Thanks for reading 🙂

We have different but similar problems. Company firewalls block NTP, so basically any machine in default configuration can’t sync time (they have to be reconfigured to go to the local NTP server). With Windows, it’s also very easy to get the time zone wrong by not configuring it–not a problem for you because you’re on Pacific Time, but we’re 9 hours off from that.

The worst part is that if time is set wrong, things fail and fail in really non-obvious ways. For example, from experience I know that if the time zone is set wrong, Windows KMS activation won’t work. But it fails with completely impenetrable errors, such that it’s impossible to figure out from the errors that misconfigured time is the cause.

Maybe in another 20 years this will get sorted out, but probably not.
Kendall Bennett says:

May 30, 2018 at 9:32 pm

Yeah I have been having time sync problems with my new Windows VM installs under parallels. It seems the time sync fails, and Windows does really stupid shit when the time sync fails. Just now I was wondering why my build was not working properly when I looked at the clock and it literally changed from 11:06am to 4:06am while I was looking at it! I checked my NTP settings and the sync had failed.

I think this problem is related to networking on my Mac, because for some reason when I come back from sleep at the moment the networking is dead for about 10 seconds or so, even with Wifi as my secondary network. Maybe I will turn off Wifi and see if that changes anything.
Richard Wells says:

May 30, 2018 at 11:22 pm

Local time was not the old computer default. No time keeping at all was standard. The DEC PDP-11 (early versions) would not keep track of the time unless the $250 KW-11L or the even more expensive programmable KW-11P was purchased. Dates, unless entered manually, would start with the default date for the OS version. The first PDP-11 with battery backed time and date was the PDP-11/93 introduced in 1990. Unix, for all its UTC pretenses, started on machines that could not tell time.
Michal Necasek says:

May 31, 2018 at 12:58 am

Yeah, we all know those DOS files dated January 1980s… though I wasn’t aware the PDP-11 was so far behind the times. As far as I know, battery-backed clocks were one of the first popular PC add-ons, even before the PC/AT came out.
Michal Necasek says:

May 31, 2018 at 1:01 am

Yeah, the clock going back really screws up building. And you can either start from scratch or wait until enough time passes. When it goes back by six hours it’s probably faster to start from scratch…

On my home system I had a recurring problem with the Linux partition I occasionally boot to. It likes to set the RTC to UTC, and that then confuses Windows. For whatever reason, Windows does not always do time sync on boot, so it can take a while for the clock to fix itself. It’s just so much trouble for no good reason at all.
random lurker says:

May 31, 2018 at 8:24 am

@Kendall
You may have experienced the greatness that is Secure Time Seeding. Windows 10 takes the timestamps from TLS connections and tries to deduce whether the system time is valid or not, and reset it accordingly. Obviously this can result in the wrong time for a number of reasons. Fortunately it can be disabled via the registry. For more, see here: https://www.reddit.com/r/sysadmin/comments/61o8p0/system_time_jumping_back_on_windows_10_caused_by/

In addition, the last time Secure Time “worked” is recorded in the registry and at least with build 1511 Windows tended to fall back to that value as soon as NTP failed even once. This is supposedly fixed now though. For more, see: http://byronwright.blogspot.com/2016/03/windows-10-time-synchronization-and.html
techfury90 says:

May 31, 2018 at 8:50 am

The VAX had an RTC as standard, but the range was only 470 days or so, IIRC. VMS stored the year in the kernel image when you set the clock to make up for that. BSD’s warning about checking and resetting the date is a way of saying it’s not certain if the year is correct, in a similar vein. Holdover from the VAX “time of year” (TOY) clock as DEC called it.
Ben says:

May 31, 2018 at 10:54 am

I am sure you know all know this, but you can set Windows to use the RTC set to UTC. I have been using this since Windows 7 and it works great. See the following for instructions:

https://wiki.archlinux.org/index.php/Time#UTC_in_Windows
Michal Necasek says:

May 31, 2018 at 11:44 am

Yes. As I wrote, the option has always been there, never exposed in the UI, and subject to bugs. I believe it works just fine in Windows 7. Do I trust Microsoft to get it 100% right in every Windows 10 update? Hmm, not really.
random lurker says:

May 31, 2018 at 12:29 pm

Actually, to expound a little more on the silliness that is Secure Time Seeding, it is certainly a little strange that Microsoft didn’t just create a web service (over https, but disregarding certificate validity period errors) that provides the OS the current time at something like a second or two’s accuracy and then use NTP from thereon (with its existing “don’t change the time too much” policy). There wouldn’t be any need to perform black magic with TLS timestamps and trying to assess which of them are trustworthy enough.

Granted, Secure Time Seeding does help to provide an initial starting point for fixing the system time in closed networks (as long as the client accesses some resource over TLS) where obviously you wouldn’t be able to access any Microsoft web service, but even in this scheme such devices could be provisioned with a certificate and server address that is trusted within that network to provide that initial fix.
Richard Wells says:

June 1, 2018 at 1:46 am

There is always the fun of SQL with something like a dozen different datetime formats between vendors, many specifically geared to work around challenges of implementations of the clocks on certain systems. Advice was to disable UTC if database would be used with or might be switched to an IBM z/Series (OS/390). Switching to UTC involves rediscovering all the special cases that 50 years of developing software around local time had ironed out.

Life was easier when the only time that mattered was set on the mainframe and hard wired connections kept the local terminals in sync.
Jason Stevens says:

June 1, 2018 at 9:15 am

Time & Date has always been a PITA. I don’ t know if many people keep track of all the Java/Windows/UNIX updates that are simply timezone files being updated because of yet another shift somewhere, or when countries either expand or collapse timezones.

On the one hand it’s super great in China that they collapsed it all into one timezone, and I live in the zone where they selected the default from. Although I’d imagine it kind of sucks for people living in the far west. Russia went through collapsing zones as well a few years back, on the one hand I can’t see it being that big of a deal for most people that work in tireless indoor office jobs, but for people outdoors, yeah it’d be weird
Michal Necasek says:

June 1, 2018 at 3:01 pm

I suppose there’s a moral for future emperors: Expand your empire North/South, not East/West!
ender says:

June 5, 2018 at 10:35 pm

Regarding UTC on Windows, wasn’t there a bug last year where the OS would freeze for an hour during the DST transition if the system was set to use UTC?
dosfan says:

June 6, 2018 at 4:43 am

One annoying aspect of this on Windows is the fact that file timestamps change with the DST shift. Why Microsoft thought this was a good idea is beyond me. Doubly annoying is if you use xcopy /d to backup files from NTFS to FAT (which only uses local time) like from a hard drive to a USB flash drive – when DST goes into effect all of the NTFS timestamps will be one hour later than the FAT drive so xcopy /d will copy everything instead of doing an incremental backup. DST sucks.
Richard Wells says:

June 7, 2018 at 6:14 am

One could lock up CICS on an IBM mainframe using z/OS and missing one part of the setup for Daylight Savings Time. http://www-01.ibm.com/support/docview.wss?uid=swg21109779 Potentially very expensive mistake.