Vnaik.com

Handling time when programming


Handling time when developing can be tricky. Time calculations depend on imprecise and variable physical features (eg. leap seconds to account for the rotation of the earth) and socio-political features (eg. time zones) and this means there could be several pitfalls between expectation and reality when working with time.

First, an introduction to certain time-related concepts.

UTC

UTC provides a uniform framework with which to look at time, abstracted away from the vagaries of local times. Interpreting UTC dates is mostly straightforward; issues only arise when converting UTC to local times. This is less of a problem for dates in the past (unless you look at historic dates, maybe older than a few decades or centuries i.e., before the epoch) and more of a problem for dates in the future.

Flavours of time: UT is a time standard based on the Earth's rotation; and UT1 is conceptually the solar time at 0° longitude but is calculated based on distant quasars because precise measurements of the sun are difficult. It is an accurate reflection (or as close as our technology allows) to the actual rotation of the earth. TAI is a time standard where every day is exactly 86000 seconds long, and is measured as the weighted average of about 400 atomic clocks around the world. GMT is mean solar time and is correlated to UT1, and may differ from UTC by up to 0.9 s. GPS time is TAI offset by 19 seconds and no leap second correction is applied. TAI is a reference timekeeping standard maintained by atomic clocks and is an abstract scale disconnected from physical reality. UT1 is solar time and is correlated to actual physical reality, to the rotation of the Earth which is variable from day to day. UT1 is correlated to reality but is inconvenient and irregular, while TAI is very convenient and regular - and thus UTC is based on TAI, but because TAI can go out of sync with UT1 from time to time, and because we cannot speed up the earth to make UT1 and UTC agree with each other, we instead use leap seconds to ensure UTC and UT1 are always within 0.9 seconds of each other. This means UTC is a discontinuous (because of leap seconds), abstract (because it is not perfectly correlated to the Earth's rotation), and reliable time-scale for almost all programming needs. Leap seconds are added about every year and a half on average.

Unix time

However Unix time is affected by leap seconds and it handles these by 'replaying' the previous second - thus, each day has 86400 seconds and time goes backwards for a second. UTC does not replay seconds and instead has an extra second on days with a leap second for 86401 seconds. Unix time is measured as the number of seconds since the epoch (January 1 1970 in UTC) and is not affected by time zones or daylight savings. The start of the epoch is very close to the start of modern UTC timekeeping and representing and using precise dates from before this time can be challenging. Luckily that is rarely a problem for most projects.

32 bit Unix time will overflow in 2038 so a 64 bit integer will need to be used to represent times beyond this.

Time zones and DST

Given the reliability of UTC, the most natural way of handling time when programming is to store times in UTC (either as a time stamp or an ISO-8601 string) and convert to local times applying current time zone rules as needed.

Just about the only thing that can go wrong is the time zone rules changing between when the time was recorded and the time the date is interpreted. In this context, time zone rule changing means that the government in question has either removed or introduced daylight savings time, or moved the country to another time zone altogether. Since time zones are a political definition and not a technological one, such changes can and will happen arbitrarily for reasons that have nothing to do with technology.[1][2][3]

This also applies to dates in the past, because interpreting old dates precisely may require the use of historic time zone rules instead of the current ones. However, this is a very niche problem and by and large current time zone rules will suffice. Thus, our time zone rules (as long as they are up-to-date) tell us how UTC maps to local time right now, but we don't know for sure today how UTC time will map to local time in the future because the time zone rules could change between now and then.

Fortunately, the most common use-case for time is also the easiest - recording times of events, usually as they happen (e.g., when an email is received, when a user logs in, etc.) - these dates are always in the past and are best stored as UTC. They can be converted into local time when read using the latest time zone rules.

Things are a bit more complicated when dealing with dates in the future, for example, to implement a scheduling or reminder system.

Handling Future Times

On the other hand, if you are interested in a specific universal instant in time in the future then UTC is a good fit. For example, if a task needs to run at 24 hour intervals irrespective of calendar time then use UTC. Future times are usually about human expectations. For example, "10:00 AM next Tuesday" is a human time, not UTC or a Unix time stamp. This future time can be converted to UTC right now for storage, but the converted value could be affected by:

Local time is about human convention. For example, "the same time next week" does not translate to a precise physical duration. It is usually 604,800 seconds, but sometimes it's 604,801 seconds (if there is a leap second), it could theoretically be 604,799 seconds (though this has never happened), sometimes it's 601,200 seconds or 608,400 seconds (due to daylight savings time changes), and if a locality decides to change which time zone it follows, it could be a completely different duration entirely. Computing the seconds between now and a future date is fraught with peril. In your time zone, 10:00AM may be 15:00 UTC or 14:00 UTC depending on whether Daylight Savings is in effect - but a meeting scheduled for 10:00 AM happens at that time irrespective of daylight savings. Since UTC does not account for this or for the other problems mentioned above, storing times in a 'human' format and converting to UTC when needed is the way to go. The more in the future the date is the less certain we can be ahead of time what the precise mapping of a human time (such as 10:00 AM next Tuesday) to UTC is because the underlying rules themselves can change in the meantime. If computed ahead of time, the UTC date could be incorrect by the time the actual time rolls around.

This might seem like nit-picking, and it might indeed be unlikely to cause problems during development, but over the course of a couple of years these errors are just about guaranteed occur, especially if you look at the number of countries and time zones in the world - and not all countries have only one time zone and sometimes even regions that share the same time zone implement daylight savings differently (for example, New South Wales and Queensland in Australia). And daylight savings has the potential to bite you right from the get go.

Therefore the best way to handle this is to store the time as a string, eg. "10:00" along with a date and the IANA time zone identifier (eg. 'Australia/Sydney) and let the system make the decision about the precise UTC time only when this value is read.

If future dates are stored in UTC then every time the time zone rules are updated all the dates will have to be recalculated, so the original 'human' time will need to be stored anyway. This also applies to leap seconds but can be ignored if your application can be off by a second.

Handling events scheduled to fall in the window of a daylight savings time switch over is trickier. For example, if Daylight Savings Time change happens at 1AM, then an event scheduled to run at 1:30AM every day will fail twice every year - not executing at all on the day of the first DST transition and executing twice on the day of the second DST transition. The best way to avoid this problem is to specifically check for this condition if necessary. Also different countries change to daylight savings time at different times of the day - some do it at midnight - 1AM others at 1AM - 2AM, and still others do it 2AM - 3AM.

Developing with these considerations can be tricky because things may seem to work fine during development and testing, and only fail months or years later when some unexpected time zone change occurs. Automated tests will potentially miss this kind of error unless tests were written with shifting time zones in mind. At the very least automatic tests should exercise the project through some daylight savings spring forwards and fall back.


Conclusions

If the application needs universal time and is not intended to be mapped back to local time then UTC is a very good fit. An example will be a task scheduler that schedules tasks every 24 hours irrespective of conventions such as holidays, time of day, etc. On the other hand, if the application needs to use local times (eg. for a calendar application, or to schedule future work) then it is best to store time strings and IANA time zone names and not UTC.

It is also worth remembering the following points, this is mostly a concern for applications where sub-second accuracy is important.

References