On solving the tzdata changes problem
Hello, I think we can semi-automate when a tzdata change occurs in software, given the following: 1. Your software stores all dates as unix timestamps (seconds since the epoch, “ignoring” leap seconds). 2. Your system (or users) choose a local representation using an oficial timezone, like ‘America/Santiago’. 3. Your system stores somewhere what version of tzdata is currently being used, like ‘2018e’. So, when a new version of tzdata is released (say ‘2018f’), the sysadmin updates their tzdata packages, and then runs a tool that checks for every timezone used (‘America/Santiago’ for ex) if there are changes that could affect users between 2018e and 2018f. The typical example: my dentist appointment is not correct anymore. So, your system may ask the user: “There have been changes in the timezone where you live, would you like to review this 3 appointments and check if they are still correct?" The problem is that there is no historical data of tzdata: when a sysadmin updates the system, it gets a new version 2018f, but there is no automatic way of knowing the differences. I didn’t find one, so I wrote it. A web app that demo this concept is here: https://a0.github.io/a0-tzmigration/demo/ <https://a0.github.io/a0-tzmigration/demo/> You can even share the url, for example here is the diff between America/Santiago 2016j and America/Punta_Arenas 2017a: https://a0.github.io/a0-tzmigration/demo/?ta=America%2FSantiago&va=2016j&tb=... <https://a0.github.io/a0-tzmigration/demo/?ta=America/Santiago&va=2016j&tb=Am...> This can automate many issues we all have when updating tzdata. For the link above, a whole region of my country switched so the new America/Punta_Arenas timezone was created. You can easily spot errors, too: the America/Punta_Arenas should be equal to America/Santiago up to 2017, but there are differences in the years 1890, 1946. And you can do this, in your own systems, using the same data, which is here: https://a0.github.io/a0-tzmigration-ruby/data/ <https://a0.github.io/a0-tzmigration-ruby/data/>. It’s not perfect. It probably has errors. I generated it from the ruby tzinfo-data gem, which has info only from 2013c onwards, and transitions up to ≈ 2068 (50 years from now). Use with caution, happy to accept corrections. I’m writing a ruby gem and an npm package, would love to see other language implementations: https://github.com/a0/a0-tzmigration-ruby <https://github.com/a0/a0-tzmigration-ruby> https://github.com/a0/a0-tzmigration-js <https://github.com/a0/a0-tzmigration-js> API is not really stable, will check and release documentation next week, but the core idea is there: grab the JSON files from our github page and get the changes that MAY have to be applied. I’m writing a medium blog post to cover all of this. Hope that this help all of you, and happy to listen any suggestions or corrections. Cheers, — Aldrin.
Date: Wed, 30 May 2018 23:29:15 -0400 From: Aldrin Martoq Ahumada <aldrin.martoq@gmail.com> Message-ID: <D8D64C7C-CBF2-4F58-8146-1D4D3E3A322A@gmail.com> | The problem is that there is no historical data of tzdata: No, the problem is that the application model is broken. If the dentist appointment is at some local time in America/Santiago then it should be stored that way, not in UTC ("that way" can be either to simply represent the local time, or to store UTC along with the offset that applied when it was converted to UTC) - and in either case an indication that it is local time in America/Santiago which is being represented. Handling time is much more complex that most people think, as once one has learned to read a clock, it is generally simply assumed that there's no more to it, and time is now "understood". That's just wrong. Anything that deals with times needs an accompanying (explicit or implicit) timezone associated with it - something that defines in what specific zone the time is fixed. Sometimes that might be UTC, though it usually isn't. The dentist appointment time would be local time at the office (that's the way things are scheduled) and changing the offset between that local time and UTC would make no difference at all to the local time set for the appointment. Any application to deal with this needs to be able to handle that kind of issue, and do it properly. This doesn't mean that your tool for examing differences between one tzdata version and another isn't useful - it just isn't (or shouldn't be) useful for that kind of application. kre
Am 31.05.2018 um 07:55 schrieb Robert Elz <kre@munnari.OZ.AU>:
Date: Wed, 30 May 2018 23:29:15 -0400 From: Aldrin Martoq Ahumada <aldrin.martoq@gmail.com> Message-ID: <D8D64C7C-CBF2-4F58-8146-1D4D3E3A322A@gmail.com>
| The problem is that there is no historical data of tzdata:
No, the problem is that the application model is broken. If the dentist appointment is at some local time in America/Santiago then it should be stored that way, not in UTC ("that way" can be either to simply represent the local time, or to store UTC along with the offset that applied when it was converted to UTC)
I'd suggest to *not* convert to UTC but to store local time: https://andreas.heigl.org/2016/12/22/why-not-to-convert-a-datetime-to-timest... Cheers Andreas
- and in either case an indication that it is local time in America/Santiago which is being represented.
Handling time is much more complex that most people think, as once one has learned to read a clock, it is generally simply assumed that there's no more to it, and time is now "understood". That's just wrong.
Anything that deals with times needs an accompanying (explicit or implicit) timezone associated with it - something that defines in what specific zone the time is fixed. Sometimes that might be UTC, though it usually isn't. The dentist appointment time would be local time at the office (that's the way things are scheduled) and changing the offset between that local time and UTC would make no difference at all to the local time set for the appointment.
Any application to deal with this needs to be able to handle that kind of issue, and do it properly.
This doesn't mean that your tool for examing differences between one tzdata version and another isn't useful - it just isn't (or shouldn't be) useful for that kind of application.
kre
On May 31, 2018, at 3:46 AM, Andreas Heigl <andreas@heigl.org> wrote:
Am 31.05.2018 um 07:55 schrieb Robert Elz <kre@munnari.OZ.AU <mailto:kre@munnari.OZ.AU>>:
Date: Wed, 30 May 2018 23:29:15 -0400 From: Aldrin Martoq Ahumada <aldrin.martoq@gmail.com <mailto:aldrin.martoq@gmail.com>> Message-ID: <D8D64C7C-CBF2-4F58-8146-1D4D3E3A322A@gmail.com <mailto:D8D64C7C-CBF2-4F58-8146-1D4D3E3A322A@gmail.com>>
| The problem is that there is no historical data of tzdata:
No, the problem is that the application model is broken. If the dentist appointment is at some local time in America/Santiago then it should be stored that way, not in UTC ("that way" can be either to simply represent the local time, or to store UTC along with the offset that applied when it was converted to UTC)
I'd suggest to *not* convert to UTC but to store local time: https://andreas.heigl.org/2016/12/22/why-not-to-convert-a-datetime-to-timest... <https://andreas.heigl.org/2016/12/22/why-not-to-convert-a-datetime-to-timest...> Hi Robert and Andreas, thank both of you for your comments. I think I failed in the humility test here, I assumed we all know in this list time is really complicated and I’m sorry because I didn’t mean that my tool is bullet proof.
There are many ways to keep date in systems, like using always local time, storing dates with UTC offset, or with a timezone like America/Santiago, or storing them with some form of location where you are working, like a city. All of them aren’t wrong or right by themselves, they just depend on their context if they are enough or not. And if it works for you, that’s great! My proposal is not perfect, and there is not a single solution that solves all the problems. Even if you implement what I’m proposing in your systems, there will be dates that can easily be migrated, some that must not be migrated, and some that you may end asking the user for confirmation. It’s really complicated. In the case of a new timezone created for my country, most of the systems I know didn’t do anything to solve the issue, and just let the users suffer the consequences. I think that we could and should do better about this. First, we need to increase awareness about how complicated this can be. just like you have done, I updated my old 2011 post here: https://medium.com/servicios-a0/about-time-and-computers-530dd3937582 <https://medium.com/servicios-a0/about-time-and-computers-530dd3937582>. And we have to do it constantly. Second, we have to make clear that there is no single approach that fits all systems. I, personally, think that we live in a global world, so we should try to make our systems somehow resilient to this inevitable changes. I know it because my country has randomly changed the DST this last 10 years. If your country is a decent one, may you have to deal with other countries that are not. I also think that normal people should be aware of the issues, not try to hide it. Make them choose their timezone explicitly if they are app is global. But again, it depends. And third, we must improve our software to handle this. Having a tool that can help you update a timezone could be one of this things. I think the upgrade process is actually too complicated to most software, many need a restart to take the changes, it would be awesome that system take new definitions without that. Maybe we could create an authoritative server like NTP, but for timezones.
- and in either case an indication that it is local time in America/Santiago which is being represented.
Handling time is much more complex that most people think, as once one has learned to read a clock, it is generally simply assumed that there's no more to it, and time is now "understood". That's just wrong.
Anything that deals with times needs an accompanying (explicit or implicit) timezone associated with it - something that defines in what specific zone the time is fixed. Sometimes that might be UTC, though it usually isn't. The dentist appointment time would be local time at the office (that's the way things are scheduled) and changing the offset between that local time and UTC would make no difference at all to the local time set for the appointment.
Any application to deal with this needs to be able to handle that kind of issue, and do it properly.
This doesn't mean that your tool for examing differences between one tzdata version and another isn't useful - it just isn't (or shouldn't be) useful for that kind of application.
kre
— Aldrin.
On 31/05/18 06:55, Robert Elz wrote:
No, the problem is that the application model is broken. If the dentist appointment is at some local time in America/Santiago then it should be stored that way, not in UTC ("that way" can be either to simply represent the local time, or to store UTC along with the offset that applied when it was converted to UTC) - and in either case an indication that it is local time in America/Santiago which is being represented.
The application model has never ben defined, but the starting point IS deciding on one's requirements. If you never need to worry about anything other than local time AND you have no DST changes to that time, then you can quite happily work with a clock which is shifted to UTC and ignore all this TZ stuff. Introducing DST requires a little more care if one is required to log when events take place since you have an overlap of time for a period each year. Usually an hour, but not exclusively. But you need to be able to identify cleanly the events before and after the transition, and using one or other time as a base time and flagging the DST offset ensures this is consistent. Normalising to UTC may have other advantages but is not essential. Once one moves to a system which has clients world wide, logging in and out at all times of the day and night, this logging NEEDS a stable base which UTC provides. Many systems currently use 'server time' but managing changes in TWO lots of DST transitions does nothing for consistency especially where one is not entirely sure bother rule-sets are from the same version of data. Running everything UTC also prevents problems where distributed services on other servers are in different timezones as well. A server time clock should simply be UTC ... Once you have a stable clock, then one can manage the time which is displayed based on the relevant rule set. Be that 'local time' of the events location or 'client time' of a remote users location. It is essential though that a change to an event due to new changes to the rule set can be picked up and normalised data corrected if required. The local time may not have changed, but the offset may have and it is that change which needs processing in the data side of things. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Thanks for doing all that. This calls out for a link from the tzdb web pages. Proposed patch attached.
On May 31, 2018, at 6:45 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
Thanks for doing all that. This calls out for a link from the tzdb web pages. Proposed patch attached. <0001-Mention-AO-TimeZone-Migration.patch>
Thank you, Paul! I think this is awesome. Just published the medium post, polishing and adding documentation for the tools. https://medium.com/servicios-a0/on-solving-the-tzdb-changes-problem-7b9fa8f9... <https://medium.com/servicios-a0/on-solving-the-tzdb-changes-problem-7b9fa8f9...> https://a0.github.io/a0-tzmigration/ <https://a0.github.io/a0-tzmigration/> Cheers, -- Aldrin.
participants (5)
-
Aldrin Martoq Ahumada -
Andreas Heigl -
Lester Caine -
Paul Eggert -
Robert Elz