Aug. 29, 2013
5:44 p.m.
"On 29 August 2013 17:28, Paul Eggert <eggert@cs.ucla.edu> wrote: > the tz database evolves +1 > we hope for the better. So why are you making it worse? Thats what I cannot fathom. The data that is being changed/deleted results in nonsense pre-1970. Until you have a comprehensive solution to the pre-1970 issue, you should revert the commit that has resulted in the nonsense. As a reminder, "America/Atikokan" used to be this: LMT -06:06:28 Transition[Gap at 1895-01-01T00:00-06:06:28 to -06:00] Transition[Gap at 1918-04-14T02:00-06:00 to -05:00] Transition[Overlap at 1918-10-27T02:00-05:00 to -06:00] Transition[Gap at 1940-09-29T00:00-06:00 to -05:00] but is now the same as "America/Panama": LMT -05:18:08 Transition[Overlap at 1890-01-01T00:00-05:18:08 to -05:19:36] Transition[Gap at 1908-04-22T00:00-05:19:36 to -05:00] >From this we can say (as fact, not opinion): - the LMT value has changed (Panama is nowhere near Atokokan) - the history of data before 1940 in Atikokan has changed - the history previously showed Atikokan started defining zones in 1895 - the history now shows Atikokan started defining zones in 1908 - the history now shows Atikokan as never having had a -06:00 offset The other IDs being altered have similar issues, but its easier to focus on one. You might argue that it is just pre-1970 data which is inaccurate and should not be relied on. I simply argue that you've taken the data from unknown quality to definitely inaccurate - clearly worse. (And that pre-1970 data is very visible in the work I do) > I've compiled some "attic" data (appended to this > email) which makes it clear that we have regularly replaced > zones by links during tz maintenance. This practice hasn't > caused hardships for users. Links for spelling mistakes obviously cause no issues. Beyond that, they are clearly going to be losers unless the entire history of the two zones are exactly identical. By entire history, I mean LMT, pre-1970 and post-1970. Bear in mind that the entire source data is visible in Java (LMT, pre-1970 and post-1970) and that we parse the source files directly. zic is a distraction. Looking at the attic data, its clear that the LMT has been treated as irrelevant in the past. Every time a zone with a unique ID is converted to a Link, then its LMT is lost. > Both filters could be implemented, and they could be applied > in series. This is all very well, but ignores the fact that other applications parse the source files, including Java. Those applications would need additional complex logic to fixup the data. There is too much focus on the C code developed here and Unix, and not on other consumers of the data. FWIW, I also think that filtering like this isn't really a good idea in practical terms for users. For example, say you filter most of central Europe after say 2010, you'll only get one zone as everywhere uses the same time. Now, lots of people setup their machines to that one central zone. Then imagine the case where Greece leaves the EU and starts to set its own time-zone. Everyone in Greece will now need to reset their zone ID. Whereas, if everyone had just selected Europe/Athens up front there would have been no problem. ie. zones split as well as merge, and they will often do so on the historic boundaries that are already captured in the tzdb. As I said above, you should start by reverting the controversial changes (see my other email). That takes the heat out of the immediate issue. Then, only make changes once you have a fully agreed strategy for handling pre-1970 data that is not destructive, and that gives enough notice to others to be able to adapt. Stephen