Getting current tzdb version in use

Hi, I was looking for a reliable way to get the current tzdb version (something like 2018e), but it seems it doesn’t exists yet. A couple of examples: - In Mac OS 10.13.6, there is a /usr/share/zoneinfo/+VERSION that says 2018c. - In one of my updated Ubuntu 16.04 servers, the only reliable way is to do a dpkg -s tzdata | grep ^Version, and guess the real version from the string “Version: 2017c-0ubuntu0.16.04" - The “tzinfo” ruby library can get the version only if it uses its own version of tzdata, if it uses the OS files, there is no string version. Knowing the current tzdb version is essential for automating timezone updates. I think an appropriate place to put that string is in the binary TZif file itself, so tools like tzinfo can get the current version *of the current timezone in use*, instead of looking some random file like +VERSION in a random directory/folder. Is there any chances for this to happen? Or any other suggestion? Aldrin.

On Jul 18, 2018, at 9:36 PM, Aldrin Martoq Ahumada <aldrin.martoq@gmail.com> wrote:
Hi,
I was looking for a reliable way to get the current tzdb version (something like 2018e), but it seems it doesn’t exists yet. A couple of examples: - In Mac OS 10.13.6, there is a /usr/share/zoneinfo/+VERSION that says 2018c. - In one of my updated Ubuntu 16.04 servers, the only reliable way is to do a dpkg -s tzdata | grep ^Version, and guess the real version from the string “Version: 2017c-0ubuntu0.16.04" - The “tzinfo” ruby library can get the version only if it uses its own version of tzdata, if it uses the OS files, there is no string version.
Knowing the current tzdb version is essential for automating timezone updates.
I think an appropriate place to put that string is in the binary TZif file itself, so tools like tzinfo can get the current version *of the current timezone in use*, instead of looking some random file like +VERSION in a random directory/folder.
Is there any chances for this to happen? Or any other suggestion?
Fwiw, in the draft C++20 spec, this is a legal program that prints out what you want: #include <chrono> #include <iostream> int main() { std::cout << std::chrono::get_tzdb().version << '\n'; } Howard

Where is it getting the version information? The tzdb files have the `make version` target, but that is not installed with `make install`. There's also the tzdata.zi file, which has the info in the header comment, but I don't think that is consistently deployed in system packages. On July 19, 2018 1:45:14 AM UTC, Howard Hinnant <howard.hinnant@gmail.com> wrote:
On Jul 18, 2018, at 9:36 PM, Aldrin Martoq Ahumada <aldrin.martoq@gmail.com> wrote:
Hi,
I was looking for a reliable way to get the current tzdb version
(something like 2018e), but it seems it doesn’t exists yet. A couple of examples:
- In Mac OS 10.13.6, there is a /usr/share/zoneinfo/+VERSION that says 2018c. - In one of my updated Ubuntu 16.04 servers, the only reliable way is to do a dpkg -s tzdata | grep ^Version, and guess the real version from the string “Version: 2017c-0ubuntu0.16.04" - The “tzinfo” ruby library can get the version only if it uses its own version of tzdata, if it uses the OS files, there is no string version.
Knowing the current tzdb version is essential for automating timezone updates.
I think an appropriate place to put that string is in the binary TZif file itself, so tools like tzinfo can get the current version *of the current timezone in use*, instead of looking some random file like +VERSION in a random directory/folder.
Is there any chances for this to happen? Or any other suggestion?
Fwiw, in the draft C++20 spec, this is a legal program that prints out what you want:
#include <chrono> #include <iostream>
int main() { std::cout << std::chrono::get_tzdb().version << '\n'; }
Howard

The C++20 draft specification leaves the version unspecified, both in its content and its source. However it is expected that the implementor will make a best attempt to track the IANA database version number. The prototype implementation first tries the file “version”. If that file doesn’t exist, it tries to scrape the version number out of NEWS: https://github.com/HowardHinnant/date/blob/master/src/tz.cpp#L3316-L3342 On Apple OS's, and if using the OS-supplied zic-compiled files, “+VERSION” is used: https://github.com/HowardHinnant/date/blob/master/src/tz.cpp#L2618-L2632 std::lib implementors will be free to provide this information however they best see fit to serve their customers. Some implementors will probably initially provide an empty string as the version, and my hope is that they will be down-voted in market share. In time, I hope that C++ std::lib implementors will converge on supplying an accurate representation of the IANA version number, given the standard API for doing so. Customers will have this standard API in their toolbox, and provide market pressure to their vendors for said API to supply quality results. Howard On Jul 18, 2018, at 9:49 PM, Paul G <paul@ganssle.io> wrote:
Where is it getting the version information? The tzdb files have the `make version` target, but that is not installed with `make install`. There's also the tzdata.zi file, which has the info in the header comment, but I don't think that is consistently deployed in system packages.
On July 19, 2018 1:45:14 AM UTC, Howard Hinnant <howard.hinnant@gmail.com> wrote: On Jul 18, 2018, at 9:36 PM, Aldrin Martoq Ahumada <aldrin.martoq@gmail.com> wrote:
Hi,
I was looking for a reliable way to get the current tzdb version (something like 2018e), but it seems it doesn’t exists yet. A couple of examples: - In Mac OS 10.13.6, there is a /usr/share/zoneinfo/+VERSION that says 2018c. - In one of my updated Ubuntu 16.04 servers, the only reliable way is to do a dpkg -s tzdata | grep ^Version, and guess the real version from the string “Version: 2017c-0ubuntu0.16.04" - The “tzinfo” ruby library can get the version only if it uses its own version of tzdata, if it uses the OS files, there is no string version.
Knowing the current tzdb version is essential for automating timezone updates.
I think an appropriate place to put that string is in the binary TZif file itself, so tools like tzinfo can get the current version *of the current timezone in use*, instead of looking some random file like +VERSION in a random directory/folder.
Is there any chances for this to happen? Or any other suggestion?
Fwiw, in the draft C++20 spec, this is a legal program that prints out what you want:
#include <chrono> #include <iostream>
int main() { std::cout << std::chrono::get_tzdb().version << '\n'; }
Howard

Howard Hinnant wrote:
The C++20 draft specification leaves the version unspecified, both in its content and its source. However it is expected that the implementor will make a best attempt to track the IANA database version number.
The prototype implementation first tries the file “version”. If that file doesn’t exist, it tries to scrape the version number out of NEWS:
https://github.com/HowardHinnant/date/blob/master/src/tz.cpp#L3316-L3342
On Apple OS's, and if using the OS-supplied zic-compiled files, “+VERSION” is used:
https://github.com/HowardHinnant/date/blob/master/src/tz.cpp#L2618-L2632
std::lib implementors will be free to provide this information however they best see fit to serve their customers. Some implementors will probably initially provide an empty string as the version, and my hope is that they will be down-voted in market share. In time, I hope that C++ std::lib implementors will converge on supplying an accurate representation of the IANA version number, given the standard API for doing so. Customers will have this standard API in their toolbox, and provide market pressure to their vendors for said API to supply quality results.
So the *implementation* of this feature in the lib would also be easier if tzdb provided a standard way to get the version. BTW, does your implementation check the tzdb version on each call, or only once after startup? In the latter case an updated tzdb version would only be detected if a program (or the whole system) is restarted ... Martin

On Jul 19, 2018, at 3:33 AM, Martin Burnicki <martin.burnicki@meinberg.de> wrote:
Howard Hinnant wrote:
The C++20 draft specification leaves the version unspecified, both in its content and its source. However it is expected that the implementor will make a best attempt to track the IANA database version number.
The prototype implementation first tries the file “version”. If that file doesn’t exist, it tries to scrape the version number out of NEWS:
https://github.com/HowardHinnant/date/blob/master/src/tz.cpp#L3316-L3342
On Apple OS's, and if using the OS-supplied zic-compiled files, “+VERSION” is used:
https://github.com/HowardHinnant/date/blob/master/src/tz.cpp#L2618-L2632
std::lib implementors will be free to provide this information however they best see fit to serve their customers. Some implementors will probably initially provide an empty string as the version, and my hope is that they will be down-voted in market share. In time, I hope that C++ std::lib implementors will converge on supplying an accurate representation of the IANA version number, given the standard API for doing so. Customers will have this standard API in their toolbox, and provide market pressure to their vendors for said API to supply quality results.
So the *implementation* of this feature in the lib would also be easier if tzdb provided a standard way to get the version.
BTW, does your implementation check the tzdb version on each call, or only once after startup? In the latter case an updated tzdb version would only be detected if a program (or the whole system) is restarted …
On first access to any call that requires a tzdb lookup, the version is looked up on the local disk and cached. However, if the client calls reload_tzdb() (https://en.cppreference.com/w/cpp/chrono/tzdb_functions) the std::lib implementation may load a new version of the tzdb if available. Howard

Howard Hinnant wrote:
On Jul 19, 2018, at 3:33 AM, Martin Burnicki <martin.burnicki@meinberg.de> wrote:
BTW, does your implementation check the tzdb version on each call, or only once after startup? In the latter case an updated tzdb version would only be detected if a program (or the whole system) is restarted …
On first access to any call that requires a tzdb lookup, the version is looked up on the local disk and cached.
However, if the client calls reload_tzdb() (https://en.cppreference.com/w/cpp/chrono/tzdb_functions) the std::lib implementation may load a new version of the tzdb if available.
Thanks for the pointer! Martin -- Martin Burnicki Senior Software Engineer MEINBERG Funkuhren GmbH & Co. KG Email: martin.burnicki@meinberg.de Phone: +49 5281 9309-414 Linkedin: https://www.linkedin.com/in/martinburnicki/ Lange Wand 9, 31812 Bad Pyrmont, Germany Amtsgericht Hannover 17HRA 100322 Geschäftsführer/Managing Directors: Günter Meinberg, Werner Meinberg, Andre Hartmann, Heiko Gerstung Websites: https://www.meinberg.de https://www.meinbergglobal.com Training: https://www.meinberg.academy

Martin Burnicki wrote:
So the*implementation* of this feature in the lib would also be easier if tzdb provided a standard way to get the version.
/usr/share/zoneinfo/tzdata.zi contains a "# version" line. Though this is reasonably recent in tzdb, and many distributions don't install that file, and it's not part of the stable API. I have my doubts about the version info. Many distributions apply local changes to their tzdb data, and I expect that they're not updating the version line appropriately. So all you have is some sort of vague good-faith attempt at version info; it doesn't guarantee that you have a particular tzdb release. That is, the version info is not something that a portable program should rely on; all it's really good for is as a string that the program can report to the user so that the user can try to debug whatever goes wrong.

On 19 Jul 2018, at 15:51, Paul Eggert <eggert@cs.ucla.edu> wrote:
Martin Burnicki wrote:
So the*implementation* of this feature in the lib would also be easier if tzdb provided a standard way to get the version.
/usr/share/zoneinfo/tzdata.zi contains a "# version" line. Though this is reasonably recent in tzdb, and many distributions don't install that file, and it's not part of the stable API.
I have my doubts about the version info. Many distributions apply local changes to their tzdb data, and I expect that they're not updating the version line appropriately. So all you have is some sort of vague good-faith attempt at version info; it doesn't guarantee that you have a particular tzdb release. That is, the version info is not something that a portable program should rely on; all it's really good for is as a string that the program can report to the user so that the user can try to debug whatever goes wrong.
Fedora, RHEL6/CentOS6/OL6 and RHEL7/CentOS7/OL7 have good news and bad news. The good news is that /usr/share/zoneinfo/tzdata.zi exists. The bad news is that it says "# version unknown". The "tzdata.zi" file is, in fact, a recent addition, it seems to have been added in tzdata-2018e-3 so perhaps the version will get fixed in a future version. Even so, as you say, local changes may be reflected in the version recorded in tzdata.zi so software relying on that version is going to have to guess whether or not it is up to date. Perhaps the C++ tzdata-version thing is looking forward to a time when there's a standardised version from tzdist. jch

John Haxby wrote:
Fedora, RHEL6/CentOS6/OL6 and RHEL7/CentOS7/OL7 have good news and bad news.
The good news is that /usr/share/zoneinfo/tzdata.zi exists.
The bad news is that it says "# version unknown".
The bad-news part appears to be a bug in RHEL. I reported it here: https://bugzilla.redhat.com/show_bug.cgi?id=1604030

On Jul 19, 2018, at 10:51 AM, Paul Eggert <eggert@cs.ucla.edu> wrote:
Martin Burnicki wrote:
So the*implementation* of this feature in the lib would also be easier if tzdb provided a standard way to get the version.
/usr/share/zoneinfo/tzdata.zi contains a "# version" line. Though this is reasonably recent in tzdb, and many distributions don't install that file, and it's not part of the stable API.
I have my doubts about the version info. Many distributions apply local changes to their tzdb data, and I expect that they're not updating the version line appropriately. So all you have is some sort of vague good-faith attempt at version info; it doesn't guarantee that you have a particular tzdb release. That is, the version info is not something that a portable program should rely on; all it's really good for is as a string that the program can report to the user so that the user can try to debug whatever goes wrong.
One of the big use cases I foresee is comparing your current version with the “remote” (or new) version. A very-long-running application such as an airline reservation system can do this to see if it needs to update the tzdb. For this use case, the only requirement on the version string is that it keep changing with each new version of the database. // a singly linked list of tzdb std::chrono::tzdb_list& get_tzdb_list(); // the front of the list const std::chrono::tzdb& get_tzdb(); // The latest tzdb version, which may have been updated since application start std::string remote_version(); // Your application’s current version get_tzdb().version // Updates your application’s current tzdb if get_tzdb().version != remote_version() const std::chrono::tzdb& reload_tzdb(); https://en.cppreference.com/w/cpp/chrono/tzdb_functions If every version == “unknown”, this strategy will spectacularly fail, and subsequently the std::lib vendor may as well. Howard

On 07/19/2018 08:39 AM, Howard Hinnant wrote:
If every version == “unknown”, this strategy will spectacularly fail, and subsequently the std::lib vendor may as well.
Presumably get_tzdb().version should have some way of failing if there is no version info, and it should fail if tzdata.zi says '# version unknown'. Although this wouldn't solve the other version problems we've seen on this thread, it should solve this one.

On Jul 19, 2018, at 12:58 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 07/19/2018 08:39 AM, Howard Hinnant wrote:
If every version == “unknown”, this strategy will spectacularly fail, and subsequently the std::lib vendor may as well.
Presumably get_tzdb().version should have some way of failing if there is no version info, and it should fail if tzdata.zi says '# version unknown'. Although this wouldn't solve the other version problems we've seen on this thread, it should solve this one.
The prototype implementation throws an exception, though that detail is not currently specified in the C++20 spec. I’ll give it more thought. Howard

On 19 July 2018 at 04:12, Howard Hinnant <howard.hinnant@gmail.com> wrote:
The C++20 draft specification leaves the version unspecified, both in its content and its source. However it is expected that the implementor will make a best attempt to track the IANA database version number.
The prototype implementation first tries the file “version”. If that file doesn’t exist, it tries to scrape the version number out of NEWS:
https://github.com/HowardHinnant/date/blob/master/src/tz.cpp#L3316-L3342
On Apple OS's, and if using the OS-supplied zic-compiled files, “+VERSION” is used:
https://github.com/HowardHinnant/date/blob/master/src/tz.cpp#L2618-L2632
std::lib implementors will be free to provide this information however they best see fit to serve their customers. Some implementors will probably initially provide an empty string as the version, and my hope is that they will be down-voted in market share. In time, I hope that C++ std::lib implementors will converge on supplying an accurate representation of the IANA version number, given the standard API for doing so. Customers will have this standard API in their toolbox, and provide market pressure to their vendors for said API to supply quality results.
On 19 July 2018 at 17:39, Howard Hinnant <howard.hinnant@gmail.com> wrote:
If every version == “unknown”, this strategy will spectacularly fail, and subsequently the std::lib vendor may as well.
That might work, if people use their C++ standard library only for tz work, or if all implementations are identical in all respects but for the tz portion. But if you have implementation A with a perfect tz implementation but for some reason std::list, std::set and std::map do not work at all, and implementation B where the template library works perfectly but the time zone version always returns "unknown", guess which one will win. I can't see how there can be market pressure for implementation B to improve its tz portion; it's not as if prospective customers will go over to implementation A just because their tz portion works perfectly, if there are glaring problems in other portions. Cheers, Philip -- Philip Newton <philip.newton@gmail.com>

On Jul 20, 2018, at 7:03 AM, Philip Newton <philip.newton@gmail.com> wrote:
On 19 July 2018 at 04:12, Howard Hinnant <howard.hinnant@gmail.com> wrote:
The C++20 draft specification leaves the version unspecified, both in its content and its source. However it is expected that the implementor will make a best attempt to track the IANA database version number.
The prototype implementation first tries the file “version”. If that file doesn’t exist, it tries to scrape the version number out of NEWS:
https://github.com/HowardHinnant/date/blob/master/src/tz.cpp#L3316-L3342
On Apple OS's, and if using the OS-supplied zic-compiled files, “+VERSION” is used:
https://github.com/HowardHinnant/date/blob/master/src/tz.cpp#L2618-L2632
std::lib implementors will be free to provide this information however they best see fit to serve their customers. Some implementors will probably initially provide an empty string as the version, and my hope is that they will be down-voted in market share. In time, I hope that C++ std::lib implementors will converge on supplying an accurate representation of the IANA version number, given the standard API for doing so. Customers will have this standard API in their toolbox, and provide market pressure to their vendors for said API to supply quality results.
On 19 July 2018 at 17:39, Howard Hinnant <howard.hinnant@gmail.com> wrote:
If every version == “unknown”, this strategy will spectacularly fail, and subsequently the std::lib vendor may as well.
That might work, if people use their C++ standard library only for tz work, or if all implementations are identical in all respects but for the tz portion.
But if you have implementation A with a perfect tz implementation but for some reason std::list, std::set and std::map do not work at all, and implementation B where the template library works perfectly but the time zone version always returns "unknown", guess which one will win. I can't see how there can be market pressure for implementation B to improve its tz portion; it's not as if prospective customers will go over to implementation A just because their tz portion works perfectly, if there are glaring problems in other portions.
You’re completely right. I might as well quit trying and just go home and watch reruns. Thanks for saving me a ton of time. Howard

Howard Hinnant wrote:
I might as well quit trying and just go home and watch reruns.
Hogan's Heroes is my rerun of choice. Its main point is to show that the evil, corrupt, and powerful Germans were incompetent fools. The real world should be more like Hogan's Heroes. I recently watched its episode "The Antique" (first aired 1969-12-12), in which prisoner-of-war Hogan bamboozles camp commandant Klink into shipping coded cuckoo clocks to Paris, Le Havre, Cherbourg, Brussels and Amsterdam. Unfortunately the episode doesn't mention that the clocks' time settings are already OK because the Germans had imposed German time on all these areas. So I shouldn't link to it from tz-art.html, unfortunately. http://hh.wikia.com/wiki/The_Antique

Paul Eggert wrote in <a7d0bc5e-54ff-e000-6431-b870658e803a@cs.ucla.edu>: |Howard Hinnant wrote: |> I might as well quit trying and just go home and watch reruns. | |Hogan's Heroes is my rerun of choice. Its main point is to show that \ |the evil, |corrupt, and powerful Germans were incompetent fools. The real world \ |should be |more like Hogan's Heroes. Ach! Sigh!!! The thing is, i am reading specific books in my launderette, something that can be gobbled in that place. The last was the nice Stevenson's Treasure Island, before that Dickens, but now i unfortunately turned to that Tory! Tory!! Tory!!! Mr. Defoe and his "The Life and Strange Surprising Adventures of Robinson Crusoe ... With An Account how he was at last at strangely deliver'd by Pyrates", published in Anno Domini 1719. And to me it is anything but appealing if a man who desires all the freedom for himself points a weapon at your head, and is willing to make use of it. All or nothing, take it or leave it, and that on page 30 of the german translation, and even after having been given a run around on the prior pages . And i know it will not get better. From a man who helped overthrowing his king, and went bankrupt several times. Obviously being convinved to be justified, being on the right track, not having a need to reach out for a wider context etc. etc. etc. That is from 1719. Is not, hmm. Over a hundred years ago we a.k.a. the humans have had a good time ("Belle Époque", and, of course: generally speaking), and many good things have been developed or explored. For example Anthroposophy, which possibly for the first time could have led the white race to be en par with bushmen regarding holistic integrity and sustainability. Of course it did not, it is us, not them. I am always stunned about the fact that parts of the research institutes that were collected under the name Kaiser-Wilhelm ("let there be more light!", maybe he was pink, but i would not say that in public) now, under the aegis of the empire, firm under the name Fritz Haber. Of course he won the Noble Price (at a time when this was worth more than it is today, imho) by giving fertilizer to our civilization. But he also endorsed chemical warfare, and supervised the very first, and further such actions in WWI. And i admire and adore his wife, the beautiful Ms. Clara Immerwahr (which can be translated to "Clara AlwaysTrue") very much: she was the first woman who did a doctorate in Germany, and magna cum laude. She shot herself in the head in the garden of her house on the day after he returned from supervising the first use of chemical warfare, and what can be heared is that she did so because she wanted to use research to help the people. Such a dedication, and purity of heart and soul. I am from Darmstadt, and the last tsarina was Alix von Hessen-Darmstadt. Exactly a hundred years (and five days) ago she and her entire Family have been murdered. They all have been canonised. Because it matters. Our Duke Ernst Ludwig of Hessen-Darmstadt is my rerun of choice, he built so many schools, waterworks, churches and minsters, there was art and beauty, heart and soul. |I recently watched its episode "The Antique" (first aired 1969-12-12), \ |in which |prisoner-of-war Hogan bamboozles camp commandant Klink into shipping coded |cuckoo clocks to Paris, Le Havre, Cherbourg, Brussels and Amsterdam. |Unfortunately the episode doesn't mention that the clocks' time settings \ |are |already OK because the Germans had imposed German time on all these \ |areas. So I |shouldn't link to it from tz-art.html, unfortunately. All this was very much off topic. Sorry for that. [1] https://de.wikipedia.org/wiki/Elisabeth_von_Hessen-Darmstadt_(1895–1903)#/media/File:Rosenh%C3%B6he_Darmstadt_-_memorial_-_IMG_7053.JPG |http://hh.wikia.com/wiki/The_Antique --End of <a7d0bc5e-54ff-e000-6431-b870658e803a@cs.ucla.edu> --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)

Date: Thu, 19 Jul 2018 07:51:28 -0700 From: Paul Eggert <eggert@cs.ucla.edu> Message-ID: <b29fe5b7-1475-f583-dbaf-ca43782bebcf@cs.ucla.edu> | I have my doubts about the version info. I have more than doubts, I am convinced that the entire thing is misguided, and that (with the one exception of curiosity value, the ability of an application to tell people which version data it is using) there is no good use of this data in the way it is being mooted that should not be handled a different way, and that anything using the tzdata version for this (like using the results of that c++ interface that was mentioned) will cause more problems than it can ever solve. To consider the one example that was presented here in the past dat or so - the long running airline application that wants to auto update the tzdata files as needed. That's a reasonable aim. But putting version info in the data is neither needed, nor useful for that. The assumption is that the app can look and see if a new version of the data has appeared, compare with the running version, and update as needed. Fine. But to discover what the new version is cannot be using version info in the data, as that data does not exist yet, only the sources, The app must be using either the version info in the file names, or the version info in the version file. When it updates, it can store that "new" version info anywhere convenient to it, which when the update is done, will become the "version in use" that it can use when looking for the next update to appear. What's more, this is really all part of the packaging system (whether that is something built into the application, or something separate) and not part of the mainstream of the application. Further, depending upon how it is done, putting version info into the data files can be counterproductive, and result in lots of churn. Eg: sometime (probably not too far away) we are going to get 2018f - in that the zone file for Fiji is going to be changed, and anyone who updates to that version will get new data for Fiji, But there is (unless I have forgotten some other pending change) no need for any of the other zone files to alter, and a system might very well detect that, and only update the zones that actually contain different data, rather than all of them, That results in a much more stable system - but if those files contain version info, soon enough there will be many different installed versions. If the version info is to be somewhere else, then the system might just as well install the "version" file somewhere, and use that (assuming it can find a reliable use for it). Longer ago, perhaps the last time this issue came up, or perhaps one of the tmes before that, there was some suggestion that software could automatically update times to account for changes to the zone data. Even assuming this is rational (which I doubt) version info is not needed (or useful) for that - what is wanted is the "old" data, and the "new" data (whatever they call themselves) and the ability to read both, and compare whatever it is needs comparing. For this nothing cares whether the old data is 2018d or 2015c when you're updating to 2018e (nor that the new version is called that) - just that "we used to get this conversion, we now get this other one, which means we should ..." As to the "assuming it s rational" - just go back to the airline example again. We know there is going to be a week when (in Fiji) the old and new tzdata (2018e and 2018f) give different results. But to believe from that the flight times into or out of Suva need to simply be adjusted by an hour would be lunacy. For example, internal flights (within Fiji) will almost certainly continue to operate at the same local wallclock times, which will be an hour earlier (or later, I forget what the change was) in UTC equivalent time for a week than what they would have been. On the other hand, international flights get far more complex - they can't just be moved bty an hour, as that would upset the scheduelling (including gate availability, congestion for takeoff/landing ...) at the other end of the flight, plus not arriving too lage for connecting flights (on other airlines which do not fly to Fiji and have no reason to alter anything), and to deal with airport curfews, etc. It all gets horribly complex, and can't be handled trivially. Of course, the Fiji case is not all that hard, as the new schedule already exists, the only question is which day the switch is made from old to new ... well, almost, there's still a lot of coordination involved, when the "new" means changes that affect operations at other places, and so everything needs to be coordinated. For most other applications, if the correct data is stored, nothing will care if the offsets change (not that nothing will be affected, just that it will all take care of itself) - things only go wrong when apps attempt to short-cut handing of time data, rather than doing it properly. Forget this. - leave version info alone, we have quite enough of it already, and do not need more, and certainly do not need a method for any random application that happens to call localtime() to find out which tzdata version was used. kre

Just to be clear, and head off any misunderstandings, I am satisfied with the current versioning and packaging at https://www.iana.org/time-zones . No changes are needed for the C++20 calendar and timezone extensions to chrono. Indeed, stability is what I would request, especially in the data formats of both the text and binary forms of the data. I am happy with the current version file in the text database. And vendors can easily add version files to their compiled binary databases (as Apple already does for macOS, iOS and watchOS). Howard

On Jul 19, 2018, at 7:29 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
Howard Hinnant wrote:
vendors can easily add version files to their compiled binary databases (as Apple already does for macOS, iOS and watchOS).
How does that work, exactly? I don't have easy access to any of those systems. Is it documented anywhere?
Deborah Goldsmith should chime in if I get anything wrong… I’m not aware of any documentation (I haven’t looked). And I’m ignorant of the Apple deployment process. On the most recent update to macOS, the file: /private/var/db/timezone/tz/2018c.1.0/zoneinfo/+VERSION contains: 2018c And that’s good enough for a std::lib vendor on macOS to read and fill out std::chrono::get_tzdb().version. There also exists a generic (version-free) symlink path to the same file: /usr/share/zoneinfo/+VERSION (which is what my prototype uses when compiled in the mode of using the zic-compiled binary database). At this time, Apple does not do the following: But it is possible that in the future, Apple could choose to update the zoneinfo database to a new directory, and atomically update the /usr/share/zoneinfo/ symlink, _without_ requiring a system reboot. This would enable long-running programs to query the OS, detect the version change of the tzdb, and update their internal ram copies of the tzdb, without the program having to go through a shutdown/restart sequence. The C++20 API is anticipating (but not requiring) that capability. Howard

Robert Elz wrote:
I have more than doubts, I am convinced that the entire thing is misguided, and that (with the one exception of curiosity value, the ability of an application to tell people which version data it is using) there is no good use of this data
Yes, that's the only good use I can think of as well. That is, the version string is more for diagnosing obsolete or misconfigured systems than for use in ordinary computation. That being said, configuration is of growing importance in software engineering, and I hope that the version comment in tzdata.zi is enough to satisfy needs in this area. With TZDIST the version info is best done via ETags, and this should work with TZif files so that clients can easily see whether they're up-to-date and get a new version if not. See Internet RFC 7808 section 4.1.4 along with: https://tools.ietf.org/html/draft-murchison-tzdist-tzif-09

Paul ... can you bounce my messages to the list ... this shiiting phone client has buggered up the user interface yet again ... why do developers think they know best what users need.

You guys know this thread was not sent to the main list? Was that on purpose? I will say that I agree about the churn - that's a good reason not to include the version in every zone file, but even if the version is interpreted as "this is based on XXXX", it would be very useful - particularly since the common case is that the version is accurate and any inaccurate versions would be a bug in the deployment *not* a bug in tzdb - you can't stop your downstream distributors from introducing bugs, but you can't avoid features just because someone might introduce a bug in them. I can tell you a few places I've found need for the version: 1. I have some tests in my test suite that check certain edge cases, one of which is the international date line switch in Kiribati in 1994. Old versions have this happening on the wrong day, so when that changed, my test suite broke. Ideally I would write my tests like this (pseudocode): if tzdata_version >= '2018d': transition_date = 1995-01-01 else: transition_date = 1994-12-30 I would be comfortable skipping this test on platforms that screw up the version information, since the version check has nothing to do with my real application. 2. It is useful to check the version to see if the system version is outdated when debugging, and in general as part of logging to provide maximum context when trying to find bugs later. If I'm digging through some old logs trying to figure out what happened and the log starts "tzdata version == 2014c; OS: Linux; Distribution: RHEL6" or something, I can see if there were any issues in tzdata version 2014c, and if RHEL has made any weird modifications to the system deployed tzdata. 3. It can be useful when you have the choice of more than one source for time zone data and want to try and ascertain which one is more recent and use that one. For example, my library dateutil ships an up-to-date version of the tz database (with version information), but will always choose the system-installed version if it exists. I would prefer it if instead I could check the system installed tzdata's version and the version of the tzdata I shipped and use whichever one is more recent (or give users the choice). Without shipped version information, this is not possible. In all of these cases, I believe that *some* version information would be better than none, even if not all the complexity of what it means to be a "new version" is captured. At this point, of course, I think the battle in the tz project is won - tzdata.zi exists and last I checked `make install` installs it into `/usr/share/zoneinfo`, so now it's time to get system distributors to make sure they include it in their distributions, I guess. Best, Paul On 07/19/2018 07:40 PM, Paul Eggert wrote:
Robert Elz wrote:
I have more than doubts, I am convinced that the entire thing is misguided, and that (with the one exception of curiosity value, the ability of an application to tell people which version data it is using) there is no good use of this data
Yes, that's the only good use I can think of as well. That is, the version string is more for diagnosing obsolete or misconfigured systems than for use in ordinary computation. That being said, configuration is of growing importance in software engineering, and I hope that the version comment in tzdata.zi is enough to satisfy needs in this area.
With TZDIST the version info is best done via ETags, and this should work with TZif files so that clients can easily see whether they're up-to-date and get a new version if not. See Internet RFC 7808 section 4.1.4 along with:

Date: Fri, 20 Jul 2018 10:12:22 -0400 From: Paul G <paul@ganssle.io> Message-ID: <76b9ed1b-b097-f483-2709-dddf02ad0e37@ganssle.io> | You guys know this thread was not sent to the main list? I just did a random check on mesages in my copies, and they all look to have been sent to tz@iana.org - what makes you think otherwise? And what other list is there they could have been sent to? | I can tell you a few places I've found need for the version: | | 1. I have some tests in my test suite that check certain edge cases, one = | of which is the international date line switch in Kiribati in 1994. The right way to test that is to check the translation of 1994-12-31 in the Kiribati timezone, and see what UTC value you get. That way, regardless of who fixed what, or when, or in what random patch to what version, you'll always find out what you should use. To be even more flexible you could work through some times, in case some other random version had the transition listed at yet another point in time. | 2. It is useful to check the version to see if the system version is outdated This is a package manager issue - and yes, software installers need to keep track of which versions are installed, or everything... | 3. It can be useful when you have the choice of more than one source for | time zone data Same thing. | In all of these cases, I believe that *some* version information would be | better than none, even if not all the complexity of what it means to be | a "new version" is captured. We have that now, and ave for some time (and on NetBSD, before the version file appeared in txdata, we had a TZDATA_VERSION which served the same purpose ... we still do...) | At this point, of course, I think the battle in the tz project is won - tzdata.zi | exists and last I checked `make install` installs it into `/usr/share/zoneinfo`, | so now it's time to get system distributors to make sure they include it in | their distributions, I guess. Which we don't in NetBSD, that file adds no value for us. kre

Robert Elz wrote:
Which we don't in NetBSD, that file adds no value for us.
The main intent of the tzdata.zi file was to have a single file containing all the information of all the TZif files, so that a software upgrade procedure could get a copy of tzdata.zi and then run zic to generate the other files. tzdata.zi is considerably smaller than a tarball of the TZif files it represents: a factor of 4 smaller if you use gzip to compress both, and a factor of 3 smaller even if you use lzip which is much better at compressing. It'd be cool to update all the TZif data by downloading a tzdata.zi.lz file that contains only 21556 bytes (the current size). This particular bit of microoptimization hasn't caught on yet, but perhaps some day it will in IoT devices where those bytes still count.

Paul Eggert <eggert@cs.ucla.edu> writes:
Robert Elz wrote:
Which we don't in NetBSD, that file adds no value for us.
The main intent of the tzdata.zi file was to have a single file containing all the information of all the TZif files, so that a software upgrade procedure could get a copy of tzdata.zi and then run zic to generate the other files. tzdata.zi is considerably smaller than a tarball of the TZif files it represents: a factor of 4 smaller if you use gzip to compress both, and a factor of 3 smaller even if you use lzip which is much better at compressing.
FWIW, Postgres has already switched over from shipping the raw tzdb text files in our tarball to shipping tzdata.zi, precisely because it's a lot smaller and you get the same output after running it through zic. So it adds value for us, whether or not any particular other platform wants to use it. And we like having some readable version indication in it, too. I think the request to propagate that version info into the file tree installed by zic is equally reasonable. As noted, a lot of people have invented their own solutions for this omission, which proves that there's a need for it. regards, tom lane

Tom Lane wrote:
FWIW, Postgres has already switched over from shipping the raw tzdb text files in our tarball to shipping tzdata.zi, precisely because it's a lot smaller and you get the same output after running it through zic.
Thanks for letting me know. I have a version-related question about that. zishrink.awk currently avoids some optimizations (and thus generates a longer file than it could) for compatibility with zic 2017b and earlier. I was thinking of enabling those optimizations in a few months, on the theory that tzdata.zi is a relatively new feature and anybody who's using it is also using a newer zic. So my question is: does "running it through zic" mean running tzdata.zi through the system zic (which is not always present, and which may predate 2017b), or does it mean running tzdata.zi through your own copy of zic? If the former, then I suppose we should wait a few years before adding those optimizations to zishrink.awk. If the latter, then I hope we can shrink tzdata.zi a bit in a few months rather than waiting until 2022.

Paul Eggert <eggert@cs.ucla.edu> writes:
Tom Lane wrote:
FWIW, Postgres has already switched over from shipping the raw tzdb text files in our tarball to shipping tzdata.zi, precisely because it's a lot smaller and you get the same output after running it through zic.
Thanks for letting me know. I have a version-related question about that.
zishrink.awk currently avoids some optimizations (and thus generates a longer file than it could) for compatibility with zic 2017b and earlier. I was thinking of enabling those optimizations in a few months, on the theory that tzdata.zi is a relatively new feature and anybody who's using it is also using a newer zic.
So my question is: does "running it through zic" mean running tzdata.zi through the system zic (which is not always present, and which may predate 2017b), or does it mean running tzdata.zi through your own copy of zic?
We run it through our own copy of zic --- generally, that subsystem is meant to support platforms that don't have tzdb at all, so we couldn't really expect zic to be available either. (On well-maintained platforms, people typically configure Postgres to use the platform's copy of tzdb not its own, whereupon updates are not our problem.) We are somewhat more slothful about updating our copy of the tz code than tz data, but all our currently-maintained release branches are currently synced to tzcode 2018e, so it'd be no problem for us if you do something that requires a current zic. In general, anytime you guys do something to the data that requires tzcode >= NN, we have to be sure we're up to NN because of the possibility that platform copies of tzdb will get the update almost immediately, whether or not we update our copy of the data. So that tends to be a stronger forcing function than zic proper anyway. Thanks for asking! regards, tom lane

I just did a random check on mesages in my copies, and they all look to have been sent to tz@iana.org - what makes you think otherwise? And what other list is there they could have been sent to?
Sorry, that was my mail client - it was only showing me the first 3 or 4 recipients. Sometimes people end up replying to specific people and not the list. Probably it's implausible that this would grab 5 people and not the list as one of them though. My fault.
| I can tell you a few places I've found need for the version: | | 1. I have some tests in my test suite that check certain edge cases, one = | of which is the international date line switch in Kiribati in 1994.
The right way to test that is to check the translation of 1994-12-31 in the Kiribati timezone, and see what UTC value you get. That way, regardless of who fixed what, or when, or in what random patch to what version, you'll always find out what you should use. To be even more flexible you could work through some times, in case some other random version had the transition listed at yet another point in time.
Yes, I heuristically do this, but in a test suite I prefer to minimize the amount of logic that relies on the thing I am testing. If I know that there was a change to a test case in a specific version, I can say, "Use the old test case if you see a version older than this, and use the new test case if you see a version newer than this", or "skip this test if the version is older than X". Another case is the Europe/Dublin negative DST. I want to write a test that says, "Europe/Dublin should return negative DST in the winter", but that test only works with versions >= 2018e. There's no really good heuristic for that - if it doesn't return negative DST it's either that my version is too old or my implementation is wrong. Having a version number really helps distinguish between these two.
| 2. It is useful to check the version to see if the system version is outdated
This is a package manager issue - and yes, software installers need to keep track of which versions are installed, or everything...
I want to write software and libraries independent of package managers. It's very common practice for software to have a method for querying the version using the conventions of the software itself for that very reason. System deployment details shouldn't be baked into my library, but "When I'm looking for time zone data, which of the many versions of the time zone data that is distributed do I have" is a very reasonable question to ask.
| 3. It can be useful when you have the choice of more than one source for | time zone data
Same thing.
| In all of these cases, I believe that *some* version information would be | better than none, even if not all the complexity of what it means to be | a "new version" is captured.
We have that now, and ave for some time (and on NetBSD, before the version file appeared in txdata, we had a TZDATA_VERSION which served the same purpose ... we still do...)
| At this point, of course, I think the battle in the tz project is won - tzdata.zi | exists and last I checked `make install` installs it into `/usr/share/zoneinfo`, | so now it's time to get system distributors to make sure they include it in | their distributions, I guess.
Which we don't in NetBSD, that file adds no value for us.
If there were *any* standard way to query the version of the data independent of platform (as there is for accessing *all* other data about the zone info mind you - I don't see why metadata should be different), I would be happy. Given that those of us providing libraries accessing this data want the information, and the people consuming our libraries want the information, I would say that it doesn't add *no value* for NetBSD - using the standard scheme for marking the version means it's easier to write cross-platform software that includes NetBSD.

Paul G wrote:
I want to write a test that says, "Europe/Dublin should return negative DST in the winter", but that test only works with versions >= 2018e. There's no really good heuristic for that - if it doesn't return negative DST it's either that my version is too old or my implementation is wrong. Having a version number really helps distinguish between these two.
That test should also work for versions 2018a and 2018b. If you're running that test only for versions 2018e or later it won't matter that it also works in some earlier versions. But if you expect the test to work one way in 2018e-or-later and the other way in 2018d-or-earlier, you'll find that this expectation is not true for 2018a and 2018b. This underscores the point that it's typically better to test for the feature you care about, instead of trying to infer features from version strings.

On Fri, Jul 20, 2018 at 2:45 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
Paul G wrote:
I want to write a test that says, "Europe/Dublin should return negative DST in the winter", but that test only works with versions >= 2018e. There's no really good heuristic for that - if it doesn't return negative DST it's either that my version is too old or my implementation is wrong. Having a version number really helps distinguish between these two.
That test should also work for versions 2018a and 2018b. If you're running that test only for versions 2018e or later it won't matter that it also works in some earlier versions. But if you expect the test to work one way in 2018e-or-later and the other way in 2018d-or-earlier, you'll find that this expectation is not true for 2018a and 2018b.
So, the version test has to be more complicated. So be it.
This underscores the point that it's typically better to test for the feature you care about, instead of trying to infer features from version strings.
The feature he is trying to test is that his implementation correctly delivers the underlying data, and that seems impossible without having some idea what the underlying data is. Perhaps the tests should only be run on version-locked data, but that limits their usefulness.

Paul G wrote:
I can tell you a few places I've found need for the version:
Thanks for reporting those war stories; they're helpful, and tend to confirm my feeling that the version number can be useful in diagnosing configuration mistakes. One comment:
3. It can be useful when you have the choice of more than one source for time zone data and want to try and ascertain which one is more recent and use that one.
Unfortunately it's not that simple, because in some cases the sources can disagree but neither will be "more recent"; for example, both might be based on 2018e but with different local patches applied. This is partly why I'm skeptical about attempts to use version information for anything important other than "report this string to the users, and let them figure it out".
At this point, of course, I think the battle in the tz project is won - tzdata.zi exists and last I checked `make install` installs it into `/usr/share/zoneinfo`, so now it's time to get system distributors to make sure they include it in their distributions, I guess.
Plus, distributors need to include something better than "# version unknown", which is what Red Hat is doing unfortunately. See: https://bugzilla.redhat.com/show_bug.cgi?id=1604030

On 07/20/2018 10:58 AM, Paul Eggert wrote:
Paul G wrote: Unfortunately it's not that simple, because in some cases the sources can disagree but neither will be "more recent"; for example, both might be based on 2018e but with different local patches applied. This is partly why I'm skeptical about attempts to use version information for anything important other than "report this string to the users, and let them figure it out".
Yes, it's not perfect, and as you mentioned in the RHEL case downstream packagers can introduce bugs, but it's certainly better than nothing. The previous status quo with *no* accessible version information meant that essentially there was no way to decide which one *claims* to be "more recent". With tzdata.zi, we can at least tell what the system claims the version is, and we can use that to check: 1. If the deployed version complies with the upstream versioning scheme 2. Whether the version differences are ambiguous (e.g. 2018e+dev23deafffa vs. 2018e+dev33471873 - which to choose?) It's up to people who implement consumer libraries to not blindly believe the versions, of course - e.g. by default believe the versions but provide a mechanism for users to say, "Actually just always use the system version" or "Actually just always use this version", but I think that once tzdata.zi is widely deployed, in 90%+ of cases you won't go wrong by trusting the system version.
At this point, of course, I think the battle in the tz project is won - tzdata.zi exists and last I checked `make install` installs it into `/usr/share/zoneinfo`, so now it's time to get system distributors to make sure they include it in their distributions, I guess.
Plus, distributors need to include something better than "# version unknown", which is what Red Hat is doing unfortunately. See:

Date: Fri, 20 Jul 2018 11:33:28 -0400 From: Paul G <paul@ganssle.io> Message-ID: <f81b3a20-7f57-a214-c254-e8c1b190c38c@ganssle.io> I think a lot of the disconnect here is summer up by this ... | but I think that once tzdata.zi is widely deployed, in 90%+ of cases | you won't go wrong by trusting the system version. To me, that's entirely backwards, and completely wrong. That is, if it gets to 90% it will be bad, if it makes it to 95% it wil be worse, and if it makes it to 99% it would be a disaster. Not that there's anything wrong with the tzdata.zi file itself (and Paul, the other Paul, I know what its purpose was - but saving a few bytes of download time in an environment when half the world are streaming movies (or porn) and the rest are trying to download kde or openoffice just isn't an objective that matters.) The issue is that unless you can guarantee 100% (and no-one is foolish enough to claim that) relying on something that "almost always works" or "works almost everywhere" (and then passing the buck when it doesn't work ... "*they* should have fixed that!" ) is what leads to hard to find and/or fix problems, things that fail so rarely that no-one cares enough to actually fix them. If it was just 30%, everyone would know that this was a likely cause of problems, and what to do to to deal with the issues. To go back to the (one real) specific example ... In a different message paul@ganssle.io said: | Yes, I heuristically do this, but in a test suite I prefer to minimize the | amount of logic that relies on the thing I am testing. If I know that there | was a change to a test case in a specific version, I can say, "Use the old | test case if you see a version older than this, and use the new test case if | you see a version newer than this", or "skip this test if the version is | older than X". The problem is that this kind of thing works, sometimes... In the example in question, if you're in Europe, or the Americas, and barely have any idea where Kiribati is, and don't much care what happens there (and quite likely have no idea how to pronounce it) there is a fair chance this strategy would work. But if you're actually in Kiribati (or nearby) and care about communication, and coordination, then you might find that quite a lot of tzdata users knew about the problem, and had been patching it locally to make the data be correct - in version after version of tzdata, until it was eventually fixed in the tzdata distribution (recently). If you make the distinction just by looking at versions, your test will fail. If you do it properly (and tests are really a place where things should always be done properly, a few seconds of performance really is irrelevant, but bad results matter) then you will always get the right result. Similarly ... in the other message paul@ganssle.io also said: | Another case is the Europe/Dublin negative DST. I want to write a test that | says, "Europe/Dublin should return negative DST in the winter", but that test | only works with versions >= 2018e. There's no really good heuristic for that | - if it doesn't return negative DST it's either that my version is too old or | my implementation is wrong. Having a version number really helps distinguish | between these two. Again, that's attempting to do it on the cheap, rather than do it properly. The first issue is to discover whether the software supports a zone with a time shift that goes backwards, rather than forwards (one that temporarily sets the clocks backwards from the local notion of standard time.) The (or a) way to do that is to configure a small "fake" timezone in exactly that way (write a very small zic input file) - run zic on it (and make sure that doesn't fail first) and then use the result to test the software. That way you find out for sure whether or not the software works, not depending upon being able to guess which random piece of (possibly) installed standard data might happen to be a good test. If the software doesn't support it, then there is no real point attempting to perform further tests in this area, you have already detected a primary probllem. If the software does all work correctly, then you can check zones where this should be able to be observed, and see if it is or not. But even if the "not" happens, you still don't know that the tzdata version is old - you may have run into one of the cases where the local tzdata packager is one of the people who believes that the "variant" period of the UTC offset for a zone must always be a step forward from the "standard" version, and has been patching the Dublin zone back to the old way "because everyone knows that is how it really should be" - and distributing that to their users. Lastly, for now anyway.... again iin the other message paul@ganssle.io said: | I want to write software and libraries independent of package managers. That's fine - mention of package managers was just because those are one thing that does care about version numbers - but note it is their own version numbers. Those might be derived from, or even identical to, those used by the source of the package, but to the package manager, it is its own version number, that it maintains, and updates as needed. The exact same version number in a different distribution could mean something totally different. I totally agree that a test suite is not that (or should not be) nor will most software or libraries be (the mythical long running airline system, that updates itself, might be an exception). The point is that for yor purpose you should not be relying on (or even knowing about) version info. It is the warong way. The quick and easy simple way perhaps - but not the way that really works. Just put in the work to do things correctly, long term, you will be happy that you did. Always test for what you need to know - exactly - never simply assume that if X is detected, then (an unrelated) Y will also exist, just because that is what usually happens. kre

Robert Elz wrote:
test for what you need to know - exactly - never simply assume that if X is detected, then (an unrelated) Y will also exist
This is good advice. It's a primary design philosophy of Autoconf: test for what you need, not for version numbers. Although you can't always do it, it's better to do it when you can.
saving a few bytes of download time in an environment when half the world are streaming movies (or porn) and the rest are trying to download kde or openoffice just isn't an objective that matters.
Admittedly it's not a big deal, but if we're upgrading tzdata in 5 billion mobile devices ten times per year and saving 50 kB per upgrade, that's 250 TB of mobile data and Internet traffic saved worldwide per year. Although this is only a tiny fraction of total use (Cisco predicts 200 EB for such traffic in 2018), even a 1 part-per-million decrease would be a (tiny) win. It'd be like Hogan's Heroes vs the German army; you get only small victories, but you do what you can and every bit helps. My source for traffic estimates is the last page of: https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-n...

On 2018-07-20 12:35, Paul Eggert wrote:
Robert Elz wrote:
test for what you need to know - exactly - never simply assume that if X is detected, then (an unrelated) Y will also exist This is good advice. It's a primary design philosophy of Autoconf: test for what you need, not for version numbers. Although you can't always do it, it's better to do it when you can.
Expecting distributors to change their packaging scripts, except when you make a change which breaks the process, won't happen: some distros package only the .tab files and the zic binaries, some include the changelog, Debian etc. still include leap-seconds.list, others include leapseconds, some include tzdata.zi, some include various selections of uppercase named and .html docs, none include version. Only zic knows what build selections are made, and assuming that has not been customized, it could generate say a VERSION file on mass rebuilds. Of course that does not handle when distros revert name or abbreviation changes to maintain consistent backwards compatibility, or cherry pick zones to package in their own manner; and each DB vendor has their own unique approaches. If their customers cared enough, they'd complain more, and be unsatisfied with workarounds of picking some other zone with the desired offset - oh - and don't forget to change back in a few months: missing the whole point of the project! Perhaps we need a bug reporting campaign to distros, to complain about lack of project docs, other distributed files, and detailed build version info.
saving a few bytes of download time in an environment when half the world are streaming movies (or porn) and the rest are trying to download kde or openoffice just isn't an objective that matters. Admittedly it's not a big deal, but if we're upgrading tzdata in 5 billion mobile devices ten times per year and saving 50 kB per upgrade, that's 250 TB of mobile data and Internet traffic saved worldwide per year. Although this is only a tiny fraction of total use (Cisco predicts 200 EB for such traffic in 2018), even a 1 part-per-million decrease would be a (tiny) win. It'd be like Hogan's Heroes vs the German army; you get only small victories, but you do what you can and every bit helps. My source for traffic estimates is the last page of: https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-n...
I certainly expect Android/iOS/BB and downstream mobile vendors to distribute only changed binaries (or just deltas) of a few KB, rather than the whole 1.5MB of unique files each time, saving a lot more traffic per upgrade. Given that the vendors are telcos, that would be naive, why would they? Are mobile updates free anywhere? -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

Brian Inglis <Brian.Inglis@SystematicSw.ab.ca> writes:
Expecting distributors to change their packaging scripts, except when you make a change which breaks the process, won't happen: some distros package only the .tab files and the zic binaries, some include the changelog, Debian etc. still include leap-seconds.list, others include leapseconds, some include tzdata.zi, some include various selections of uppercase named and .html docs, none include version.
This seems unduly pessimistic to me. If zic writes an additional file into the target directory, that would probably automatically get included by most packagers --- it'd look no different from a new zone file. And I doubt many packagers are left who haven't set things up to just automatically include new zone files. As an ex-packager (not of tzdb though) I can tell you there are lots of better things to spend time on than micro-managing your upstream's file manifest. Especially for an upstream such as tzdb, where changes in the file list are everyday occurrences. regards, tom lane

On Sat, 21 Jul 2018, Tom Lane wrote:
Brian Inglis <Brian.Inglis@SystematicSw.ab.ca> writes:
Expecting distributors to change their packaging scripts, except when you make a change which breaks the process, won't happen: some distros package only the .tab files and the zic binaries, some include the changelog, Debian etc. still include leap-seconds.list, others include leapseconds, some include tzdata.zi, some include various selections of uppercase named and .html docs, none include version.
This seems unduly pessimistic to me. If zic writes an additional file into the target directory, that would probably automatically get included by most packagers --- it'd look no different from a new zone file. And I doubt many packagers are left who haven't set things up to just automatically include new zone files. ...
NetBSD would not automatically install an additional file. The build process is very explicit about which files get copied to the build's destination directory; it is also very selective in what files it includes in the .tgz "distribution set" files. Manual intervention is needed to update these steps. +------------------+--------------------------+----------------------------+ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org | +------------------+--------------------------+----------------------------+

Paul Goyette <paul@whooppee.com> writes:
On Sat, 21 Jul 2018, Tom Lane wrote:
This seems unduly pessimistic to me. If zic writes an additional file into the target directory, that would probably automatically get included by most packagers --- it'd look no different from a new zone file. And I doubt many packagers are left who haven't set things up to just automatically include new zone files. ...
NetBSD would not automatically install an additional file. The build process is very explicit about which files get copied to the build's destination directory; it is also very selective in what files it includes in the .tgz "distribution set" files. Manual intervention is needed to update these steps.
And your point is? Should I conclude that NetBSD users see a seriously lobotomized set of time zones, or does the packager just automatically update the distribution list every time a new file shows up there? In my former life as a packager for Red Hat, you'd list a specific set of files you expected your package to install into, say, /usr/bin -- either an unexpected addition or unexpected removal there should set off alarm bells. But the file list for tzdb's database would almost certainly just be "/usr/share/zoneinfo/*"; there is no value in expending packager brain cells on second-guessing the upstream there. Maybe NetBSD's packaging toolchain is too impoverished to do that, but I'd certainly call that a bug not a feature. regards, tom lane

On Sat, 21 Jul 2018, Tom Lane wrote:
NetBSD would not automatically install an additional file. The build process is very explicit about which files get copied to the build's destination directory; it is also very selective in what files it includes in the .tgz "distribution set" files. Manual intervention is needed to update these steps.
And your point is? Should I conclude that NetBSD users see a seriously lobotomized set of time zones, or does the packager just automatically update the distribution list every time a new file shows up there?
Whenever the contents of the destination don't match what is expected, manual intervention is required to evaluate and make whatever changes are needed.
In my former life as a packager for Red Hat, you'd list a specific set of files you expected your package to install into, say, /usr/bin -- either an unexpected addition or unexpected removal there should set off alarm bells. But the file list for tzdb's database would almost certainly just be "/usr/share/zoneinfo/*"; there is no value in expending packager brain cells on second-guessing the upstream there. Maybe NetBSD's packaging toolchain is too impoverished to do that, but I'd certainly call that a bug not a feature.
Perhaps. Call it what you want, but that is reality. Not worth further debate. The info was offered only as a counter example to your doubt of the existence of packages who don't just include whatever shows up. +------------------+--------------------------+----------------------------+ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org | +------------------+--------------------------+----------------------------+

Date: Sat, 21 Jul 2018 02:08:20 -0400 From: Tom Lane <tgl@sss.pgh.pa.us> Message-ID: <8508.1532153300@sss.pgh.pa.us> | And your point is? Should I conclude that NetBSD users see a seriously | lobotomized set of time zones, No. | or does the packager just automatically | update the distribution list every time a new file shows up there? Not that either. I am the "packager" (currently) and new files only appear when they have some value. A new zone with new time translation data has that value. Other random files (perhaps) do not. Further we do not use the upstream (ie: Paul's) Makefiles at all. (Not for tzdata, and not for tzcode - the latter of which requires much more manual integration (not by me) as we do not ship the upstream code unaltered.) Paul's Makefiles might not even work with NetBSD's make (I haven't tried them to find out). | In my former life as a packager for Red Hat, you'd list a specific set of | files you expected your package to install into, say, /usr/bin -- either | an unexpected addition or unexpected removal there should set off alarm | bells. Same here. | But the file list for tzdb's database would almost certainly just | be "/usr/share/zoneinfo/*"; We have no such concept. Every file is listed individually - which is needed, if for no other reason, than that we know when something is removed, and a user upgrades (from however ancient a previous version) that we automatically remove the no-longer needed file, if it has not been changed at the user's site, and warn the local management if they have changed files which are no longer required -- if any of those were found). If all we expected was '*' then we'd have no way of knowing that some file should no longer be there - and while time zones themselves are very unlikey to ever be removed, or renamed (without being replaced by something) these extra files that you're anticipating being automatically added have no such requirements. With this, do remember that with NetBSD many (it was once "most", possibly not any more, as sad as that is) users upgrade from source, not from binary distributions. That is, the "build from source" system needs to handle any random crap it might find lying around the destination (which cannot assumed to be empty, or just upgrading what has changed would not really work...) | there is no value in expending packager brain | cells on second-guessing the upstream there. The few times a year it happens, and with knowing from the list what is expected to be changing, and what is not, the intellectual demands are not something I worry about. | Maybe NetBSD's packaging toolchain is too impoverished to do that, | but I'd certainly call that a bug not a feature. You are welcome to your opinion. I do not share it. kre

Date: Fri, 20 Jul 2018 11:35:59 -0700 From: Paul Eggert <eggert@cs.ucla.edu> Message-ID: <84d13601-c08a-ede3-bf8a-22f290c8ae43@cs.ucla.edu> | Admittedly it's not a big deal, but if we're upgrading tzdata in 5 billion | mobile devices ten times per year and saving 50 kB per upgrade, [...] Hopefully I did not say or imply, that the compressed file wasn't useful. All I think I said, or all I intended to say, was that we do not use it in NetBSD, and have no current plans to do so, You don't need to defend it. kre

On Jul 20, 2018, at 10:58, Paul Eggert <eggert@cs.ucla.edu> wrote:
Plus, distributors need to include something better than "# version unknown", which is what Red Hat is doing unfortunately. See:
https://bugzilla.redhat.com/show_bug.cgi?id=1604030 <https://bugzilla.redhat.com/show_bug.cgi?id=1604030> On any RedHat-ish system, one can get the TZ version very easily by doing:
*** snip snip *** [fredg@elastigirl ~]$ rpm -q tzdata tzdata-2018e-3.el7.noarch *** snip snip *** Or the equivalent programmatic call to the RPM database. This, of course, is horrendously platform-specific, but that illustrates the larger point: TZDB has historically been integrated in all sorts of ways on all sorts of platforms; to expect a new ‘one true way to get the version’ to get any kind of traction at this point is rather utopian. Cheers! |----------------------------------------------------------------------| | Frederick F. Gleason, Jr. | Chief Developer | | | Paravel Systems | |----------------------------------------------------------------------| | I looked out my window, and saw Kyle Petty's car upside down, then I | | thought "one of us is in real trouble". | | -- Davey Allison, on a 150 mph crash | |----------------------------------------------------------------------|

On 07/20/2018 11:34 AM, Fred Gleason wrote:
This, of course, is horrendously platform-specific, but that illustrates the larger point: TZDB has historically been integrated in all sorts of ways on all sorts of platforms; to expect a new ‘one true way to get the version’ to get any kind of traction at this point is rather utopian.
I don't think it will necessarily be *easy* to get this universally accepted, but I suspect there are a bunch of platform-specific ways to do this because there has, historically, never been a correct way to do it in the upstream project. If `make install` installs a version file, then only people whitelisting the deployed files or blacklisting that file will be missing it. Also, deploying `tzdata.zi` doesn't hurt. It's a very small file and nothing forces you to abandon your old platform-specific versioning. I suspect that as long as `make install` continues to install `tzdata.zi` by default, as more projects adopt this there will be more demand for platforms to ship that file and eventually it will be widely available.

OK I am back in the office now with a decent email client ;) On 20/07/18 00:40, Paul Eggert wrote:
Robert Elz wrote:
I have more than doubts, I am convinced that the entire thing is misguided, and that (with the one exception of curiosity value, the ability of an application to tell people which version data it is using) there is no good use of this data
Yes, that's the only good use I can think of as well. That is, the version string is more for diagnosing obsolete or misconfigured systems than for use in ordinary computation. That being said, configuration is of growing importance in software engineering, and I hope that the version comment in tzdata.zi is enough to satisfy needs in this area.
The simple question that the provision of a version number allows one to answer is "what version of tz data was used to create the material being published". Since we can't rely on just which version a client machine IS using, there has to be some way to identify just what WAS used, even if that requires the site ALSO providing it's own copy of the tz rules used! Calendars for international meetings are produced perhaps two or three years in advance, so it is quite possible that in the intervening time some element of normalised data will change, and at the very least the organisers need to be aware of the problems created. THEY at least need to be able to 'un-normalise' using the original rules and then establish where cross timezone events have now been 'broken'. Assuming the diary will still work with the data the OS is providing simply does not work.
With TZDIST the version info is best done via ETags, and this should work with TZif files so that clients can easily see whether they're up-to-date and get a new version if not. See Internet RFC 7808 section 4.1.4 along with:
ETags ONLY work for the person who originally normalised the data! Other users need to know what version of tz data to ask for to go with the diary they are reading and that may not be the 'current' tz data so if tzdist is to be of any use, it needs to provide the requested rule set. Something that ETags does nothing to support. In practice the publisher of the data may not even be using TZ at all, so in addition to the version, the creator of the diary would probably have to publish the the source of the tzdist feed they are using ... Note ... none of the above even touches on pre-1970 variations in the raw data, but fixing the problem for current data will allow that data to be viewed in 30 or 50 years time without having to worry about changes in the meantime. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

One thing to note is that there's some possibility that distributions will be reluctant to deploy the (large) tzdata.zi file /just/ to have an official place where the version is deployed. I would really like it if a separate version.zi file were deployed alongside tzdata.zi that /just/ contains the version information. One option is to have the version.zi be a little richer than an empty file with a version string, it could be a key-value store like: base_data_version: 2018c tzcode_version: 2018c Presumably a key for whether the files were generated from the rearguard file would also be useful. On 07/23/2018 09:36 AM, Lester Caine wrote:
OK I am back in the office now with a decent email client ;)
On 20/07/18 00:40, Paul Eggert wrote:
Robert Elz wrote:
I have more than doubts, I am convinced that the entire thing is misguided, and that (with the one exception of curiosity value, the ability of an application to tell people which version data it is using) there is no good use of this data
Yes, that's the only good use I can think of as well. That is, the version string is more for diagnosing obsolete or misconfigured systems than for use in ordinary computation. That being said, configuration is of growing importance in software engineering, and I hope that the version comment in tzdata.zi is enough to satisfy needs in this area.
The simple question that the provision of a version number allows one to answer is "what version of tz data was used to create the material being published". Since we can't rely on just which version a client machine IS using, there has to be some way to identify just what WAS used, even if that requires the site ALSO providing it's own copy of the tz rules used! Calendars for international meetings are produced perhaps two or three years in advance, so it is quite possible that in the intervening time some element of normalised data will change, and at the very least the organisers need to be aware of the problems created. THEY at least need to be able to 'un-normalise' using the original rules and then establish where cross timezone events have now been 'broken'. Assuming the diary will still work with the data the OS is providing simply does not work.
With TZDIST the version info is best done via ETags, and this should work with TZif files so that clients can easily see whether they're up-to-date and get a new version if not. See Internet RFC 7808 section 4.1.4 along with:
ETags ONLY work for the person who originally normalised the data! Other users need to know what version of tz data to ask for to go with the diary they are reading and that may not be the 'current' tz data so if tzdist is to be of any use, it needs to provide the requested rule set. Something that ETags does nothing to support. In practice the publisher of the data may not even be using TZ at all, so in addition to the version, the creator of the diary would probably have to publish the the source of the tzdist feed they are using ...
Note ... none of the above even touches on pre-1970 variations in the raw data, but fixing the problem for current data will allow that data to be viewed in 30 or 50 years time without having to worry about changes in the meantime.

Lester Caine wrote:
The simple question that the provision of a version number allows one to answer is "what version of tz data was used to create the material being published".
Unfortunately that's not how things work. In practice the version number often does not provide that information; it merely provides a string that is useless or (worse) wrong. An example of this is the version string in the latest version of Fedora, which is merely "unknown". In practice there is no guarantee that a version string will be mappable to the data that it represents. There's not even a guarantee that differing data will have differing version strings.
there has to be some way to identify just what WAS used
In practice the version string does not suffice for that. You can use 'zdump -i', though.
ETags ONLY work for the person who originally normalised the data!
Sure, just as version strings are distributor-specific. ETags are no worse than version strings in this respect.

On 25/07/18 23:07, Paul Eggert wrote:
Lester Caine wrote:
The simple question that the provision of a version number allows one to answer is "what version of tz data was used to create the material being published".
Unfortunately that's not how things work. In practice the version number often does not provide that information; it merely provides a string that is useless or (worse) wrong. An example of this is the version string in the latest version of Fedora, which is merely "unknown". In practice there is no guarantee that a version string will be mappable to the data that it represents. There's not even a guarantee that differing data will have differing version strings.
there has to be some way to identify just what WAS used
In practice the version string does not suffice for that. You can use 'zdump -i', though.
ETags ONLY work for the person who originally normalised the data!
Sure, just as version strings are distributor-specific. ETags are no worse than version strings in this respect.
But isn't this the whole crux of the problem? We NEED some way to identify if the data we are currently working with is current. Even a 'recent' RFC like tzdist manages to miss the whole point that the way things NEED to work with this sort of changing data is having SOME WAY to identify that one is working with the same progression of data changes that the data publisher has been using? Currently there is no way to ensure that everybody IS seeing the same information :( I am thinking that the only safe way to manage ANY current cross timezone diary of events is to also include the raw TZ data that was used to create it! This is the only way one can ensure that the SAME normalizations are displayed. tzdist SHOULD provide the same sequence of data changes to everybody who is accessing the same publishing source, but as yet we do not have any published sources of tzdist? The fact that the system IS currently broken is no excuse and "Unfortunately that's not how things work" has no place in the discussion. We need to be providing a fix of some sort even if it is only "data unavailable for that period" -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

On 07/26/2018 03:25 AM, Lester Caine wrote:
We need to be providing a fix of some sort
The need for a fix within TZif itself is not great, and the "fixes" proposed so far have disadvantages of their own. As others have pointed out, this problem is better addressed as metadata, outside the scope of TZif itself.
tzdist SHOULD provide the same sequence of data changes to everybody who is accessing the same publishing source, but as yet we do not have any published sources of tzdist
If the TZif spec becomes standard, it might make sense for IANA to set up a tzdist server for it. Perhaps also a plain https server, as that's good enough for distributing TZif files.

We didn't add everything we wanted to in the RFC because it's a balance between getting it all or gettign nothing. Our hope is that each timezone definition will have some sort of version - maybe the latest version for each of the rules. That would allow update of a single tz. I understand that redistributors may want to add their own data or modify the data but that just means they need to add their own version. And yes they do need to be an increasing sequence. A simple sequence number might suffice for versioning. And I agree - it's broken now is more of a reason to do something than a reason to do nothing On 7/26/18 06:25, Lester Caine wrote:
On 25/07/18 23:07, Paul Eggert wrote:
Lester Caine wrote:
The simple question that the provision of a version number allows one to answer is "what version of tz data was used to create the material being published".
Unfortunately that's not how things work. In practice the version number often does not provide that information; it merely provides a string that is useless or (worse) wrong. An example of this is the version string in the latest version of Fedora, which is merely "unknown". In practice there is no guarantee that a version string will be mappable to the data that it represents. There's not even a guarantee that differing data will have differing version strings.
there has to be some way to identify just what WAS used
In practice the version string does not suffice for that. You can use 'zdump -i', though.
ETags ONLY work for the person who originally normalised the data!
Sure, just as version strings are distributor-specific. ETags are no worse than version strings in this respect.
But isn't this the whole crux of the problem? We NEED some way to identify if the data we are currently working with is current. Even a 'recent' RFC like tzdist manages to miss the whole point that the way things NEED to work with this sort of changing data is having SOME WAY to identify that one is working with the same progression of data changes that the data publisher has been using? Currently there is no way to ensure that everybody IS seeing the same information :(
I am thinking that the only safe way to manage ANY current cross timezone diary of events is to also include the raw TZ data that was used to create it! This is the only way one can ensure that the SAME normalizations are displayed. tzdist SHOULD provide the same sequence of data changes to everybody who is accessing the same publishing source, but as yet we do not have any published sources of tzdist?
The fact that the system IS currently broken is no excuse and "Unfortunately that's not how things work" has no place in the discussion. We need to be providing a fix of some sort even if it is only "data unavailable for that period"
participants (18)
-
Aldrin Martoq Ahumada
-
Bradley White
-
Brian Inglis
-
Fred Gleason
-
Howard Hinnant
-
John Haxby
-
Lester Caine
-
lester@lsces.co.uk
-
Martin Burnicki
-
Michael Douglass
-
Paul Eggert
-
Paul G
-
Paul Ganssle
-
Paul Goyette
-
Philip Newton
-
Robert Elz
-
Steffen Nurpmeso
-
Tom Lane