Dear Paul, thanks for the elaboration - further comments inline On 19.02.19 17:34, Paul Eggert wrote:
Hans-Joerg Happel wrote:
So any information on the rationale for the above-quoted rule is appreciated.
When I introduced locations as tzdb identifiers long ago, I naively put in one Zone or Link per country. Some users got accustomed to the idea that Zone and/or Link status was akin to political recognition of countries or their boundaries (is Kosovo a country? is Jerusalem part of Palestine? that sort of thing) and so I eventually realized my mistake: the identifiers are now supposed to be single locations, decoupled from countries or national regions.
At one point I attempted to simplify the database by coalescing regions that crossed country boundaries, since there's no longer any need to conflate the two. Unfortunately this flustered some users who thought this meant disrepect to particular countries (which was not the intent). After some back and forth we ended up with the current wording, which I'm not happy with, as the current rules are still too political and we still end up with "There should be an entry for Hanoi/Beijing/etc.!" discussions.
If this continues to be a sore spot, I'm inclined to adjust the rules so that there need not at least one Link or Zone per country. This should help simplify future maintenance if, say, the US splits into multiple countries (and stuff like that does happen).
I'd see two different kinds of "maintenance" here: * Maintenance of tzdb (which you are referring to) * Maintenance of systems/data by people using tzdb (directly or indirectly) So in your example - if say, Washington (the federal state) would become a separate country, you argue to keep maintenance of tzdb low by sticking to "America/Los_Angeles", whereas one could argue for a link from "America/Seattle" (or the like) according to the current ISO 3166-1 rule in [1]. On the other hand, if Washington users could make use of such an "America/Seattle" alias, that could reduce maintenance effort in the event of of the new country of Washington deciding to deviate from "America/Los_Angeles" rules (which, as you say is stuff that will happen one or the other way). To be clear: I perfectly understand that the first kind of "maintenance" is the main concern of the tzdb community. However, I am trying to make a point that tzdb design decisions may have further kind of impact (second aspect of maintenance), and I wonder to what degree these concerns are shared. Another point I made was that in my experience some applications "misuse" tzdb identifiers as actual time zone names shown to users. While I agree it would be better to solve this on a more general level such as geo-location<=>timezone mappings (and I know efforts in that direction exist) I'd still argue that one won't get around that "misuse" practice anytime soon. It's a matter of philosophy if that counts as an argument for the ISO 3166-1 rule case. Disclosure: I am involved in CalConnect's efforts to advise governments and practitioners on how to tackle time zone issues. Therefore I am trying to get a better understanding of how things work and appreciate your feedback, even if it might slightly go beyond the scope of pure tzdb maintenance.
I see there are recommendations for implementers regarding localization (e.g., reference to Unicode CLDR), but I could not find recommendations concerning the usage of tzdb names/ids/links/aliases - which, in my understanding, will be the String persisted by most applications in the end?
Although the "Names of timezones" section in <https://data.iana.org/time-zones/theory.html> talks about this, it sounds like you think something is missing. I'm not sure what sort of recommendations you're thinking about, though. Perhaps you could suggest a specific wording change to that section?
I'd perhaps first suggest to talk about "Identifiers of timezones" instead of "Names of timezones". The distinction may sound picky, but I've the impression some discussions (such as the "Hanoi" one) circle around the misunderstanding of "Asia/Bangkok" being some sort of semi-official "name" while it is supposed to be (as you pointed out above), A second suggestion would be a short note that it is these identifiers which would be typically stored in database / system configuration / appointment in order to denote the time zone of a "thing" and that software will typically pick up the additional rule data in tzdb in order to compute meaningful things based on these identifiers. This is sort of implicit to the given text, but might be pointed out more explicitly. Note that my perspective here is rather from an end user / developer point of view and I acknowledge that this is not the main audience of tzdb content. However I'd argue that end users / developers might find some clarification on this useful when considering their use of identifiers which ultimately stem from and are maintained by tzdb. I don't know the change policy of theory.html, but if there's some agreement that I have a point here, I'd be happy to propose more particular wording changes.
setting "Vietnamese" data or systems to an "Asia/Bangkok" tzdb identifier would, to some degree, effectively tie that data to Thai legislation.
No, it ties that data to our best guess of what northern Vietnam etc. will do. It does not tie the data to any legislation, just as setting "Moroccan" data or systems to "Africa/Casablanca" ties that data to our best guess of what Morocco will do, regardless of existing legislation. In both cases our guesses may well be wrong and if so users will need to deal with the wrong guesses, which is just part of life when predicting the future. If we started letting political considerations affect how we record our guesses, that would lead to more controversy and more database churn and I'd rather not go that direction.
My assumption (also in the US/Washington case above) is, that it is (in most cases) countries which do cast decisions about time zone rules. Therefore, both Vietnam and Thailand seem to be free to change time zone rules at any time soon. Assuming two databases, one with "Vietnamese data" on an office server in Hanoi, one with "Thai data" on an office server in Bangkok, both databases would currently use "Asia/Bangkok" as an identifier. So in case the Thai government decides to change time zone rules for Thailand (and Vietnam would not follow), if I am not wrong, this would most probably require tzdb to introduce an "Asia/Hanoi" identifier to accommodate for this new situation. However, people maintaining the server with "Vietnamese data" might then need to convert that data from using "Asia/Bangkok" to "Asia/Hanoi" (my example may sound a little simple or arbitrary, but I also want to understand if you share the idea that this may be a point). So given that example I'd argue that a Thai government could indeed influence Vietnamese data. Coming back to the "Hanoi" thread, I also think this might be a valid concern of Vietnamese people while being "forced" to use an "Asia/Bangkok" identifier. However, I also see your point about additional controversy. On the other hand I think the existing technique of "Links" seems to be a relatively simple way of resolving some of the points I made above, and that the current "Asia/Phnom_Penh" has its value with respect to this. Best, Hans-Joerg [1] https://data.iana.org/time-zones/theory.html#naming -- audriga GmbH Durlacher Allee 47 76131 Karlsruhe, Germany Tel: +49 (0) 721 17029 316 Fax: +49 (0) 721 17029 3179 www.twitter.com/audriga <http://www.twitter.com/audriga> www.audriga.com <http://www.audriga.com/> Handelsregister: Amtsgericht Mannheim - HRB 713034 Sitz der Gesellschaft: Karlsruhe Geschäftsführer: Dr. Frank Dengler, Dr. Hans-Jörg Happel USt-ID: DE 279724142