[Forwarded with permission.] -------- Forwarded Message -------- Subject: TZDB Questions Date: Mon, 27 Aug 2018 17:52:53 -0700 From: Mikey Schott <maschott@gmail.com> To: eggert@cs.ucla.edu Hello, Dr. Eggert. First let me express my thanks to you for all of the time and energy you have put into the upkeep of the time zone database. I appreciate all of the thought and detail that goes into the decisions regarding each zone. I've been working extensively with the time zone database over the past month to identify a list of currently unique time zones for use in a front-end web application. I'd be happy to share the details of my work if you are curious, but since its purpose is tangential to that of the time zone database, I'll leave them out of this email. The reason I'm writing is because after working so closely with the data, I had a few questions I was hoping you could answer for me. I have more or less completed the work I was doing, so these questions come more from curiosity than necessity. 1. Has there been any consideration to move the source data into a more structured format like csv, xml, or json? I realize I'm probably in the minority here in terms of people who would find this useful. 2. The latest zone entry for some zones that no longer use dst still point to dst rules, e.g. Asia/Tokyo and the Japan rule set even though dst is no longer used in Japan . Is there a reason why some non-dst zones use this method where they point to an outdated rule set whereas most non-dst zones use no rule set (i.e. rule = '-')? 3. I've noticed that some zones just use a shortened version of their offset for their format, whereas timeanddate.com gives them a more descriptive format (e.g. Europe/Volgograd uses MSK vs. +03). Is there a reason for the discrepancy? Does the time zone database error on the side of caution here? 4. I believe that one of the primary purposes of the Link zones is to ensure that there is a zone that covers every country. Is there a specific list or source that the time zone database uses to decide the list of countries that are covered? I noticed that Bouvet Island (BV) and Heard and McDonald Islands (HM) are not included despite having ISO-3166-2 country codes. (Although both have a population of 0, so it hardly qualifies as an oversight.) 5. Do you have a favorite time zone? I've become quite fond of Antarctica/Troll both for its name and its unique offset. Thanks again for all of your work on the database. I've signed up for the mailing list so I can keep up to date. -Mikey
From: Mikey Schott
1. Has there been any consideration to move the source data into a more structured format like csv, xml, or json?
There are translators into various XML and JSON flavors; see <https://data.iana.org/time-zones/theory.html> for a few pointers. This is the first I've heard CSV suggested. Part of the fun of maintaining tzdb is that the source data format is simple and easily readable. I have my doubts whether hand-maintaining the source in these other formats would be worth the trouble.
2. The latest zone entry for some zones that no longer use dst still point to dst rules, e.g. Asia/Tokyo and the Japan rule set even though dst is no longer used in Japan . Is there a reason why some non-dst zones use this method where they point to an outdated rule set whereas most non-dst zones use no rule set (i.e. rule = '-')?
To some extent it's an accident. To some extent it attempts to use a ruleset that is likely to see the smallest change if the most-likely political events occur in the future. I don't try all that hard to normalize the source; the goal is easy of understanding and stability more than strict consistency.
3. I've noticed that some zones just use a shortened version of their offset for their format, whereas timeanddate.com gives them a more descriptive format (e.g. Europe/Volgograd uses MSK vs. +03). Is there a reason for the discrepancy? Does the time zone database error on the side of caution here?
Yes. Previously we erred on the side of incaution and saw more arguments over what the abbreviations should be. The abbreviations were kind of fanciful, and not that useful anyway (they're ambiguous). This is described in more detail in <https://data.iana.org/time-zones/theory.html#abbreviations>.
4. I believe that one of the primary purposes of the Link zones is to ensure that there is a zone that covers every country.
That has been the case, yes. In hindsight this was a mistake as it needlessly complicates tzdb and is more likely to lead to political bickering.
Is there a specific list or source that the time zone database uses to decide the list of countries that are covered?
More politics?! The list we use is in the file iso3166.tab. It is not authoritative nor, as the lawyers might say, is it intended to take or endorse any position on legal or territorial claims.
5. Do you have a favorite time zone?
My current favorite is America/Montevideo. "Apparently restaurateurs complained that DST caused people to go to the beach instead of out to dinner." Imagine, a government that listens to its country's cooks!
On Aug 27, 2018, at 8:46 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
1. Has there been any consideration to move the source data into a more structured format like csv, xml, or json?
Source code and data is something to be edited and read by humans, so a less-structured but more readable format, as long as it *can* be parsed by software, is preferable.
I realize I'm probably in the minority here in terms of people who would find this useful.
There's "make the source file CSV/XML/JSON files" and there's "make CSV/XML/JSON files containing the time zone data available"; the latter can be done without doing the former, just as "make binary files containing the time zone data available" was done without the former when the project began - a program like zic, or zic itself, could read the source files and emit CSV/XML/JSON files, once an appropriate schema is devised. I suspect the latter would be as useful as the former.
2. The latest zone entry for some zones that no longer use dst still point to dst rules, e.g. Asia/Tokyo and the Japan rule set even though dst is no longer used in Japan . Is there a reason why some non-dst zones use this method where they point to an outdated rule set whereas most non-dst zones use no rule set (i.e. rule = '-')?
Do any tzdb regions that used to observe DST but no longer do so implement that by adding an additional zone line, with no rule set, rather than by having the last zone line point to a rule set where the last rule ends when DST observation didn't end?
4. I believe that one of the primary purposes of the Link zones is to ensure that there is a zone that covers every country. Is there a specific list or source that the time zone database uses to decide the list of countries that are covered? I noticed that Bouvet Island (BV) and Heard and McDonald Islands (HM) are not included despite having ISO-3166-2 country codes. (Although both have a population of 0, so it hardly qualifies as an oversight.)
We may need to clarify that the first of these two items in the theory.html file takes precedence over the second, unless the use of "should" rather than "must" in the second is sufficient: • Uninhabited regions like the North Pole and Bouvet Island do not need locations, since local time is not defined there. • There should typically be at least one name for each ISO 3166-1 officially assigned two-letter code for an inhabited country or territory.
5. Do you have a favorite time zone? I've become quite fond of Antarctica/Troll both for its name and its unique offset.
US/Pacific-New, until we stopped waiting for it to happen and just linked it to America/Los_Angeles. :-)
On 2018-08-28 05:46:38 (+0200), Paul Eggert wrote:
5. Do you have a favorite time zone? I've become quite fond of Antarctica/Troll both for its name and its unique offset.
I have always liked Africa/Egypt and Africa/Morocco for their "interesting" DST rules. My favourite aspect of the tzdb as a whole is the wealth of historical trivia and thoughtful comments in the data files (like the LISP code to work out Morocco's DST rules, for instance). Philip -- Philip Paeps Senior Reality Engineer Ministry of Information
participants (3)
-
Guy Harris -
Paul Eggert -
Philip Paeps