Proper (stable) way to list installed time zones?
As part of the implementation of IANA time zone support in the Python standard library (which is now accepted <https://www.python.org/dev/peps/pep-0615/> for inclusion in Python 3.9 — many thanks to those who commented on the original proposal), I realized that likely a common feature request is to get the list of time zones installed in the system. I have a candidate implementation <https://github.com/pganssle/zoneinfo/pull/60> for this, which basically walks each potential install directory (there's a "time zone search path" equivalent to PATH) and populates a list of every file installed there which starts with the `TZif` magic string. The problem I'm running into is that tzcode installs `posix/` and `right/` folders, so for each entry in zoneinfo, I get three entries: "Africa/Abidjan", "posix/Africa/Abidjan" and "right/Africa/Abidjan". I'm also seeing a `posixrules` zone in the zoneinfo root. My questions are: 1. Is there a better source for the list of installed time zones (leaving aside `tzdata.zi`, which I don't think I can count on to exist in all environments). 2. Is there any stable and standard way to distinguish between proper zones and other things that are also zone files like posixrules and right/ and posix/? In my own redistribution of the data, I populate a list of zones from `make zonenames`, which I note does /not/ include posix/, right/ or posixrules, but I don't think that information is included in standard distributions of zoneinfo. The closest I can find is `zone1970.tab` and `zone.tab` (which I think is not even always included), and that appears to be only a subset of the output of `make zonenames`. Thanks, Paul
On 5/13/20 11:19 AM, Paul Ganssle wrote:
The problem I'm running into is that tzcode installs `posix/` and `right/` folders, so for each entry in zoneinfo, I get three entries: "Africa/Abidjan", "posix/Africa/Abidjan" and "right/Africa/Abidjan". I'm also seeing a `posixrules` zone in the zoneinfo root.
The posix/* entries are duplicates are the main ones, and the right/* entries are with leap seconds (which aren't normally used). The posixrules is for when one uses a nonconforming TZ string like TZ='EST5EDT'.
1. Is there a better source for the list of installed time zones (leaving aside `tzdata.zi`, which I don't think I can count on to exist in all environments).
That depends on what you want the list for. If you want every value V such that TZ='V' works via reading files, you've got the right list. If you want every value V such that TZ='V' works, then you have just a subset since POSIX-style Vs are not taken from files. If you want a shorter list (which makes sense), you can omit the posix/* and right/* and posixrules entries.
2. Is there any stable and standard way to distinguish between proper zones and other things that are also zone files like posixrules and right/ and posix/?
Depends on what you mean by 'proper'. :-) 'make zonenames' should be equivalent to looking in tzdata.zi, because that's what 'make zonenames' does. And that should be equivalent to the "shorter list" mentioned above.
The closest I can find is `zone1970.tab` and `zone.tab` (which I think is not even always included), and that appears to be only a subset of the output of `make zonenames`.
Yup. The method I'd suggest is tzdata.zi if available, or the "shorter list" mentioned above, whichever's faster. This is because zone1970.tab and zone.tab don't list some of the duplicates, and some users like the duplicates. If you also want the leap-second entries, you'll need the right/* entries though.
On Wed, 2020-05-13 at 19:13 -0700, Paul Eggert wrote:
Yup. The method I'd suggest is tzdata.zi if available, or the "shorter list" mentioned above, whichever's faster. This is because zone1970.tab and zone.tab don't list some of the duplicates, and some users like the duplicates.
If you also want the leap-second entries, you'll need the right/* entries though.
I am thinking the best approach should be reading tzdata.zi when available, and potentially fallback to ignoring posix/* and right/* and posixrules. I am not sure about this last one though, for reasons outlined in the PR. What do you think Paul (G.)? Filipe Laíns
This is very tough, because there are a lot of plausible subsets that someone could want, and there are not good names for them all (nor, it seems, do they all have a foolproof method of detection). I'd say one could easily want: 1. Every valid value for key that doesn't raise an error when you call ZoneInfo(key) 2. All the values found in tzdata.zi 3. The intersection of the values found in tzdata.zi and the keys that actually exist on disk 4. All the values from zone.tab and/or zone1970.tab 5. The intersection of #4 and all the keys that actually exist on disk 6. #4 or #5, but also including "UTC" (and possibly the fixed-offset zones like Etc/GMT+5 or whatever) My inclination would be to compile a list of zones annotated with each entry's membership in each of these groups and let the end user filter as desired, but that's slightly complicated by the fact that almost all the indicators other than #1 may be manipulated by the install mechanism, and it becomes hard to distinguish between "not present in tzdata.zi" and "tzdata.zi not shipped". That said, it sounds to me like we can probably safely say that it is unlikely that we won't have to special case any names of folders or files /other/ than posixrules, posix/ and right/ for the foreseeable future and that if distros rename those files but still put them under the zoneinfo root, that's probably safely considered a bug in the distro. I'm still mildly uncertain as to whether it's possible that tzdata.zi might have more zones than are actually installed (for example if a distro doesn't install anything in backward or antarctica for some reason). With regards to this:
If you also want the leap-second entries, you'll need the right/* entries though.
Is it always the case that the right/ and posix/ trees are identical to the primary tree? If so, it's reasonable to leave right/ and posix/ out of the listings and users can know that if they want the right/* entries, they can just prepend "right/" to any given key. Best, Paul On 5/15/20 9:43 AM, Filipe Laíns wrote:
On Wed, 2020-05-13 at 19:13 -0700, Paul Eggert wrote:
Yup. The method I'd suggest is tzdata.zi if available, or the "shorter list" mentioned above, whichever's faster. This is because zone1970.tab and zone.tab don't list some of the duplicates, and some users like the duplicates.
If you also want the leap-second entries, you'll need the right/* entries though. I am thinking the best approach should be reading tzdata.zi when available, and potentially fallback to ignoring posix/* and right/* and posixrules. I am not sure about this last one though, for reasons outlined in the PR. What do you think Paul (G.)?
Filipe Laíns
On 5/15/20 7:39 AM, Paul Ganssle wrote:
I'd say one could easily want:
1. Every valid value for key that doesn't raise an error when you call ZoneInfo(key)
Does ZoneInfo operate by looking for an actual file, or by setting TZ='key' and seeing whether that works? If the latter, there are many keys that are not listed anywhere because they're specified by POSIX. A simple example is TZ='MST7' which is a valid POSIX key and which is commonly used, but for which there is no file or tzdata.zi entry. I looked at PEP 615 and didn't see any mention of the issue of POSIX-specified TZ settings.
I'm still mildly uncertain as to whether it's possible that tzdata.zi might have more zones than are actually installed (for example if a distro doesn't install anything in backward or antarctica for some reason).
In the reference implementation, the zones are installed from tzdata.zi so the lists should be identical there.
Is it always the case that the right/ and posix/ trees are identical to the primary tree?
Yes, in the reference implementation.
On 2020-05-15 13:52, Paul Eggert wrote:
On 5/15/20 7:39 AM, Paul Ganssle wrote:
I'd say one could easily want:
1. Every valid value for key that doesn't raise an error when you call ZoneInfo(key)
Does ZoneInfo operate by looking for an actual file, or by setting TZ='key' and seeing whether that works? If the latter, there are many keys that are not listed anywhere because they're specified by POSIX. A simple example is TZ='MST7' which is a valid POSIX key and which is commonly used, but for which there is no file or tzdata.zi entry. I looked at PEP 615 and didn't see any mention of the issue of POSIX-specified TZ settings.
I'm still mildly uncertain as to whether it's possible that tzdata.zi might have more zones than are actually installed (for example if a distro doesn't install anything in backward or antarctica for some reason).
In the reference implementation, the zones are installed from tzdata.zi so the lists should be identical there.
Is it always the case that the right/ and posix/ trees are identical to the primary tree?
Yes, in the reference implementation.
For every zone in the sources, try the following from your tzdata directory: $ awk ' /^Zone/ { print $2, "Zone", FILENAME } /^Link/ { print $3, "Link", FILENAME, $2 } /^[^#]/ && FILENAME ~ /zone[^.]*\.tab/ { print $3, "Tab", FILENAME, $1, $2 } ' africa antarctica asia australasia backward backzone etcetera \ europe factory northamerica pacificnew solar8[7-9] systemv \ usno1988 usno1989 usno1989a usno199[578] zone.tab zone1970.tab | \ sort -dfuk1,1 you can drop the sort -u or the sort command to see the selected data, or drop the extra fields to just get the zone names - 586 unique entries total possible. If I missed some selection source or criterion please post on the list. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised. [Data in IEC units and prefixes, physical quantities in SI.]
participants (4)
-
Brian Inglis -
Filipe Laíns -
Paul Eggert -
Paul Ganssle