Re: [tz] Version in zoneinfo files?

On Oct 25, 2015, at 10:55 AM, Paul G <paul@ganssle.io> wrote:
I'm not suggesting that it be impossible to create a weird result when asking for the version, just that there's a standard place for the version to go, and preferably the standard default gives you something that you expected.
I.e., something that works most of the time is sufficient? So what exactly is the intended use of this version field?
As it is now, you're virtually guaranteed to have a lot of incompatibility because there's nothing you can point to that says, "This is the canonical way to store version information about zoneinfo files", and no standard has emerged on how to do this as is.
On 10/25/2015 01:51 PM, Guy Harris wrote:
On Oct 25, 2015, at 10:37 AM, Paul G <paul@ganssle.io> wrote:
On 10/25/2015 01:35 PM, Guy Harris wrote:
So where would zic get the version information to put into the binary files?
My understanding is that it's the VERSION= line of the Makefile in the tzdata tarballs.
I.e., you're proposing that the value of the VERSION value in the Makefile be passed as an argument to zic, and zic would put the value of that argument into a field in the files?
So, if the goal is to provide a "reliable way to get the version information in a cross-platform way", what would ensure that this field be set properly, for some value of "properly", everywhere that binary tzdata files are generated?
I.e., what is to guarantee that
1) nobody downstream change VERSION if they aren't changing the zic files from what's distributed
and
2) people change it "properly" if the files don't correspond to a particular version of files that were distributed (e.g., if some downstream supplier takes 2015q and adds some changes mentioned on the list to it before 2015r comes out with those changes plus other changes, they should call the resulting files something other than "2015q" or "2015r")?

On 10/25/2015 01:59 PM, Guy Harris wrote:
On Oct 25, 2015, at 10:55 AM, Paul G <paul@ganssle.io> wrote:
I'm not suggesting that it be impossible to create a weird result when asking for the version, just that there's a standard place for the version to go, and preferably the standard default gives you something that you expected.
I.e., something that works most of the time is sufficient?
So what exactly is the intended use of this version field?
Maybe I don't fully understand the concerns you are raising, but by way of example, in python-dateutil, we distribute zoneinfo binaries along with the library, and to allow users to query what version of tzdata they are using, we bundle our own metadata file in with zoneinfo (the feature request is here: https://github.com/dateutil/dateutil/issues/27). I gather that this is not uncommon though I'm not an end-user of this particular feature, so I can only speculate as to the various use cases. Right now, we're expecting a zoneinfo file of our own construction, but there are organizations for whom rebuilding python-dateutil's (and pytz's, and a hundred other zic consumers') zoneinfo database on every tz release is not reasonable, so they would prefer to point dateutil at a shared binary zoneinfo. If each of these zic consumers are expecting their own bespoke way of storing version information, though, they will either return inaccurate information or no information in a case such as this. So, as it is now, it's going to be spotty at best ANYWAY if you do anything other than the completely stock-standard way of handling the databases (unless you maintain your own patched versions of all your zic consumers), so having a way to do it that at least handles the "first order" case (people who deploy zoneinfo files in a standard, but centralized way) is a significant improvement over the status quo.

On Oct 25, 2015, at 11:17 AM, Paul G <paul@ganssle.io> wrote:
On 10/25/2015 01:59 PM, Guy Harris wrote:
On Oct 25, 2015, at 10:55 AM, Paul G <paul@ganssle.io> wrote:
I'm not suggesting that it be impossible to create a weird result when asking for the version, just that there's a standard place for the version to go, and preferably the standard default gives you something that you expected.
I.e., something that works most of the time is sufficient?
So what exactly is the intended use of this version field?
Maybe I don't fully understand the concerns you are raising, but by way of example, in python-dateutil, we distribute zoneinfo binaries along with the library, and to allow users to query what version of tzdata they are using, we bundle our own metadata file in with zoneinfo (the feature request is here: https://github.com/dateutil/dateutil/issues/27). I gather that this is not uncommon though I'm not an end-user of this particular feature, so I can only speculate as to the various use cases.
And the person who filed that request didn't indicate what their use case was. I'm not sure I understand the concerns *they* are raising. What do they intend to *do* with the version string once they get it? Is this just something to let people know whether they have an up-to-date version of the tzdb files or not?
Right now, we're expecting a zoneinfo file of our own construction, but there are organizations for whom rebuilding python-dateutil's (and pytz's, and a hundred other zic consumers') zoneinfo database on every tz release is not reasonable, so they would prefer to point dateutil at a shared binary zoneinfo. If each of these zic consumers are expecting their own bespoke way of storing version information, though, they will either return inaccurate information or no information in a case such as this.
So, as it is now, it's going to be spotty at best ANYWAY if you do anything other than the completely stock-standard way of handling the databases (unless you maintain your own patched versions of all your zic consumers), so having a way to do it that at least handles the "first order" case (people who deploy zoneinfo files in a standard, but centralized way)
Presumably meaning "deploy standard, unmodified zic output built from a standard tzdb release".
is a significant improvement over the status quo.

On Sun, Oct 25, 2015 at 2:30 PM, Guy Harris <guy@alum.mit.edu> wrote:
Is this just something to let people know whether they have an up-to-date version of the tzdb files or not?
Yes, although I think of it simply as let people know what version of a tzdb file they have. (Whether that is the latest version is a separate question.) I would even propose an additional self-identification step: include the zone name in tzdb file. Then you could tell, for example, where /etc/localtime came from.

On Oct 26, 2015, at 2:50 PM, Bradley White <bww@acm.org> wrote:
On Sun, Oct 25, 2015 at 2:30 PM, Guy Harris <guy@alum.mit.edu> wrote:
Is this just something to let people know whether they have an up-to-date version of the tzdb files or not?
Yes, although I think of it simply as let people know what version of a tzdb file they have. (Whether that is the latest version is a separate question.)
Which version they have, or which version of the IANA tzdb the version they have is based on? :-)
I would even propose an additional self-identification step: include the zone name in tzdb file.
Which zone name is "the" zone name? The zone name on the system on which zic was run?
Then you could tell, for example, where /etc/localtime came from.
Well, on my system, it came from a symlink() call: $ ls -l /etc/localtime lrwxr-xr-x 1 root wheel 39 Oct 10 21:20 /etc/localtime -> /usr/share/zoneinfo/America/Los_Angeles although if I were traveling there would be more than such symlink() call, one made whenever the machine changes what tzdb zone it's in (I checked "Set time zone automatically using current location" in the "Time Zone" subpane of the "Date & Time" pane of System Preferences"). Should the code making those symlink() calls also write to each compiled tzdb file under /usr/share/zoneinfo, updating the zone name?

On Tue, Oct 27, 2015 at 7:40 AM, Guy Harris <guy@alum.mit.edu> wrote:
On Oct 26, 2015, at 2:50 PM, Bradley White <bww@acm.org> wrote:
On Sun, Oct 25, 2015 at 2:30 PM, Guy Harris <guy@alum.mit.edu> wrote:
Is this just something to let people know whether they have an up-to-date version of the tzdb files or not?
Yes, although I think of it simply as let people know what version of a tzdb file they have. (Whether that is the latest version is a separate question.)
Which version they have, or which version of the IANA tzdb the version they have is based on? :-)
The version of the data file that was passed to zic to produce the zoneinfo file in question. (If someone downstream wants to modify data files and recompile them, then sub-versioning is up to them.)
I would even propose an additional self-identification step: include the zone name in tzdb file.
Which zone name is "the" zone name? The zone name on the system on which zic was run?
The zone name is the Zone field from the data file used to produce the zoneinfo file. (There would be a question about how to handle Links.)
Then you could tell, for example, where /etc/localtime came from.
Well, on my system, it came from a symlink() call:
$ ls -l /etc/localtime lrwxr-xr-x 1 root wheel 39 Oct 10 21:20 /etc/localtime -> /usr/share/zoneinfo/America/Los_Angeles
And on others it is a copy. It seems preferable to have the data be self identifying rather than assuming anything about symlinks.
although if I were traveling there would be more than such symlink() call, one made whenever the machine changes what tzdb zone it's in (I checked "Set time zone automatically using current location" in the "Time Zone" subpane of the "Date & Time" pane of System Preferences"). Should the code making those symlink() calls also write to each compiled tzdb file under /usr/share/zoneinfo, updating the zone name?
No, the zone name in any zoneinfo file would be constant. As an example, perhaps we could then see: $ file /etc/localtime /etc/localtime: timezone data, *zone America/New_York*, *release 2015f-0ubuntu0.14.04*, version 2, 4 gmt time flags, 4 std time flags, no leap seconds, 235 transition times, 4 abbreviation chars

On Oct 27, 2015, at 8:19 AM, Bradley White <bww@acm.org> wrote:
On Tue, Oct 27, 2015 at 7:40 AM, Guy Harris <guy@alum.mit.edu> wrote:
On Oct 26, 2015, at 2:50 PM, Bradley White <bww@acm.org> wrote:
On Sun, Oct 25, 2015 at 2:30 PM, Guy Harris <guy@alum.mit.edu> wrote:
Is this just something to let people know whether they have an up-to-date version of the tzdb files or not?
Yes, although I think of it simply as let people know what version of a tzdb file they have. (Whether that is the latest version is a separate question.)
Which version they have, or which version of the IANA tzdb the version they have is based on? :-)
The version of the data file that was passed to zic to produce the zoneinfo file in question.
"The version of the data file that was passed to zic to produce the zoneinfo file in question" presumably means, here, "The version string, which was passed as one of the arguments to zic", as the text files don't themselves include version numbers.
(If someone downstream wants to modify data files and recompile them, then sub-versioning is up to them.)
And they can pass any version string they choose.
Then you could tell, for example, where /etc/localtime came from.
Well, on my system, it came from a symlink() call:
$ ls -l /etc/localtime lrwxr-xr-x 1 root wheel 39 Oct 10 21:20 /etc/localtime -> /usr/share/zoneinfo/America/Los_Angeles
And on others it is a copy. It seems preferable to have the data be self identifying rather than assuming anything about symlinks.
Presumably, then, by "where /etc/localtime came from" you meant "the tzdb name for the default time zone".

On Tue, Oct 27, 2015 at 2:54 PM, Guy Harris <guy@alum.mit.edu> wrote:
On Oct 27, 2015, at 8:19 AM, Bradley White <bww@acm.org> wrote:
On Tue, Oct 27, 2015 at 7:40 AM, Guy Harris <guy@alum.mit.edu> wrote:
On Oct 26, 2015, at 2:50 PM, Bradley White <bww@acm.org> wrote:
On Sun, Oct 25, 2015 at 2:30 PM, Guy Harris <guy@alum.mit.edu> wrote:
Is this just something to let people know whether they have an up-to-date version of the tzdb files or not?
Yes, although I think of it simply as let people know what version of a tzdb file they have. (Whether that is the latest version is a separate question.)
Which version they have, or which version of the IANA tzdb the version they have is based on? :-)
The version of the data file that was passed to zic to produce the zoneinfo file in question.
"The version of the data file that was passed to zic to produce the zoneinfo file in question" presumably means, here, "The version string, which was passed as one of the arguments to zic", as the text files don't themselves include version numbers.
They don't explicitly at the moment, no. (They used to have RCS Ids.)
(If someone downstream wants to modify data files and recompile them, then sub-versioning is up to them.)
And they can pass any version string they choose.
Yes. Presumably it would be something useful.
Then you could tell, for example, where /etc/localtime came from.
Well, on my system, it came from a symlink() call:
$ ls -l /etc/localtime lrwxr-xr-x 1 root wheel 39 Oct 10 21:20 /etc/localtime -> /usr/share/zoneinfo/America/Los_Angeles
And on others it is a copy. It seems preferable to have the data be self identifying rather than assuming anything about symlinks.
Presumably, then, by "where /etc/localtime came from" you meant "the tzdb name for the default time zone".
Yes. And the version.

Guy Harris wrote:
Which zone name is "the" zone name? The zone name on the system on which zic was run?
I assume it'd be the name in the Zone directive of the zic input. Presumably zic would also have a new option to specify the version, and Makefile would use the new option. The version and zone name could be appended to the current tzfile format, e.g., ZONE=America/Los_Angeles VERSION=2015g If you copy or link the file, that wouldn't change its ZONE line. We'd bump the tzfile version number from '3' to '4'. ZONE and VERSION strings couldn't contain newlines. Or perhaps we should use a different convention that allows arbitrary data in strings. This is doable; I'm not sure it's worth the hassle, though. There are political objections to some of the zone names, so putting them in the data files might raise a few eyebrows. On the technical side, if you change one source file, will you remember to change VERSION too? and will you be happy that your one little change to America/Podunk updates the VERSION string in all the files that zic produces? That sort of thing.

On 2015-10-27 11:34, Paul Eggert wrote:
Guy Harris wrote:
Which zone name is "the" zone name? The zone name on the system on which zic was run?
I assume it'd be the name in the Zone directive of the zic input. Presumably zic would also have a new option to specify the version, and Makefile would use the new option. The version and zone name could be appended to the current tzfile format, e.g.,
ZONE=America/Los_Angeles VERSION=2015g
If you copy or link the file, that wouldn't change its ZONE line. We'd bump the tzfile version number from '3' to '4'. ZONE and VERSION strings couldn't contain newlines. Or perhaps we should use a different convention that allows arbitrary data in strings.
This is doable; I'm not sure it's worth the hassle, though. There are political objections to some of the zone names, so putting them in the data files might raise a few eyebrows. On the technical side, if you change one source file, will you remember to change VERSION too? and will you be happy that your one little change to America/Podunk updates the VERSION string in all the files that zic produces? That sort of thing.
Perhaps a NUL terminated environment list file suffix like: TIMEZONE \0 America/Los_Angeles \0 TZDATA_VERSION \0 2015g \0 ZIC_VERSION \0 2015g \0 DISTRIBUTION \0 iana.org \0 \0 \0 to which distributions may be allowed to add entries? This could be useful, for example, to distinguish your test releases, with a git hash for TZDATA_VERSION and possibly ZIC_VERSION, and DISTRIBUTION eggert or https://github.com/eggert/tz or whatever you choose. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

Thanks to Bradley White in <http://mm.icann.org/pipermail/tz/2017-February/024814.html> / <CAGb2eRP7oT3FpTvJCL=gZDX=VS3=5aiu0GTWT4xUpM-y-x5K8g@mail.gmail.com> for pointing me to this thread <http://mm.icann.org/pipermail/tz/2015-October/022807.html> On 10/27/15 12:25 PM, Brian Inglis wrote:
On 2015-10-27 11:34, Paul Eggert wrote:
Guy Harris wrote:
Which zone name is "the" zone name? The zone name on the system on which zic was run? I assume it'd be the name in the Zone directive of the zic input. Presumably zic would also have a new option to specify the version, and Makefile would use the new option. The version and zone name could be appended to the current tzfile format, e.g.,
ZONE=America/Los_Angeles VERSION=2015g
If you copy or link the file, that wouldn't change its ZONE line. We'd bump the tzfile version number from '3' to '4'. ZONE and VERSION strings couldn't contain newlines. Or perhaps we should use a different convention that allows arbitrary data in strings.
This is doable; I'm not sure it's worth the hassle, though. I outlined the benefits as a consumer of the tz data. Trying to locate the actual id of a zone file is itself a huge hassle, as John Layt also noted in http://mm.icann.org/pipermail/tz/2015-October/022838.html
There are political objections to some of the zone names, so putting them in the data files might raise a few eyebrows. Wouldn't the names have been filenames elsewhere on disk, somewhere in the "zoneinfo" directory?
Perhaps a NUL terminated environment list file suffix like: TIMEZONE \0 America/Los_Angeles \0 TZDATA_VERSION \0 2015g \0 ZIC_VERSION \0 2015g \0 DISTRIBUTION \0 iana.org \0 \0 \0 to which distributions may be allowed to add entries? This could be useful, for example, to distinguish your test releases, with a git hash for TZDATA_VERSION and possibly ZIC_VERSION, and DISTRIBUTION eggert or https://github.com/eggert/tz or whatever you choose.
A git hash could be helpful in those situations. Actually, just add TZDATA_GITHASH as a separate field. On 2/9/17 2:59 PM, Arthur David Olson wrote:
The tzid might be stored as a seemingly extraneous time zone abbreviation, presumably with magic characters ("TZID"?) prepended or appended for identification purposes. Other such information could be included using the same approach. Would this [then not] require a format change? That could be a benefit to this approach, if so. The fifteen "reserved for future use" bytes near the start of time zone binary files are too few for this purpose (take "America/New_York"--and cue Henny Youngman). Sounds like *that* future isn’t here, yet.
So there are about three approaches proposed to add this information. Would such a version bump (if needed) require consumers to upgrade? Thanks, -s

On 02/09/2017 04:44 PM, Steven R. Loomis wrote:
Trying to locate the actual id of a zone file is itself a huge hassle, as John Layt also noted in http://mm.icann.org/pipermail/tz/2015-October/022838.html
Yes, it is a real management hassle that is not trivial to solve. Unfortunately nobody has had the time to come up with a practical solution, as far as I know.
There are political objections to some of the zone names, so putting them in the data files might raise a few eyebrows. Wouldn't the names have been filenames elsewhere on disk, somewhere in the "zoneinfo" directory?
Not in some installations. Android, I think, does not create filenames like "America/New_York". More important, these names are not part of the format now, and standardizing them in the format would increase the likelihood of causing political irritations. And still more important, downstream users are free to add to the list of names, and many do so; this lessens the utility of using a "standard" name, as these names are not as "standard" as one might want.
Would such a version bump (if needed) require consumers to upgrade?
It seems so -- at least under the proposals I've seen so far, as if consumers are running software derived from tzcode, they would need to upgrade, as otherwise the code would mishandle some timestamps. I haven't looked at non-tzcode-derived libraries but I expect many would be similar. We'd rather avoid this, of course.

On Thu, 9 Feb 2017, Paul Eggert wrote:
It seems so -- at least under the proposals I've seen so far, as if consumers are running software derived from tzcode, they would need to upgrade, as otherwise the code would mishandle some timestamps. I haven't looked at non-tzcode-derived libraries but I expect many would be similar. We'd rather avoid this, of course.
While avoiding a version bump is desirable, it's not always possible. It seems to me that the pain of a version bump could be ameliorated by having both * new code able to read and process both old and new versions, and * zic growing a new option to produce either old or new "compiled" versions of the zone files (default version TBD) +------------------+--------------------------+------------------------+ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired) | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org | +------------------+--------------------------+------------------------+

I'd like to put in a plug for having the tz string in the binary file preceded by @(#) so that it appears when the what(1) utility is invoked. --jhawk@mit.edu John Hawkinson

On 2017-02-09 18:49, John Hawkinson wrote:
I'd like to put in a plug for having the tz string in the binary file preceded by @(#) so that it appears when the what(1) utility is invoked.
Does anyone still run systems with "what" or sccs on it? I stopped using "what" strings many years ago when "what" stopped being available on systems. There does appear to be a package available called cssc which is compatible with sccs. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

I'd like to put in a plug for having the tz string in the binary file preceded by @(#) so that it appears when the what(1) utility is invoked.
Does anyone still run systems with "what" or sccs on it?
Yes (to the former, not the latter), although that's beyond the point. (It ships under MacOS and I have it on other systems, including Solaris and NetBSD). But the idea is to have the string displayable using a semi-standard tool. That doesn't preclude doing something snazzier. An argument could be made that ident(1) is a better choice, although I don't think it really is, in part since it would require assigning a pseudokeyword like $Tz: ... $.
I stopped using "what" strings many years ago when "what" stopped being available on systems. There does appear to be a package available called cssc which is compatible with sccs.
The BSD version is easily available. --jhawk@mit.edu John Hawkinson

John Hawkinson wrote:
the idea is to have the string displayable using a semi-standard tool.
On GNU/Linux a common tool for that is the 'file' command. For example, on Fedora 25: $ file /usr/share/zoneinfo/America/Los_Angeles /usr/share/zoneinfo/America/Los_Angeles: timezone data, version 2, 5 gmt time flags, 5 std time flags, no leap seconds, 186 transition times, 5 abbreviation chars Here the string "America/Los_Angeles" is already part of the output. Presumably the intended application for a change to the binary file format is when the binary file has been renamed or linked and someone wants to know the "originally-intended" or "canonical" name. tz follows the Unix tradition where files do not have canonical names - a design flaw to some, and a design feature to others.

Paul Eggert <eggert@cs.ucla.edu> wrote on Fri, 10 Feb 2017 at 00:18:22 -0800 in <2c1949bd-fed5-a848-900b-4ac45b6deada@cs.ucla.edu>:
On GNU/Linux a common tool for that is the 'file' command. For example, on Fedora 25:
I'm not quite sure what you're getting at, because I feel like I could replace the word 'file' with 'ls -l' and have your message convey the same information... :) Maybe -ld tho. But yes, if we went with a prefixed string of '@(#)' and recorded the original zone name in the file, it would be nice to have file(1) display it as well; this is compatible with having the string displayed by what(1), so would be great to do both. I am not a magic(5) expert, but I believe the "search" operator lets that happen (and it would only be invoked on files matching the existing 'TZif' magic string, see magic/Magdir/timezone in the file source distribution). In the other part of my thread, I fear Brian Inglis misunderstood my intent. It was not to suggest that what(1) ought to be the only mechanism to display this information (thus launching debates about how many people have the tool installed, or whether it is obscure, etc., etc.); it is instead to suggest adopting a mechanism compatible with it, so that it works for many people today. That doesn't preclude a tz-specific tool (perhaps part of file(1) or standalone tool). --jhawk@mit.edu John Hawkinson
$ file /usr/share/zoneinfo/America/Los_Angeles /usr/share/zoneinfo/America/Los_Angeles: timezone data, version 2, 5 gmt time flags, 5 std time flags, no leap seconds, 186 transition times, 5 abbreviation chars
Here the string "America/Los_Angeles" is already part of the output. Presumably the intended application for a change to the binary file format is when the binary file has been renamed or linked and someone wants to know the "originally-intended" or "canonical" name. tz follows the Unix tradition where files do not have canonical names - a design flaw to some, and a design feature to others.
$ ls -l /usr/share/zoneinfo/America/Los_Angeles -rw-r--r-- 1 root root 2819 Dec 7 05:59 /usr/share/zoneinfo/America/Los_Angeles Here the string "America/Los_Angeles" is already part of the output. Presumably the intended application for a change to the binary file format is when the binary file has been renamed or linked and someone wants to know the "originally-intended" or "canonical" name. tz follows the Unix tradition where files do not have canonical names - a design flaw to some, and a design feature to others.

On 2/10/17 12:18 AM, Paul Eggert wrote:
John Hawkinson wrote:
the idea is to have the string displayable using a semi-standard tool.
On GNU/Linux a common tool for that is the 'file' command. For example, on Fedora 25:
$ file /usr/share/zoneinfo/America/Los_Angeles /usr/share/zoneinfo/America/Los_Angeles: timezone data, version 2, 5 gmt time flags, 5 std time flags, no leap seconds, 186 transition times, 5 abbreviation chars
Here the string "America/Los_Angeles" is already part of the output. Yes, but consider: $ file /usr/share/zoneinfo/America/Los_Angeles /usr/share/zoneinfo/America/Los_Angeles: timezone data, old version, 4 gmt time flags, 4 std time flags, no leap seconds, 185 transition times, 4 abbreviation chars srl@filfla:/tmp/_ $ file /etc/localtime /etc/localtime: timezone data, old version, 4 gmt time flags, 4 std time flags, no leap seconds, 185 transition times, 4 abbreviation chars
Apparently 'file' didn't want to mess with readlink() either. I don't blame it.
Presumably the intended application for a change to the binary file format is when the binary file has been renamed or linked and someone wants to know the "originally-intended" or "canonical" name. Just so. tz follows the Unix tradition where files do not have canonical names - a design flaw to some, and a design feature to others. but tz does have:
northamerica: Zone America/Los_Angeles backward: Link America/Los_Angeles US/Pacific So, formally, what I am looking for is "the Zone entry of the stanza which zic used to generate the tzfile output." I'm very happy to leave filenames out of the definitional picture. Steven

Hmm... are the coordinates shown in the zone.tab file stable enough to use as IDs? If so, then a geohash of the relevant point would be a reasonable choice that would fit in the 15 future bytes. Suitable schemes that come to mind are Geohash-36, a base-64 variant of the base-32 geohash.org scheme, or (if we wish to be a bit less obfuscatory) Maidenhead locators. And some other provisions for non-geographic zones. More detailed analysis to follow when I'm at my full-powered computer. Envoyé de mon iPhone

On 02/10/2017 08:31 AM, J Andrew Lipscomb wrote:
are the coordinates shown in the zone.tab file stable enough to use as IDs?
I doubt it, as they are updated from time to time, and I would guess they're less stable than the English names. Most likely some of those coordinates are typos that should be fixable. Most of those coordinates came from Shanks, who (I suspect) rounded them to fit whatever crude approximation his old database format was using. Plus, people argue even about the coordinates! These can be just as political as the name.

FWIW CLDR uses LOCODE for tzids in locale ids, such as en-u-tz-uslax for America/Los_Angeles • http://www.unicode.org/repos/cldr/trunk/specs/ldml/tr35.html#Time_Zone_Ident... • http://www.unicode.org/repos/cldr/trunk/specs/ldml/tr35.html#UnicodeTimezone... • http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/timezone.xml This is a registry and not an algorithm, so CLDR is responsible for its stability, as the ietf bcp47 -u- extension owner. On 2/10/17 8:31 AM, J Andrew Lipscomb wrote:
Hmm... are the coordinates shown in the zone.tab file stable enough to use as IDs? If so, then a geohash of the relevant point would be a reasonable choice that would fit in the 15 future bytes. Suitable schemes that come to mind are Geohash-36, a base-64 variant of the base-32 geohash.org scheme, or (if we wish to be a bit less obfuscatory) Maidenhead locators. And some other provisions for non-geographic zones.
More detailed analysis to follow when I'm at my full-powered computer.
Envoyé de mon iPhone

On 02/10/2017 09:09 AM, Steven R. Loomis wrote:
FWIW CLDR uses LOCODE for tzids in locale ids
The LOCODE scheme suffers from the problem of politics. For example, the CLDR code for Europe/Simferopol is "uasip", which gives the appearance of taking a political position that Crimea is part of Ukraine as opposed to Russia. I suppose we could come up with a different scheme that uses less-emotionally-charged abbreviations. As I recall, someone suggested IATA airport codes a while ago; unfortunately they do not suffice for locations like Pacific/Pitcairn where the nearest IATA airport is not in the same time zone. Which reminds me: I still need to clean out invented time zone abbreviations like PNT from the "australasia" file.

On Fri, Feb 10, 2017 at 10:44 AM, Steven R. Loomis <srl@icu-project.org> wrote:
So, formally, what I am looking for is "the Zone entry of the stanza which zic used to generate the tzfile output." I'm very happy to leave filenames out of the definitional picture.
+1

On 2017-02-09 19:51, John Hawkinson wrote:
I'd like to put in a plug for having the tz string in the binary file preceded by @(#) so that it appears when the what(1) utility is invoked.
Does anyone still run systems with "what" or sccs on it?
Yes (to the former, not the latter), although that's beyond the point. (It ships under MacOS and I have it on other systems, including Solaris and NetBSD).
But the idea is to have the string displayable using a semi-standard tool. That doesn't preclude doing something snazzier. An argument could be made that ident(1) is a better choice, although I don't think it really is, in part since it would require assigning a pseudokeyword like $Tz: ... $.
Another legacy source control string requiring an RCS utility that has not been installed on current systems for many years AFAIK, except maybe on BSD derived distros.
I stopped using "what" strings many years ago when "what" stopped being available on systems. There does appear to be a package available called cssc which is compatible with sccs.
The BSD version is easily available.
While "what" is a POSIX standard utility, it seems to be present only in the POSIX man pages on modern distros, so I don't think either utility could now be considered semi-standard. If either what or ident strings were used, the utility sources or a replacement script would have to be added to tzcode, or instructions for the curious or serious to install the appropriate legacy package. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

John Hawkinson <jhawk@MIT.EDU> wrote:
I'd like to put in a plug for having the tz string in the binary file preceded by @(#) so that it appears when the what(1) utility is invoked.
An argument could be made that ident(1) is a better choice, although I don't think it really is, in part since it would require assigning a pseudokeyword like $Tz: ... $.
It's easy to support both what and ident, e.g. I sometimes have @(#) $Version: blahblah $ with the output of `git describe` instead of blahblah. Tony. -- f.anthony.n.finch <dot@dotat.at> http://dotat.at/ - I xn--zr8h punycode Faeroes, Southeast Iceland: Southerly veering southwesterly 5 or 6. Very rough becoming rough. Rain or wintry showers. Good, occasionally poor.

On Feb 9, 2017, at 8:42 PM, Brian Inglis <Brian.Inglis@systematicsw.ab.ca> wrote:
On 2017-02-09 18:49, John Hawkinson wrote:
I'd like to put in a plug for having the tz string in the binary file preceded by @(#) so that it appears when the what(1) utility is invoked.
Does anyone still run systems with "what" or sccs on it? I stopped using "what" strings many years ago when "what" stopped being available on systems. There does appear to be a package available called cssc which is compatible with sccs.
I have also re-implemented the “what” utility: https://github.com/314159/what.git -- Scott

Would such a version bump (if needed) require consumers to upgrade?
Storing an extra, otherwise unused time zone abbreviation a la "@(#) TZID America/New_York" wouldn't require a version bump; then again, it's far less clean than what could be done with a version bump. @dashdashado On Thu, Feb 9, 2017 at 8:37 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 02/09/2017 04:44 PM, Steven R. Loomis wrote:
Trying to locate the actual id of a zone file is itself a huge hassle, as John Layt also noted in http://mm.icann.org/pipermail/tz/2015-October/022838.html
Yes, it is a real management hassle that is not trivial to solve. Unfortunately nobody has had the time to come up with a practical solution, as far as I know.
There are political objections to some of the zone names, so putting
them in the data files might raise a few eyebrows.
Wouldn't the names have been filenames elsewhere on disk, somewhere in the "zoneinfo" directory?
Not in some installations. Android, I think, does not create filenames like "America/New_York". More important, these names are not part of the format now, and standardizing them in the format would increase the likelihood of causing political irritations. And still more important, downstream users are free to add to the list of names, and many do so; this lessens the utility of using a "standard" name, as these names are not as "standard" as one might want.
Would such a version bump (if needed) require consumers to upgrade?
It seems so -- at least under the proposals I've seen so far, as if consumers are running software derived from tzcode, they would need to upgrade, as otherwise the code would mishandle some timestamps. I haven't looked at non-tzcode-derived libraries but I expect many would be similar. We'd rather avoid this, of course.

On 10/02/2017 01:37, Paul Eggert wrote:
On 02/09/2017 04:44 PM, Steven R. Loomis wrote:
Trying to locate the actual id of a zone file is itself a huge hassle, as John Layt also noted in http://mm.icann.org/pipermail/tz/2015-October/022838.html
Yes, it is a real management hassle that is not trivial to solve. Unfortunately nobody has had the time to come up with a practical solution, as far as I know.
There are political objections to some of the zone names, so putting them in the data files might raise a few eyebrows. Wouldn't the names have been filenames elsewhere on disk, somewhere in the "zoneinfo" directory?
Not in some installations. Android, I think, does not create filenames like "America/New_York". More important, these names are not part of the format now, and standardizing them in the format would increase the likelihood of causing political irritations. And still more important, downstream users are free to add to the list of names, and many do so; this lessens the utility of using a "standard" name, as these names are not as "standard" as one might want.
The political irritations could be eliminated by assigning a numeric ID to each zone and storing that in the binary file (along with version information, if deemed necessary). This might require numeric IDs to be added to the source data files. They _could_ be generated from a hash of the name, but sometimes the name changes sometimes (e.g. Asia/Kolkatta) and it would be nice to have a stable ID for a zone. -- -=( Ian Abbott @ MEV Ltd. E-mail: <abbotti@mev.co.uk> )=- -=( Web: http://www.mev.co.uk/ )=-

The major problem all the time is identifying just what rules were applied when processing historic data, which may actually only be a current meeting diary. One can not assume that the current rule set actually correctly translates the current data if one set predates a later change. But my own data goes back to a time when the rules were more suspect, and reprocessing that data can result in confusion. We need to be able to assess if changes do affect either current or historic data and without a clean version number that is not posible. Removing backzone data also screws up historic data even only back to thd second world war, so that needs to be part of any versioning tag. Sent from my android device so quoting is crap ... need to kill these painful email clients! -----Original Message----- From: Bradley White <bww@acm.org> To: "tz@iana.org mailing list" <tz@iana.org> Sent: Mon, 26 Oct 2015 22:50 Subject: Re: [tz] Version in zoneinfo files? On Sun, Oct 25, 2015 at 2:30 PM, Guy Harris <guy@alum.mit.edu> wrote:
Is this just something to let people know whether they have an up-to-date version of the tzdb files or not?
Yes, although I think of it simply as let people know what version of a tzdb file they have. (Whether that is the latest version is a separate question.) I would even propose an additional self-identification step: include the zone name in tzdb file. Then you could tell, for example, where /etc/localtime came from.

I'm not sure I understand why backzone data needs to be part of the version tag. Isn't it sufficient to know the release version of the tzdata (as these are archived and available)? On Oct 27, 2015 11:34 PM, <lester@lsces.co.uk> wrote:
The major problem all the time is identifying just what rules were applied when processing historic data, which may actually only be a current meeting diary. One can not assume that the current rule set actually correctly translates the current data if one set predates a later change. But my own data goes back to a time when the rules were more suspect, and reprocessing that data can result in confusion. We need to be able to assess if changes do affect either current or historic data and without a clean version number that is not posible. Removing backzone data also screws up historic data even only back to thd second world war, so that needs to be part of any versioning tag.
Sent from my android device so quoting is crap ... need to kill these painful email clients!
-----Original Message----- From: Bradley White <bww@acm.org> To: "tz@iana.org mailing list" <tz@iana.org> Sent: Mon, 26 Oct 2015 22:50 Subject: Re: [tz] Version in zoneinfo files?
On Sun, Oct 25, 2015 at 2:30 PM, Guy Harris <guy@alum.mit.edu> wrote:
Is this just something to let people know whether they have an up-to-date version of the tzdb files or not?
Yes, although I think of it simply as let people know what version of a tzdb file they have. (Whether that is the latest version is a separate question.)
I would even propose an additional self-identification step: include the zone name in tzdb file. Then you could tell, for example, where /etc/localtime came from.

If the data being normalized NEEDS pre-1970 data to be correctly used, but the machines tz service does not have it we have no way of knowing there is actually a problem! Either we need to loose the two versions of data, or tag just which one was used. And that is before even adding version number ... Sent from my android device so quoting is crap ... need to kill these painful email clients! -----Original Message----- From: Paul Ganssle <paul@ganssle.io> To: lester@lsces.co.uk Cc: "tz@iana.org List" <tz@iana.org>, Bradley White <bww@acm.org> Sent: Wed, 28 Oct 2015 4:38 Subject: Re: [tz] Version in zoneinfo files? I'm not sure I understand why backzone data needs to be part of the version tag. Isn't it sufficient to know the release version of the tzdata (as these are archived and available)? On Oct 27, 2015 11:34 PM, <lester@lsces.co.uk> wrote:
The major problem all the time is identifying just what rules were applied when processing historic data, which may actually only be a current meeting diary. One can not assume that the current rule set actually correctly translates the current data if one set predates a later change. But my own data goes back to a time when the rules were more suspect, and reprocessing that data can result in confusion. We need to be able to assess if changes do affect either current or historic data and without a clean version number that is not posible. Removing backzone data also screws up historic data even only back to thd second world war, so that needs to be part of any versioning tag.
Sent from my android device so quoting is crap ... need to kill these painful email clients!
-----Original Message----- From: Bradley White <bww@acm.org> To: "tz@iana.org mailing list" <tz@iana.org> Sent: Mon, 26 Oct 2015 22:50 Subject: Re: [tz] Version in zoneinfo files?
On Sun, Oct 25, 2015 at 2:30 PM, Guy Harris <guy@alum.mit.edu> wrote:
Is this just something to let people know whether they have an up-to-date version of the tzdb files or not?
Yes, although I think of it simply as let people know what version of a tzdb file they have. (Whether that is the latest version is a separate question.)
I would even propose an additional self-identification step: include the zone name in tzdb file. Then you could tell, for example, where /etc/localtime came from.

THIRD TRY ON THIS BLOODY PHONE! If the data being used has been normalized using pre-1970 information but the tz service has omitted it then how do you identify the problem. More important how do you provide the correct data? And that is before adding the version of tz data used to do the normalization. The version number is essential data even for current normalized data, backzone just adds to the problem of establishing which data was used. See discussion on tzdist for more detail on managing current timetables which may have changes while one is in flight mode .... and when syncing again differences are flagged via version change. Sent from my android device so quoting is crap ... need to kill these painful email clients! -----Original Message----- From: Paul Ganssle <paul@ganssle.io> To: lester@lsces.co.uk Cc: "tz@iana.org List" <tz@iana.org>, Bradley White <bww@acm.org> Sent: Wed, 28 Oct 2015 4:38 Subject: Re: [tz] Version in zoneinfo files? I'm not sure I understand why backzone data needs to be part of the version tag. Isn't it sufficient to know the release version of the tzdata (as these are archived and available)? On Oct 27, 2015 11:34 PM, <lester@lsces.co.uk> wrote:
The major problem all the time is identifying just what rules were applied when processing historic data, which may actually only be a current meeting diary. One can not assume that the current rule set actually correctly translates the current data if one set predates a later change. But my own data goes back to a time when the rules were more suspect, and reprocessing that data can result in confusion. We need to be able to assess if changes do affect either current or historic data and without a clean version number that is not posible. Removing backzone data also screws up historic data even only back to thd second world war, so that needs to be part of any versioning tag.
Sent from my android device so quoting is crap ... need to kill these painful email clients!
-----Original Message----- From: Bradley White <bww@acm.org> To: "tz@iana.org mailing list" <tz@iana.org> Sent: Mon, 26 Oct 2015 22:50 Subject: Re: [tz] Version in zoneinfo files?
On Sun, Oct 25, 2015 at 2:30 PM, Guy Harris <guy@alum.mit.edu> wrote:
Is this just something to let people know whether they have an up-to-date version of the tzdb files or not?
Yes, although I think of it simply as let people know what version of a tzdb file they have. (Whether that is the latest version is a separate question.)
I would even propose an additional self-identification step: include the zone name in tzdb file. Then you could tell, for example, where /etc/localtime came from.

lester@lsces.co.uk wrote:
The version number is essential data
It's not essential to put the version number into zic's output, as the tz project has been operating successfully for decades without doing so. Although there are benefits to having a version number in the data, there are costs too, and it's not clear which is greater. If one needs a unique number to briefly identify a data file, one can use a hash function of the data. This should suffice for tzdist's needs, so that tzdist shouldn't require a version number in the data.

On 2015-10-28 02:12, Paul Eggert wrote:
lester@lsces.co.uk wrote:
The version number is essential data
It's not essential to put the version number into zic's output, as the tz project has been operating successfully for decades without doing so. Although there are benefits to having a version number in the data, there are costs too, and it's not clear which is greater. If one needs a unique number to briefly identify a data file, one can use a hash function of the data. This should suffice for tzdist's needs, so that tzdist shouldn't require a version number in the data.
Seems essential to document the provenance of the data using the kind of output --version produces from some packages, which include explicit build libraries and versions, e.g. 2015g+zone1970.tab where zone.tab could replace zone1970.tab; +backzone, and +(NIST)leap-seconds.3629404800 could be added, with (IERS) a (singular but possibly more secure and authoritative) alternative to (NIST); and local commit hash added like +https://github.com/eggert/tz/commit/6bf2f29c6458f8aa245dd5780235a38e6142bbef. [Last time I checked, only 9 of 24 NIST servers, and 2 of 9 nist.gov servers {time,wwv} offered anonymous FTP leap seconds files.] -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

On Oct 27, 2015, at 12:20 PM, lester@lsces.co.uk wrote:
The major problem all the time is identifying just what rules were applied when processing historic data, which may actually only be a current meeting diary. One can not assume that the current rule set actually correctly translates the current data if one set predates a later change.
I'm not sure I understood correctly what you intended. Are you talking about this case: I make a calendar entry for 9 am local time next year. The calendar system translates that to UTC and stores that time in its database. A few months later, the TZ rules change, and that same UTC time is now 10 am local. The difficulty with cases like this is that it is not obvious what the desired outcome is. It might be that I want this event to occur at 9 am local time, and I don't care what that is in UTC. But it is also possible that I want it to occur at whatever UTC corresponds to 9 am local -- for example, because this is a teleconference with people in several countries. It might even be true that the desired answer varies depending on which calendar entry you're looking at. A dentist appointment is probably strictly in local time; a call with a colleague in another continent is probably primarily in UTC. This doesn't feel like a zoneinfo issue; it's a user interface design question for calendar applications. paul

Back on a computer with a real keyboard after floating around the Med and being subjected to one hour time changes almost every night. Some following the actual timezone and DST changes over the weekend, and others arbitrary to avoid a 2 hour jump. So ships time did not match the adjacent land time :) But at least the time was always quoted as an offset ... from GMT ... standard sea time apparently.
On Oct 27, 2015, at 12:20 PM, lester@lsces.co.uk wrote:
The major problem all the time is identifying just what rules were applied when processing historic data, which may actually only be a current meeting diary. One can not assume that the current rule set actually correctly translates the current data if one set predates a later change.
I'm not sure I understood correctly what you intended. Are you talking about this case: I make a calendar entry for 9 am local time next year. The calendar system translates that to UTC and stores that time in its database. A few months later, the TZ rules change, and that same UTC time is now 10 am local. That is the EXACT problem that caused my original investigation as to why TZ data was wrong when there were problems with an international meeting which involved a live video link ... no one had told the organisers that DST change may not happen due to I think Ramadan but certainly some religious event which used the astronomical observations rather than a calendar date. This was over ten years ago, and these events now make sure to avoid some periods of the year to prevent another occurrence, but that is just one example.
The difficulty with cases like this is that it is not obvious what the desired outcome is. It might be that I want this event to occur at 9 am local time, and I don't care what that is in UTC. But it is also possible that I want it to occur at whatever UTC corresponds to 9 am local -- for example, because this is a teleconference with people in several countries. FLAGGING that there is a potential problem would at least be a help, but as has been flagged in other posts, printed material such as schedules and prayer sheets may well be wrong if they have been created using out of data facts. Deciding what is correct is not a matter for TZ, only that the facts in version a no longer match the facts in version b.
It might even be true that the desired answer varies depending on which calendar entry you're looking at. A dentist appointment is probably strictly in local time; a call with a colleague in another continent is probably primarily in UTC. Up until quite recently most people probably did not even bother setting the timezone on their systems. Certainly the systems I was building 15 years ago only had a 24 hour clock and could not run over the DST change. Switching to a UTC based system eliminates all the problems that a multi timezone system creates, and one only needs the correct TZ data for a client location in order to display and add local time events. The problem that arises is when an event is planned for some future date, stored normalized, and then the TZ offset gets changed for some reason. DST changes are the obvious one, and knowing that the event was planned with version x of the tz data and that version y now applies a different offset, one can at least identify a problem. If the meeting stays at the UTC time or moves to a new one is outside of the remit of TZ, but the fact that TZ has a change is the important bit ... the version used has changed.
This doesn't feel like a zoneinfo issue; it's a user interface design question for calendar applications. YES the user interface has to deal with the problem, but if it can't establish there is a problem because it can't identify that the stored version is different to the current version of offset then how can the user interface even warn you there is a problem. It needs to ask if the versions match ... and it may be that it is dealing with a diary fom a third source so a centrally sourced reliable version of TZ is critical.
With genealogical data going back in time if your copy of TZ offsets does not match the one the data is created with who do you even read the data? Even storing the location data does not help if my copy of TZ has included backzone and yours does not added to which we have identified a historical mistake which is fixed in the current version, but not in the 10+ year old version that was used to create the data set. We need something to identify the copy of data being used at the time and to be able to access that and check for potential changes. Be that a holy day that is a week late, or a previously scheduled DST change which has been cancelled. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Lester Caine <lester@lsces.co.uk> wrote on Wed, 28 Oct 2015 at 23:03:56 +0000 in <5631545C.8080704@lsces.co.uk>:
This doesn't feel like a zoneinfo issue; it's a user interface design question for calendar applications.
YES the user interface has to deal with the problem, but if it can't establish there is a problem because it can't identify that the stored version is different to the current version of offset then how can the user interface even warn you there is a problem. It needs to ask if the versions match ... and it may be that it is dealing with a diary fom a third source so a centrally sourced reliable version of TZ is critical.
I still can't parse this (try again with smaller, shorter, sentences please?), but I think you may misunderstand the problem. The time and the zone must be stored, becuse whther the time shifts with DST is a function of the zone. As long as the calendar app stores some things like "9:00am US/Eastern" (for a dentist appointment) and "13:00 UTC" (for an international conference call), calendar apps should work well. --jhawk@mit.edu John Hawkinson

Agreed - what Lester seems to be describing sounds more complicated than necessary. What might be useful is if you, for whatever reason, want to notify someone that a meeting time specified in UTC has changed in the local time zone of one or more of the participants. That said, what is the point of maintaining per zone versioning? That seems needlessly complex when retrieving and processing two tzdata tarballs is incredibly quick (especially if you only care about a limited number of zones). Then you can just process the date of interest using the old version and the new version and see if they resolve to the same time in UTC or TAI. On Oct 28, 2015 8:02 PM, "John Hawkinson" <jhawk@mit.edu> wrote:
Lester Caine <lester@lsces.co.uk> wrote on Wed, 28 Oct 2015 at 23:03:56 +0000 in <5631545C.8080704@lsces.co.uk>:
This doesn't feel like a zoneinfo issue; it's a user interface design question for calendar applications.
YES the user interface has to deal with the problem, but if it can't establish there is a problem because it can't identify that the stored version is different to the current version of offset then how can the user interface even warn you there is a problem. It needs to ask if the versions match ... and it may be that it is dealing with a diary fom a third source so a centrally sourced reliable version of TZ is critical.
I still can't parse this (try again with smaller, shorter, sentences please?), but I think you may misunderstand the problem.
The time and the zone must be stored, becuse whther the time shifts with DST is a function of the zone. As long as the calendar app stores some things like "9:00am US/Eastern" (for a dentist appointment) and "13:00 UTC" (for an international conference call), calendar apps should work well.
--jhawk@mit.edu John Hawkinson

On 28/10/15 23:37, John Hawkinson wrote:
YES the user interface has to deal with the problem, but if it can't
establish there is a problem because it can't identify that the stored version is different to the current version of offset then how can the user interface even warn you there is a problem. It needs to ask if the versions match ... and it may be that it is dealing with a diary fom a third source so a centrally sourced reliable version of TZ is critical. I still can't parse this (try again with smaller, shorter, sentences please?), but I think you may misunderstand the problem.
The time and the zone must be stored, becuse whther the time shifts with DST is a function of the zone. As long as the calendar app stores some things like "9:00am US/Eastern" (for a dentist appointment) and "13:00 UTC" (for an international conference call), calendar apps should work well.
SIMPLE things like that are not the problem. Where the problem arises is when the 9:00AM UTC Meeting in one location is advertised in several time zones and one of those time zones has a change of offset. The medical conference I quoted was across four time zones in Europe and the middle east, but one of the satellite links started an hour before the delegates arrived because of the change in DST at short notice. Don't ask why nobody even thought about the problem ... the local programs had been printed only with local times and the local organisers simply did not think ... current calendars do not do multi timezone well? With the number of short notice changes TZ is currently having to cope with, many diaries may need changes if viewed from outside the affected locations and travel arrangements may need adjusting to match up with the change. But my other problem is with the local tz identifiers which have different values in backzone pre-1970. In particular the UK zones which I have historic data that was UTC normalized for the second world war period, but gets miss displayed if backzone offsets are omitted. Some European id's have a similar problem. The original data from some 20 years ago was normalized using older versions of TZ, but the chronology did not work and when investigated, the missing elements of TZ data were identified, but have now been flushed to backzone ... how can it be ensured that the correct local times are displayed rather than the incorrect ones provided by a non-backzone TZ? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Lester Caine <lester@lsces.co.uk> wrote on Thu, 29 Oct 2015 at 00:17:50 +0000 in <563165AE.7000404@lsces.co.uk>:
SIMPLE things like that are not the problem. Where the problem arises is when the 9:00AM UTC Meeting in one location is advertised in several time zones and one of those time zones has a change of offset.
Again, this is a user interface problem. A 0900 UTC meeting is advertised as 0900 UTC, period. Clients convert to local time if appropriate. What is the problem that the tzdb should be addressing here? I do not see it.
But my other problem is with the local tz identifiers which have different values in backzone pre-1970. In particular the UK zones which
Again, it is extremely difficult to parse your speech here, but we do not really address pre-1970 issues. I think it is out of the tz project's scope. --jhawk@mit.edu John Hawkinson
I have historic data that was UTC normalized for the second world war period, but gets miss displayed if backzone offsets are omitted. Some European id's have a similar problem. The original data from some 20 years ago was normalized using older versions of TZ, but the chronology did not work and when investigated, the missing elements of TZ data were identified, but have now been flushed to backzone ... how can it be ensured that the correct local times are displayed rather than the incorrect ones provided by a non-backzone TZ?

On 29/10/15 00:40, John Hawkinson wrote:
But my other problem is with the local tz identifiers which have
different values in backzone pre-1970. In particular the UK zones which Again, it is extremely difficult to parse your speech here, but we do not really address pre-1970 issues. I think it is out of the tz project's scope. In which case TZ should not include any pre-1970 data? Instead of providing pre-1970 data which has been proven wrong ...
-- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Is there anything in the proposed versioning schemes that is actually incompatible with your aims? If not, I'm not sure that scope has much to do with it anyway, since this is just an example of a use-case where people WILL want to know the tzdata version. As much potential problem as there is with potentially misleading versioning in the zoneinfo data, I suspect there's an even higher possibility of confusion when there's NO standard way of getting version information, and every consumer of zic outputs is just rolling their own metadata storage on top of it. Is the human-incremented nature of the versioning still the main stumbling block to the addition of versioning to the files? On 10/28/2015 08:47 PM, Lester Caine wrote:
On 29/10/15 00:40, John Hawkinson wrote:
But my other problem is with the local tz identifiers which have
different values in backzone pre-1970. In particular the UK zones which Again, it is extremely difficult to parse your speech here, but we do not really address pre-1970 issues. I think it is out of the tz project's scope. In which case TZ should not include any pre-1970 data? Instead of providing pre-1970 data which has been proven wrong ...

On 29/10/15 01:02, Paul Ganssle wrote:
Is there anything in the proposed versioning schemes that is actually incompatible with your aims? If not, I'm not sure that scope has much to do with it anyway, since this is just an example of a use-case where people WILL want to know the tzdata version.
As much potential problem as there is with potentially misleading versioning in the zoneinfo data, I suspect there's an even higher possibility of confusion when there's NO standard way of getting version information, and every consumer of zic outputs is just rolling their own metadata storage on top of it.
Is the human-incremented nature of the versioning still the main stumbling block to the addition of versioning to the files?
That about sums it up? Up until now one could not even guarantee that if a distribution actually provided a version number that the content would match the same data from another distribution. A lot of genealogical data was ditched and re-built simply because there was no way of knowing just what had been used to produce normalized data. The problem on tzdist was getting an agreement that one NEEDS a reliable version number in order to identify just what version of data has been used to create a particular archive ... and this results in being able to update archives reliably moving forwards. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

On Oct 31, 2015, at 8:00 AM, Lester Caine <lester@lsces.co.uk> wrote:
Up until now one could not even guarantee that if a distribution actually provided a version number that the content would match the same data from another distribution.
You'll *never* be able to *guarantee* that, 100%, with anything short of a signature; you'll only be able to hope that the distribution provider doesn't do something stupid such as change the source files from what's in tzdata2017b without changing the version number to "2017b-frobozzlinux10.17".

On 31/10/15 15:21, Guy Harris wrote:
Up until now one could not even guarantee that if
a distribution actually provided a version number that the content would match the same data from another distribution. You'll *never* be able to *guarantee* that, 100%, with anything short of a signature; you'll only be able to hope that the distribution provider doesn't do something stupid such as change the source files from what's in tzdata2017b without changing the version number to "2017b-frobozzlinux10.17".
Which is why tzdist is so urgently required. With a reliable publisher that provides a well documented, untruncated, data service and can be identified as the source of tz data, and will provide any version of tz data it hs provided going forward. Simply providing a 'current' tz service does not even cover the problems of data changes from one week to the next? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

On 29/10/15 00:17, Lester Caine wrote:
The time and the zone must be stored, becuse whther the time shifts
with DST is a function of the zone. As long as the calendar app stores some things like "9:00am US/Eastern" (for a dentist appointment) and "13:00 UTC" (for an international conference call), calendar apps should work well. SIMPLE things like that are not the problem. Where the problem arises is when the 9:00AM UTC Meeting in one location is advertised in several time zones and one of those time zones has a change of offset. The medical conference I quoted was across four time zones in Europe and the middle east, but one of the satellite links started an hour before the delegates arrived because of the change in DST at short notice. Don't ask why nobody even thought about the problem ... the local programs had been printed only with local times and the local organisers simply did not think ... current calendars do not do multi timezone well?
All three of the ones I use handle timezones well. Meetings are typically scheduled by the organizer in the organizer's timezone and show up at the right (local) time in everyone's calendar (even in Turkey). It's not perfect, of course. In one case, the organiser was having a bad day, tried to set his machine to use GMT to organise the meeting, got it wrong and assumed and assumed that the calendar application was broken and turned up an hour late (or early) to his own meeting. In another case, the maintainers of the calendar system hadn't updated the server's tzdata since 2009 which made it remarkably difficult to schedule a regular meeting for a consistent time of day from Moscow. I suspect in the case of the satellite problem that either someone was trying too hard to be helpful rather than a broken calendar. (Unless they were using the same calendar application that hasn't updated its tzdata since 2009.) jch

Lester Caine <lester@lsces.co.uk> wrote:
On 28/10/15 23:37, John Hawkinson wrote:
The time and the zone must be stored, becuse whther the time shifts with DST is a function of the zone. As long as the calendar app stores some things like "9:00am US/Eastern" (for a dentist appointment) and "13:00 UTC" (for an international conference call), calendar apps should work well.
SIMPLE things like that are not the problem. Where the problem arises is when the 9:00AM UTC Meeting in one location is advertised in several time zones and one of those time zones has a change of offset. The medical conference I quoted was across four time zones in Europe and the middle east, but one of the satellite links started an hour before the delegates arrived because of the change in DST at short notice. Don't ask why nobody even thought about the problem ... the local programs had been printed only with local times and the local organisers simply did not think ... current calendars do not do multi timezone well?
Yes. There is a problem with the way iCalendar is often used: calendar apps do early binding to timezone rules - not just the tz names, but embedding the actual schedule of DST changes in the calendar appointment object. This is a massive pain if the rules change. The best way to avoid problems is to bind to a time zone as late as possible. Sadly iCalendar doesn't allow the binding to be late enough. http://fanf.livejournal.com/104586.html Tony. -- f.anthony.n.finch <dot@dotat.at> http://dotat.at/ Fair Isle, Faeroes, Southeast Iceland: Southeasterly, veering southwesterly for a time, 5 to 7, occasionally gale 8 at first. Rough or very rough, occasionally high later. Rain or showers. Good, occasionally poor.

On Oct 28, 2015, at 7:03 PM, Lester Caine <lester@lsces.co.uk> wrote:
...
This doesn't feel like a zoneinfo issue; it's a user interface design question for calendar applications. YES the user interface has to deal with the problem, but if it can't establish there is a problem because it can't identify that the stored version is different to the current version of offset then how can the user interface even warn you there is a problem. It needs to ask if the versions match ... and it may be that it is dealing with a diary fom a third source so a centrally sourced reliable version of TZ is critical.
Thanks, that helps. I thought the discussion in this thread was about zoneinfo release identifier tracking. That's pretty easy to do in principle. What you describe seems to require something different: information as to whether a particular timestamp may have changed meaning. To do that, you need two things: (a) the zone identifier of the timestamp, and (b) the revision history of THAT zone. I suppose that's doable. It would be possible, for every distinct zone in the tzdata repository, to maintain a sequence number that is incremented when (and only when) some rule for that particular zone changes. For example, in tzdata2015g, the sequence number for Turkey would be incremented, but the one for New York would not be. Then for any stored timestamp, you'd keep the zone id plus the update sequence number. If later on you get a new tzdata release, you could walk through the calendar and see if any entries have a changed sequence number for their zone. If yes, that entry MAY have changed meaning. It may not, depending on whether the particular time is affected by the change. It may have been safely in summer time in both versions, or the change may have been one for historic timestamps only. But that would be a good way to filter the data. The question is whether maintaining that sequence information is something that could be added. It would be an additional burden on the maintainer; hopefully an acceptably small one. paul

On 28/10/15 23:51, Paul_Koning@Dell.com wrote:
I suppose that's doable. It would be possible, for every distinct zone in the tzdata repository, to maintain a sequence number that is incremented when (and only when) some rule for that particular zone changes. For example, in tzdata2015g, the sequence number for Turkey would be incremented, but the one for New York would not be.
tzdist has all the right hooks to provide just what is needed, and a 'perfect' publisher would provide all of the historic data where available, and empty responses where it is not. If you have a current cached copy of the tz data and then see a calendar which uses an older version you can enquire if any of the times you are looking at may have changed, or if you reconnect after flight mode, you can see if there has been a change to the data you are using whilst off line. All of this needs management of the version of data and the current version number system can handle it as long as hard coded copies can be mapped to the later dynamic changes? If distributions are using binary zoneinfo files, then matching that to live updates is important, but so is providing a means of identifying which version was used to create the data being viewed? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

On Oct 28, 2015, at 8:30 AM, Paul_Koning@dell.com wrote:
On Oct 27, 2015, at 12:20 PM, lester@lsces.co.uk wrote:
The major problem all the time is identifying just what rules were applied when processing historic data, which may actually only be a current meeting diary. One can not assume that the current rule set actually correctly translates the current data if one set predates a later change.
I'm not sure I understood correctly what you intended. Are you talking about this case: I make a calendar entry for 9 am local time next year. The calendar system translates that to UTC and stores that time in its database. A few months later, the TZ rules change, and that same UTC time is now 10 am local.
The difficulty with cases like this is that it is not obvious what the desired outcome is. It might be that I want this event to occur at 9 am local time, and I don't care what that is in UTC. But it is also possible that I want it to occur at whatever UTC corresponds to 9 am local -- for example, because this is a teleconference with people in several countries.
It might even be true that the desired answer varies depending on which calendar entry you're looking at. A dentist appointment is probably strictly in local time; a call with a colleague in another continent is probably primarily in UTC.
This doesn't feel like a zoneinfo issue; it's a user interface design question for calendar applications.
...and an API/mechanism issue for the OSes on which those applications run, i.e. there needs to be a way for an application to be told "oops, a tz rule changed" or "oops, this particular tzdb zone's rules changed" or something such as that so that it knows that a meeting that's supposed to occur at "9 AM local time" isn't going to happen at the time it thought it was going to happen (if for no other reason than to ensure that a "you have a meeting in {5 minutes, an hour, ...}" message pops up at the appropriate time. (That mechanism could also be used on systems wherein "system time" ticks at one second per second, even during leap seconds, to be notified that a new leap second will be inserted in the future and therefore that events with either local *or* UTC times in the future will have to have their "time of event, represented as system time" values recalculated.) But that doesn't need a version stamp; all that the notification mechanism requires is a way to know that something changed, which could be delivered if the files were replaced at all (that might deliver a "false positive" if the file didn't actually change, but that might be rare enough not to care about), or delivered if anybody expressed interest in changes in a *particular* tzdb zone and its rules changed, and that should be checked by actually *comparing the data*, not by relying on a human-generated version number.

On Oct 27, 2015, at 9:20 AM, lester@lsces.co.uk wrote:
The major problem all the time is identifying just what rules were applied when processing historic data, which may actually only be a current meeting diary.
If by "historic data" you mean "pre-1970 data", I'm not sure why that would be needed for a current meeting diary, and it's not the major problem *all* the time, it's a problem for people using the tzdb to try to convert between UTC ("proleptic UTC"?) and local time for times prior to the Epoch, which might be a significant subset of the users of the tzdb, but it's still a (proper) subset of those users.

On 29/10/15 00:12, Guy Harris wrote:
The major problem all the time is identifying just what rules were applied when processing historic data, which may actually only be a current meeting diary. If by "historic data" you mean "pre-1970 data", I'm not sure why that would be needed for a current meeting diary, and it's not the major problem *all* the time, it's a problem for people using the tzdb to try to convert between UTC ("proleptic UTC"?) and local time for times prior to the Epoch, which might be a significant subset of the users of the tzdb, but it's still a (proper) subset of those users.
See my other post ... If a user only has a post-1970 TZ set of data but then looks at some historic material they see the wrong times and possibly dates. Agreed it would not affect many people, but with the growing interest in family history it's a growing area of interest. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

On Oct 28, 2015, at 5:23 PM, Lester Caine <lester@lsces.co.uk> wrote:
On 29/10/15 00:12, Guy Harris wrote:
The major problem all the time is identifying just what rules were applied when processing historic data, which may actually only be a current meeting diary. If by "historic data" you mean "pre-1970 data", I'm not sure why that would be needed for a current meeting diary, and it's not the major problem *all* the time, it's a problem for people using the tzdb to try to convert between UTC ("proleptic UTC"?) and local time for times prior to the Epoch, which might be a significant subset of the users of the tzdb, but it's still a (proper) subset of those users.
See my other post ... If a user only has a post-1970 TZ set of data but then looks at some historic material they see the wrong times and possibly dates. Agreed it would not affect many people, but with the growing interest in family history it's a growing area of interest.
If a user has a *full* TZ set of data and then looks at some historic material they may see the wrong times and possibly dates, if the pre-1970 information is wrong. :-)

On 29/10/15 00:29, Guy Harris wrote:
If a user has a *full* TZ set of data and then looks at some historic material they may see the wrong times and possibly dates, if the pre-1970 information is wrong. :-) We know that some pre 1970 material is suspect. I've no argument with that, but as much verified data currently exists in backzone as unverified! As long as the unverified data is consistent until such time as more accurate data is established then the version number is sufficient to identify updates to historic data as it does future changes? It's the use of proven incorrect pre-1970 data in the truncated TZ that is my problem :(
-- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

On 26 October 2015 at 21:50, Bradley White <bww@acm.org> wrote:
On Sun, Oct 25, 2015 at 2:30 PM, Guy Harris <guy@alum.mit.edu> wrote:
Is this just something to let people know whether they have an up-to-date version of the tzdb files or not?
Yes, although I think of it simply as let people know what version of a tzdb file they have. (Whether that is the latest version is a separate question.)
I would even propose an additional self-identification step: include the zone name in tzdb file. Then you could tell, for example, where /etc/localtime came from.
I'd love to see both of these, they're something we wanted when I wrote the Qt support for time zones. Writing a cross-platform / cross-distro library you quickly learn how hard it can be to figure out what the system time zone is, e.g. for translations of names, or so a calendar app can embed it in a n event invitation. We have workarounds for most distros, but having it in the file itself would make it so much easier. We also have users writing calendar apps who would like to know metadata about the tzdb the system uses, i.e. version, release date, etc. John.
participants (20)
-
Arthur David Olson
-
Bradley White
-
Brian Inglis
-
Guy Harris
-
Ian Abbott
-
J Andrew Lipscomb
-
John Hawkinson
-
John Haxby
-
John Layt
-
Lester Caine
-
lester@lsces.co.uk
-
Paul Eggert
-
Paul G
-
Paul Ganssle
-
Paul Ganssle
-
Paul Goyette
-
Paul_Koning@dell.com
-
Scott Nelson
-
Steven R. Loomis
-
Tony Finch