Lack of "initial" transitions for some zones
I suspect this is just a matter of how the data has been specified, but it's a little odd... I'm looking at the output of zic for each zone, and almost all zones start with a "transition" for some timestamp in the long distant past - including fixed zones such as UTC and Etc/*. However, the legacy "root" zones of CET, EET, PST8PDT etc don't have this initial "transition" in the file - unless I'm misinterpreting it, which is always a possibility. Would it make sense to adjust either the data or zic to make all zic output (in the 8-byte-timestamp format, at least) cover all of time? I realize that this "extra" data (presumably an insertion of "standard time until the first transition") may well not be historically accurate - but I doubt that it would be any less accurate than other zones in a similar situation. Jon
On 21/07/15 18:47, Jon Skeet wrote:
However, the legacy "root" zones of CET, EET, PST8PDT etc don't have this initial "transition" in the file - unless I'm misinterpreting it, which is always a possibility.
Since several areas merged to adopt these generic timezones, just which preceding time does one choose? The zone offset did not exist prior to it's adoption, and assuming any one offset prior to that is always wrong for other areas.
Would it make sense to adjust either the data or zic to make all zic output (in the 8-byte-timestamp format, at least) cover all of time? I realize that this "extra" data (presumably an insertion of "standard time until the first transition") may well not be historically accurate - but I doubt that it would be any less accurate than other zones in a similar situation.
It will always be less accurate where several areas combined to adopt a new 'standard' time. The time prior to that may well be 'local mean time' and tagging THAT is accurate, but then selecting one LMT from several is always going to be wrong for the rest? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On 21 July 2015 at 19:01, Lester Caine <lester@lsces.co.uk> wrote:
On 21/07/15 18:47, Jon Skeet wrote:
However, the legacy "root" zones of CET, EET, PST8PDT etc don't have this initial "transition" in the file - unless I'm misinterpreting it, which is always a possibility.
Since several areas merged to adopt these generic timezones, just which preceding time does one choose? The zone offset did not exist prior to it's adoption, and assuming any one offset prior to that is always wrong for other areas.
Well zdump already assumes an offset, in order to provide the local time just before the first transition... and indeed zic does to work out when the first transition occurs. So there's already some precedent for the assumption.
Would it make sense to adjust either the data or zic to make all zic output (in the 8-byte-timestamp format, at least) cover all of time? I realize that this "extra" data (presumably an insertion of "standard time until the first transition") may well not be historically accurate - but I doubt that it would be any less accurate than other zones in a similar situation.
It will always be less accurate where several areas combined to adopt a new 'standard' time. The time prior to that may well be 'local mean time' and tagging THAT is accurate, but then selecting one LMT from several is always going to be wrong for the rest?
The zones in question all have a standard time and a daylight time. The first transition is into daylight time. I would assume standard time before that. The zones in question are all just abbreviations, too - not actual locations as such. To be less lazy than I was before, the complete list is: CET, CST6CDT, EET, EST5EDT, MET, MST7MDT, PST8PDT, WET The Theory file describes the US zones in that list as "legacy names" and the European ones as "old-fashioned names". The europe file describes them with: "These are for backward compatibility with older versions." I don't know if that changes things at all. Given the existing caveats about "don't trust that values before 1970 are historically accurate" I think it's reasonable to simply assume standard time before then, isn't it? Having said that, WET and EET are somewhat interesting in that their first transitions are in 1977 - so after the "accuracy watershed" so to speak. That does complicate things somewhat. Jon
On Jul 21, 2015, at 11:19 AM, Jon Skeet <skeet@pobox.com> wrote:
Given the existing caveats about "don't trust that values before 1970 are historically accurate" I think it's reasonable to simply assume standard time before then, isn't it? Having said that, WET and EET are somewhat interesting in that their first transitions are in 1977 - so after the "accuracy watershed" so to speak. That does complicate things somewhat.
I get: WET Initially: +00:00:00 standard WET 1977-04-03T01:00:00Z +01:00:00 daylight WEST 1977-09-25T01:00:00Z +00:00:00 standard WET 1978-04-02T01:00:00Z +01:00:00 daylight WEST … and: EET Initially: +02:00:00 standard EET 1977-04-03T01:00:00Z +03:00:00 daylight EEST 1977-09-25T01:00:00Z +02:00:00 standard EET 1978-04-02T01:00:00Z +03:00:00 daylight EEST … Howard
On 21/07/15 19:19, Jon Skeet wrote:
Given the existing caveats about "don't trust that values before 1970 are historically accurate" I think it's reasonable to simply assume standard time before then, isn't it? Having said that, WET and EET are somewhat interesting in that their first transitions are in 1977 - so after the "accuracy watershed" so to speak. That does complicate things somewhat.
And some of the others have additional historic data in the 'back' file so that while a truncated TZ file has a common start offset, switching to include all the available data broadens that selection. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On Jul 21, 2015, at 10:47 AM, Jon Skeet <skeet@pobox.com> wrote:
I suspect this is just a matter of how the data has been specified, but it's a little odd...
I'm looking at the output of zic for each zone, and almost all zones start with a "transition" for some timestamp in the long distant past - including fixed zones such as UTC and Etc/*.
However, the legacy "root" zones of CET, EET, PST8PDT etc don't have this initial "transition" in the file - unless I'm misinterpreting it, which is always a possibility.
Would it make sense to adjust either the data or zic to make all zic output (in the 8-byte-timestamp format, at least) cover all of time? I realize that this "extra" data (presumably an insertion of "standard time until the first transition") may well not be historically accurate - but I doubt that it would be any less accurate than other zones in a similar situation.
Just zeroing in on PST8PDT, it looks to me like its initial state (from the beginning of time) is unambiguously: -08:00:00 standard PST I get this by using: Zone PST8PDT -8:00 US P%sT And “falling off the beginning” of the US Rule list (by finding the first Rule with 0 save): Rule US 1918 1919 - Oct lastSun 2:00 0 S Howard
On Jul 21, 2015, at 11:09 AM, Howard Hinnant <howard.hinnant@gmail.com> wrote:
On Jul 21, 2015, at 10:47 AM, Jon Skeet <skeet@pobox.com> wrote:
I suspect this is just a matter of how the data has been specified, but it's a little odd...
I'm looking at the output of zic for each zone, and almost all zones start with a "transition" for some timestamp in the long distant past - including fixed zones such as UTC and Etc/*.
However, the legacy "root" zones of CET, EET, PST8PDT etc don't have this initial "transition" in the file - unless I'm misinterpreting it, which is always a possibility.
Would it make sense to adjust either the data or zic to make all zic output (in the 8-byte-timestamp format, at least) cover all of time? I realize that this "extra" data (presumably an insertion of "standard time until the first transition") may well not be historically accurate - but I doubt that it would be any less accurate than other zones in a similar situation.
Just zeroing in on PST8PDT, it looks to me like its initial state (from the beginning of time) is unambiguously:
-08:00:00 standard PST
I get this by using:
Zone PST8PDT -8:00 US P%sT
And “falling off the beginning” of the US Rule list (by finding the first Rule with 0 save):
Rule US 1918 1919 - Oct lastSun 2:00 0 S
And subsequently its first transition is determined by this this Rule: Rule US 1918 1919 - Mar lastSun 2:00 1:00 D PST8PDT Initially: -08:00:00 standard PST 1918-03-31T10:00:00Z -07:00:00 daylight PDT 1918-10-27T09:00:00Z -08:00:00 standard PST 1919-03-30T10:00:00Z -07:00:00 daylight PDT … Howard
Jon Skeet wrote:
I'm looking at the output of zic for each zone, and almost all zones start with a "transition" for some timestamp in the long distant past - including fixed zones such as UTC and Etc/*.
Can you give more details about the problem, for UTC and PST8PDT say? It surprises me that the long-ago timestamp is issued for some output files but not others; I'm not sure why that would be. The zic.8 documentation does not uniquely specify the output file, given the input: multiple output files can correctly implement the input. So it's possible that you haven't found a bug, but still, it is curious. Possibly this is due to the transition change introduced in 2014c (see commit 7fb077a9ff67dab22b9a23f64f65f85d59cf593e) and updated by the Big Bang change in 2014d. Are the long-ago transitions equal to -2**59 == 0xf800000000000000 == -576460752303423488? If so, that's most likely what's going on.
On 21 July 2015 at 19:49, Paul Eggert <eggert@cs.ucla.edu> wrote:
Jon Skeet wrote:
I'm looking at the output of zic for each zone, and almost all zones start with a "transition" for some timestamp in the long distant past - including fixed zones such as UTC and Etc/*.
Can you give more details about the problem, for UTC and PST8PDT say? It surprises me that the long-ago timestamp is issued for some output files but not others; I'm not sure why that would be.
Well, I'm not sure I'd go so far as to say "problem" - other than for my validation efforts, where other APIs (including Noda Time) *do* assume standard time before the first transition. I'm currently using the zic code from 2015e, btw. (Note: I always look at the second set of values in the file, i.e. the 64-bit ones.) When processing UTC the first and only transition is at -576460752303423488 - the big bang value you noted later. When processing PST8PDT the first transition is -1633269600, i.e. in 1918. The zic.8 documentation does not uniquely specify the output file, given
the input: multiple output files can correctly implement the input. So it's possible that you haven't found a bug, but still, it is curious.
Possibly this is due to the transition change introduced in 2014c (see commit 7fb077a9ff67dab22b9a23f64f65f85d59cf593e) and updated by the Big Bang change in 2014d. Are the long-ago transitions equal to -2**59 == 0xf800000000000000 == -576460752303423488? If so, that's most likely what's going on.
In terms of *input*, I think it's reasonably understandable - most zones start with a zone line with a "-" for the rule name. These 8 rules don't - the *only* line for each of those zones is one which specifies a rule. The fixed-offset zones (e.g. EST, MST, HST) only have a single line each, but that line doesn't specify a rule - so it goes back to the start of time. So *if* we wanted to change anything (and it's unclear whether we should) the simplest option would probably be to give an extra zone line for each of those zones, specifying standard time until the first transition. Whatever we do, I'll need to think about what the tzvalidate format should say about it, if anything - because it won't change the retrospective data either way, and I'm not trying to suggest that implementations change their behaviour, either. Jon
Jon Skeet wrote:
So*if* we wanted to change anything (and it's unclear whether we should) the simplest option would probably be to give an extra zone line for each of those zones, specifying standard time until the first transition.
Thanks for looking into it. Since this doesn't seem to affect anybody (other than validation testing) I'm somewhat inclined to let sleeping dogs lie.
On 22 July 2015 at 09:11, Paul Eggert <eggert@cs.ucla.edu> wrote:
Jon Skeet wrote:
So*if* we wanted to change anything (and it's unclear whether we should) the simplest option would probably be to give an extra zone line for each of those zones, specifying standard time until the first transition.
Thanks for looking into it. Since this doesn't seem to affect anybody (other than validation testing) I'm somewhat inclined to let sleeping dogs lie.
Makes sense. I'll think about faking it for just these zones, with an appropriate documentation note :) I haven't tried it for old data yet, which may uncover problems with that approach. Jon
participants (4)
-
Howard Hinnant -
Jon Skeet -
Lester Caine -
Paul Eggert