RE: Question on abbreviations

Then I'm confused. At the end of 1942, that rule should expire, because of the "only" leaving what rule in effect? Apparently none, leaving the time in 1943 undefined. Where can I find a explanation of the tz text database format? Thanks, ++PLS -----Original Message----- From: tz-request@elsie.nci.nih.gov [mailto:tz-request@elsie.nci.nih.gov] On Behalf Of Paul Eggert Sent: Wednesday, September 27, 2006 9:29 AM To: tz@lecserver.nci.nih.gov Subject: Re: Question on abbreviations "Mark Davis" <mark.davis@icu-project.org> writes:
According to the spec, TO = "only" is equivalent to saying that the TO value equals the FROM value.
Yes, that's what's happening here. The rule to switch to DST applies only in 1942. There is no rule to switch _out_ of DST in 1942, so DST continues until 1945, the next rule that applies.
Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S
... Rule US 1942 only - Feb 9 2:00 1:00 W # War Rule US 1945 only - Aug 14 23:00u 1:00 P #

Rules don't expire. Each rule creates a number of transition points--an "only" rule creates exactly one, a ranged rule creates one for each year in the range, and a "max" rule yields an infinite number (in theory). The current DST rule is always the one put in effect by the most recent transition point J Andrew Lipscomb, CPA*ABV, ASA Decosimo Corporate Finance 900 Tallan Building 2 Union Square Chattanooga, TN 37402 423.756.7100 Fax 423.266.6671 www.dcf.decosimo.com -----Original Message----- From: Paul Schauble [mailto:Paul.Schauble@ticketmaster.com] Sent: Wed 27 September 2006 16:28 To: tz@lecserver.nci.nih.gov; tz@lecserver.nci.nih.gov Subject: RE: Question on abbreviations Then I'm confused. At the end of 1942, that rule should expire, because of the "only" leaving what rule in effect? Apparently none, leaving the time in 1943 undefined. Where can I find a explanation of the tz text database format? Thanks, ++PLS -----Original Message----- From: tz-request@elsie.nci.nih.gov [mailto:tz-request@elsie.nci.nih.gov] On Behalf Of Paul Eggert Sent: Wednesday, September 27, 2006 9:29 AM To: tz@lecserver.nci.nih.gov Subject: Re: Question on abbreviations "Mark Davis" <mark.davis@icu-project.org> writes:
According to the spec, TO = "only" is equivalent to saying that the TO value equals the FROM value.
Yes, that's what's happening here. The rule to switch to DST applies only in 1942. There is no rule to switch _out_ of DST in 1942, so DST continues until 1945, the next rule that applies.
Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S
... Rule US 1942 only - Feb 9 2:00 1:00 W # War Rule US 1945 only - Aug 14 23:00u 1:00 P #

The documentation is in zic.8.txt. I share your confusion. If Paul (Eggert's) description is right, then I have to ignore the TO field in some circumstances which are entirely unclear to me. I would much rather see the TO field corrected. That is, if TO=1942 is ignored, and 1945 is the real date, then the line should be corrected to TO=1945. There are other failures in the parsing. My error messages are: ??? Format too short: '' under Europe/Amsterdam; ??? Format too short: 'S' under Europe/Amsterdam; ??? Format too short: '' under Europe/Moscow; ??? Format too short: 'S' under Europe/Moscow; I looked into why this is happening, and found: Zone Europe/Amsterdam 0:19:32 - LMT 1835 0:19:32 Neth %s 1937 Jul 1 But the first LETTER/S defined by Neth is in 1916, so during the range from 1835 to 1916 this is undefined. If the LETTER/S are magically also defined *before* the first FROM, that should be described in the specification. Mark BTW, the documentation was a first a bit confusing to me, since it says that fields are delimited by spaces, and lists a single Zone UNTIL field. However, if you look carefully at the documentation, there are really 4 fields: UNTIL_YEAR UNTIL_IN UNTIL_ON UNTIL_AT which are optional [but only in "truncation" from the end: that is, it corresponds to the (Perl) regex (UNTIL_YEAR (UNTIL_IN (UNTIL_ON (UNTIL_AT)?)?)?)?]. I'm not the only one to have initially made this mistake: the proposed XML format for the TZ database makes the same mistake. On 9/27/06, Paul Schauble < Paul.Schauble@ticketmaster.com> wrote:
Then I'm confused. At the end of 1942, that rule should expire, because of the "only" leaving what rule in effect? Apparently none, leaving the time in 1943 undefined.
Where can I find a explanation of the tz text database format?
Thanks, ++PLS
-----Original Message----- From: tz-request@elsie.nci.nih.gov [mailto:tz-request@elsie.nci.nih.gov] On Behalf Of Paul Eggert Sent: Wednesday, September 27, 2006 9:29 AM To: tz@lecserver.nci.nih.gov Subject: Re: Question on abbreviations
"Mark Davis" <mark.davis@icu-project.org > writes:
According to the spec, TO = "only" is equivalent to saying that the TO value equals the FROM value.
Yes, that's what's happening here. The rule to switch to DST applies only in 1942. There is no rule to switch _out_ of DST in 1942, so DST continues until 1945, the next rule that applies.
Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S
... Rule US 1942 only - Feb 9 2:00 1:00 W # War Rule US 1945 only - Aug 14 23:00u 1:00 P #

On Wed, Sep 27, 2006 at 02:37:58PM -0700, Mark Davis wrote:
I share your confusion. If Paul (Eggert's) description is right, then I have to ignore the TO field in some circumstances which are entirely unclear to me. I would much rather see the TO field corrected. That is, if TO=1942 is ignored, and 1945 is the real date, then the line should be corrected to TO=1945.
The key to understanding is that the rules describe a list of *transitions*. After a transition, the described effect on zone offset and abbreviation *remain* in effect until the next transition. The "TO" part of a rule is used to enable a shorthand for a _recurring_ transition, such as "first Tuesday of February", for all years within the range. If "to" is "only", then the *transition* being documented is a singleton, but the transitioned-into offset/abbreviation remains in effect until the _next_ transition, no matter how far in the future.
There are other failures in the parsing. My error messages are: ... I looked into why this is happening, and found:
Zone Europe/Amsterdam 0:19:32 - LMT 1835 0:19:32 Neth %s 1937 Jul 1
But the first LETTER/S defined by Neth is in 1916, so during the range from 1835 to 1916 this is undefined. If the LETTER/S are magically also defined *before* the first FROM, that should be described in the specification.
Yes, this is a failure of the documentation. If a Zone refers to a time within a Rule that is before the first transition mentioned for that rule, then the _oldest_standard_time_ "Letter/s" is used. In this case, AMT.
BTW, the documentation was a first a bit confusing to me, since it says that fields are delimited by spaces, and lists a single Zone UNTIL field. However, if you look carefully at the documentation, there are really 4 fields:
UNTIL_YEAR UNTIL_IN UNTIL_ON UNTIL_AT
which are optional [but only in "truncation" from the end: that is, it corresponds to the (Perl) regex (UNTIL_YEAR (UNTIL_IN (UNTIL_ON (UNTIL_AT)?)?)?)?].
I'm not the only one to have initially made this mistake: the proposed XML format for the TZ database makes the same mistake.
Confusing: granted. Whether "Until" is one or multiple fields is a matter of interpretation. The _traditional_ understanding is that it is a *single* "timestamp field" which may happen to have spaces within it. BTW the subfields aren't "YEAR IN ON AT", but "YEAR MONTH DAY TIME". In this regard, a recent addition to the tzcode tarball is zoneinfo2tdf.pl, which translates the more free-with-spaces zone tzdata into a form which strictly uses a single tab between fields. This may make life easier for some by simplifying their parser's requirements. (Or not.) --Ken Pizzini

Thanks for the explanations, that helps. Could the explanation of transitions and the default for LETTER/S before the first year be added to zic.8.txt? As to UNTIL, it isn't just a confusion. The documentation says: Input lines are made up of fields. Fields are separated from one another by any number of white space characters. Leading and trailing white space on input lines is ignored. Thus saying that UNTIL is a single field with interior spaces contradicts this (and is also clumsier to parse and harder to explain). Mark On 9/27/06, Ken Pizzini <tz.@explicate.org> wrote:
On Wed, Sep 27, 2006 at 02:37:58PM -0700, Mark Davis wrote:
I share your confusion. If Paul (Eggert's) description is right, then I have to ignore the TO field in some circumstances which are entirely unclear to me. I would much rather see the TO field corrected. That is, if TO=1942 is ignored, and 1945 is the real date, then the line should be corrected to TO=1945.
The key to understanding is that the rules describe a list of *transitions*.
After a transition, the described effect on zone offset and abbreviation *remain* in effect until the next transition. The "TO" part of a rule is used to enable a shorthand for a _recurring_ transition, such as "first Tuesday of February", for all years within the range. If "to" is "only", then the *transition* being documented is a singleton, but the transitioned-into offset/abbreviation remains in effect until the _next_ transition, no matter how far in the future.
There are other failures in the parsing. My error messages are: ... I looked into why this is happening, and found:
Zone Europe/Amsterdam 0:19:32 - LMT 1835 0:19:32 Neth %s 1937 Jul 1
But the first LETTER/S defined by Neth is in 1916, so during the range from 1835 to 1916 this is undefined. If the LETTER/S are magically also defined *before* the first FROM, that should be described in the specification.
Yes, this is a failure of the documentation. If a Zone refers to a time within a Rule that is before the first transition mentioned for that rule, then the _oldest_standard_time_ "Letter/s" is used. In this case, AMT.
BTW, the documentation was a first a bit confusing to me, since it says that fields are delimited by spaces, and lists a single Zone UNTIL field. However, if you look carefully at the documentation, there are really 4 fields:
UNTIL_YEAR UNTIL_IN UNTIL_ON UNTIL_AT
which are optional [but only in "truncation" from the end: that is, it corresponds to the (Perl) regex (UNTIL_YEAR (UNTIL_IN (UNTIL_ON (UNTIL_AT)?)?)?)?].
I'm not the only one to have initially made this mistake: the proposed XML format for the TZ database makes the same mistake.
Confusing: granted. Whether "Until" is one or multiple fields is a matter of interpretation. The _traditional_ understanding is that it is a *single* "timestamp field" which may happen to have spaces within it. BTW the subfields aren't "YEAR IN ON AT", but "YEAR MONTH DAY TIME".
In this regard, a recent addition to the tzcode tarball is zoneinfo2tdf.pl , which translates the more free-with-spaces zone tzdata into a form which strictly uses a single tab between fields. This may make life easier for some by simplifying their parser's requirements. (Or not.)
--Ken Pizzini

On Wed, Sep 27, 2006 at 03:26:59PM -0700, Mark Davis wrote:
As to UNTIL, it isn't just a confusion. The documentation says:
Input lines are made up of fields. Fields are separated from one another by any number of white space characters. Leading and trailing white space on input lines is ignored.
Thus saying that UNTIL is a single field with interior spaces contradicts this (and is also clumsier to parse and harder to explain).
It is still a matter of interpretation, ripe for confusion. The "classic" interpretation has been: *the* fields (e.g., the five or six in (Zone,NAME,GMTOFF,RULES/SAVE,FORMAT,[UNTIL])) are whitspace separated. The UNTIL field is *one* field, even if it does have spaces. The "wrong" interpretation you mention is a very reasonable reading, but not the one that was meant.
Could the explanation of transitions and the default for LETTER/S before the first year be added to zic.8.txt?
I'll take a stab at making zic.8 less ambiguous, if someone else doesn't beat me to it. --Ken Pizzini

On Wed, Sep 27, 2006 at 03:14:14PM -0700, Ken Pizzini wrote:
UNTIL_YEAR UNTIL_IN UNTIL_ON UNTIL_AT BTW the subfields aren't "YEAR IN ON AT", but "YEAR MONTH DAY TIME".
My bad: I trusted memory instead of checking docs --- you were right, the differences being that "ON" can be things like "lastSun", and "AT" can have w/s/u/g/z suffixes. --Ken Pizzini
participants (4)
-
Andy Lipscomb
-
Ken Pizzini
-
Mark Davis
-
Paul Schauble