On Wed, Sep 27, 2006 at 02:37:58PM -0700, Mark Davis wrote:
I share your confusion. If Paul (Eggert's) description is right, then I have to ignore the TO field in some circumstances which are entirely unclear to me. I would much rather see the TO field corrected. That is, if TO=1942 is ignored, and 1945 is the real date, then the line should be corrected to TO=1945.
The key to understanding is that the rules describe a list of *transitions*. After a transition, the described effect on zone offset and abbreviation *remain* in effect until the next transition. The "TO" part of a rule is used to enable a shorthand for a _recurring_ transition, such as "first Tuesday of February", for all years within the range. If "to" is "only", then the *transition* being documented is a singleton, but the transitioned-into offset/abbreviation remains in effect until the _next_ transition, no matter how far in the future.
There are other failures in the parsing. My error messages are: ... I looked into why this is happening, and found:
Zone Europe/Amsterdam 0:19:32 - LMT 1835 0:19:32 Neth %s 1937 Jul 1
But the first LETTER/S defined by Neth is in 1916, so during the range from 1835 to 1916 this is undefined. If the LETTER/S are magically also defined *before* the first FROM, that should be described in the specification.
Yes, this is a failure of the documentation. If a Zone refers to a time within a Rule that is before the first transition mentioned for that rule, then the _oldest_standard_time_ "Letter/s" is used. In this case, AMT.
BTW, the documentation was a first a bit confusing to me, since it says that fields are delimited by spaces, and lists a single Zone UNTIL field. However, if you look carefully at the documentation, there are really 4 fields:
UNTIL_YEAR UNTIL_IN UNTIL_ON UNTIL_AT
which are optional [but only in "truncation" from the end: that is, it corresponds to the (Perl) regex (UNTIL_YEAR (UNTIL_IN (UNTIL_ON (UNTIL_AT)?)?)?)?].
I'm not the only one to have initially made this mistake: the proposed XML format for the TZ database makes the same mistake.
Confusing: granted. Whether "Until" is one or multiple fields is a matter of interpretation. The _traditional_ understanding is that it is a *single* "timestamp field" which may happen to have spaces within it. BTW the subfields aren't "YEAR IN ON AT", but "YEAR MONTH DAY TIME". In this regard, a recent addition to the tzcode tarball is zoneinfo2tdf.pl, which translates the more free-with-spaces zone tzdata into a form which strictly uses a single tab between fields. This may make life easier for some by simplifying their parser's requirements. (Or not.) --Ken Pizzini