There is also the original issue of %s
being null in 1943. This seems to be a bug. But maybe not. I’m unable from the
documents I’ve seen to determine the intended behavior.
None the less, it seems obvious to me that
either both the time resulting from the transition and the letter carry forward
or neither one does. Have one but not the other carry forward seems
nonsensical.
++PLS
From:
tz-request@elsie.nci.nih.gov [mailto:tz-request@elsie.nci.nih.gov] On Behalf Of Mark Davis
Sent: Wednesday, September 27,
2006 4:14 PM
To: tz@lecserver.nci.nih.gov
Cc: tz@lecserver.nci.nih.gov
Subject: Re: Question on
abbreviations
My reading of the
specification (zic.8.txt) was that the first rule mentioned was operative
during the interval from 1942 to "only", that is, during 1942 alone.
This was by my reading of:
TO Gives the final year in which the rule applies. In
addition to minimum and maximum (as above), the word
only (or an abbreviation) may be used to repeat the
value of the FROM field.
While it was explained to
me what the actual code does, I don't think this is reflected in the above text
-- or at least, not at all clearly. According to this text, if I saw the
following:
Rule US 1942
1944
- Feb 9 2:00 1:00 W
# War
The rule should not apply in 1945. So I request that the text be fixed, because
the rule clearly, according to the explanations given on this thread, applies
*afterwards* (and the circumstances in which it applies need to be clearly
specified. Is it until the next Rule that has an SAVE value with the same SAVE
value as this Rule? Until the next Rule that has a SAVE value?...
mark
On 9/27/06, Paul
Schauble <Paul.Schauble@ticketmaster.com>
wrote:
So in this case:
Rule US 1942 only - Feb 9 2:00 1:00 W
# War
Rule US 1945 only - Aug 14
23:00u 1:00 P #
Peace
Why is %s undefined in 1943? This was the question that started the
thread. If the time setting carries forward, surely the letter should
also.
++PLS
-----Original Message-----
From: tz-request@elsie.nci.nih.gov
[mailto:tz-request@elsie.nci.nih.gov]
On Behalf Of Ken Pizzini
Sent: Wednesday, September 27, 2006 3:14 PM
To: tz@lecserver.nci.nih.gov
Subject: Re: Question on abbreviations
On Wed, Sep 27, 2006 at 02:37:58PM -0700, Mark Davis wrote:
> I share your confusion. If Paul (Eggert's) description is right, then
I have
> to ignore the TO field in some circumstances which are entirely
unclear to
> me. I would much rather see the TO field corrected. That is, if
TO=1942 is
> ignored, and 1945 is the real date, then the line should be corrected
to
> TO=1945.
The key to understanding is that the rules describe a list of
*transitions*.
After a transition, the described effect on zone offset and abbreviation
*remain* in effect until the next transition. The "TO"
part of a rule
is
used to enable a shorthand for a _recurring_ transition, such as "first
Tuesday of February", for all years within the range. If
"to" is
"only", then the *transition* being documented is a singleton, but
the transitioned-into offset/abbreviation remains in effect until the
_next_ transition, no matter how far in the future.
> There are other failures in the parsing. My error messages are:
...
> I looked into why this is happening, and found:
>
> Zone Europe/Amsterdam 0:19:32
- LMT 1835
> 0:19:32 Neth %s 1937
Jul 1
> But the first LETTER/S defined by Neth is in 1916, so during the range
from
> 1835 to 1916 this is undefined. If the LETTER/S are magically also
defined
> *before* the first FROM, that should be described in the
specification.
Yes, this is a failure of the documentation. If a Zone refers to a
time
within a Rule that is before the first transition mentioned for that
rule,
then the _oldest_standard_time_ "Letter/s" is used. In
this case, AMT.
> BTW, the documentation was a first a bit confusing to me, since it
says that
> fields are delimited by spaces, and lists a single Zone UNTIL field.
> However, if you look carefully at the documentation, there are really
4
> fields:
>
> UNTIL_YEAR UNTIL_IN UNTIL_ON UNTIL_AT
>
> which are optional [but only in "truncation" from the end: that
is, it
> corresponds to the (Perl) regex (UNTIL_YEAR (UNTIL_IN (UNTIL_ON
> (UNTIL_AT)?)?)?)?].
>
> I'm not the only one to have initially made this mistake: the proposed
XML
> format for the TZ database makes the same mistake.
Confusing: granted. Whether "Until" is one or multiple
fields is a
matter of interpretation. The _traditional_ understanding is that it
is a *single* "timestamp field" which may happen to have spaces
within
it. BTW the subfields aren't "YEAR IN ON AT", but
"YEAR MONTH DAY
TIME".
In this regard, a recent addition to the tzcode tarball is
zoneinfo2tdf.pl,
which translates the more free-with-spaces zone tzdata into a form which
strictly uses a single tab between fields. This may make life easier
for some by simplifying their parser's requirements. (Or not.)
--Ken
Pizzini