Why is "AEST" the abbreviation for Australia/Sydney in 1900?

Given this snippet from tzdata.zi: Z Australia/Sydney 10:4:52 - LMT 1895 F 10 AU AE%sT 1971 R AU 1917 o - Ja 1 2s 1 D R AU 1917 o - Mar lastSu 2s 0 S R AU 1942 o - Ja 1 2s 1 D R AU 1942 o - Mar lastSu 2s 0 S R AU 1942 o - S 27 2s 1 D R AU 1943 1944 - Mar lastSu 2s 0 S R AU 1943 o - O 3 2s 1 D What should the abbreviation be for the zone in, say, 1900? Using glibc I get AEST: TZ=Australia/Sydney date --date="1900-01-01" Mon 1 Jan 00:00:00 AEST 1900 But I don't see any way to get "AEST" from the tzdata and the spec in the zic(8) man page. Prior to 1895 it is "LMT", and then from 1917-01-01 it's "AEDT", and from 1917-03-25 it's "AEST", but none of the Rule lines apply to dates before 1917, so what value of LETTER/S should be used for the AE%sT format in 1900? Glibc solves this by iterating through all the transitions for the AU rule until it finds a valid name for a transition to DST and a valid name for a transition out of DST, and then remembers those two names, which in this case gives AEDT (from 1917-01-01 to 1917-03-25) and AEST (from 1917-03-25 to 1942-01-01). Then when it finds the info for a date in 1900 it looks at the isdst field of struct tm for that date, and picks one of the two remembered names, which gives us "AEST". That seems like the correct result, but nothing in the zic man page (or any other docs I can find) says that is what's supposed to happen. My code for GCC's C++ library is an independent implementation based on the spec, and it has no way to produce "AEST" for 1900. It seems to me that either the spec should to do ... something ... so that a LETTER/S field for the Rule can be found, or there should be a Zone continuation line in the tzdata that explicitly says that Australia/Sydney is always on standard time for the period 1895-02-01 to 1917-01-01 and that the abbreviation is AEST (without any %s that needs to be expanded). So add the second line here: Z Australia/Sydney 10:4:52 - LMT 1895 F 10 - AEST 1917 10 AU AE%sT 1971 R AU 1917 o - Ja 1 2s 1 D R AU 1917 o - Mar lastSu 2s 0 S R AU 1942 o - Ja 1 2s 1 D R AU 1942 o - Mar lastSu 2s 0 S R AU 1942 o - S 27 2s 1 D R AU 1943 1944 - Mar lastSu 2s 0 S R AU 1943 o - O 3 2s 1 D I'm sure this question could also be asked of many other zones, Australia/Sydney just happens to be the one I was testing with and noticed that my code couldn't expand the %s in "AE%sT". Am I missing something that says to use "S" as the LETTER/S field for dates prior to the first DST transition in 1917? What logic do other C libraries use to get "AEST" here?

On 2024-04-30 06:46, Jonathan Wakely via tz wrote:
Given this snippet from tzdata.zi:
Z Australia/Sydney 10:4:52 - LMT 1895 F 10 AU AE%sT 1971 R AU 1917 o - Ja 1 2s 1 D R AU 1917 o - Mar lastSu 2s 0 S R AU 1942 o - Ja 1 2s 1 D R AU 1942 o - Mar lastSu 2s 0 S R AU 1942 o - S 27 2s 1 D R AU 1943 1944 - Mar lastSu 2s 0 S R AU 1943 o - O 3 2s 1 D
What should the abbreviation be for the zone in, say, 1900?
Using glibc I get AEST:
TZ=Australia/Sydney date --date="1900-01-01" Mon 1 Jan 00:00:00 AEST 1900
But I don't see any way to get "AEST" from the tzdata and the spec in the zic(8) man page.
Prior to 1895 it is "LMT", and then from 1917-01-01 it's "AEDT", and from 1917-03-25 it's "AEST", but none of the Rule lines apply to dates before 1917, so what value of LETTER/S should be used for the AE%sT format in 1900?
Glibc solves this by iterating through all the transitions for the AU rule until it finds a valid name for a transition to DST and a valid name for a transition out of DST, and then remembers those two names, which in this case gives AEDT (from 1917-01-01 to 1917-03-25) and AEST (from 1917-03-25 to 1942-01-01). Then when it finds the info for a date in 1900 it looks at the isdst field of struct tm for that date, and picks one of the two remembered names, which gives us "AEST". That seems like the correct result, but nothing in the zic man page (or any other docs I can find) says that is what's supposed to happen.
My code for GCC's C++ library is an independent implementation based on the spec, and it has no way to produce "AEST" for 1900.
It seems to me that either the spec should to do ... something ... so that a LETTER/S field for the Rule can be found, or there should be a Zone continuation line in the tzdata that explicitly says that Australia/Sydney is always on standard time for the period 1895-02-01 to 1917-01-01 and that the abbreviation is AEST (without any %s that needs to be expanded). So add the second line here:
Z Australia/Sydney 10:4:52 - LMT 1895 F 10 - AEST 1917 10 AU AE%sT 1971 R AU 1917 o - Ja 1 2s 1 D R AU 1917 o - Mar lastSu 2s 0 S R AU 1942 o - Ja 1 2s 1 D R AU 1942 o - Mar lastSu 2s 0 S R AU 1942 o - S 27 2s 1 D R AU 1943 1944 - Mar lastSu 2s 0 S R AU 1943 o - O 3 2s 1 D
I'm sure this question could also be asked of many other zones, Australia/Sydney just happens to be the one I was testing with and noticed that my code couldn't expand the %s in "AE%sT".
Am I missing something that says to use "S" as the LETTER/S field for dates prior to the first DST transition in 1917? What logic do other C libraries use to get "AEST" here?
It is default standard time LMT, STD, ... until it is not? You do not want to need rules for all the tropical (or polar) zones where there are no summer time changes. Read https://data.iana.org/time-zones/tz-how-to.html or /usr/share/doc/tzcode/tz-how-to.html then look at the zic code for defaulting time types, calculating tt_desigidx, and adding to tzh_charcnt as in tzfile(5). -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry

On Tue, 30 Apr 2024 at 19:57, Brian Inglis via tz <tz@iana.org> wrote:
On 2024-04-30 06:46, Jonathan Wakely via tz wrote:
Given this snippet from tzdata.zi:
Z Australia/Sydney 10:4:52 - LMT 1895 F 10 AU AE%sT 1971 R AU 1917 o - Ja 1 2s 1 D R AU 1917 o - Mar lastSu 2s 0 S R AU 1942 o - Ja 1 2s 1 D R AU 1942 o - Mar lastSu 2s 0 S R AU 1942 o - S 27 2s 1 D R AU 1943 1944 - Mar lastSu 2s 0 S R AU 1943 o - O 3 2s 1 D
What should the abbreviation be for the zone in, say, 1900?
Using glibc I get AEST:
TZ=Australia/Sydney date --date="1900-01-01" Mon 1 Jan 00:00:00 AEST 1900
But I don't see any way to get "AEST" from the tzdata and the spec in the zic(8) man page.
Prior to 1895 it is "LMT", and then from 1917-01-01 it's "AEDT", and from 1917-03-25 it's "AEST", but none of the Rule lines apply to dates before 1917, so what value of LETTER/S should be used for the AE%sT format in 1900?
Glibc solves this by iterating through all the transitions for the AU rule until it finds a valid name for a transition to DST and a valid name for a transition out of DST, and then remembers those two names, which in this case gives AEDT (from 1917-01-01 to 1917-03-25) and AEST (from 1917-03-25 to 1942-01-01). Then when it finds the info for a date in 1900 it looks at the isdst field of struct tm for that date, and picks one of the two remembered names, which gives us "AEST". That seems like the correct result, but nothing in the zic man page (or any other docs I can find) says that is what's supposed to happen.
My code for GCC's C++ library is an independent implementation based on the spec, and it has no way to produce "AEST" for 1900.
It seems to me that either the spec should to do ... something ... so that a LETTER/S field for the Rule can be found, or there should be a Zone continuation line in the tzdata that explicitly says that Australia/Sydney is always on standard time for the period 1895-02-01 to 1917-01-01 and that the abbreviation is AEST (without any %s that needs to be expanded). So add the second line here:
Z Australia/Sydney 10:4:52 - LMT 1895 F 10 - AEST 1917 10 AU AE%sT 1971 R AU 1917 o - Ja 1 2s 1 D R AU 1917 o - Mar lastSu 2s 0 S R AU 1942 o - Ja 1 2s 1 D R AU 1942 o - Mar lastSu 2s 0 S R AU 1942 o - S 27 2s 1 D R AU 1943 1944 - Mar lastSu 2s 0 S R AU 1943 o - O 3 2s 1 D
I'm sure this question could also be asked of many other zones, Australia/Sydney just happens to be the one I was testing with and noticed that my code couldn't expand the %s in "AE%sT".
Am I missing something that says to use "S" as the LETTER/S field for dates prior to the first DST transition in 1917? What logic do other C libraries use to get "AEST" here?
It is default standard time LMT, STD, ... until it is not?
Yes, the question is not whether the zone uses STD or DST, I thought I was clear I'm only talking about how to handle %s in a Zone's FORMAT. I'm not asking "why did Australia/Sydney use STD before it started using DST?" because that is self-explanatory. The question is about what value of LETTER/S to use for the format AE%sT in 1900. The %s should be replaced by the LETTER/S field of the rule in effect at the time, but there is no AU rule that is in effect prior to 1917. So how does AE%sT get expanded to AEST?
You do not want to need rules for all the tropical (or polar) zones where there are no summer time changes.
Those zones don't use %s so the question doesn't arise. They don't have transitions, so there's no variable part of their FORMAT fields.
Read https://data.iana.org/time-zones/tz-how-to.html or /usr/share/doc/tzcode/tz-how-to.html then look at the zic code for defaulting time types, calculating tt_desigidx, and adding to tzh_charcnt as in tzfile(5).
The America/Chicago zone in the how-to doc has the same problem. After 1883 the zone uses the "US" rules, and the format is C%sT. So what is the LETTER/S field for Chicago in 1900? Which of the "US" rules is in effect in 1900 and so provides the LETTER/S field? The how-to document even says this explicitly! "The last two make sense only if there’s a named rule in effect." Where "the last two" refers to using / or %s in a Zone's FORMAT field. But that document also provides the answer: "One wrinkle, not fully explained in zic.8.txt, is what happens when switching to a named rule. To what values should the SAVE and LETTER data be initialized? [...] If switching to a named rule before any transition has happened, assume standard time (SAVE zero), and use the LETTER data from the earliest transition with a SAVE of zero." And that matches what glibc does: it finds the earliest transition with SAVE of zero, and uses that LETTER for the period before the first transition. Should this "wrinkle" be documented in the zic(8) man page? Is there some other source of truth documenting that behaviour, or is the how-to document the only place?

* Jonathan Wakely via tz:
And that matches what glibc does: it finds the earliest transition with SAVE of zero, and uses that LETTER for the period before the first transition.
Do you mean zic instead of glibc? I don't think we have a .zi parser in glibc proper. We mostly bundle zic for historic reasons. Thanks, Florian

On Thu, 2 May 2024 at 08:18, Florian Weimer wrote:
* Jonathan Wakely via tz:
And that matches what glibc does: it finds the earliest transition with SAVE of zero, and uses that LETTER for the period before the first transition.
Do you mean zic instead of glibc? I don't think we have a .zi parser in glibc proper. We mostly bundle zic for historic reasons.
I mean glibc code such as: https://sourceware.org/git/?p=glibc.git;a=blob;f=time/tzfile.c;h=41475399643... The comment says it's finding the offsets, but it seems to be finding the abbreviations for the first two transitions, storing them in the __tzname array. There's similar logic here: https://sourceware.org/git/?p=glibc.git;a=blob;f=time/tzfile.c;h=41475399643... When I ran `TZ=Australia/Sydney date -d 1900-01-01` under gdb that seemed to be the code that decided on AEST, but maybe I got confused.

* Jonathan Wakely:
On Thu, 2 May 2024 at 08:18, Florian Weimer wrote:
* Jonathan Wakely via tz:
And that matches what glibc does: it finds the earliest transition with SAVE of zero, and uses that LETTER for the period before the first transition.
Do you mean zic instead of glibc? I don't think we have a .zi parser in glibc proper. We mostly bundle zic for historic reasons.
I mean glibc code such as: https://sourceware.org/git/?p=glibc.git;a=blob;f=time/tzfile.c;h=41475399643... The comment says it's finding the offsets, but it seems to be finding the abbreviations for the first two transitions, storing them in the __tzname array.
There's similar logic here: https://sourceware.org/git/?p=glibc.git;a=blob;f=time/tzfile.c;h=41475399643...
When I ran `TZ=Australia/Sydney date -d 1900-01-01` under gdb that seemed to be the code that decided on AEST, but maybe I got confused.
What I meant is that glibc isn't looking at LETTER, it's picking a precomposed time zone identifier. Thanks, Florian

On 2024-05-02 01:10, Jonathan Wakely via tz wrote:
I mean glibc code such as: https://sourceware.org/git/?p=glibc.git;a=blob;f=time/tzfile.c;h=41475399643... The comment says it's finding the offsets, but it seems to be finding the abbreviations for the first two transitions, storing them in the __tzname array.
That code is present for a different reason. It's trying to support obsolescent (but POSIX-required) variables like tzname, which in general make no sense when TZif files are used. (The next POSIX draft tries to fix this, but it's messed up, and I haven't had time to interact with the POSIX committee to clean things up.) These obsolescent compatibility variables are not related to the question of what is the proper abbreviation to use for timestamps in 1908.
There's similar logic here: https://sourceware.org/git/?p=glibc.git;a=blob;f=time/tzfile.c;h=41475399643...
This code is present for yet another different reason. In this area, glibc uses a heuristic in which it infers a time type before the first transition by looking for the first time type that uses standard time. This heuristic is related to (but differs from) a similar heuristic that was introduced to tzcode in 1986 and has evolved over the years to fix various issues with it. Unfortunately the heuristic has been so problematic (e.g., tzcode differs from glibc which surely differs from other TZif readers) that Internet RFC prohibits the practice, and says that TZif readers should simply use time type 0 for timestamps before the first transition. The tzcode heuristic is present only for backwards compatibility to old (and now nonstandard) versions of zic. Now is a good time to fix tzcode, so that it conforms to Internet RFC 8536 in this area. I installed the attached proposed patch. Glibc should also be fixed to conform to the RFC, but that is a separate matter. Anyway to get back to your original email, as Florian mentioned this is not an issue about the TZif-reading code in glibc proper; it's an issue about zic, the TZif writer that glibc mererly copies from tzcode. Arthur wrote some email about that to clarify things, and I plan to follow up to his email. Thanks for bringing this up, as it reminded me the time is ripe for the reference implementation to conform to the RFC in this area. (And sorry for the belated reply; things have been hectic around UCLA lately....)

* Paul Eggert:
On 2024-05-02 01:10, Jonathan Wakely via tz wrote:
I mean glibc code such as: https://sourceware.org/git/?p=glibc.git;a=blob;f=time/tzfile.c;h=41475399643... The comment says it's finding the offsets, but it seems to be finding the abbreviations for the first two transitions, storing them in the __tzname array.
That code is present for a different reason. It's trying to support obsolescent (but POSIX-required) variables like tzname, which in general make no sense when TZif files are used. (The next POSIX draft tries to fix this, but it's messed up, and I haven't had time to interact with the POSIX committee to clean things up.)
These obsolescent compatibility variables are not related to the question of what is the proper abbreviation to use for timestamps in 1908.
Unfortunately, __tzname is also used internally to ferry around information, see __tz_compute and the way it uses that to set tm_zone. We should not be doing that, but it's the code we currently have. 8-( Thanks, Florian

On 2024-05-02 11:31, Paul Eggert via tz wrote:
On 2024-05-02 01:10, Jonathan Wakely via tz wrote:
I mean glibc code such as: https://sourceware.org/git/?p=glibc.git;a=blob;f=time/tzfile.c;h=41475399643... The comment says it's finding the offsets, but it seems to be finding the abbreviations for the first two transitions, storing them in the __tzname array.
That code is present for a different reason. It's trying to support obsolescent (but POSIX-required) variables like tzname, which in general make no sense when TZif files are used. (The next POSIX draft tries to fix this, but it's messed up, and I haven't had time to interact with the POSIX committee to clean things up.)
It looks like comments on drafts offered have been minimal, and voting appears to be imminent, so you may wish to respond soon, or defer to TC1. -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry

A hand-waving explanation of AEST. For a given zone, each line describing the zone except for the last ends with an until time. The creation of the transition time entry for the until time is deferred until the following zone line has been completely processed. (It happens at the bottom of the giant for loop in outzone.) That deferral means that the time zone abbreviations in use have been computed, so the appropriate abbreviation can be applied to the until time. @dashdashado On Tue, Apr 30, 2024 at 1:49 PM Jonathan Wakely via tz <tz@iana.org> wrote:
Given this snippet from tzdata.zi:
Z Australia/Sydney 10:4:52 - LMT 1895 F 10 AU AE%sT 1971 R AU 1917 o - Ja 1 2s 1 D R AU 1917 o - Mar lastSu 2s 0 S R AU 1942 o - Ja 1 2s 1 D R AU 1942 o - Mar lastSu 2s 0 S R AU 1942 o - S 27 2s 1 D R AU 1943 1944 - Mar lastSu 2s 0 S R AU 1943 o - O 3 2s 1 D
What should the abbreviation be for the zone in, say, 1900?
Using glibc I get AEST:
TZ=Australia/Sydney date --date="1900-01-01" Mon 1 Jan 00:00:00 AEST 1900
But I don't see any way to get "AEST" from the tzdata and the spec in the zic(8) man page.
Prior to 1895 it is "LMT", and then from 1917-01-01 it's "AEDT", and from 1917-03-25 it's "AEST", but none of the Rule lines apply to dates before 1917, so what value of LETTER/S should be used for the AE%sT format in 1900?
Glibc solves this by iterating through all the transitions for the AU rule until it finds a valid name for a transition to DST and a valid name for a transition out of DST, and then remembers those two names, which in this case gives AEDT (from 1917-01-01 to 1917-03-25) and AEST (from 1917-03-25 to 1942-01-01). Then when it finds the info for a date in 1900 it looks at the isdst field of struct tm for that date, and picks one of the two remembered names, which gives us "AEST". That seems like the correct result, but nothing in the zic man page (or any other docs I can find) says that is what's supposed to happen.
My code for GCC's C++ library is an independent implementation based on the spec, and it has no way to produce "AEST" for 1900.
It seems to me that either the spec should to do ... something ... so that a LETTER/S field for the Rule can be found, or there should be a Zone continuation line in the tzdata that explicitly says that Australia/Sydney is always on standard time for the period 1895-02-01 to 1917-01-01 and that the abbreviation is AEST (without any %s that needs to be expanded). So add the second line here:
Z Australia/Sydney 10:4:52 - LMT 1895 F 10 - AEST 1917 10 AU AE%sT 1971 R AU 1917 o - Ja 1 2s 1 D R AU 1917 o - Mar lastSu 2s 0 S R AU 1942 o - Ja 1 2s 1 D R AU 1942 o - Mar lastSu 2s 0 S R AU 1942 o - S 27 2s 1 D R AU 1943 1944 - Mar lastSu 2s 0 S R AU 1943 o - O 3 2s 1 D
I'm sure this question could also be asked of many other zones, Australia/Sydney just happens to be the one I was testing with and noticed that my code couldn't expand the %s in "AE%sT".
Am I missing something that says to use "S" as the LETTER/S field for dates prior to the first DST transition in 1917? What logic do other C libraries use to get "AEST" here?

On 2024-04-30 16:21, Arthur David Olson via tz wrote:
For a given zone, each line describing the zone except for the last ends with an until time. The creation of the transition time entry for the until time is deferred until the following zone line has been completely processed. (It happens at the bottom of the giant for loop in outzone.) That deferral means that the time zone abbreviations in use have been computed, so the appropriate abbreviation can be applied to the until time.
OK, but this wouldn't address the issue of what happens with the last zone line, as that lacks an until time. For example: Rule Aus 1917 only - Jan 1 2:00 1:00 D Rule Aus 1917 only - Mar 1 2:00 0 S Zone Australia/Sydney 10:00 Aus AE%sT zic generates a TZif file where time type 0 (the time type before 1917) uses the abbreviation "AEST" - but where did that "S" come from? The documentation doesn't say. I think this is Jonathan's main point. When computing %s for timestamps that come before the earliest rule, zic uses the LETTER/S field of the earliest rule that specifies standard time. I installed the attached to try to document this.

On 2024-05-02 12:35, Paul Eggert via tz wrote:
On 2024-04-30 16:21, Arthur David Olson via tz wrote:
For a given zone, each line describing the zone except for the last ends with an until time. The creation of the transition time entry for the until time is deferred until the following zone line has been completely processed. (It happens at the bottom of the giant for loop in outzone.) That deferral means that the time zone abbreviations in use have been computed, so the appropriate abbreviation can be applied to the until time.
OK, but this wouldn't address the issue of what happens with the last zone line, as that lacks an until time. For example:
Rule Aus 1917 only - Jan 1 2:00 1:00 D Rule Aus 1917 only - Mar 1 2:00 0 S Zone Australia/Sydney 10:00 Aus AE%sT
zic generates a TZif file where time type 0 (the time type before 1917) uses the abbreviation "AEST" - but where did that "S" come from? The documentation doesn't say. I think this is Jonathan's main point.
When computing %s for timestamps that come before the earliest rule, zic uses the LETTER/S field of the earliest rule that specifies standard time. I installed the attached to try to document this.
I would expect it to default to "S" to allow "%s" in zones without rules. -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry

On Thu, 2 May 2024 at 23:25, brian.inglis--- via tz <tz@iana.org> wrote:
On 2024-05-02 12:35, Paul Eggert via tz wrote:
On 2024-04-30 16:21, Arthur David Olson via tz wrote:
For a given zone, each line describing the zone except for the last ends with an until time. The creation of the transition time entry for the until time is deferred until the following zone line has been completely processed. (It happens at the bottom of the giant for loop in outzone.) That deferral means that the time zone abbreviations in use have been computed, so the appropriate abbreviation can be applied to the until time.
OK, but this wouldn't address the issue of what happens with the last zone line, as that lacks an until time. For example:
Rule Aus 1917 only - Jan 1 2:00 1:00 D Rule Aus 1917 only - Mar 1 2:00 0 S Zone Australia/Sydney 10:00 Aus AE%sT
zic generates a TZif file where time type 0 (the time type before 1917) uses the abbreviation "AEST" - but where did that "S" come from? The documentation doesn't say. I think this is Jonathan's main point.
When computing %s for timestamps that come before the earliest rule, zic uses the LETTER/S field of the earliest rule that specifies standard time. I installed the attached to try to document this.
I would expect it to default to "S" to allow "%s" in zones without rules.
Wouldn't that be wrong for several European countries using CE%sT where the LETTER/S for standard time is "-" and for daylight savings time is "S"?

On 2024-05-03 04:14, Jonathan Wakely wrote:
On Thu, 2 May 2024 at 23:25, brian.inglis--- via tz <tz@iana.org> wrote:
On 2024-05-02 12:35, Paul Eggert via tz wrote:
On 2024-04-30 16:21, Arthur David Olson via tz wrote:
For a given zone, each line describing the zone except for the last ends with an until time. The creation of the transition time entry for the until time is deferred until the following zone line has been completely processed. (It happens at the bottom of the giant for loop in outzone.) That deferral means that the time zone abbreviations in use have been computed, so the appropriate abbreviation can be applied to the until time.
OK, but this wouldn't address the issue of what happens with the last zone line, as that lacks an until time. For example:
Rule Aus 1917 only - Jan 1 2:00 1:00 D Rule Aus 1917 only - Mar 1 2:00 0 S Zone Australia/Sydney 10:00 Aus AE%sT
zic generates a TZif file where time type 0 (the time type before 1917) uses the abbreviation "AEST" - but where did that "S" come from? The documentation doesn't say. I think this is Jonathan's main point.
When computing %s for timestamps that come before the earliest rule, zic uses the LETTER/S field of the earliest rule that specifies standard time. I installed the attached to try to document this.
I would expect it to default to "S" to allow "%s" in zones without rules.
Wouldn't that be wrong for several European countries using CE%sT where the LETTER/S for standard time is "-" and for daylight savings time is "S"?
There are no zones without rules in Europe data, and I never noticed that "S" is in alternate rather than standard time periods. In French Canada, it is HNE Normale rather than Hiver or HAE Avancée rather than Été. As I said, the default is for zones without rules where you could assign letters, only standard time, about 1/3 of the total, depending on what you total, to allow "%s" in the abbreviation, which would have to default to "S". Whereas in zones with rules you may have "D", sometimes "M" for Midsummer, and the whole abbreviation for UK and Russia: I wondered about MSK and in Cyrillic it appears to be MCK. As the project and data is in English and not localized, that has to be handled by downstreams like ICU or apps. -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry

On 2024-05-03 13:33, brian.inglis--- via tz wrote:
As I said, the default is for zones without rules where you could assign letters, only standard time, about 1/3 of the total, depending on what you total, to allow "%s" in the abbreviation, which would have to default to "S".
Unfortunately I also don't understand what you're saying. Are you suggesting that zic's behavior should change? or merely that zic's documentation should change? If the former, can you give an example of how zic's behavior should differ from what it's doing now? If the latter, what wording in the documentation should change?

On 2024-05-03 15:58, Paul Eggert wrote:
On 2024-05-03 13:33, brian.inglis--- via tz wrote:
As I said, the default is for zones without rules where you could assign letters, only standard time, about 1/3 of the total, depending on what you total, to allow "%s" in the abbreviation, which would have to default to "S".
Unfortunately I also don't understand what you're saying. Are you suggesting that zic's behavior should change? or merely that zic's documentation should change?
If the former, can you give an example of how zic's behavior should differ from what it's doing now? If the latter, what wording in the documentation should change?
I am suggesting that "%s" should be allowed in zone lines, even in zones where rules and letters are not used, so should default to "S" or "%z" or something. -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry

On Fri, 3 May 2024 at 23:23, brian.inglis--- via tz <tz@iana.org> wrote:
On 2024-05-03 15:58, Paul Eggert wrote:
On 2024-05-03 13:33, brian.inglis--- via tz wrote:
As I said, the default is for zones without rules where you could assign letters, only standard time, about 1/3 of the total, depending on what you total, to allow "%s" in the abbreviation, which would have to default to "S".
Unfortunately I also don't understand what you're saying. Are you suggesting that zic's behavior should change? or merely that zic's documentation should change?
If the former, can you give an example of how zic's behavior should differ from what it's doing now? If the latter, what wording in the documentation should change?
I am suggesting that "%s" should be allowed in zone lines, even in zones where rules and letters are not used, so should default to "S" or "%z" or something.
But why? In a zone with no rules, why would you want to use X%sX and have it expand to XSX, when XSX is shorter and simpler to use directly? Why would you want %s to default to %z when you could just use %z?

On 2024-05-03 15:22, brian.inglis--- via tz wrote:
I am suggesting that "%s" should be allowed in zone lines, even in zones where rules and letters are not used, so should default to "S" or "%z" or something.
I don't see how this change would help. It wouldn't shorten the existing data entries or make them easier to maintain, and it'd raise the probability of uncaught typos.

On 2024-05-03 16:41, Paul Eggert wrote:
On 2024-05-03 15:22, brian.inglis--- via tz wrote:
I am suggesting that "%s" should be allowed in zone lines, even in zones where rules and letters are not used, so should default to "S" or "%z" or something.
I don't see how this change would help. It wouldn't shorten the existing data entries or make them easier to maintain, and it'd raise the probability of uncaught typos.
Consistency/orthogonality, so you have a default, and do not have to check whether there are rules, or does zic disallow %s without a rule named, and should whatever is dis-/allowed be added to the docs? -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry

On 2024-05-03 18:09, brian.inglis--- via tz wrote:
does zic disallow %s without a rule named, and should whatever is dis-/allowed be added to the docs?
Yes it does disallow %s in that situation. I tried to capture that in the recent doc change <https://github.com/eggert/tz/commit/1e75b31fa9d110aeb2ed5fa13bed5f5c8ec633e8> which says that a rule must exist for %s one way or the other.

On Thu, 2 May 2024 at 19:35, Paul Eggert wrote:
On 2024-04-30 16:21, Arthur David Olson via tz wrote:
For a given zone, each line describing the zone except for the last ends with an until time. The creation of the transition time entry for the until time is deferred until the following zone line has been completely processed. (It happens at the bottom of the giant for loop in outzone.) That deferral means that the time zone abbreviations in use have been computed, so the appropriate abbreviation can be applied to the until time.
OK, but this wouldn't address the issue of what happens with the last zone line, as that lacks an until time. For example:
Rule Aus 1917 only - Jan 1 2:00 1:00 D Rule Aus 1917 only - Mar 1 2:00 0 S Zone Australia/Sydney 10:00 Aus AE%sT
zic generates a TZif file where time type 0 (the time type before 1917) uses the abbreviation "AEST" - but where did that "S" come from? The documentation doesn't say. I think this is Jonathan's main point.
When computing %s for timestamps that come before the earliest rule, zic uses the LETTER/S field of the earliest rule that specifies standard time. I installed the attached to try to document this.
The patch looks great and resolves my concern, thanks!
participants (6)
-
Arthur David Olson
-
Brian Inglis
-
brian.inglis@systematicsw.ab.ca
-
Florian Weimer
-
Jonathan Wakely
-
Paul Eggert