Date: Thu, 2 Mar 2023 15:49:09 -0800 From: Paul Eggert via tz <tz@iana.org> Message-ID: <555133d4-2fcc-0a25-e4a0-1ad0a569e661@cs.ucla.edu> | POSIX does not specify TZ strings like TZ='EST5EDT'; they are TZDB | extensions. If you want a TZ string whose meaning is specified, you need | something like TZ='EST5EDT,M3.2.0,M11.1.0'. That's not correct, otherwise TZ=UTC0 would not be POSIX, and it certainly is (there's obviously no way to specify summer time or times when it begins and ends, for UTC). | You can see this by looking a few lines before the lines you quoted, | which say that a TZ string contents are "std offset dst offset, rule". That's just a generic hint, the actual specification is later (and is unchanged, other than the reference to "for all TZs whose..." which has been altered to allow tzdata type names also) in the most recent drafts). It includes: The expanded format (for all TZs whose value does not have a <colon> as the first character) is as follows: stdoffset[dst[offset][,start[/time],end[/time]]] Where: std and dst Indicate no less than three, nor more than {TZNAME_MAX}, bytes that are the designation for the standard (std) or the alternative (dst--such as Daylight Savings Time) timezone. Only std is required; if dst is missing, then the alternative time does not apply in this locale. That's quite clear that all that is needed is XXXn to be a "POSIX" TZ specification, though any more included must meet the required syntax. The current standard doesn't say what to do when "dst" is given, but the rule (everything after the first comma, is not) - making it "implicitly" unspecified. The latest draft (currently available) of the forthcoming standard is clearer, it adds: If the dst field is specified and the rule field is not, it is implementation-defined when the changes to and from Daylight Saving Time occur. It shouldn't say "Daylight Saving Time" there, while the symbol in the grammar is "dst" it is otherwise described as "alternative timezone" and should be there as well. That may have already been fixed, if not, it will be before the new version of the standard gets published. Also note, that sometime in the future this "POSIX TZ" format will almost certainly be deprecated, and then removed. I believe there is now general acceptance that it is simply inadequate for real world timezones, other than the simplest ones - and is never able to describe anything other than a single shift to & from a single alternative timezone in one year (though the "single shift" could be fixed by allowing more than one set of start,end pairs - similarly, more XXXn fields could be added to specify more different zone offsets than just the two currently possible, but I very much doubt that anyone is going to work out how to specify that, particularly not if no-one is stupid enough to try to implement some version of this. The current draft also contains this: Daylight Saving Time is in effect all year if it starts January 1 at 00:00 and ends December 31 at 24:00 plus the difference between Daylight Saving Time and standard time, leaving no room for standard time in the calendar. For example, TZ='EST5EDT,0/0,J365/25' represents a time zone that observes Daylight Saving Time all year, being 4 hours west of UTC with abbreviation "EDT". which suggests an intent to be able to support "permanent summer time", though that complicated mess achieves little more than TZ=EDT4 except for the value of tm_isdst, which as you mentioned in an earlier message really has no effect on anything - it was more or less an index into the tzname[] array, which it doesn't do well at, as while that array has no defined upper bound on its index, tm_isdst is only permitted to be 0 or 1. That's all now largely obsoleted (though not yet in the standard) by tm_zone (and tm_gmtoff) (which will be in the standard). In this regard note that the standard already says: Implementations are encouraged to use the time zone database maintained by IANA to determine when Daylight Saving Time changes occur and to handle TZ values that start with a <colon>. See RFC 6557. That is, it has already been noted that POSIX TZ isn't really good enough. In the next draft, that will be altered to say Implementations are encouraged to incorporate the IANA timezone database into the timezone database used for TZ values specifying geographical and special timezones, and to provide a way to update it in accordance with RFC 6557. POSIX TZ strings are on their way to oblivion, fortunately. However, while they remain (which will be at least until the next (major) version of the standard, after the coming one - ie: at least another decade) the specification is that if the TZ value can be interpreted as a valid POSIX TZ string, then that is what it is. If that fails - which it will for anything which does not start xxxxN (at least 3 chars in the xxxx field), but can for many other reasons as well, then it is to be interpreted (of possible) as a geographic/ special TZ string (eg: as a tzdata zone name). And while I'm here, in an earlier message you said: | If common practice becomes "ET" we couldn't use that, | unfortunately, as POSIX requires at least three characters. That's also incorrect. It is true that to use a POSIX TZ string, in the form normally seen in the wild, like TZ=UTC0 (as above) the "std" (and "dst" field if given) must be at least 3 chars. But that field is allowed to be in what POSIX calls "quoted" format, where the first char is '<' and the last is '>' and those two count in the required 3 chars, but are not part of the name created (the minimum is three so that in quoted form, there is at least one meaningful character remaining, TZ='<>0' isn't valid. There is no problem with TZ='<Z>0' if you want to set "zulu" time. That has 3 chars of "std", but the quoting chars aren't part of the tzname defined, leaving just "Z". This 3 char rule also applies only to POSIX form TZ strings, the zone names specified by tzdata format TZ specifications (or whatever other provider of timezone data an implementation chooses to use) have no such restriction. There's no reason at all tzdata could not use "ET" if it wanted to (even now it really makes more sense to call what is currently EST and EDT as just "ET", all anyone really cares about is that is eastern (US) time (USET would be better, other places have an "east" too, and some of them have timezones that apply in their eastern areas - and that is > 3 chars...) Even a POSIX TZ string can handle that TZ='<ET>5<ET>4,whatever' should work on any conforming implementation, right now (with a suitable value filled in for "whatever" of course, or with it and its preceding comma omitted - in which case the implementation is expected to supply the rule for when the switch occurs, but nothing, anywhere, requires that rule to be in any way consistent with any actual timezone on the planet, or to supply any actual switch times at all. kre