On 2023-03-03 14:47, Robert Elz wrote:
| That could lead to problems, as Internet RFC 8536 relies on POSIX TZ | format,
If it relies upon it by reference, then it should probably start being updated to specify whatever it needs itself. Just in case.
That's not strictly necessary, as the RFC specifies the POSIX version, so even if POSIX comes out with a new version the RFC is still valid. When writing RFC 8536 I didn't want to duplicate the POSIX spec. I wanted to refer to an existing standard; that way, we could avoid errors that inevitably arise when duplicating, and readers could easily see that they can reuse their POSIX code to implement the spec. This sort of thing is common practice.
| and by lots of other downstream code.
What kind? I doubt that anything other than tzset() and related stuff ever parses a TZ string contents, though I guess someone might have written a TZ string -> what it means converter, to help users get that right.
There's a partial list at <https://data.iana.org/time-zones/tz-link.html#TZif>. There's plenty of other code like that, both to parse TZif files and to deal with other uses of POSIX TZ strings. A quick search reports <https://github.com/Ryujinx/Ryujinx/blob/master/Ryujinx.HLE/HOS/Services/Time...> for example; this is part of a Nintendo Switch emulator written in C#.
As long as it remains in POSIX, users can keep insisting upon their right to use those strings, and implementations (even ones not based upon tzcode, which have no use for that nonsense at all) have to keep supporting it.
I see uses for these POSIX TZ strings, even with tzcode and tzdata. Here's a scenario: your government abruptly changed the DST rules and you don't have access to the network (or perhaps your distributor hasn't updated its copy of tzdata yet) and so you can't get the latest tzdata easily. You can work around the problem with one of these POSIX TZ strings. Even if your platform is built out of a bunch of different modules, all the code should still work because they all conform to this longstanding POSIX standard. I don't much like POSIX TZ strings either. However, now that we have them, they're useful on occasions like these, and removing them from POSIX would be a small benefit to implementers and a significant hassle for some use cases.
| > There is no problem with | > | > TZ='<Z>0' | | No there is a real problem, in current POSIX anyway, since POSIX says | for this case "the std and dst fields in this case shall not include the | quoting characters" ('<' and '>') and it also says that std must be at | least three characters.
Yes, but you are misinterpreting what "std" is. That is not the abbreviation (or tzname, or whatever one wants to call it), it is the field of the TZ string in which that name is specified.
No, because POSIX says that for TZ strings "The std and dst fields in this case shall not include the quoting characters." In the TZ setting TZ='<+1245>-12:45<+1345>,M9.5.0/2:45,M4.1.0/3:45' (isn't that a *beauty* :-) the std field is simply "+1245", without the angle brackets. I realize your interpretation of that wording differs. However, my interpretation is more plausible and better reflects existing practice.
I consider glibc broken in that case.
macOS behaves like glibc. That's an independent code base, but evidently both sets of developers read POSIX the way that I'm reading it, and it'd be a stretch to say we're all wrong. AIX and Solaris behave in yet a third way: they treat TZ='<Z>0' as if it were TZ='<Z >0' (i.e., two spaces after the "Z"). All these behaviors conform to POSIX because POSIX doesn't specify the behavior when dst has fewer than 3 characters.
What does a pure (as distributed) tzcode version do in this case?
It behaves like NetBSD, which isn't surprising as NetBSD is derived from tzcode.
I'd expect even more problems if the name doesn't appear at all. But Ubuntu seems to be surviving that
? this is backwards. People don't use TZ='<Z>0' or TZ='<ET>4' because those usages are nonconforming and don't work in general. If TZ='America/New_York' started saying just 'ET', that would be more like what the situation was when TZDB put spaces in time zone abbreviations. But I'd be loath to do that.
What happens using glibs with TZ='<A>1' ?
$ TZ='<A>1' date; TZ='<Z>0' date; date -u Sat Mar 4 00:35:26 2023 Sat Mar 4 00:35:26 2023 Sat Mar 4 00:35:26 UTC 2023 That is, both TZ settings are invalid, and in that case glibc which uses UTC without any abbreviation (POSIX says %Z is empty when unknown). When NetBSD sees an invalid TZ setting it does something similar, except it uses the abbreviation "GMT" instead of "", and it extends POSIX in a different way so it has a different opinion about what is invalid. These behaviors all conform to POSIX since the TZ settings don't conform to POSIX. Here's a more-outlandish example, run on NetBSD: $ TZ="$(awk 'BEGIN {for (i=0; i<512; i++) printf "A"; print "4"}')" date; date -u Sat Mar 4 00:28:09 GMT 2023 Sat Mar 4 00:28:09 UTC 2023 Here NetBSD treats the TZ setting as invalid (a time zone abbreviation of 512 "A"s!) and silently substitutes GMT. Glibc treats this same example as specifying a 512-byte abbreviation for a time zone 4 hours west of Greenwich. Both behaviors conform to POSIX since the TZ string exceeds POSIX length limits.