Invalid POSIX TZ strings in generated TZif files
The format for POSIX TZ strings is defined here: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag... I will note two things from this specification: - The hour shall be between zero and 24 - The *time* has the same format as *offset* except that no leading sign ( '-' or '+' ) is allowed There are four time zones that violate these requirements. - Asia/Gaza has DST transitions that occur 50 hours after midnight (both start and end). - Likewise for Asia/Hebron. - Asia/Jerusalem has a DST transition that begins 26 hours after midnight. - America/Nuuk has a DST transition that begins -1 hours after midnight (not a typo). Violating the first requirement is unspecified behavior, and violating the second is simply not allowed. I ran across these situations while testing a parser for TZif files that I wrote. While the desired behavior is clear, it is also the case that Gaza, Hebron, and Jerusalem rely on unspecified behavior and Nuuk should never parse successfully. Both of these are far from ideal, of course. I'm not entirely certain what can be done here, as the POSIX format is incapable of handling situations like this at first glance. Jacob Pratt
Hi Jacob, This is documented behavior. See https://datatracker.ietf.org/doc/html/rfc8536#section-3.3.1 On Thu, Oct 26, 2023, at 5:03 AM, Jacob Pratt via tz wrote:
The format for POSIX TZ strings is defined here: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag...
I will note two things from this specification: • The hour shall be between zero and 24 • The *time* has the same format as *offset* except that no leading sign ('-' or'+' ) is allowed There are four time zones that violate these requirements. • Asia/Gaza has DST transitions that occur 50 hours after midnight (both start and end). • Likewise for Asia/Hebron. • Asia/Jerusalem has a DST transition that begins 26 hours after midnight. • America/Nuuk has a DST transition that begins -1 hours after midnight (not a typo). Violating the first requirement is unspecified behavior, and violating the second is simply not allowed. I ran across these situations while testing a parser for TZif files that I wrote. While the desired behavior is clear, it is also the case that Gaza, Hebron, and Jerusalem rely on unspecified behavior and Nuuk should never parse successfully. Both of these are far from ideal, of course.
I'm not entirely certain what can be done here, as the POSIX format is incapable of handling situations like this at first glance.
Jacob Pratt
-- Kenneth Murchison Senior Software Developer Fastmail US LLC murch@fastmailteam.com
Somehow I had missed that, even after reading through the RFC multiple times while implementing it. Thanks for the quick response! Jacob Pratt On Thu, Oct 26, 2023 at 5:44 AM Ken Murchison via tz <tz@iana.org> wrote:
Hi Jacob,
This is documented behavior. See https://datatracker.ietf.org/doc/html/rfc8536#section-3.3.1
On Thu, Oct 26, 2023, at 5:03 AM, Jacob Pratt via tz wrote:
The format for POSIX TZ strings is defined here: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag...
I will note two things from this specification:
- The hour shall be between zero and 24 - The *time* has the same format as *offset* except that no leading sign ('-' or'+' ) is allowed
There are four time zones that violate these requirements.
- Asia/Gaza has DST transitions that occur 50 hours after midnight (both start and end). - Likewise for Asia/Hebron. - Asia/Jerusalem has a DST transition that begins 26 hours after midnight. - America/Nuuk has a DST transition that begins -1 hours after midnight (not a typo).
Violating the first requirement is unspecified behavior, and violating the second is simply not allowed. I ran across these situations while testing a parser for TZif files that I wrote. While the desired behavior is clear, it is also the case that Gaza, Hebron, and Jerusalem rely on unspecified behavior and Nuuk should never parse successfully. Both of these are far from ideal, of course.
I'm not entirely certain what can be done here, as the POSIX format is incapable of handling situations like this at first glance.
Jacob Pratt
-- Kenneth Murchison Senior Software Developer Fastmail US LLC murch@fastmailteam.com
On 26/10/2023 10:03, Jacob Pratt via tz wrote:
I'm not entirely certain what can be done here, as the POSIX format is incapable of handling situations like this at first glance.
Not yet, anyway, but perhaps it has been changed in one of the drafts of the 202x revision of the standard. The relevant bug report by Paul was resolved as "Accepted As Marked": https://austingroupbugs.net/view.php?id=1252 -- -=( Ian Abbott <abbotti@mev.co.uk> || MEV Ltd. is a company )=- -=( registered in England & Wales. Regd. number: 02862268. )=- -=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=- -=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-
participants (3)
-
Ian Abbott -
Jacob Pratt -
Ken Murchison