Extra transition for Europe/London with 2023d
Hi, I have just updated the tzdb for PHP, and one of our tests started failing, and it turned out due to an unexpected data change: Previously, the following transitions existed: … 1994-03-27 01:00:00 UT ( 764730000) = 1 [ 3600 1 4 'BST' (0,0)] 1994-10-23 01:00:00 UT ( 782874000) = 2 [ 0 0 8 'GMT' (0,0)] 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)] POSIX string: GMT0BST,M3.5.0/1,M10.5.0 std: 2 [ 0 0 8 'GMT' (0,0)] dst: 1 [ 3600 1 4 'BST' (0,0)] But now, they include an extra one for Jan 1st, 1996, with the March 31st one now not being the last one: … 1994-03-27 01:00:00 UT ( 764730000) = 1 [ 3600 1 4 'BST' (0,0)] 1994-10-23 01:00:00 UT ( 782874000) = 2 [ 0 0 8 'GMT' (0,0)] 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)] POSIX string: GMT0BST,M3.5.0/1,M10.5.0 std: 2 [ 0 0 8 'GMT' (0,0)] dst: 1 [ 3600 1 4 'BST' (0,0)] I couldn't find anywhere in tzfile.5 or theory.html whether the last generated transition must match a transition as specified with the POSIX string (as it did with 2023c and earlier), but I vaguely remember having read such a thing when I implemented the POSIX string parsing logic. As far as I know so-far, the only effect it has on PHP users is that they will now see an extra transition when they enumerate them (the 1996-01-01 is inserted). I think I am mostly flagging this up because this was an unexpected change. cheers, Derick -- https://derickrethans.nl | https://xdebug.org | https://dram.io Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support Host of PHP Internals News: https://phpinternals.news mastodon: @derickr@phpc.social @xdebug@phpc.social twitter: @derickr and @xdebug
On 2024-01-02 04:29, Derick Rethans via tz wrote:
Hi, I have just updated the tzdb for PHP, and one of our tests started failing, and it turned out due to an unexpected data change: Previously, the following transitions existed: … 1994-03-27 01:00:00 UT ( 764730000) = 1 [ 3600 1 4 'BST' (0,0)] 1994-10-23 01:00:00 UT ( 782874000) = 2 [ 0 0 8 'GMT' (0,0)] 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)] POSIX string: GMT0BST,M3.5.0/1,M10.5.0 std: 2 [ 0 0 8 'GMT' (0,0)] dst: 1 [ 3600 1 4 'BST' (0,0)] But now, they include an extra one for Jan 1st, 1996, with the March 31st one now not being the last one: … 1994-03-27 01:00:00 UT ( 764730000) = 1 [ 3600 1 4 'BST' (0,0)] 1994-10-23 01:00:00 UT ( 782874000) = 2 [ 0 0 8 'GMT' (0,0)] 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)] POSIX string: GMT0BST,M3.5.0/1,M10.5.0 std: 2 [ 0 0 8 'GMT' (0,0)] dst: 1 [ 3600 1 4 'BST' (0,0)] I couldn't find anywhere in tzfile.5 or theory.html whether the last generated transition must match a transition as specified with the POSIX string (as it did with 2023c and earlier), but I vaguely remember having read such a thing when I implemented the POSIX string parsing logic. As far as I know so-far, the only effect it has on PHP users is that they will now see an extra transition when they enumerate them (the 1996-01-01 is inserted). I think I am mostly flagging this up because this was an unexpected change. Check your installed data or paths and conversion code! There was a leap second at that time, and regularly during the 1990s, so you seem to be using right/Europe/London:
$ zdump -Vc1994,1998 right/Europe/London right/Europe/London Sun Mar 27 01:00:17 1994 UT = Sun Mar 27 00:59:59 1994 GMT isdst=0 gmtoff=0 right/Europe/London Sun Mar 27 01:00:18 1994 UT = Sun Mar 27 02:00:00 1994 BST isdst=1 gmtoff=3600 right/Europe/London Fri Jul 1 00:00:18 1994 UT = Fri Jul 1 00:59:60 1994 BST isdst=1 gmtoff=3600 right/Europe/London Fri Jul 1 00:00:19 1994 UT = Fri Jul 1 01:00:00 1994 BST isdst=1 gmtoff=3600 right/Europe/London Sun Oct 23 01:00:18 1994 UT = Sun Oct 23 01:59:59 1994 BST isdst=1 gmtoff=3600 right/Europe/London Sun Oct 23 01:00:19 1994 UT = Sun Oct 23 01:00:00 1994 GMT isdst=0 gmtoff=0 right/Europe/London Sun Mar 26 01:00:18 1995 UT = Sun Mar 26 00:59:59 1995 GMT isdst=0 gmtoff=0 right/Europe/London Sun Mar 26 01:00:19 1995 UT = Sun Mar 26 02:00:00 1995 BST isdst=1 gmtoff=3600 right/Europe/London Sun Oct 22 01:00:18 1995 UT = Sun Oct 22 01:59:59 1995 BST isdst=1 gmtoff=3600 right/Europe/London Sun Oct 22 01:00:19 1995 UT = Sun Oct 22 01:00:00 1995 GMT isdst=0 gmtoff=0 right/Europe/London Mon Jan 1 00:00:19 1996 UT = Sun Dec 31 23:59:60 1995 GMT isdst=0 gmtoff=0 right/Europe/London Mon Jan 1 00:00:20 1996 UT = Mon Jan 1 00:00:00 1996 GMT isdst=0 gmtoff=0 right/Europe/London Sun Mar 31 01:00:19 1996 UT = Sun Mar 31 00:59:59 1996 GMT isdst=0 gmtoff=0 right/Europe/London Sun Mar 31 01:00:20 1996 UT = Sun Mar 31 02:00:00 1996 BST isdst=1 gmtoff=3600 right/Europe/London Sun Oct 27 01:00:19 1996 UT = Sun Oct 27 01:59:59 1996 BST isdst=1 gmtoff=3600 right/Europe/London Sun Oct 27 01:00:20 1996 UT = Sun Oct 27 01:00:00 1996 GMT isdst=0 gmtoff=0 right/Europe/London Sun Mar 30 01:00:19 1997 UT = Sun Mar 30 00:59:59 1997 GMT isdst=0 gmtoff=0 right/Europe/London Sun Mar 30 01:00:20 1997 UT = Sun Mar 30 02:00:00 1997 BST isdst=1 gmtoff=3600 right/Europe/London Tue Jul 1 00:00:20 1997 UT = Tue Jul 1 00:59:60 1997 BST isdst=1 gmtoff=3600 right/Europe/London Tue Jul 1 00:00:21 1997 UT = Tue Jul 1 01:00:00 1997 BST isdst=1 gmtoff=3600 right/Europe/London Sun Oct 26 01:00:20 1997 UT = Sun Oct 26 01:59:59 1997 BST isdst=1 gmtoff=3600 right/Europe/London Sun Oct 26 01:00:21 1997 UT = Sun Oct 26 01:00:00 1997 GMT isdst=0 gmtoff=0 but there is no change visible with zdump on default or POSIX 2023d: $ zdump -Vc1994,1998 Europe/London zdump -Vc1997,2025 -Vc1994,1998 Europe/London Europe/London Sun Mar 27 00:59:59 1994 UT = Sun Mar 27 00:59:59 1994 GMT isdst=0 gmtoff=0 Europe/London Sun Mar 27 01:00:00 1994 UT = Sun Mar 27 02:00:00 1994 BST isdst=1 gmtoff=3600 Europe/London Sun Oct 23 00:59:59 1994 UT = Sun Oct 23 01:59:59 1994 BST isdst=1 gmtoff=3600 Europe/London Sun Oct 23 01:00:00 1994 UT = Sun Oct 23 01:00:00 1994 GMT isdst=0 gmtoff=0 Europe/London Sun Mar 26 00:59:59 1995 UT = Sun Mar 26 00:59:59 1995 GMT isdst=0 gmtoff=0 Europe/London Sun Mar 26 01:00:00 1995 UT = Sun Mar 26 02:00:00 1995 BST isdst=1 gmtoff=3600 Europe/London Sun Oct 22 00:59:59 1995 UT = Sun Oct 22 01:59:59 1995 BST isdst=1 gmtoff=3600 Europe/London Sun Oct 22 01:00:00 1995 UT = Sun Oct 22 01:00:00 1995 GMT isdst=0 gmtoff=0 Europe/London Sun Mar 31 00:59:59 1996 UT = Sun Mar 31 00:59:59 1996 GMT isdst=0 gmtoff=0 Europe/London Sun Mar 31 01:00:00 1996 UT = Sun Mar 31 02:00:00 1996 BST isdst=1 gmtoff=3600 Europe/London Sun Oct 27 00:59:59 1996 UT = Sun Oct 27 01:59:59 1996 BST isdst=1 gmtoff=3600 Europe/London Sun Oct 27 01:00:00 1996 UT = Sun Oct 27 01:00:00 1996 GMT isdst=0 gmtoff=0 Europe/London Sun Mar 30 00:59:59 1997 UT = Sun Mar 30 00:59:59 1997 GMT isdst=0 gmtoff=0 Europe/London Sun Mar 30 01:00:00 1997 UT = Sun Mar 30 02:00:00 1997 BST isdst=1 gmtoff=3600 Europe/London Sun Oct 26 00:59:59 1997 UT = Sun Oct 26 01:59:59 1997 BST isdst=1 gmtoff=3600 Europe/London Sun Oct 26 01:00:00 1997 UT = Sun Oct 26 01:00:00 1997 GMT isdst=0 gmtoff=0 -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry
On Tue, Jan 2, 2024 at 1:26 PM brian.inglis--- via tz <tz@iana.org> wrote:
there is no change visible with zdump
That's true, but zdump (contrary to its name) doesn't just dump the content of a TZif file. This is the 2023c/2023d difference I see in the *raw*, slim Europe/London data, just as Derick reports. @@ -176,6 +176,6 @@ 782874000 [type=2] gmtoff= 0 is_dst=F isstd=0 isgmt=0 abbr="GMT" 796179600 [type=1] gmtoff= 3600 is_dst=T isstd=0 isgmt=0 abbr="BST" 814323600 [type=2] gmtoff= 0 is_dst=F isstd=0 isgmt=0 abbr="GMT" - 828234000 [type=1] gmtoff= 3600 is_dst=T isstd=0 isgmt=0 abbr="BST" + 820454400 [type=2] gmtoff= 0 is_dst=F isstd=0 isgmt=0 abbr="GMT" Future Specification: GMT0BST,M3.5.0/1,M10.5.0
On Tue, 2 Jan 2024, brian.inglis--- via tz wrote:
On 2024-01-02 04:29, Derick Rethans via tz wrote:
Hi, I have just updated the tzdb for PHP, and one of our tests started failing, and it turned out due to an unexpected data change: Previously, the following transitions existed: … 1994-03-27 01:00:00 UT ( 764730000) = 1 [ 3600 1 4 'BST' (0,0)] 1994-10-23 01:00:00 UT ( 782874000) = 2 [ 0 0 8 'GMT' (0,0)] 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)]
POSIX string: > > GMT0BST,M3.5.0/1,M10.5.0 std: 2 [ 0 0 8 'GMT' (0,0)] dst: 1 [ 3600 1 4 'BST' (0,0)]
But now, they include an extra one for Jan 1st, 1996, with the March 31st one now not being the last one: … 1994-03-27 01:00:00 UT ( 764730000) = 1 [ 3600 1 4 'BST' (0,0)] 1994-10-23 01:00:00 UT ( 782874000) = 2 [ 0 0 8 'GMT' (0,0)] 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)]
POSIX string: GMT0BST,M3.5.0/1,M10.5.0 std: 2 [ 0 0 8 'GMT' (0,0)] dst: 1 [ 3600 1 4 'BST' (0,0)]
I couldn't find anywhere in tzfile.5 or theory.html whether the last generated transition must match a transition as specified with the POSIX string (as it did with 2023c and earlier), but I vaguely remember having read such a thing when I implemented the POSIX string parsing logic. As far as I know so-far, the only effect it has on PHP users is that they will now see an extra transition when they enumerate them (the 1996-01-01 is inserted). I think I am mostly flagging this up because this was an unexpected change. Check your installed data or paths and conversion code!
I am not using any installed data, and both of these were created by zic, which is what I would consider the reference implementation.
There was a leap second at that time, and regularly during the 1990s, so you seem to be using right/Europe/London:
No, I am not. The new rule for 1996-01-01 says:
1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)]
The first "2" is the "tzh_typecount" value. It is 2, just like in the previous entry for 1995-10-22 01:00:00 UT:
1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)]
You can also see that the offset stays "0" for both, and both have the abbreviation "GMT". In the 2023c data file, that entry correctly has typecount 1 (for 1996-03-31):
1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)]
It is distinctly a change in data as output by zic, as my diff of the created binary also show: The change from (0x31, 0x5D, 0xD9, 0x10) 828234000 to (0x30, 0xE7, 0x24, 0x00) 820454400 is exactly as what my tool (https://github.com/derickr/timelib/blob/master/docs/show-tzinfo.c) shows, the change from 1996-03-31 to 1996-01-01: -0x00, 0x00, 0x00, 0x31, 0x5D, 0xD9, 0x10, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, +0x00, 0x00, 0x00, 0x30, 0xE7, 0x24, 0x00, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, … And the other change changes the associated typecount from 0x01 to 0x02: -0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0xFF, 0xFF, 0xFF, 0xB5, 0x00, 0x00, 0x00, 0x00, 0x0E, 0x10, +0x02, 0x01, 0x02, 0x01, 0x02, 0x02, 0xFF, 0xFF, 0xFF, 0xB5, 0x00, 0x00, 0x00, 0x00, 0x0E, 0x10, No where comes the 'right''s leap second into play. If I turn these on by setting -L leapseconds when calling `zic`, then the output changes to the following, as expected (well, except for 1996-01-01 vs 1996-03-31): 1994-10-23 01:00:19 UT ( 782874019) = 2 [ 0 0 8 'GMT' (0,0)] 1995-03-26 01:00:19 UT ( 796179619) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:19 UT ( 814323619) = 2 [ 0 0 8 'GMT' (0,0)] 1996-01-01 00:00:20 UT ( 820454420) = 2 [ 0 0 8 'GMT' (0,0)] … 1994-07-01 00:00:18 UT ( 773020818) = 19 1996-01-01 00:00:19 UT ( 820454419) = 20 1997-07-01 00:00:20 UT ( 867715220) = 21 … <snip>
but there is no change visible with zdump on default or POSIX 2023d:
$ zdump -Vc1994,1998 Europe/London zdump -Vc1997,2025 -Vc1994,1998 Europe/London
zdump apparently doesn't show this behaviour. I'm fairly certain that the output of zic itself changed. If I replace the "zic.c" from 2023d with 2023c (and associated source files), the data that my tool shows indeed reverts back to the expected: … 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)] cheers, Derick -- https://derickrethans.nl | https://xdebug.org | https://dram.io Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support Host of PHP Internals News: https://phpinternals.news mastodon: @derickr@phpc.social @xdebug@phpc.social twitter: @derickr and @xdebug
On 1/2/2024 2:29 PM, Derick Rethans via tz wrote:
On Tue, 2 Jan 2024, brian.inglis--- via tz wrote:
On 2024-01-02 04:29, Derick Rethans via tz wrote:
Hi, I have just updated the tzdb for PHP, and one of our tests started failing, and it turned out due to an unexpected data change: Previously, the following transitions existed: … 1994-03-27 01:00:00 UT ( 764730000) = 1 [ 3600 1 4 'BST' (0,0)] 1994-10-23 01:00:00 UT ( 782874000) = 2 [ 0 0 8 'GMT' (0,0)] 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)]
POSIX string: > > GMT0BST,M3.5.0/1,M10.5.0 std: 2 [ 0 0 8 'GMT' (0,0)] dst: 1 [ 3600 1 4 'BST' (0,0)]
But now, they include an extra one for Jan 1st, 1996, with the March 31st one now not being the last one: … 1994-03-27 01:00:00 UT ( 764730000) = 1 [ 3600 1 4 'BST' (0,0)] 1994-10-23 01:00:00 UT ( 782874000) = 2 [ 0 0 8 'GMT' (0,0)] 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)]
POSIX string: GMT0BST,M3.5.0/1,M10.5.0 std: 2 [ 0 0 8 'GMT' (0,0)] dst: 1 [ 3600 1 4 'BST' (0,0)] I couldn't find anywhere in tzfile.5 or theory.html whether the last generated transition must match a transition as specified with the POSIX string (as it did with 2023c and earlier), but I vaguely remember having read such a thing when I implemented the POSIX string parsing logic. As far as I know so-far, the only effect it has on PHP users is that they will now see an extra transition when they enumerate them (the 1996-01-01 is inserted). I think I am mostly flagging this up because this was an unexpected change. Check your installed data or paths and conversion code! I am not using any installed data, and both of these were created by zic, which is what I would consider the reference implementation.
There was a leap second at that time, and regularly during the 1990s, so you seem to be using right/Europe/London: No, I am not.
The new rule for 1996-01-01 says:
1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)] The first "2" is the "tzh_typecount" value. It is 2, just like in the previous entry for 1995-10-22 01:00:00 UT:
1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] You can also see that the offset stays "0" for both, and both have the abbreviation "GMT".
In the 2023c data file, that entry correctly has typecount 1 (for 1996-03-31):
1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)] It is distinctly a change in data as output by zic, as my diff of the created binary also show:
The change from (0x31, 0x5D, 0xD9, 0x10) 828234000 to (0x30, 0xE7, 0x24, 0x00) 820454400 is exactly as what my tool (https://github.com/derickr/timelib/blob/master/docs/show-tzinfo.c) shows, the change from 1996-03-31 to 1996-01-01:
-0x00, 0x00, 0x00, 0x31, 0x5D, 0xD9, 0x10, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, +0x00, 0x00, 0x00, 0x30, 0xE7, 0x24, 0x00, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02,
…
And the other change changes the associated typecount from 0x01 to 0x02:
-0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0xFF, 0xFF, 0xFF, 0xB5, 0x00, 0x00, 0x00, 0x00, 0x0E, 0x10, +0x02, 0x01, 0x02, 0x01, 0x02, 0x02, 0xFF, 0xFF, 0xFF, 0xB5, 0x00, 0x00, 0x00, 0x00, 0x0E, 0x10,
No where comes the 'right''s leap second into play. If I turn these on by setting -L leapseconds when calling `zic`, then the output changes to the following, as expected (well, except for 1996-01-01 vs 1996-03-31):
1994-10-23 01:00:19 UT ( 782874019) = 2 [ 0 0 8 'GMT' (0,0)] 1995-03-26 01:00:19 UT ( 796179619) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:19 UT ( 814323619) = 2 [ 0 0 8 'GMT' (0,0)] 1996-01-01 00:00:20 UT ( 820454420) = 2 [ 0 0 8 'GMT' (0,0)] … 1994-07-01 00:00:18 UT ( 773020818) = 19 1996-01-01 00:00:19 UT ( 820454419) = 20 1997-07-01 00:00:20 UT ( 867715220) = 21 …
<snip>
but there is no change visible with zdump on default or POSIX 2023d:
$ zdump -Vc1994,1998 Europe/London zdump -Vc1997,2025 -Vc1994,1998 Europe/London zdump apparently doesn't show this behaviour.
I'm fairly certain that the output of zic itself changed. If I replace the "zic.c" from 2023d with 2023c (and associated source files), the data that my tool shows indeed reverts back to the expected:
… 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)]
cheers, Derick
In my (as yet unpublished) work I've discovered that there are some missing transitions produced by zic.c. This example at Europe/London 1996 is but one example. # Zone NAME STDOFF RULES FORMAT [UNTIL] Zone Europe/London -0:01:15 - LMT 1847 Dec 1 0:00s 0:00 GB-Eire %s 1968 Oct 27 1:00 - BST 1971 Oct 31 2:00u 0:00 GB-Eire %s 1996 0:00 EU GMT/BST There is certainly a transition at 1996-01-01 00:00:00, from RULES GB-Eire to EU and FORMAT from %s to GMT/BST: 0:00 GB-Eire %s 1996 0:00 EU GMT/BST zic.c misses this transition, and it's important to the work I'm doing because its necessary to lookup the *previous* transition in some circumstances and if this transition is missing the returned previous transition is incorrect. I've found it necessary to refine zic.c, in particular portions of outzone() and writezone(), including the code block commented as "** Optimize", such that this transition (and others like it) are included in the resulting TzIf files. With this my adapted version of zdump then produces: [157] 814323599 1995-10-22 01:59:59 isdst 1 gmtoff 3600 stdoff 0 BST 814323600 1995-10-22 01:00:00 isdst 0 gmtoff 0 stdoff 0 GMT [158] 820454399 1995-12-31 23:59:59 isdst 0 gmtoff 0 stdoff 0 GMT 820454400 1996-01-01 00:00:00 isdst 0 gmtoff 0 stdoff 0 GMT [159] 828233999 1996-03-31 00:59:59 isdst 0 gmtoff 0 stdoff 0 GMT 828234000 1996-03-31 02:00:00 isdst 1 gmtoff 3600 stdoff 0 BST Note this 1996 transition does not produce a discontinuity in the YMDhms sequence, which rolls over normally: 1995-12-31 23:59:59 to 1996-01-01 00:00:00. But its "at" time (820454400) and metadata are important. There are many examples of this throughout the TzDb source file, wherever a time zone "era" [UNTIL] is designated with only the year, like this London example, 0:00 GB-Eire %s 1996. Now in this special case of London in winter time that time-point falls immediately after a leap-second. So the YMDhms sequence must go: 1995-12-31 23:59:59 1995-12-31 23:59:60 << leap-second 1996-01-01 00:00:00 In this case London is at the same STDOFF as UTC (0:00) so this sequence is true for both methods of introducing leap-seconds; a) local simultaneous with UTC (like tzdb "right") and b) "rolling leap-seconds", both of which my work supports. I suggest TzDb may want to have a look at this topic. I think If these improvements were made it would not alter the typical current behavior of localtime(); the YMDhms representations and sequences would remain the same. But the addition of these transitions are more complete and honest to the underlying TzDb source data and this is important for some types of extended functionality I'm pursuing. Thanks, -Brooks "Prediction is difficult, especially about the future."
On 2024-01-02 12:29, Derick Rethans wrote:
On Tue, 2 Jan 2024, brian.inglis--- via tz wrote:
On 2024-01-02 04:29, Derick Rethans via tz wrote:
Hi, I have just updated the tzdb for PHP, and one of our tests started failing, and it turned out due to an unexpected data change: Previously, the following transitions existed: … 1994-03-27 01:00:00 UT ( 764730000) = 1 [ 3600 1 4 'BST' (0,0)] 1994-10-23 01:00:00 UT ( 782874000) = 2 [ 0 0 8 'GMT' (0,0)] 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)]
POSIX string: > > GMT0BST,M3.5.0/1,M10.5.0 std: 2 [ 0 0 8 'GMT' (0,0)] dst: 1 [ 3600 1 4 'BST' (0,0)]
But now, they include an extra one for Jan 1st, 1996, with the March 31st one now not being the last one: … 1994-03-27 01:00:00 UT ( 764730000) = 1 [ 3600 1 4 'BST' (0,0)] 1994-10-23 01:00:00 UT ( 782874000) = 2 [ 0 0 8 'GMT' (0,0)] 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)]
POSIX string: GMT0BST,M3.5.0/1,M10.5.0 std: 2 [ 0 0 8 'GMT' (0,0)] dst: 1 [ 3600 1 4 'BST' (0,0)]
I couldn't find anywhere in tzfile.5 or theory.html whether the last generated transition must match a transition as specified with the POSIX string (as it did with 2023c and earlier), but I vaguely remember having read such a thing when I implemented the POSIX string parsing logic. As far as I know so-far, the only effect it has on PHP users is that they will now see an extra transition when they enumerate them (the 1996-01-01 is inserted). I think I am mostly flagging this up because this was an unexpected change. Check your installed data or paths and conversion code!
I am not using any installed data, and both of these were created by zic, which is what I would consider the reference implementation.
There was a leap second at that time, and regularly during the 1990s, so you seem to be using right/Europe/London:
No, I am not.
Generated data rather than installed, and what selections, options, and parameters are you using to generate that data, including those for make and zic? I am using tzcode 2023d zdump and zic, and tzdata 2023d make and zic parameters: make DATAFORM=rearguard PACKRATDATA=backzone PACKRATLIST=zone.tab \ VERSION_DEPS= tzdata.zi mkdir -p zoneinfo/ zoneinfo/posix/ zoneinfo/right/ zic -b fat -d zoneinfo -L /dev/null tzdata.zi zic -b fat -d zoneinfo/posix -L /dev/null tzdata.zi zic -b fat -d zoneinfo/right -L leapseconds tzdata.zi Which data format version(s) are you reading and listing?
The new rule for 1996-01-01 says:
1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)]
The first "2" is the "tzh_typecount" value. It is 2, just like in the previous entry for 1995-10-22 01:00:00 UT:
1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)]
You can also see that the offset stays "0" for both, and both have the abbreviation "GMT".
In the 2023c data file, that entry correctly has typecount 1 (for 1996-03-31):
1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)]
It is distinctly a change in data as output by zic, as my diff of the created binary also show:
The change from (0x31, 0x5D, 0xD9, 0x10) 828234000 to (0x30, 0xE7, 0x24, 0x00) 820454400 is exactly as what my tool (https://github.com/derickr/timelib/blob/master/docs/show-tzinfo.c) shows, the change from 1996-03-31 to 1996-01-01: -0x00, 0x00, 0x00, 0x31, 0x5D, 0xD9, 0x10, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, +0x00, 0x00, 0x00, 0x30, 0xE7, 0x24, 0x00, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, … And the other change changes the associated typecount from 0x01 to 0x02: -0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0xFF, 0xFF, 0xFF, 0xB5, 0x00, 0x00, 0x00, 0x00, 0x0E, 0x10, +0x02, 0x01, 0x02, 0x01, 0x02, 0x02, 0xFF, 0xFF, 0xFF, 0xB5, 0x00, 0x00, 0x00, 0x00, 0x0E, 0x10, No where comes the 'right''s leap second into play. If I turn these on by setting -L leapseconds when calling `zic`, then the output changes to the following, as expected (well, except for 1996-01-01 vs 1996-03-31): 1994-10-23 01:00:19 UT ( 782874019) = 2 [ 0 0 8 'GMT' (0,0)] 1995-03-26 01:00:19 UT ( 796179619) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:19 UT ( 814323619) = 2 [ 0 0 8 'GMT' (0,0)] 1996-01-01 00:00:20 UT ( 820454420) = 2 [ 0 0 8 'GMT' (0,0)] … 1994-07-01 00:00:18 UT ( 773020818) = 19 1996-01-01 00:00:19 UT ( 820454419) = 20 1997-07-01 00:00:20 UT ( 867715220) = 21 …
but there is no change visible with zdump on default or POSIX 2023d: $ zdump -Vc1994,1998 Europe/London zdump -Vc1997,2025 -Vc1994,1998 Europe/London
zdump apparently doesn't show this behaviour. I'm fairly certain that the output of zic itself changed. If I replace the "zic.c" from 2023d with 2023c (and associated source files), the data that my tool shows indeed reverts back to the expected: … 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)]
It is possible with your make and zic selections, options, and parameters, that a bug generates unnecessary extra transition(s). What generates your 2023.4 data file? It might be useful for zdump to support a rawer -d debug/dump format showing rawer data in useful radixes to diagnose these cases. -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry
On Wed, 3 Jan 2024, brian.inglis--- via tz wrote:
On 2024-01-02 12:29, Derick Rethans wrote:
On Tue, 2 Jan 2024, brian.inglis--- via tz wrote:
On 2024-01-02 04:29, Derick Rethans via tz wrote:
Hi, I have just updated the tzdb for PHP, and one of our tests started failing, and it turned out due to an unexpected data change: Previously, the following transitions existed:
<snip>
I couldn't find anywhere in tzfile.5 or theory.html whether the last generated transition must match a transition as specified with the POSIX string (as it did with 2023c and earlier), but I vaguely remember having read such a thing when I implemented the POSIX string parsing logic. As far as I know so-far, the only effect it has on PHP users is that they will now see an extra transition when they enumerate them (the 1996-01-01 is inserted).
I think I am mostly flagging this up because this was an unexpected change.
Check your installed data or paths and conversion code!
I am not using any installed data, and both of these were created by zic, which is what I would consider the reference implementation.
There was a leap second at that time, and regularly during the 1990s, so you seem to be using right/Europe/London:
No, I am not.
Generated data rather than installed, and what selections, options, and parameters are you using to generate that data, including those for make and zic?
I am using tzcode 2023d zdump and zic, and tzdata 2023d make and zic parameters:
make DATAFORM=rearguard PACKRATDATA=backzone PACKRATLIST=zone.tab \ VERSION_DEPS= tzdata.zi mkdir -p zoneinfo/ zoneinfo/posix/ zoneinfo/right/ zic -b fat -d zoneinfo -L /dev/null tzdata.zi zic -b fat -d zoneinfo/posix -L /dev/null tzdata.zi zic -b fat -d zoneinfo/right -L leapseconds tzdata.zi
Which data format version(s) are you reading and listing?
Just "-b slim", and none of the other environment vars (repro script below).
It is distinctly a change in data as output by zic, as my diff of the created binary also show:
The change from (0x31, 0x5D, 0xD9, 0x10) 828234000 to (0x30, 0xE7, 0x24, 0x00) 820454400 is exactly as what my tool (https://github.com/derickr/timelib/blob/master/docs/show-tzinfo.c) shows, the change from 1996-03-31 to 1996-01-01:
-0x00, 0x00, 0x00, 0x31, 0x5D, 0xD9, 0x10, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, +0x00, 0x00, 0x00, 0x30, 0xE7, 0x24, 0x00, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0x02, … And the other change changes the associated typecount from 0x01 to 0x02:
-0x02, 0x01, 0x02, 0x01, 0x02, 0x01, 0xFF, 0xFF, 0xFF, 0xB5, 0x00, 0x00, 0x00, 0x00, 0x0E, 0x10, +0x02, 0x01, 0x02, 0x01, 0x02, 0x02, 0xFF, 0xFF, 0xFF, 0xB5, 0x00, 0x00, 0x00, 0x00, 0x0E, 0x10,
but there is no change visible with zdump on default or POSIX 2023d: $ zdump -Vc1994,1998 Europe/London zdump -Vc1997,2025 -Vc1994,1998 Europe/London
zdump apparently doesn't show this behaviour.
I'm fairly certain that the output of zic itself changed. If I replace the "zic.c" from 2023d with 2023c (and associated source files), the data that my tool shows indeed reverts back to the expected: … 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)]
It is possible with your make and zic selections, options, and parameters, that a bug generates unnecessary extra transition(s).
What generates your 2023.4 data file?
This is the repro set-up, no non-default arguments, or any of the environment variables. I use the 2023.4 data file, and the 2023c/2023d code releases. ---- >8 ---------- tzdata-repro.sh ------------------------------------------ mkdir /tmp/tzdata-repro cd /tmp/tzdata-repro # Download and extract code (2023c and 2023d): wget https://data.iana.org/time-zones/releases/tzcode2023c.tar.gz mkdir code-2023c && cd code-2023c && tar xvzf ../tzcode2023c.tar.gz && cd .. wget https://data.iana.org/time-zones/releases/tzcode2023d.tar.gz mkdir code-2023d && cd code-2023d && tar xvzf ../tzcode2023d.tar.gz && cd .. # Download and extract data (2023d): wget https://data.iana.org/time-zones/releases/tzdata2023d.tar.gz cd code-2023c && tar xvzf ../tzdata2023d.tar.gz && cd .. cd code-2023d && tar xvzf ../tzdata2023d.tar.gz && cd .. # Build code cd code-2023c && make zic && cd .. cd code-2023d && make zic && cd .. # Create data files for Europe mkdir -p data-files/2023c data-files/2023d ./code-2023c/zic code-2023c/europe -d data-files/2023c -b slim ./code-2023d/zic code-2023d/europe -d data-files/2023d -b slim # Show difference diff <(xxd -g 1 data-files/2023c/Europe/London) <(xxd -g 1 data-files/2023d/Europe/London) # Result: # 86c86 # < 00000550: 00 00 00 31 5d d9 10 02 01 02 01 02 01 02 01 02 ...1]........... # --- # > 00000550: 00 00 00 30 e7 24 00 02 01 02 01 02 01 02 01 02 ...0.$.......... # 96c96 # < 000005f0: 02 01 02 01 02 01 ff ff ff b5 00 00 00 00 0e 10 ................ # --- # > 000005f0: 02 01 02 01 02 02 ff ff ff b5 00 00 00 00 0e 10 ................ # And if you have timelib's show-tzinfo: diff -u <(~/dev/derickr-timelib/docs/show-tzinfo Europe/London `pwd`/data-files/2023c) <(~/dev/derickr-timelib/docs/show-tzinfo Europe/London `pwd`/data-files/2023d) # Result: # --- /dev/fd/63 2024-01-04 10:47:50.296038022 +0000 # +++ /dev/fd/62 2024-01-04 10:47:50.296038022 +0000 # @@ -171,7 +171,7 @@ # 1994-10-23 01:00:00 UT ( 782874000) = 2 [ 0 0 8 'GMT' (0,0)] # 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] # 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] # -1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)] # +1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)] # # POSIX string: GMT0BST,M3.5.0/1,M10.5.0 # std: 2 [ 0 0 8 'GMT' (0,0)] ---- >8 --------------------------------------------------------------------- Which shows the difference in output between zic-2023c and zic-2023d, both using data-2023d. This difference is exactly as I described earlier, the transition time change from (0x31, 0x5D, 0xD9, 0x10) 828234000 to (0x30, 0xE7, 0x24, 0x00) 820454400 abd the type change for the last entry from 01 to 02.
It might be useful for zdump to support a rawer -d debug/dump format showing rawer data in useful radixes to diagnose these cases.
Yes, that is why I had written my show-tzinfo when I wrote the tzdata reader for PHP first in 2005: https://github.com/php/php-src/commit/4fb4cac65c735a9253d7b77f17468a5768a7de... cheers, Derick -- https://derickrethans.nl | https://xdebug.org | https://dram.io Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support Host of PHP Internals News: https://phpinternals.news mastodon: @derickr@phpc.social @xdebug@phpc.social twitter: @derickr and @xdebug
On Thu, Jan 4, 2024 at 6:00 AM Derick Rethans via tz <tz@iana.org> wrote:
This is the repro set-up, no non-default arguments, or any of the environment variables. I use the 2023.4 data file, and the 2023c/2023d code releases.
I can do a `git bisect` in the next day or three (prob over the weekend) if that will help narrow down where this was introduced. Should be pretty easy to adjust Derick's script to automate most of it. --Matthew Donadio (matt@mxd120.com)
On Thu, Jan 4, 2024 at 12:23 PM Matthew Donadio via tz <tz@iana.org> wrote:
On Thu, Jan 4, 2024 at 6:00 AM Derick Rethans via tz <tz@iana.org> wrote:
This is the repro set-up, no non-default arguments, or any of the environment variables. I use the 2023.4 data file, and the 2023c/2023d code releases.
I can do a `git bisect` in the next day or three (prob over the weekend) if that will help narrow down where this was introduced. Should be pretty easy to adjust Derick's script to automate most of it.
It was ... commit 35c116b7536a36c43eb7cd36bff71ad0c5ecf071 Author: Paul Eggert <eggert@cs.ucla.edu> Date: Sun Oct 15 12:26:28 2023 -0700 Fix zic bug with Palestine after 2075 The bug can be observed when processing the following .zi data, adapted from the current ‘asia’ file: Rule Palestine 2075 max - Mar Sat<=30 2:00 1:00 S Rule Palestine 2075 max - Oct Sat<=30 2:00 0 - Rule Palestine 2076 only - Jul 25 2:00 0 - Rule Palestine 2076 only - Sep 5 2:00 1:00 S Zone Asia/Gaza 2:00 - EET 2012 2:00 Palestine EE%sT Without the fix, zic generates an incorrect TZif file, in which the special-case 2076 transitions are omitted. This causes ‘zdump -ic 2076,2077 Asia/Gaza’ to mistakenly omit the lines: 2076-07-25 01 +02 EET 2076-09-05 03 +03 EEST 1 * zic.c (outzone): Redo algorithm to work even when the effect of a Rule that never ends (TO="max") is interspersed with the effect of a one-shot rule (TO="only").
On 2024-01-04 11:07, Bradley White via tz wrote:
On Thu, Jan 4, 2024 at 12:23 PM Matthew Donadio via tz <tz@iana.org <mailto:tz@iana.org>> wrote:
On Thu, Jan 4, 2024 at 6:00 AM Derick Rethans via tz <tz@iana.org <mailto:tz@iana.org>> wrote:
This is the repro set-up, no non-default arguments, or any of the environment variables. I use the 2023.4 data file, and the 2023c/2023d code releases.
I can do a `git bisect` in the next day or three (prob over the weekend) if that will help narrow down where this was introduced. Should be pretty easy to adjust Derick's script to automate most of it.
It was ...
commit 35c116b7536a36c43eb7cd36bff71ad0c5ecf071 Author: Paul Eggert <eggert@cs.ucla.edu <mailto:eggert@cs.ucla.edu>> Date: Sun Oct 15 12:26:28 2023 -0700
Fix zic bug with Palestine after 2075
The bug can be observed when processing the following .zi data, adapted from the current ‘asia’ file: Rule Palestine 2075 max - Mar Sat<=30 2:00 1:00 S Rule Palestine 2075 max - Oct Sat<=30 2:00 0 - Rule Palestine 2076 only - Jul 25 2:00 0 - Rule Palestine 2076 only - Sep 5 2:00 1:00 S Zone Asia/Gaza 2:00 - EET 2012 2:00 Palestine EE%sT Without the fix, zic generates an incorrect TZif file, in which the special-case 2076 transitions are omitted. This causes ‘zdump -ic 2076,2077 Asia/Gaza’ to mistakenly omit the lines: 2076-07-25 01 +02 EET 2076-09-05 03 +03 EEST 1 * zic.c (outzone): Redo algorithm to work even when the effect of a Rule that never ends (TO="max") is interspersed with the effect of a one-shot rule (TO="only").
So the issue appears running `2023d/zic -slim -d ... 2023[cd]/europe`? Has anyone tried using tzdata 2023[cd] `make tzdata.zi` then 2023d/zic -b slim -d ... tzdata.zi to confirm if the issue still appears in Europe/London? -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry
On 2024-01-02 03:29, Derick Rethans via tz wrote:
Previously, the following transitions existed:
1994-03-27 01:00:00 UT ( 764730000) = 1 [ 3600 1 4 'BST' (0,0)] 1994-10-23 01:00:00 UT ( 782874000) = 2 [ 0 0 8 'GMT' (0,0)] 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)]
POSIX string: GMT0BST,M3.5.0/1,M10.5.0 std: 2 [ 0 0 8 'GMT' (0,0)] dst: 1 [ 3600 1 4 'BST' (0,0)]
But now, they include an extra one for Jan 1st, 1996, with the March 31st one now not being the last one:
… 1994-03-27 01:00:00 UT ( 764730000) = 1 [ 3600 1 4 'BST' (0,0)] 1994-10-23 01:00:00 UT ( 782874000) = 2 [ 0 0 8 'GMT' (0,0)] 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)]
POSIX string: GMT0BST,M3.5.0/1,M10.5.0 std: 2 [ 0 0 8 'GMT' (0,0)] dst: 1 [ 3600 1 4 'BST' (0,0)]
It seems to be more complicated than that. In 2023c there's an explicit transition at 1996-03-31 01:00. In 2023d this transition is instead at 1996-01-01 01:00. As you mentioned, both transition sets induce the same set of timestamps from localtime, so in that sense they're both correct. It seems to me, though, that neither transition is needed, and a TZif file lacking both transitions would also be correct. This is an optimization that could be added to tzcode someday.
As far as I know so-far, the only effect it has on PHP users is that they will now see an extra transition
It sounds like PHP is making an incorrect assumption, namely, that each entry in a TZif file is a transition that should be shown to users. This assumption is incorrect for current TZcode. For various reasons there can be an entry in a TZif file's transition table in which the timestamps before and after the entry have the same UTC offset, the same time zone abbreviation, and the same is_dst flags. This is not a user-visible transition and should not be shown to users. In thinking about it a bit, it should be possible to change zic.c so that each entry in the transition corresponds to a change in the observed timestamp behavior (either UTC offset or abbreviation or is_dist). This would require a bit of nontrivial hacking, though.
I couldn't find anywhere in tzfile.5 or theory.html whether the last generated transition must match a transition as specified with the POSIX string (as it did with 2023c and earlier), but I vaguely remember having read such a thing when I implemented the POSIX string parsing logic.
There is no such requirement. Perhaps you're thinking of the requirement that the last entry in a TZif file's explicit transition table must have a time type that is consistent with the following TZ string. Although this is a requirement, it's weaker than what you suggest.
On 2024-01-02 13:53, Brooks Harris via tz wrote:
I suggest TzDb may want to have a look at this topic. I think If these improvements were made it would not alter the typical current behavior of localtime(); the YMDhms representations and sequences would remain the same. But the addition of these transitions are more complete and honest to the underlying TzDb source data and this is important for some types of extended functionality I'm pursuing.
It's never been a goal of zic to pack as much as possible information about the input into the TZif binary output file. Instead, the goal has been to optimize the output file size, as long as optimization doesn't change localtime's behavior. (Look for the word "optimize" in zic.c for more about this.) It might be possible to add a flag to zic to tell it to output larger TZif files that contain transitions and other information that do not affect localtime but might aid other applications. However, I don't see how such a flag could preserve all the relevant information, without a change to the TZif file format. So unless we change the TZif format, users who want all the info in the .zi input files would need to look at the .zi files anyway.
On Fri, 5 Jan 2024 at 19:31, Paul Eggert via tz <tz@iana.org> wrote:
It might be possible to add a flag to zic to tell it to output larger TZif files that contain transitions and other information that do not affect localtime but might aid other applications. However, I don't see how such a flag could preserve all the relevant information, without a change to the TZif file format. So unless we change the TZif format, users who want all the info in the .zi input files would need to look at the .zi files anyway.
I suspect that most people write TZDB source parsers because they want access to more data than the binary format provides. The source files are a wealth of information, which cannot be obtained in any other way. As such, it could be a useful direction for TZDB to provide an alternate output format - effectively a standardised version of the data in the source files. Logically this would be in JSON (or XML) format and well-documented This would allow most external parsers to be refactored to use the new data format. For example, modern Java uses a list of historic transitions and encoded rules for future transitions. But some others prefer a list of transitions into the future (to some future year). I suspect the new format would supply both the rules and resolved transitions for future dates. See https://github.com/jodastephen/tzdiff/blob/master/data/Europe-London.txt for the kind of data Java needs (transitions and rules). Issues such as the negative daylight savings flag go away. The alternate format would simply supply both flags. eg. for Europe/Dublin winter would have something like "dstLegal=true" and "dstSummer=false". Note that it would basically need to expose all data in the source files (otherwise people will keep on parsing the source files). This therefore includes pre-1970 data for all regions - but that could be explicitly in a separate section of the format. (ie. all pre-1970 data from all countries would be separate from all post-1970 data, allowing data consumers to pick and choose what they want) Ideally, the final TZif binary format would be derived from the new alternate format, thus the flow would be TZ source files (intended for internal TZDB use only) -> TZ JSON -> TZif binary If there is interest, I could work on the JSON format needed. Stephen
On 1/5/2024 2:31 PM, Paul Eggert wrote:
On 2024-01-02 13:53, Brooks Harris via tz wrote:
I suggest TzDb may want to have a look at this topic. I think If these improvements were made it would not alter the typical current behavior of localtime(); the YMDhms representations and sequences would remain the same. But the addition of these transitions are more complete and honest to the underlying TzDb source data and this is important for some types of extended functionality I'm pursuing.
Thanks for looking at this.
It's never been a goal of zic to pack as much as possible information about the input into the TZif binary output file. Instead, the goal has been to optimize the output file size, as long as optimization doesn't change localtime's behavior. Yes, I understand, I think. But I think these "first of year" transitions should be included in TzIf. They are important to the objectives I'm pursuing because they contain the critical STDOFF shifts that are not included in the normal outzone() output. I'm including both STDOFF and DST transitions in my resulting timestamp formats. (Look for the word "optimize" in zic.c for more about this.) Yes, I'm familiar with that code block, and it too needs a bit of modification to retain these transitions.
It might be possible to add a flag to zic to tell it to output larger TZif files that contain transitions and other information that do not affect localtime but might aid other applications. I think that's what I'm suggesting, although I'm not using the zic.c main() command params; I've got it more "hard coded". I suppose other users would like the option, although it would not change the behavior of localtime().
However, I don't see how such a flag could preserve all the relevant information, without a change to the TZif file format. True, and I'm not suggesting TZif format should change, only that these transitions be retained.
There aren't actually very many, though I've not done a full tally. But there's only 1 in London, 5 in New York, 1 in Sydney, etc. I think this would be a very small increase in the TzIf files size.
So unless we change the TZif format, users who want all the info in the .zi input files would need to look at the .zi files anyway. Yes, that's what I'm doing, actually looking at the output of infile(), that is, the struct zone and struct rule sets that outzone() is processing to pick up the additional data, mostly the STDOFF shifts.
If you want to investigate this further I'd be happy to go into more detail. Thanks, -Brooks
On Jan 5, 2024, at 2:16 PM, Stephen Colebourne via tz <tz@iana.org> wrote:
I suspect that most people write TZDB source parsers because they want access to more data than the binary format provides. The source files are a wealth of information, which cannot be obtained in any other way.
Which information is that (other than transition dates/times and rules, which are mentioned later in your message)?
For example, modern Java uses a list of historic transitions and encoded rules for future transitions. But some others prefer a list of transitions into the future (to some future year). I suspect the new format would supply both the rules and resolved transitions for future dates.
So what is the definition of a "transition" here? The binary file obviously allows code that reads it to get information of the form "at date/time DT, one or more of {the offset from UTC, whether tm_isdst should be zero or non-zero, the time zone abbreviation} changes". The tzcode doesn't happen to have APIs to *provide* that information, but that's a different matter. Is there software that needs to know about transitions that change none of those?
See https://github.com/jodastephen/tzdiff/blob/master/data/Europe-London.txt for the kind of data Java needs (transitions and rules).
So are there Java classes read those files and use them? Or are they files produced by Java code that *uses* the data? That file appears to list: the first entry in Zone Europe/London ("LMT: -00:01:15"); a bunch of transitions, the first one being at 1847-12-01T00:00-00:01:15 and the last one being at 1997-10-26T02:00+01:00, with the date and time shown in ISO 8601 format (with 1997-10-26T02:00+01:00 meaning year 1997, month October, day 26, at 2:00 local time), "Gap" presumably meaning "the clock is turned forward" and "Overlap" presumably meaning "the clock is turned back", and with "to XXX" meaning "the offset from (proleptic?) UTC switches to XXX, with "Z" meaning "the offset from UTC is zero"); two rules that are, I guess, presumed to cover all times after 1997-10-26T02:00+01:00. Most of those transitions are generated by rules for GB-Eire, but those rules are *not* in that file, even though they *are* in the europe source file. What is the criterion for when it switches from showing transitions to showing rules?
Issues such as the negative daylight savings flag go away. The alternate format would simply supply both flags. eg. for Europe/Dublin winter would have something like "dstLegal=true" and "dstSummer=false".
So what do the (presumed) Booleans "dstLegal" and "dstSummer" mean here?
Note that it would basically need to expose all data in the source files (otherwise people will keep on parsing the source files).
"Data" presumably meaning "not comments".
Ideally, the final TZif binary format would be derived from the new alternate format, thus the flow would be TZ source files (intended for internal TZDB use only) -> TZ JSON -> TZif binary
"Internal TZDB use" presumably meaning "kept in the TZDB repository and shipped as part of a TZDB 'source code release', but not installed on systems using the tzdb? They also serve as human-readable text; JSON is "human-readable" in that it's a textual format, but I'm not sure I'd call it "human-readable" in the sense of "every bit as easy for a human to read as zic source is". I'd be more inclined to treat the JSON format as an alternative compiled format.
On 2024-01-05 15:16, Stephen Colebourne via tz wrote:
On Fri, 5 Jan 2024 at 19:31, Paul Eggert via tz <tz@iana.org> wrote:
It might be possible to add a flag to zic to tell it to output larger TZif files that contain transitions and other information that do not affect localtime but might aid other applications. However, I don't see how such a flag could preserve all the relevant information, without a change to the TZif file format. So unless we change the TZif format, users who want all the info in the .zi input files would need to look at the .zi files anyway.
I suspect that most people write TZDB source parsers because they want access to more data than the binary format provides. The source files are a wealth of information, which cannot be obtained in any other way.
As such, it could be a useful direction for TZDB to provide an alternate output format - effectively a standardised version of the data in the source files. Logically this would be in JSON (or XML) format and well-documented This would allow most external parsers to be refactored to use the new data format. For example, modern Java uses a list of historic transitions and encoded rules for future transitions. But some others prefer a list of transitions into the future (to some future year). I suspect the new format would supply both the rules and resolved transitions for future dates.
See https://github.com/jodastephen/tzdiff/blob/master/data/Europe-London.txt for the kind of data Java needs (transitions and rules).
Issues such as the negative daylight savings flag go away. The alternate format would simply supply both flags. eg. for Europe/Dublin winter would have something like "dstLegal=true" and "dstSummer=false".
Note that it would basically need to expose all data in the source files (otherwise people will keep on parsing the source files). This therefore includes pre-1970 data for all regions - but that could be explicitly in a separate section of the format. (ie. all pre-1970 data from all countries would be separate from all post-1970 data, allowing data consumers to pick and choose what they want)
Ideally, the final TZif binary format would be derived from the new alternate format, thus the flow would be TZ source files (intended for internal TZDB use only) -> TZ JSON -> TZif binary
If there is interest, I could work on the JSON format needed.
tzdata `make ... tzdata.zi` gets you that preprocessed alternate format as well as {rearguard,main,vanguard.zi} with the abbreviated format as tzdata.zi, which zic also understands, and you should probably make your input one of these if your project wants to generate JSON or XML for further processing. I will be happy if the project wants to stick to plain text data formats and steer clear of barely readable and/or verbose hierarchical formats with arbitrary tags that require toolkits to make use of. -- Take care. Thanks, Brian Inglis Calgary, Alberta, Canada La perfection est atteinte Perfection is achieved non pas lorsqu'il n'y a plus rien à ajouter not when there is no more to add mais lorsqu'il n'y a plus rien à retirer but when there is no more to cut -- Antoine de Saint-Exupéry
On Jan 5, 2024, at 3:16 PM, Brooks Harris via tz <tz@iana.org> wrote:
On 1/5/2024 2:31 PM, Paul Eggert wrote:
It's never been a goal of zic to pack as much as possible information about the input into the TZif binary output file. Instead, the goal has been to optimize the output file size, as long as optimization doesn't change localtime's behavior.
Yes, I understand, I think. But I think these "first of year" transitions should be included in TzIf. They are important to the objectives I'm pursuing because they contain the critical STDOFF shifts that are not included in the normal outzone() output. I'm including both STDOFF and DST transitions in my resulting timestamp formats.
So by "STDOFF shifts" do you mean changes that represent a timezone changing which time zone it's in :-) rather than turning the clock forward in spring and backward in autumn? I would expect them to be included unless they coincide with a clock forward/backward shift that cancels out the STDOFF shift. What are examples of STDOFF changes that don't show up in the TZif files? Furthermore, at least as I read https://pubs.opengroup.org/onlinepubs/9699919799/functions/daylight.html they *do* affect the value of the timezone global variable, which is specified as "the difference, in seconds, between Coordinated Universal Time (UTC) and local standard time", and thus would need to be included in TZif files in order to support that variable (by changing its value to the the value in effect as of the last time converted - greasy hack, but that's what you get when you, meaning "AT&T", design an API that assumes a location never changes what time zone it's in).
However, I don't see how such a flag could preserve all the relevant information, without a change to the TZif file format.
True, and I'm not suggesting TZif format should change, only that these transitions be retained.
There aren't actually very many, though I've not done a full tally. But there's only 1 in London, 5 in New York, 1 in Sydney, etc. I think this would be a very small increase in the TzIf files size.
OK, so which transition is that for Europe/London? Europe/London has # Zone NAME STDOFF RULES FORMAT [UNTIL] Zone Europe/London -0:01:15 - LMT 1847 Dec 1 0:00 GB-Eire %s 1968 Oct 27 1:00 - BST 1971 Oct 31 2:00u 0:00 GB-Eire %s 1996 0:00 EU GMT/BST The only transitions that affect STDOFF are the ones on 1968-10-27 (entering British Standard Time, 1 hour ahead of GMT) and the one on 1971-10-31 at 2:00 UTC (going with GMT in winter and British *Summer* Time in the summer). Those appear in the Europe/London TZif file in macOS Ventura, as reported by the dump in macOS Ventura: Europe/London Sun Feb 18 01:59:59 1968 UTC = Sun Feb 18 01:59:59 1968 GMT isdst=0 From Greenwich Mean Time Europe/London Sun Feb 18 02:00:00 1968 UTC = Sun Feb 18 03:00:00 1968 BST isdst=1 to British Summer Time Europe/London Sat Oct 26 22:59:59 1968 UTC = Sat Oct 26 23:59:59 1968 BST isdst=1 From British Summer Time Europe/London Sat Oct 26 23:00:00 1968 UTC = Sun Oct 27 00:00:00 1968 BST isdst=0 to British Standard Time, with clocks not changed Europe/London Sun Oct 31 01:59:59 1971 UTC = Sun Oct 31 02:59:59 1971 BST isdst=0 From British Standard Time Europe/London Sun Oct 31 02:00:00 1971 UTC = Sun Oct 31 02:00:00 1971 GMT isdst=0 to Greenwich Mean Time, with clocks turned back an hour because they involve a change to one or more of: the offset from UTC that's currently in effect; the isdst flag; the time zone abbreviation. What are some examples of changes that affect STDOFF that do *not* result in transitions in a TZif file?
On 1/5/24 16:50, Guy Harris wrote:
What are some examples of changes that affect STDOFF that do *not* result in transitions in a TZif file?
I don't know of any in 2023d, other than transitions governed by the TZ string (which are OK). As I understand it, this thread is about timestamps like 820454400 (1996-01-01 00:00:00 UTC) in 2023d's Europe/London. This is a no-op transition - "no-op" in the sense that tm_isdst, tm_zone, and tm_gmtoff do not change at that instant: they remain 0, "GMT", 0 respectively. In 2023d timestamps like these can appear as the last explicit timestamp in the TZif file, as a marker where the TZ string starts to govern. In 2023d's Europe/London, all timestamps from 820454400 on are governed by Europe/London's TZ string "GMT0BST,M3.5.0/1,M10.5.0". I wrote earlier that this timestamp could be omitted but I now see that I was mistaken. The previous timestamp in Europe/London is 814323600 (1995-10-22 01:00:00 UTC), a transition from BST to GMT. If we simply omitted the 820454400 transition the TZif file would become incorrect, since the file's TZ string says that 814323600 is BST, and this would disagree with what the previous timestamp transitioned to. However, the 2023d timestamp 820454400 could be replaced by any timestamp from 814928400 (1995-10-29 01:00:00 UTC) through 828233999 (1996-03-31 00:59:59 UTC) without affecting behavior visible to localtime etc. Better yet, it could be replaced by 828234000 (1996-03-31 01:00:00 UTC) so long as the corresponding transition is a real one, to BST, as opposed to being a no-op placeholder from GMT to GMT. This would be less confusing for TZif readers, because the TZif file would not contain these particular no-op transitions. (TZif files still could contain other no-op transitions, though; in some cases involving 'zic -r' they're unavoidable.) This is what Derick noted in his original email. That is, he noted that 2023c Europe/London took the "Better yet" approach mentioned above, whereas 2023d Europe/London uses a more-confusing (though still correct) transition.
On Jan 5, 2024, at 3:31 PM, Guy Harris via tz <tz@iana.org> wrote:
On Jan 5, 2024, at 2:16 PM, Stephen Colebourne via tz <tz@iana.org> wrote:
See https://github.com/jodastephen/tzdiff/blob/master/data/Europe-London.txt for the kind of data Java needs (transitions and rules).
So are there Java classes read those files and use them?
Or are they files produced by Java code that *uses* the data?
That file appears to list:
the first entry in Zone Europe/London ("LMT: -00:01:15");
a bunch of transitions, the first one being at 1847-12-01T00:00-00:01:15 and the last one being at 1997-10-26T02:00+01:00, with the date and time shown in ISO 8601 format (with 1997-10-26T02:00+01:00 meaning year 1997, month October, day 26, at 2:00 local time), "Gap" presumably meaning "the clock is turned forward" and "Overlap" presumably meaning "the clock is turned back", and with "to XXX" meaning "the offset from (proleptic?) UTC switches to XXX, with "Z" meaning "the offset from UTC is zero");
two rules that are, I guess, presumed to cover all times after 1997-10-26T02:00+01:00.
Most of those transitions are generated by rules for GB-Eire, but those rules are *not* in that file, even though they *are* in the europe source file. What is the criterion for when it switches from showing transitions to showing rules?
It appears from the documentation of the java.time ZoneRules class: https://docs.oracle.com/javase/8/docs/api/java/time/zone/ZoneRules.html that the "rules" are a collection of transitions and then a rule, in a fashion that seems similar to a TZif file's list of transitions and its POSIX-style TZ string. Is there anything provided by an ZoneRules instance corresponding to a tzid that could not be derived from the raw contents of the TZif file for that tzid? ("Raw contents" so as to distinguish "what's in the file" from "what are in the file *and* are made available by the APIs in the current version of the tzcode"; the latter can be changed without changes to TZif files, we'd just need to provide APIs to expose them.)
On Fri, 5 Jan 2024, Paul Eggert via tz wrote:
On 2024-01-02 13:53, Brooks Harris via tz wrote:
I suggest TzDb may want to have a look at this topic. I think If these improvements were made it would not alter the typical current behavior of localtime(); the YMDhms representations and sequences would remain the same. But the addition of these transitions are more complete and honest to the underlying TzDb source data and this is important for some types of extended functionality I'm pursuing.
It's never been a goal of zic to pack as much as possible information about the input into the TZif binary output file. Instead, the goal has been to optimize the output file size, as long as optimization doesn't change localtime's behavior. (Look for the word "optimize" in zic.c for more about this.)
But didn't this change to Europe/London do the opposite? It added a 1996-01-01 transition that wasn't needed (from GMT, to GMT): 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)] FWIW, I don't think it is only Europe/London where this happened. Having a look at that now. cheers, Derick -- https://derickrethans.nl | https://xdebug.org | https://dram.io Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support Host of PHP Internals News: https://phpinternals.news mastodon: @derickr@phpc.social @xdebug@phpc.social twitter: @derickr and @xdebug
On Thu, 4 Jan 2024, brian.inglis--- via tz wrote:
On 2024-01-04 11:07, Bradley White via tz wrote:
It was ...
commit 35c116b7536a36c43eb7cd36bff71ad0c5ecf071 Author: Paul Eggert <eggert@cs.ucla.edu <mailto:eggert@cs.ucla.edu>> Date: Sun Oct 15 12:26:28 2023 -0700
Fix zic bug with Palestine after 2075
The bug can be observed when processing the following .zi data, adapted from the current ‘asia’ file: Rule Palestine 2075 max - Mar Sat<=30 2:00 1:00 S Rule Palestine 2075 max - Oct Sat<=30 2:00 0 - Rule Palestine 2076 only - Jul 25 2:00 0 - Rule Palestine 2076 only - Sep 5 2:00 1:00 S Zone Asia/Gaza 2:00 - EET 2012 2:00 Palestine EE%sT Without the fix, zic generates an incorrect TZif file, in which the special-case 2076 transitions are omitted. This causes ‘zdump -ic 2076,2077 Asia/Gaza’ to mistakenly omit the lines: 2076-07-25 01 +02 EET 2076-09-05 03 +03 EEST 1 * zic.c (outzone): Redo algorithm to work even when the effect of a Rule that never ends (TO="max") is interspersed with the effect of a one-shot rule (TO="only").
So the issue appears running `2023d/zic -slim -d ... 2023[cd]/europe`?
No, it does affect a lot more zones than just europe: https://gist.github.com/derickr/8e86f0c5a54702fb94a512719b1533c5 There are some expected changes, such as Gaza/Hebron, but most of the difference shouldn't have happened there. I wonder though, is there no test suite for this? To me these changes seem clearly not expected.
Has anyone tried using tzdata 2023[cd] `make tzdata.zi` then
2023d/zic -b slim -d ... tzdata.zi
to confirm if the issue still appears in Europe/London?
After running the following in my repo directory: ./code-2023c/zic -d data-files/2023c -b slim code-2023c/tzdata.zi ./code-2023d/zic -d data-files/2023d -b slim code-2023d/tzdata.zi It still shows the same transition changes, as well as the other changes in the GIST (and a few more, as I had forgotten about pacific, antarctica, and pacific: for i in data-files/2023c/*/*; do TZ=$(echo $i | sed 's@data-files/2023c/@@'); diff -u <(~/dev/derickr-timelib/docs/show-tzinfo $TZ `pwd`/data-files/2023c) <(~/dev/derickr-timelib/docs/show-tzinfo $TZ `pwd`/data-files/2023d) > /tmp/result.txt; if [ -s /tmp/result.txt ]; then echo $TZ; cat /tmp/result.txt; echo; fi; done https://gist.github.com/derickr/80c0a834211656bc9301507c4d3757d1 cheers, Derick
On Fri, 5 Jan 2024, Paul Eggert wrote:
On 2024-01-02 03:29, Derick Rethans via tz wrote:
Previously, the following transitions existed:
<snip>
It seems to be more complicated than that. In 2023c there's an explicit transition at 1996-03-31 01:00. In 2023d this transition is instead at 1996-01-01 01:00.
As you mentioned, both transition sets induce the same set of timestamps from localtime, so in that sense they're both correct.
It seems to me, though, that neither transition is needed, and a TZif file lacking both transitions would also be correct. This is an optimization that could be added to tzcode someday.
I thought that the explicit transition at 1996-03-01 00:00 was intended, as that is also the exact time the POSIX rule inserts a transition at the same time. My surprise was more that there is a "new" entry at 1996-01-01 for Europe.London, and several others for other timezones, as you can see in https://gist.github.com/derickr/80c0a834211656bc9301507c4d3757d1
As far as I know so-far, the only effect it has on PHP users is that they will now see an extra transition
It sounds like PHP is making an incorrect assumption, namely, that each entry in a TZif file is a transition that should be shown to users. This assumption is incorrect for current TZcode. For various reasons there can be an entry in a TZif file's transition table in which the timestamps before and after the entry have the same UTC offset, the same time zone abbreviation, and the same is_dst flags. This is not a user-visible transition and should not be shown to users.
It's not really a problem, as that API mainly exists for debugging purposes. The actual conversions handle them just fine. And IMO, it being a debugging feature made this new problem showed up, so I rather keep it :-)
I couldn't find anywhere in tzfile.5 or theory.html whether the last generated transition must match a transition as specified with the POSIX string (as it did with 2023c and earlier), but I vaguely remember having read such a thing when I implemented the POSIX string parsing logic.
There is no such requirement.
Perhaps you're thinking of the requirement that the last entry in a TZif file's explicit transition table must have a time type that is consistent with the following TZ string. Although this is a requirement, it's weaker than what you suggest.
Ah, right, that's what it was. Can you point to the source perhaps as I still cannot find that. *However*, the Europe/London change then now violates this: 1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] -1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)] +1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)] POSIX string: GMT0BST,M3.5.0/1,M10.5.0 std: 2 [ 0 0 8 'GMT' (0,0)] dst: 1 [ 3600 1 4 'BST' (0,0)] The new explicit transition type (2) is no longer consistent with the next transition in the POSIX string (which would be type 1, to to go DST on 1997-03-30? cheers, Derick -- https://derickrethans.nl | https://xdebug.org | https://dram.io Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support Host of PHP Internals News: https://phpinternals.news mastodon: @derickr@phpc.social @xdebug@phpc.social twitter: @derickr and @xdebug
On 2024-01-05 23:31, Guy Harris via tz asked:
The binary file obviously allows code that reads it to get information of the form "at date/time DT, one or more of {the offset from UTC, whether tm_isdst should be zero or non-zero, the time zone abbreviation} changes". The tzcode doesn't happen to have APIs to*provide* that information, but that's a different matter.
Is there software that needs to know about transitions that change none of those?
... So are there Java classes read those files and use them?
Or are they files produced by Java code that*uses* the data?
Yes, there is information on local civil times scales needed by many datetime software interfaces that is already present in the tzdb source files but that is missing from the TZif files since the very beginning: • the SAVE value (or the numeric RULES value) applicable at an instant is not available via TZif files, and cannot in general be deduced from its contents (see eg the SAVE value +01 h for Europe/Dublin when UT = 1916-10-01 + 02:25:21.1). It has even become more difficult to guess these values from TZif files since they are allowed to be negative. • the RULEs applicable at a specific instant (if any). They are available in TZif files (in versions 2 and 3) only for recent instants, and the start of their applicability is only given indirectly in TZif files, sometimes requiring a redundant transition (which is the topic of this thread). I think it would be very useful to have an official output of the tzdb data compilation process (zic and associated tools) that makes these data available for datetime software, even though they are not needed by the POSIX datetime functions. Michael Deckers.
On 2024-01-06 01:40, Derick Rethans wrote:
But didn't this change to Europe/London do the opposite? It added a 1996-01-01 transition that wasn't needed (from GMT, to GMT):
1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] 1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)]
That 1996-01-01 transition *is* needed, in the sense that if you omit it the resulting TZif file would have the wrong behavior: the TZ string "GMT0BST,M3.5.0/1,M10.5.0" would apply to all timestamps after 1995-10-22 01:00 and thus would incorrectly induce a transition from GMT to BST at that point, along with another transition from BST back to GMT at 1995-10-29 02:00. Also, if you omit that transition the TZif file doesn't conform to RFC 8536 secton 3.3 <https://datatracker.ietf.org/doc/html/rfc8536#section-3.3> because its last transition type would disagree with the TZ string evaluated at that transition. (This requirement was put into the RFC precisely to avoid this sort of confusion.) So Europe/London needs *some* explicit transition after 1995-10-22 01:00:00 UT. It's just that zic is free to choose the one it chose in 2023c, or the one it chose in 2023d, or a host of other explicit transitions (they'd all work). On 2024-01-06 02:03, Derick Rethans via tz wrote:
I wonder though, is there no test suite for this?
The test suite I run is 'make public'. It does a regression test on all the Zones, making sure that zdump reports exactly the changes expected, up through the year 2050. It's an end-to-end test: it doesn't care how the TZif file implements a user-visible transition. It of course would be possible to add further tests of intermediate TZif forms. However, this is work that I hope someone else would do, as it's not the primary goal of tzcode to export timestamp information other than what's visible to localtime, or to commit to a particular TZif form when another would do. On 2024-01-06 02:15, Derick Rethans wrote:
On Fri, 5 Jan 2024, Paul Eggert wrote:
Perhaps you're thinking of the requirement that the last entry in a TZif file's explicit transition table must have a time type that is consistent with the following TZ string. Although this is a requirement, it's weaker than what you suggest.
Ah, right, that's what it was. Can you point to the source perhaps as I still cannot find that.
tzfile.5 says: 'If nonempty, the POSIX-style TZ string must agree with the local time type after the last transition time if present in the eight-byte data; for example, given the string "WET0WEST,M3.5.0/1,M10.5.0" then if a last transition time is in July, the transition's local time type must specify a daylight-saving time abbreviated "WEST" that is one hour east of UT.' There is similar wording in Internet RFC 8536 section 3.3.
*However*, the Europe/London change then now violates this:
1995-03-26 01:00:00 UT ( 796179600) = 1 [ 3600 1 4 'BST' (0,0)] 1995-10-22 01:00:00 UT ( 814323600) = 2 [ 0 0 8 'GMT' (0,0)] -1996-03-31 01:00:00 UT ( 828234000) = 1 [ 3600 1 4 'BST' (0,0)] +1996-01-01 00:00:00 UT ( 820454400) = 2 [ 0 0 8 'GMT' (0,0)]
POSIX string: GMT0BST,M3.5.0/1,M10.5.0 std: 2 [ 0 0 8 'GMT' (0,0)] dst: 1 [ 3600 1 4 'BST' (0,0)]
The new explicit transition type (2) is no longer consistent with the next transition in the POSIX string (which would be type 1, to to go DST on 1997-03-30?
But the new transition is consistent. It's not a question of the next transition implied by the POSIX string. It's a question of what the POSIX string says about the transition itself. The example in tzfile.5 should make this clear.
On 1/6/2024 11:34 AM, Michael H Deckers via tz wrote:
On 2024-01-05 23:31, Guy Harris via tz asked:
The binary file obviously allows code that reads it to get information of the form "at date/time DT, one or more of {the offset from UTC, whether tm_isdst should be zero or non-zero, the time zone abbreviation} changes". The tzcode doesn't happen to have APIs to*provide* that information, but that's a different matter.
Is there software that needs to know about transitions that change none of those?
... So are there Java classes read those files and use them?
Or are they files produced by Java code that*uses* the data?
Yes, there is information on local civil times scales needed by many datetime software interfaces that is already present in the tzdb source files but that is missing from the TZif files since the very beginning:
• the SAVE value (or the numeric RULES value) applicable at an instant is not available via TZif files, and cannot in general be deduced from its contents (see eg the SAVE value +01 h for Europe/Dublin when UT = 1916-10-01 + 02:25:21.1). It has even become more difficult to guess these values from TZif files since they are allowed to be negative.
• the RULEs applicable at a specific instant (if any). They are available in TZif files (in versions 2 and 3) only for recent instants, and the start of their applicability is only given indirectly in TZif files, sometimes requiring a redundant transition (which is the topic of this thread).
I think it would be very useful to have an official output of the tzdb data compilation process (zic and associated tools) that makes these data available for datetime software, even though they are not needed by the POSIX datetime functions.
Michael Deckers.
I agree, this would help. I think the difficulty originates with Posix-time itself which has no method to signal STDOFF, the isdst flag is insufficient to signal so-called "double summertime" or negative DST shifts and, of course, leap-seconds. This leaves TzDb responsible to 'trick' the Posix time (localtime() and mktime()) into yielding the Posix YMDhms representation (broken-down time) by manipulating gmtoff while holding stdoff to some fixed value. In common cases this works ok but in many situations, such as STDOFF shifts, STDOFF shifts simultaneous with DST shifts, or "double summertime" the TzIf values are essentially 'lying' to work around the Posix insufficiencies. Until Posix-time is improved TzDb will be forever inventing work-around manipulations of the Posix-time. One might hope the Posix folks might recognize this and find ways to address it. In the meantime I think including these kinds of "no-op" transitions would improve things. Zic does include many "no-op" transitions where only the abbreviation changes, such as New_York: -769395601 1945-08-14 18:59:59 isdst 1 gmtoff -14400 stdoff -18000 EWT -769395600 1945-08-14 19:00:00 isdst 1 gmtoff -14400 stdoff -18000 EPT I see no harm to typical behavior of Posix-time by providing these other "no-op" transitions to assist applications that can make use of them. -Brooks
On Jan 6, 2024, at 9:55 AM, Brooks Harris via tz <tz@iana.org> wrote:
I think the difficulty originates with Posix-time itself which has no method to signal STDOFF,
The page at https://pubs.opengroup.org/onlinepubs/9699919799/functions/daylight.html says The tzset() function also shall set the external variable daylight to 0 if Daylight Savings Time conversions should never be applied for the timezone in use; otherwise, non-zero. The external variable timezone shall be set to the difference, in seconds, between Coordinated Universal Time (UTC) and local standard time. That says "local standard time", not "local time". And the page at https://pubs.opengroup.org/onlinepubs/9699919799/functions/localtime.html says The localtime() function shall convert the time in seconds since the Epoch pointed to by timer into a broken-down time, expressed as a local time. The function corrects for the timezone and any seasonal time adjustments. [CX] Local timezone information is used as though localtime() calls tzset(). which can be read as meaning that all of the things that the tzset() function is specified as doing could occur as a result of a localtime() call, so presumably a localtime() call could set the timezone variable. The first of those pages doesn't say what "the difference, in seconds, between Coordinated Universal Time (UTC) and local standard time" is if its value is itself dependent on date and time, which is one of the problems with the POSIX time API.
the isdst flag is insufficient to signal so-called "double summertime" or negative DST shifts
Yes, it treats DST as an on-off thing, and doesn't specify what "daylight saving time" means.
and, of course, leap-seconds.
*That* is a problem with the specification of POSIX time in https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html which defines "seconds since the Epoch" in a way that always has 86400-second days. The POSIX *API* is capable of indicating leap seconds - the page at https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/time.h.html specifies (following the ISO C standard - thanks to Doug Gwen for getting the original ANSI C standard for specifying this!) that the maximum value of tm_sec is 60, not 59. However, a POSIX-compliant system will never set tm_sec to 60, given the way "seconds since the Epoch" is defined.
This leaves TzDb responsible to 'trick' the Posix time (localtime() and mktime()) into yielding the Posix YMDhms representation (broken-down time) by manipulating gmtoff while holding stdoff to some fixed value. In common cases this works ok but in many situations, such as STDOFF shifts, STDOFF shifts simultaneous with DST shifts, or "double summertime" the TzIf values are essentially 'lying' to work around the Posix insufficiencies. Until Posix-time is improved TzDb will be forever inventing work-around manipulations of the Posix-time. One might hope the Posix folks might recognize this and find ways to address it.
Just as with POSIX there's the specification of "POSIX time" ("seconds since the Epoch") and of the POSIX time API, with TZDB there's the format of the air source files, the format of a TZif compiled file, and the tzdb code, which implements a superset of the POSIX time API. Not all users of the TZDB data use the tzdb code. Some use the source files rather than the TZif files, and I think some use the TZif files but have their own code to read them (I think International Components for Unicode does so, at least from what I remember from Apple - Deborah?). That code is *not* constrained by the limitations of the POSIX API, but code that reads TZif files is constrained by what information is stored in the files. In particular, nothing in the TZif file format splits the offset-from-UTC into "standard time offset from UTC" and "daylight saving time adjustment from standard time". The TZif format is documented in RFC 8536: https://datatracker.ietf.org/doc/html/rfc8536 It defines "Daylight Saving Time" as The time according to a location's law or practice, when adjusted as necessary from standard time. The adjustment may be positive or negative, and the amount of adjustment may vary depending on the date and time; the TZif format even allows the adjustment to be zero, although this is not common practice. It defines a "Time Change" as A change to civil timekeeping practice. It occurs when one or more of the following happen simultaneously: 1. a change in UT offset 2. a change in whether daylight saving time is in effect 3. a change in time zone abbreviation 4. a leap second (i.e., a change in LEAPCORR) A change from DST to double DST would be a "Time Change", as it would result in a change in UT offset. It would not change whether DST is in effect. A change in the standard-time offset of the timezone from UTC would probably be a "Time Change", as it would probably result in a change to UT offset.
I see no harm to typical behavior of Posix-time by providing these other "no-op" transitions to assist applications that can make use of them.
So what are examples of how an application would make use of this information?
On 2024-01-06 09:55, Brooks Harris via tz wrote:
in many situations, such as STDOFF shifts, STDOFF shifts simultaneous with DST shifts, or "double summertime" the TzIf values are essentially 'lying'
Surely this goes too far. Whether a particular UT offset is "standard time" or "daylight saving time" (or something else) does not concern ordinary users and is pretty much arbitrary - as witness the disagreement over whether Morocco observes "standard time" or "daylight saving time" for most of the year. To this end, the TZif values are not lying; they're merely limiting themselves to the info needed to display timestamps. What matters to users is "What time is it?". Questions like "Is daylight saving time observed now?", "Is daylight saving ever observed?" and "What is the standard time now, ignoring any DST observance?" are timekeeping nerds' means to that end, not the end itself, and are best left to tzcode's internals. This is why theory.html says "The tm_isdst member is almost never needed and most of its uses should be discouraged....". In hindsight there never should have been a tm_isdst (instead, there should have been tm_gmtoff and tm_zone) though obviously it's too late now to remove tm_isdst.
On Jan 6, 2024, at 8:34 AM, Michael H Deckers via tz <tz@iana.org> wrote:
Yes, there is information on local civil times scales needed by many datetime software interfaces that is already present in the tzdb source files but that is missing from the TZif files since the very beginning:
• the SAVE value (or the numeric RULES value) applicable at an instant is not available via TZif files, and cannot in general be deduced from its contents (see eg the SAVE value +01 h for Europe/Dublin when UT = 1916-10-01 + 02:25:21.1). It has even become more difficult to guess these values from TZif files since they are allowed to be negative.
I.e., TZif files: https://datatracker.ietf.org/doc/html/rfc8536 don't contain enough information to allow both STDOFF and SAVE to be derived for a given time value. We could provide a version 5, with a version 5 data block that does contain that information. (RFC 8536 doesn't describe version 4; the tzfile(5) man page describes it as For version-4-format TZif files, the first leap second record can have a correction that is neither +1 nor -1, to represent truncation of the TZif file at the start. Also, if two or more leap second transitions are present and the last entry's correction equals the previous one, the last entry denotes the expiration of the leap second table instead of a leap second; timestamps after this expiration are unreliable in that future releases will likely add leap second entries after the expiration, and the added leap seconds will change how post-expiration timestamps are treated.
• the RULEs applicable at a specific instant (if any). They are available in TZif files (in versions 2 and 3) only for recent instants, and the start of their applicability is only given indirectly in TZif files, sometimes requiring a redundant transition (which is the topic of this thread).
So how does software (other than zic, of course :-)) use the rules rather than just the transition times calculated from the rules?
I think it would be very useful to have an official output of the tzdb data compilation process (zic and associated tools) that makes these data available for datetime software, even though they are not needed by the POSIX datetime functions.
Yes, there is no inherent reason why the TZif format, for example, should limit itself to supporting only the date/time functions that POSIX and the C standard currently happen to provide.
On 2024-01-06 12:25, Guy Harris via tz wrote:
Not all users of the TZDB data use the tzdb code. Some use the source files rather than the TZif files, and I think some use the TZif files but have their own code to read them
This partly depends on what one means by "their own code". On my Fedora 39 system, for example, even /usr/bin/zdump uses the GNU C Library's TZif-reading code, not tzcode's. Although there might be a distant relationship between glibc's TZif reader and tzcode's, by now they are quite separate code bases. In contrast, the BSDs tend to use code derived from tzcode, though the code bases diverged years ago. Possibly these could be unified if someone had the time. Presumably that'd be a good thing....
On Jan 6, 2024, at 12:31 PM, Paul Eggert via tz <tz@iana.org> wrote:
On 2024-01-06 09:55, Brooks Harris via tz wrote:
in many situations, such as STDOFF shifts, STDOFF shifts simultaneous with DST shifts, or "double summertime" the TzIf values are essentially 'lying'
Surely this goes too far. Whether a particular UT offset is "standard time" or "daylight saving time" (or something else) does not concern ordinary users and is pretty much arbitrary - as witness the disagreement over whether Morocco observes "standard time" or "daylight saving time" for most of the year. To this end, the TZif values are not lying; they're merely limiting themselves to the info needed to display timestamps.
It appears, from some comments here (emails from Brooks Harris and Michael H Deckers), that the values of STDOFF and SAVE, and not just the current offset from UTC, matter to at least some users. It would help if they indicated the situations in which STDOFF and SAVE matter and gave examples of software that uses them.
On 2024-01-06 12:45, Guy Harris via tz wrote:
there is no inherent reason why the TZif format, for example, should limit itself to supporting only the date/time functions that POSIX and the C standard currently happen to provide
Although an API *could* support deeper interrogation of TZif details, I doubt whether that'd be a good idea. The distinction between stdoff and tm_gmtoff is not always clear, any extra information you'd extract from the data could well be dubious (certainly there are dubious corners in this part of tzdata now), and improving tzdata and maintaining the result would require unnecessary work. We saw an instance of this sort of thing in America/Nuuk last year. Before 2023d, summer 2023 was marked as daylight saving time, but 2023d changed that to standard time. Although nobody in the real world cared (because it didn't affect UTC offsets or even abbreviations), the change consumed nontrivial maintenance effort. And the only reason the change was needed was because of tm_isdst, a feature that should never have been made visible to users in the first place. I wouldn't want even more unnecessary maintenance make-work to follow from exposing even more unnecessary details to users.
On Jan 6, 2024, at 12:52 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 2024-01-06 12:25, Guy Harris via tz wrote:
Not all users of the TZDB data use the tzdb code. Some use the source files rather than the TZif files, and I think some use the TZif files but have their own code to read them
This partly depends on what one means by "their own code". On my Fedora 39 system, for example, even /usr/bin/zdump uses the GNU C Library's TZif-reading code, not tzcode's. Although there might be a distant relationship between glibc's TZif reader and tzcode's, by now they are quite separate code bases.
In contrast, the BSDs tend to use code derived from tzcode, though the code bases diverged years ago. Possibly these could be unified if someone had the time. Presumably that'd be a good thing....
I'm not referring to what the POSIX API implementations in Unix-like systems do, I'm referring to *other* users of the tzdb. For example, ICU has its own code to read TZif files: https://github.com/unicode-org/icu/blob/main/icu4c/source/tools/tzcode/tz2ic...
On 1/6/2024 3:31 PM, Paul Eggert wrote:
On 2024-01-06 09:55, Brooks Harris via tz wrote:
in many situations, such as STDOFF shifts, STDOFF shifts simultaneous with DST shifts, or "double summertime" the TzIf values are essentially 'lying' Sorry. "Lying" might be the wrong word, didn't mean to insult TzDb. I just meant the values of gmtoff and stdoff are adjusted to satisfy Posix-time rather than reflect the values in the source files.
Surely this goes too far. Whether a particular UT offset is "standard time" or "daylight saving time" (or something else) does not concern ordinary users and is pretty much arbitrary - as witness the disagreement over whether Morocco observes "standard time" or "daylight saving time" for most of the year. To this end, the TZif values are not lying; they're merely limiting themselves to the info needed to display timestamps.
What matters to users is "What time is it?". Questions like "Is daylight saving time observed now?", "Is daylight saving ever observed?" and "What is the standard time now, ignoring any DST observance?" are timekeeping nerds' means to that end, not the end itself, and are best left to tzcode's internals.
This is why theory.html says "The tm_isdst member is almost never needed and most of its uses should be discouraged....". In hindsight there never should have been a tm_isdst (instead, there should have been tm_gmtoff and tm_zone) though obviously it's too late now to remove tm_isdst.
On Jan 6, 2024, at 1:03 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 2024-01-06 12:45, Guy Harris via tz wrote:
there is no inherent reason why the TZif format, for example, should limit itself to supporting only the date/time functions that POSIX and the C standard currently happen to provide
Although an API *could* support deeper interrogation of TZif details, I doubt whether that'd be a good idea. The distinction between stdoff and tm_gmtoff is not always clear, any extra information you'd extract from the data could well be dubious (certainly there are dubious corners in this part of tzdata now), and improving tzdata and maintaining the result would require unnecessary work.
We saw an instance of this sort of thing in America/Nuuk last year. Before 2023d, summer 2023 was marked as daylight saving time, but 2023d changed that to standard time. Although nobody in the real world cared (because it didn't affect UTC offsets or even abbreviations), the change consumed nontrivial maintenance effort. And the only reason the change was needed was because of tm_isdst, a feature that should never have been made visible to users in the first place.
I wouldn't want even more unnecessary maintenance make-work to follow from exposing even more unnecessary details to users.
Perhaps exposing the stuff that, say, the java.time people want exposed would not be ideal... ...but they already work around the lack of that information in TZif files by reading the source files, so it's not as if not providing that information in TZif files prevents people from using information from the source files in ways that are, perhaps, unwise. Perhaps generating JSON files from the source is the way to go there; that would, at least, let the source file format and contents not be constrained by code that uses them directly. However, I'd still like to see what the people who use the source files to get STDOFF and SAVE information - and, for that matter, rule information - to give examples of how this is useful.
On Sat, 6 Jan 2024 at 21:52, Guy Harris via tz <tz@iana.org> wrote:
However, I'd still like to see what the people who use the source files to get STDOFF and SAVE information - and, for that matter, rule information - to give examples of how this is useful.
The various Java APIs all expose both the standard offset and the actual offset, and say that daylight savings is when the two differ. ZoneRules: - getOffset(Instant) - gets the actual offset from UTC at an instant on the timeline - getStandardOffset(Instant) - gets the standard ("winter") offset from UTC at an instant on the timeline - getDaylightSavings(Instant) - gets the duration added to standard offset to get the actual offset - isDaylightSavings(Instant) - derived from whether standard and actual are equal - getTransitions() - the historic list of transitions, describing discontinuities in the local timeline - getTransitionRules() - the list of transition rules, describing discontinuities in the local timeline in the future https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/... Code in https://github.com/openjdk/jdk/tree/master/src/java.base/share/classes/java/...
So what do the (presumed) Booleans "dstLegal" and "dstSummer" mean here? They allow the JSON file to express the difference between negative DST and "are the clocks > standard offset" DST.
For clarity, the idea implies that the JSON is an alternative output file of the project, just not a binary one. On Sat, 6 Jan 2024 at 20:31, Paul Eggert via tz <tz@iana.org> wrote:
What matters to users is "What time is it?". Questions like "Is daylight saving time observed now?", "Is daylight saving ever observed?" and "What is the standard time now, ignoring any DST observance?" are timekeeping nerds' means to that end, not the end itself, and are best left to tzcode's internals.
I understand you wish this were true. But it hasn't been true ever since Java had a date-time API parsing the TZDB source files. Java does expose those things, and will continue to do so. The idea I'm putting forward is a better way to expose that data, which ensures that the maintainer and consumer can easily see the data and whether it has changed (deliberately or accidentally). Stephen
On Jan 6, 2024, at 1:48 PM, Brooks Harris via tz <tz@iana.org> wrote:
On 1/6/2024 3:31 PM, Paul Eggert wrote:
On 2024-01-06 09:55, Brooks Harris via tz wrote:
in many situations, such as STDOFF shifts, STDOFF shifts simultaneous with DST shifts, or "double summertime" the TzIf values are essentially 'lying'
Sorry. "Lying" might be the wrong word, didn't mean to insult TzDb. I just meant the values of gmtoff and stdoff are adjusted to satisfy Posix-time rather than reflect the values in the source files.
POSIX's time API has no notion of "gmtoff" and "stdoff". What it has is a global variable named "timezone" which, as per my earlier mail, is described in the Single UNIX Specification thus: The external variable timezone shall be set to the difference, in seconds, between Coordinated Universal Time (UTC) and local standard time. This came from System V; the System V Interface Definition, Issue 2, Volume 1: http://bitsavers.org/pdf/att/unix/SVID/System_V_Interface_Definition_Issue_2... says, on page 162, that The external long variable timezone contains the difference, in seconds, between GMT and local standard time (in EST, timezone is 5*60*60); the external variable daylight is non-zero only if the standard USA Daylight Savings Time conversion should be applied. What the System V Release 3.1 code https://archive.org/download/ATTUNIXSystemVRelease4Version2 (ignore the title, it has older versions) does is to set timezone to the offset as specified in the TZ environment variable, which means it's "the difference, in seconds, between GMT and local standard time", e.g., if TZ is set to EST5EDT, it's set to 5*60*60, and stays there, even if the system, or the time that was last converted, is currently in DST. What our code does, in update_tzname_etc(), is #if USG_COMPAT if (!ttisp->tt_isdst) timezone = - ttisp->tt_utoff; #endif which means that, if and only if the entry handed to it is for "non-DST" time, it sets timezone to the offset from UTC. ("USG_COMPAT" refers to the UNIX Support Group in AT&T, which was the group that produced the System {Roman Numeral} releases, before it became, as I remember, the UNIX System Development Laboratory. At the time, UN*Xes were generally called either "BSD" or "USG", depending on which particular flavor they were, although a lot of commercial UN*Xes picked up features, including APIs, from both.) I'm not sure what "reflect the values in the source files" means, given that, for a given timezone, there could be *multiple* values of "the difference, in seconds, between Coordinated Universal Time (UTC) and local standard time". Either there's only one value for the timezone, in which case ttisp->tt_utoff, for all ttisp->tt_isdst, has the same value, and all "adjustments" just set timezone to the value it already has, or there's more than one value, in which case there's no indication of what either "[setting it] to the difference, in seconds, between Coordinated Universal Time (UTC) and local standard time" or "[reflecting] the values in the source files" would mean.
On Jan 6, 2024, at 2:30 PM, Stephen Colebourne via tz <tz@iana.org> wrote:
On Sat, 6 Jan 2024 at 21:52, Guy Harris via tz <tz@iana.org> wrote:
However, I'd still like to see what the people who use the source files to get STDOFF and SAVE information - and, for that matter, rule information - to give examples of how this is useful.
The various Java APIs all expose both the standard offset and the actual offset, and say that daylight savings is when the two differ.
Not that the decision to do so would be easily reversible, if reversible at all, just as the decision by the Bell Labs Research folks to expose a "DST is in effect at the date and time in question" member of struct tm would not be easily reversible, if reversible at all, but what was the rationale for exposing them both? Was there a situation where a Java library or application needed that information?
So what do the (presumed) Booleans "dstLegal" and "dstSummer" mean here?
They allow the JSON file to express the difference between negative DST and "are the clocks > standard offset" DST.
(As one reason why tm_isdst was a mistake pops up again....) Was this done before, or after, the "Ireland does things differently" issue popped up?
On 2024-01-06 14:51, Guy Harris via tz wrote:
POSIX's time API has no notion of "gmtoff" and "stdoff".
Although that's true for current POSIX, POSIX 202x/D3 does have tm_gmtoff in struct tm, as a result of Austin Group Defect 1533 <https://austingroupbugs.net/view.php?id=1533> which saw steffen and kre as contributors.
On Sat, 6 Jan 2024 at 23:00, Guy Harris <gharris@sonic.net> wrote:
what was the rationale for exposing them both? Was there a situation where a Java library or application needed that information?
So people can use them? https://stackoverflow.com/questions/1060479/determine-whether-daylight-savin...
So what do the (presumed) Booleans "dstLegal" and "dstSummer" mean here?
They allow the JSON file to express the difference between negative DST and "are the clocks > standard offset" DST.
Was this done before, or after, the "Ireland does things differently" issue popped up?
isDaylightSavings() returns true in summer for Dublin in Java - always has done. Stephen
On 2024-01-06 14:30, Stephen Colebourne via tz wrote:
On Sat, 6 Jan 2024 at 20:31, Paul Eggert via tz<tz@iana.org> wrote:
What matters to users is "What time is it?". Questions like "Is daylight saving time observed now?", "Is daylight saving ever observed?" and "What is the standard time now, ignoring any DST observance?" are timekeeping nerds' means to that end, not the end itself, and are best left to tzcode's internals. I understand you wish this were true. But it hasn't been true ever since Java had a date-time API parsing the TZDB source files. Java does expose those things, and will continue to do so.
It's true even with the Java API, in the sense that Java users by and large have the same needs as POSIX users: they need to know local time, not all the internal machinery that underlies calculations of local time. It was a design mistake for the POSIX API to expose some of that internal machinery, as exposing it causes more trouble in user code than it cures: it causes users to mistakenly think that they need to know about and use tm_isdst to get their work done. To the extent that the Java APIs inherit this POSIX misfeature, they have a similar problem. The example <https://stackoverflow.com/questions/1060479/determine-whether-daylight-savin...> that you recently gave is an instance of this similar problem. Of course this is all water under the bridge for the current APIs. However, it would be better if future protocols and APIs and formats did not have to repeat these mistakes of their predecessors.
On Jan 6, 2024, at 3:35 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 2024-01-06 14:51, Guy Harris via tz wrote:
POSIX's time API has no notion of "gmtoff" and "stdoff".
Although that's true for current POSIX, POSIX 202x/D3 does have tm_gmtoff in struct tm, as a result of Austin Group Defect 1533 <https://austingroupbugs.net/view.php?id=1533> which saw steffen and kre as contributors.
Yes, I know, I signed up for the list so that I could read the draft. But there's no tm_stdoff, so it wasn't obvious that "gmtoff" and "stdoff" were referring to values in struct tm, and Brooks Harris said
I just meant the values of gmtoff and stdoff are adjusted to satisfy Posix-time rather than reflect the values in the source files.
but tm_gmtoff is currently obviously not "adjusted to satisfy Posix-time" as, when tm_gmtoff was introduced, there was no tm_gmtoff in POSIX or, for that matter, in most UN*Xes (was it in *any* UN*Xes before tzcode put it in?), and it's also not "adjusted", it's just the current offset as calculated from the values in the source files. And, as there's no "tm_stdoff" value even now, there's nothing *to* adjust about stdoff.
On Jan 6, 2024, at 3:37 PM, Stephen Colebourne via tz <tz@iana.org> wrote:
On Sat, 6 Jan 2024 at 23:00, Guy Harris <gharris@sonic.net> wrote:
what was the rationale for exposing them both? Was there a situation where a Java library or application needed that information?
So people can use them? https://stackoverflow.com/questions/1060479/determine-whether-daylight-savin...
What they're doing is I have a Java class that takes in the latitude/longitude of a location and returns the GMT offset when daylight savings time is on and off. I am looking for an easy way to determine in Java if the current date is in daylight savings time so I can apply the correct offset. and, in answer to an obvious next question, more of what they're doing is Currently I am retrieving the time zone information using the GeoTools library and a shapefile provided by the National Atlas of the United States (nationalatlas.gov/mld/timeznp.html). Fortunately this provides me with some additional information - primarily the time zone symbol which is 2 or 4 digits (i.e AL, EA, EAno, etc). Unfortunately this value doesn't correspond to those used by Java time zones although I could perform this mapping manually. Ideally I'd like a solution that would work if I replaced this file with a world time zone shapefile but that might be too ambitious. The answers point out that you're going to need the "Java time zone" anyway - the top answer suggests using inDaylightTime(), and points out that "A server trying to figure this out for a client will need the client's time zone." If you have the relevant time zone, however, converting the current date to what I'm presuming is local time for whatever time zone that is, you can just use getOffset() to get the offset and not waste time caring about DST. I.e., their belief that they need to be able to determine whether DST is active may be based on ignorance of one or more of 1) the getOffset() method of a TimeZone or 2) the fact that they're going to need a TimeZone *anyway* to determine whether DST is active. And the information I was asking about was the standard offset and the actual offset; they're trying to determine the actual offset, and, once you have that, the standard offset isn't useful. The problem is that they're trying to determine the actual offset in a roundabout way, by getting the standard offset, somehow determining what the offset would be in DST (probably by making a not-necessarily-valid assumption about DST shifting time one hour ahead), and then determining whether DST is in effect or not. If the latter is done with inDaylightTime(), I infer that the java.time *implementation* of that method gets the standard and actual offsets and compares them, but that could equally well be done by using a TZif file that provides both a "what should tm_isdst be set to" value and a "what should inDaylightTime() return" value, without the standard offset being provided by the TZif file. So, are there any cases where code would need information such as is provided by getRawOffset() for reasons *other* than "I'm trying to implement something that I could just do by calling the appropriate java.time method rather than duplicating what it does"? And are there any cases where code would need information such as is provided by inDaylightTime() for reasons other than the aforementioned?
isDaylightSavings() returns true in summer for Dublin in Java - always has done.
So I presume from that answer that isDaylightSavings(some time) is not always the same as the tm_isdst value in the structure returned by localtime(some time). If so, why was it decided to have it work differently?
...when tm_gmtoff was introduced, there was no tm_gmtoff in POSIX or, for that matter, in most UN*Xes (was it in *any* UN*Xes before tzcode put it in?)...
Some tm_gmtoff history. One general note: the earliest time zone package work was influenced by what had been done for UNIX System V, BSD, and HP-UX. A source for a UNIX System V manual: http://bitsavers.trailing-edge.com/pdf/att/3b1/999-801-312IS_ATT_UNIX_PC_Sys... System V version 3.51 (1986) did not have tm_gmtoff. A source for BSD-derived manuals: https://man.freebsd.org/cgi/man.cgi/help.html Neither HP-UX 8.07 (1991) nor HP-UX 11.22 (2002) had tm_gmtoff, so it would not have been present before those dates. The earliest 1986 version of the time zone package (distributed on request) did not have tm_gmtoff. The 1987-02-28 version of the time zone package distributed via USENET's mod.sources did have tm_gmtoff. https://groups.google.com/g/mod.sources/c/2Jq1irYs0w4 tm_gmtoff and tm_zone were added at the same time; there's material in the IANA mailing list archive in late 1986 and early 1987 on making additions to struct tm. And a month after the mod.sources posting... From ado Sat Mar 28 15:13:19 1987 To: tz Subject: Berkeley variant of time zone stuff Status: RO ... Berkeley forged ahead and added the "tm_gmtoff" and "tm_zone" elements to the tm structure. ... ...meaning that it wasn't present in BSD before then. @dashdashdo On Sat, Jan 6, 2024 at 7:07 PM Guy Harris via tz <tz@iana.org> wrote:
On Jan 6, 2024, at 3:35 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 2024-01-06 14:51, Guy Harris via tz wrote:
POSIX's time API has no notion of "gmtoff" and "stdoff".
Although that's true for current POSIX, POSIX 202x/D3 does have tm_gmtoff in struct tm, as a result of Austin Group Defect 1533 < https://austingroupbugs.net/view.php?id=1533> which saw steffen and kre as contributors.
Yes, I know, I signed up for the list so that I could read the draft.
But there's no tm_stdoff, so it wasn't obvious that "gmtoff" and "stdoff" were referring to values in struct tm, and Brooks Harris said
I just meant the values of gmtoff and stdoff are adjusted to satisfy Posix-time rather than reflect the values in the source files.
but tm_gmtoff is currently obviously not "adjusted to satisfy Posix-time" as, when tm_gmtoff was introduced, there was no tm_gmtoff in POSIX or, for that matter, in most UN*Xes (was it in *any* UN*Xes before tzcode put it in?), and it's also not "adjusted", it's just the current offset as calculated from the values in the source files.
And, as there's no "tm_stdoff" value even now, there's nothing *to* adjust about stdoff.
On Sun, 7 Jan 2024 at 02:21, Guy Harris <gharris@sonic.net> wrote:
So, are there any cases where code would need information such as is provided by getRawOffset() for reasons *other* than "I'm trying to implement something that I could just do by calling the appropriate java.time method rather than duplicating what it does"? And are there any cases where code would need information such as is provided by inDaylightTime() for reasons other than the aforementioned?
It really doesn't matter if there are use cases or not - the API is available and will not be removed. Developers have been free to use them for 20+ years and will continue to do so, whether there are better choices they could be making or not.
isDaylightSavings() returns true in summer for Dublin in Java - always has done.
So I presume from that answer that isDaylightSavings(some time) is not always the same as the tm_isdst value in the structure returned by localtime(some time). If so, why was it decided to have it work differently?
In my opinion, it was TZDB that decided to change the meaning of the flag. Stephen
On 1/6/2024 7:06 PM, Guy Harris wrote:
On Jan 6, 2024, at 3:35 PM, Paul Eggert<eggert@cs.ucla.edu> wrote:
On 2024-01-06 14:51, Guy Harris via tz wrote:
POSIX's time API has no notion of "gmtoff" and "stdoff". Although that's true for current POSIX, POSIX 202x/D3 does have tm_gmtoff in struct tm, as a result of Austin Group Defect 1533<https://austingroupbugs.net/view.php?id=1533> which saw steffen and kre as contributors. Yes, I know, I signed up for the list so that I could read the draft.
But there's no tm_stdoff, so it wasn't obvious that "gmtoff" and "stdoff" were referring to values in struct tm, and Brooks Harris said
I just meant the values of gmtoff and stdoff are adjusted to satisfy Posix-time rather than reflect the values in the source files. but tm_gmtoff is currently obviously not "adjusted to satisfy Posix-time" as, when tm_gmtoff was introduced, there was no tm_gmtoff in POSIX or, for that matter, in most UN*Xes (was it in *any* UN*Xes before tzcode put it in?), and it's also not "adjusted", it's just the current offset as calculated from the values in the source files.
And, as there's no "tm_stdoff" value even now, there's nothing *to* adjust about stdoff. Right, but I think there should be. Posix cannot distinguish an stdoff shift independent from a gmtoff shift.
For example, there is a STDOFF shift at 1971 Oct 31 2:00u from 1:00 to 0:00 (a West shift) in London: # Zone NAME STDOFF RULES FORMAT [UNTIL] Zone Europe/London -0:01:15 - LMT 1847 Dec 1 0:00s 0:00 GB-Eire %s 1968 Oct 27 1:00 - BST 1971 Oct 31 2:00u 0:00 GB-Eire %s 1996 0:00 EU GMT/BST Zic and TzIf reflect this change as a shift in gmtoff, not stdoff: 57722399 1971-10-31 02:59:59 isdst 0 gmtoff 3600 stdoff 0 BST 57722400 1971-10-31 02:00:00 isdst 0 gmtoff 0 stdoff 0 GMT That's what I mean by "adjusted" for Posix sake. It gives the proper UTC offset, yes, but not for the right reason. The underlying reason was an STDOFF shift, presumably stated in the law behind it. TzDb source has been careful to try to honor the laws and customs behind the local time rules, and I strongly support that policy. I would presume this custom originates with the USA laws that specifically define the time zones as offsets from UTC and the optional one-hour DST shifts. These are the familiar behaviors of time zones where DST is "observed". But many time zones do not follow these familiar patterns and we have to be careful about that. Many zones have shifted their STDOFF, not always by an hour, and made multiple DST shifts (co-called "double summertime") and not always by an hour, and sometimes simultaneously. This can get tricky. --------------------- I'm involved with several standards projects at the Society of Motion Picture and Television Engineers (SMPTE). The television and broadcast industries have used "SMPTE Timecode" (defined primarily by a standard named ST12-1) for decades, since the mid 1970s. SMPTE Timecode is used in very many ways, from labeling of video frames to synchronization and essentially glues the timekeeping of the whole industry together. SMPTE Timecode was derived from IRIG and is quite similar to the formats used by WWV radio and others with addition of video-related metadata. Timecode is used all over the world, where various video rates are used. For example most of Europe uses 1/25 (25 frames-per-second). These systems have an exact relation to running time in seconds and are sometimes called "integral rate". But the USA, Canada, Japan and others use the 'strange' rate off 30000/1001, called a "non-integral rate". This has all sorts of algorithmic ramifications and implementations must be careful with their math. But this is well-know in the industry. Meantime there is the matter of audio synchronization of various frequencies with video. A typical audio frequency is 48000/1. If you're interested in more detail about video and audio rates and SMPTE Timecode you might read: Conversion of Audio Samples to Video Frames Brooks Harris, May 8, 1998 http://edlmax.com/AudioNTSC.htm Conversion between SMPTE hh:mm:ss:ff Time Code and Frames Brooks Harris, EdlMax, LLC. Version V4 2015-04-04 http://edlmax.com/SMPTETimeCodeConversion.htm In broadcast television it's important to have the hh:mm:ss portion of the timecode run as close as possible to local time. But SMPTE timecode and equipment that depends on it cannot tolerate a discontinuity in the hh:mm:ss:video-frame sequence. To work around this limitation the industry has typically adopted a procedure called "Daily Jam". Here, DST shifts, STDOFF shifts and leap-second shifts are deferred to some local time and the hh:mm:ss:video-frame values are reset ("jammed") to match the local wall-clock time. This is an imperfect solution which can produce a "glitch" in the hh:mm:ss:video-frame sequence, especially for systems running at "non-integer" rates. Therefore the "Daily Jam" is typically instituted at some local time least likely to disrupt normal operations. In the USA this time is typically 02:00:00 (2am). Daily Jam is imperfect but an effective work-around that has been used for decades. In recent years SMPTE has developed several standards related to packet-based network "streaming" of video and audio. They've elected to use IEEE Precision Time Protocol as the primary synchronization reference. The standards called ST2059-1 and ST2059-2 in particular define the relation of timecode to PTP time. These also codify the use of Daily Jam procedures. User requirements call for use of local time together with video frame and audio-sample accuracy. And it is this set of challenges that have brought me (kicking and screaming :-) ) into the wider world of timekeeping and so to Tz Database. In 2019 I was invited to make a presentation to ION/PTTI: Accurate Local Timestamps, Brooks Harris https://www.ion.org/publications/abstract.cfm?articleID=16763 In this I pointed out that typical timestamps were insufficient to represent truly unambiguous local time. If we have two ISO 8601 timestamps with date, time and UTC offset: 2023-07-01 00:01:23 -7:00 2023-07-01 00:04:56 -7:00 We can normalize to UTC and determine that the first precedes the second; precedence is established. Great! This, of course, is the first of most important aspect of timekeeping. This was the whole point to Posix-time to begin with; maintain event precedence within the system. But this entirely misses the fact that the two timestamps may have come from two different time zones. If you add the time zone it becomes much clearer what the meaning of the two events are: 2023-07-01 00:01:23 -7:00 America/Los_Angeles 2023-07-01 00:04:56 -7:00 America/Phoenix But this does not signal that DST was in effect in America/Los_Angeles and not observed in America/Phoenix. You could add an "isdst" flag: 2023-07-01 00:01:23 -7:00 America/Los_Angeles DST 2023-07-01 00:04:56 -7:00 America/Phoenix STD That works (more-or-less) in these two typical and familiar cases. But when you get to other time zones that simple logic may not hold up. My favorite "kryptonite" example is Europe/Moscow: # Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S Rule Russia 1917 only - Jul 1 23:00 1:00 MST # Moscow Summer Time Rule Russia 1917 only - Dec 28 0:00 0 MMT # Moscow Mean Time Rule Russia 1918 only - May 31 22:00 2:00 MDST # Moscow Double Summer Time Rule Russia 1918 only - Sep 16 1:00 1:00 MST Rule Russia 1919 only - May 31 23:00 2:00 MDST Rule Russia 1919 only - Jul 1 0:00u 1:00 MSD Rule Russia 1919 only - Aug 16 0:00 0 MSK Rule Russia 1921 only - Feb 14 23:00 1:00 MSD Rule Russia 1921 only - Mar 20 23:00 2:00 +05 Rule Russia 1921 only - Sep 1 0:00 1:00 MSD Rule Russia 1921 only - Oct 1 0:00 0 - Rule Russia 1981 1984 - Apr 1 0:00 1:00 S Rule Russia 1981 1983 - Oct 1 0:00 0 - Rule Russia 1984 1995 - Sep lastSun 2:00s 0 - Rule Russia 1985 2010 - Mar lastSun 2:00s 1:00 S Rule Russia 1996 2010 - Oct lastSun 2:00s 0 - # Zone NAME STDOFF RULES FORMAT [UNTIL] Zone Europe/Moscow 2:30:17 - LMT 1880 2:30:17 - MMT 1916 Jul 3 # Moscow Mean Time 2:31:19 Russia %s 1919 Jul 1 0:00u 3:00 Russia %s 1921 Oct 3:00 Russia MSK/MSD 1922 Oct 2:00 - EET 1930 Jun 21 3:00 Russia MSK/MSD 1991 Mar 31 2:00s 2:00 Russia EE%sT 1992 Jan 19 2:00s 3:00 Russia MSK/MSD 2011 Mar 27 2:00s 4:00 - MSK 2014 Oct 26 2:00s 3:00 - MSK Whoa man! Take the transition at 1991-03-31 02:00:00 3:00 Russia MSK/MSD 1991 Mar 31 2:00s 2:00 Russia EE%sT 1992 Jan 19 2:00s This is an STDOFF shift with a simultaneous offsetting DST shift. As Time And Date puts it: Mar 31 No change, Mar 31, 1991 - Daylight Saving Time Started DST started on Sunday, March 31, 1991, 2:00:00 am. However, clocks were not changed because Moscow switched time zones at the same time. https://www.timeanddate.com/time/change/russia/moscow?year=1991 This is a "no-op" transition (the YMDhms sequence is not interrupted) but several important facts changed; both STDOFF and DST shifted, and the Abbreviation changed. Zic also sets the isdst flag: 670373999 1991-03-31 01:59:59 isdst 0 gmtoff 10800 stdoff 10800 MSK 670374000 1991-03-31 02:00:00 isdst 1 gmtoff 10800 stdoff 10800 EEST There is another objective in my work. In video we have always been able to represent any time-point within the 24-hour range, important for editorial adjustments a duration calculation. To do this comprehensively with local time you need to know that this day includes a transition and what and when that transition occurs. Thus, my formats include information to signal "today is a transition day", "what the transition is", and "when it occurs". In this Moscow example both STDOFF and DST shift and these are independently indicated. D1991-03-30T23:59:59U+03Zeurope/moscowAmskV2021aL16MuX UTC 00670366815 D1991-03-31T00:00:00U+03w01+02Zeurope/moscowAmskV2021aL16S00t01a02cMuX UTC 00670366816 D1991-03-31T01:59:59U+03w01+02Zeurope/moscowAmskV2021aL16S00t01a02cMuX UTC 00670374015 D1991-03-31T02:00:00U+03w01+02Zeurope/moscowAeestV2021aL16S01t01a02cMuX UTC 00670374016 D1991-03-31T23:59:59U+03w01+02Zeurope/moscowAeestV2021aL16S01t01a02cMuX UTC 00670453215 D1991-04-01T00:00:00U+03Zeurope/moscowAeestV2021aL16S01cMuX UTC 00670453216 There are many details to the formats and implementation I'm working on but I hope this gives an idea why I believe it is useful to include the STDOFF data in any timestamp format and to include such transitions in the output of zic. I also hope Posix might consider adding "tm_stdoff" to struct tm. Thanks, -Brooks
On 2024-01-07 12:01, Brooks Harris wrote:
Zic and TzIf reflect this change as a shift in gmtoff, not stdoff:
57722399 1971-10-31 02:59:59 isdst 0 gmtoff 3600 stdoff 0 BST 57722400 1971-10-31 02:00:00 isdst 0 gmtoff 0 stdoff 0 GMT
That's what I mean by "adjusted" for Posix sake. It gives the proper UTC offset, yes, but not for the right reason. The underlying reason was an STDOFF shift, presumably stated in the law behind it.
I'm still not following this, unfortunately. The use cases you gave mostly involved comparing local-time timestamps when their UT offsets are known, and obviously gmtoff suffices for that. You mentioned one use case for which the POSIX API is ill-designed (namely, "find the next gmtoff transition"), but gmtoff suffices for that too: we don't need stdoff there either. Also, it's often not the case that the underlying reason was stated in the law behind it. The laws often don't specify this information, or the laws simply aren't available (we're relying on secondary sources), and in these cases I just made up internal details like stdoff to make the POSIX-visible timestamp info correct. APIs should not be exposing juryrigged internal details to the user, as many of these details are simply my invention and do not reflect known legislation.
On Jan 7, 2024, at 12:01 PM, Brooks Harris <brooks@edlmax.com> wrote:
Right, but I think there should be. Posix cannot distinguish an stdoff shift independent from a gmtoff shift.
So presumably: "stdoff shift" is short for "a shift in the offset between UTC and standard time", "standard time" being what is specified as such by law; "gmtoff shift" is short for "a shift in the offset between civil time and standard time". Given that, not all shifts in the offset between UTC and civil time" are "gmtoff shifts", which is a bit confusing, given that "gmtoff" sounds as if it's an offset from "GMT" or UTC. Not all "gmtoff shifts" constitute transitions to or from "daylight saving time", as Morocco, for example, has on some occasions introduced daylight saving time, but also shifts the clock during Ramadan.
For example, there is a STDOFF shift at 1971 Oct 31 2:00u from 1:00 to 0:00 (a West shift) in London:
# Zone NAME STDOFF RULES FORMAT [UNTIL] Zone Europe/London -0:01:15 - LMT 1847 Dec 1 0:00 GB-Eire %s 1968 Oct 27 1:00 - BST 1971 Oct 31 2:00u 0:00 GB-Eire %s 1996 0:00 EU GMT/BST
Zic and TzIf reflect this change as a shift in gmtoff, not stdoff:Zic and TzIf reflect this change as a shift in gmtoff, not stdoff:
57722399 1971-10-31 02:59:59 isdst 0 gmtoff 3600 stdoff 0 BST 57722400 1971-10-31 02:00:00 isdst 0 gmtoff 0 stdoff 0 GMT
RFC 8536: https://datatracker.ietf.org/doc/html/rfc8536 describes TZif format (to use the capitalization in the RFC) up to version 3 (version 4 is not documented in the RFC, but it's a small change to leap second handling, documented in the tzfile man page). According to it, the local time type records contain a field named utoff (presumably changed from gmtoff), which contains "a four-octet signed integer specifying the number of seconds to be added to UT in order to determine local time" for that local time type, a Boolean field that indicates whether that local time type is to be considered DST or not, and a field that is an index into the table of time zone designation strings for the designation string to be used for that local time type. Transitions are represented in the file by two parallel arrays (rather than an array of structures), one containing the UNIX time of the transition and once containing the local time type to which local time is transitioning. The local time type includes an offset from UTC and a "is this DST?" flag. Neither of those directly provide a "offset of standard time from UTC" value; at best, the "is this DST?" flag can be used to *infer* the standard time offset. The transition in question is a transition from British Standard Time, which had a +1 hour offset from GMT, to GMT. The "before" local time type has a utoff of +1 hour, an "is this DST?" flag of 0 (because DST wasn't observed at that point), and a designation string index that points to "BSD". The "after" local time has a utoff of 0, an "is this DST?" flag of 0 (because it was in October, when DST was not in effect - DST wouldn't resume until next spring), and a destination string index that points to "GMT". I don't know what program produced the output you're showing; how did it incorrectly infer that the offset between standard time and UTC before the transition was 0 rather than +1 hour (3600 seconds)?
That's what I mean by "adjusted" for Posix sake. It gives the proper UTC offset, yes, but not for the right reason. The underlying reason was an STDOFF shift, presumably stated in the law behind it.
I don't know what "it' is here; it's not the TZif file format, as it does not contain anything that specifies a reason for a transition; at most, it provides a value to which to set tm_isdst and a value to indicate what time zone designation strings to use. Reasons for the transitions can only be guessed at from those values, and code may well incorrectly guess the reason. (The *correct* guess for the reasons for the transition at midnight on 1968-10-27 is "the offset between current local time and UTC changed from 0:00:00 to +1:00:00 and the designation string changed from GMT to BST", and the *correct* guess for the reasons for the transition at 2:00 UTC on 1971-10-31 are "the offset between current local time and UTC changed from +1:00:00 to 0:00:00 and the designation string changed from BST to GMT". Neither one is "the" reason; a change in the current local time, in whether tm_isdst should be set to 0 or 1, or in the designation string are sufficient to cause a transition.)
TzDb source has been careful to try to honor the laws and customs behind the local time rules, and I strongly support that policy.
What do you mean by "honor"? Do you mean "give the correct results when, for example, localtime() is called", with the local time being the result of law and the designation string being a result of custom in some cases (in other cases, there doesn't appear to be a custom, and it's just given as an offset from UTC). Or do you mean "specifies the offset of local standard time from UTC in Zone lines and the offset of current local time from local standard time in Rule lines", in which case, as far as I know, that's more a case of notational convenience for people writing zic source files and a matter of honoring laws and customs; laws and customs affected that part of the syntax to the extent that the chosen syntax means you have to do less work to translate laws to zic file text.
I would presume this custom originates with the USA laws that specifically define the time zones as offsets from UTC and the optional one-hour DST shifts. These are the familiar behaviors of time zones where DST is "observed".
I'm not sure to which "custom" you're referring, but, if you mean "TZif files don't separately provide the local standard time and current local time offsets from UTC", that probably originated with "damn, the US DST rules changed *again* - thanks, Clorox! - and the way UNIXes handle converting to local time will have to be fixed, which involves changing a table in a library and recompiling the library and relinking all programs that use the library; can we come up with something better?", which was the origin of the tzdb project. What 15 U.S. Code section 261 - Zones for standard time; interstate or foreign commerce: https://www.govtrack.us/congress/bills/89/s1404/text states is that For the purpose of establishing the standard time of the United States, the territory of the United States shall be divided into nine zones in the manner provided in this section. Except as provided in section 260a(a) of this title, the standard time of the first zone shall be Coordinated Universal Time retarded by 4 hours; that of the second zone retarded by 5 hours; that of the third zone retarded by 6 hours; that of the fourth zone retarded by 7 hours; that of the fifth zone retarded [1] 8 hours; that of the sixth zone retarded by 9 hours; that of the seventh zone retarded by 10 hours; that of the eighth zone retarded by 11 hours; and that of the ninth zone shall be Coordinated Universal Time advanced by 10 hours. which specifies the offsets from UTC for standard time in the time zones of the US, and what 15 U.S. Code § 260a - Advancement of time or changeover dates: https://www.law.cornell.edu/uscode/text/15/260a#a states in subparagraph (a) is that During the period commencing at 2 o’clock antemeridian on the second Sunday of March of each year and ending at 2 o’clock antemeridian on the first Sunday of November of each year, the standard time of each zone established by sections 261 to 264 of this title, as modified by section 265 of this title, shall be advanced one hour and such time as so advanced shall for the purposes of such sections 261 to 264, as so modified, be the standard time of such zone during such period; however, (1) any State that lies entirely within one time zone may by law exempt itself from the provisions of this subsection providing for the advancement of time, but only if that law provides that the entire State (including all political subdivisions thereof) shall observe the standard time otherwise applicable during that period, and (2) any State with parts thereof in more than one time zone may by law exempt either the entireState as provided in (1) or may exempt the entire area of the State lying within any time zone. which specifies that, unless some state says "no, thanks" to DST, the clocks get turned forward by 1 hour on the second Sunday of March at 2AM and turned back on the first Sunday of November at 2AM. So section 261 "specifically [defines] the time zones as offsets from UTC" and section 260a(a) describes the "optional one-hour DST shifts". That's not what TZif files do; they specify, for local time zone types, the *current* offset from UTC for that local time zone type, including shifts for DST or Ramadan or any other such change.
I'm involved with several standards projects at the Society of Motion Picture and Television Engineers (SMPTE). The television and broadcast industries have used "SMPTE Timecode" (defined primarily by a standard named ST12-1) for decades, since the mid 1970s. SMPTE Timecode is used in very many ways, from labeling of video frames to synchronization and essentially glues the timekeeping of the whole industry together.
SMPTE Timecode was derived from IRIG and is quite similar to the formats used by WWV radio and others with addition of video-related metadata.
Timecode is used all over the world, where various video rates are used. For example most of Europe uses 1/25 (25 frames-per-second). These systems have an exact relation to running time in seconds and are sometimes called "integral rate". But the USA, Canada, Japan and others use the 'strange' rate off 30000/1001, called a "non-integral rate". This has all sorts of algorithmic ramifications and implementations must be careful with their math. But this is well-know in the industry. Meantime there is the matter of audio synchronization of various frequencies with video. A typical audio frequency is 48000/1.
If you're interested in more detail about video and audio rates and SMPTE Timecode you might read:
Conversion of Audio Samples to Video Frames Brooks Harris, May 8, 1998 http://edlmax.com/AudioNTSC.htm
Conversion between SMPTE hh:mm:ss:ff Time Code and Frames Brooks Harris, EdlMax, LLC. Version V4 2015-04-04 http://edlmax.com/SMPTETimeCodeConversion.htm
In broadcast television it's important to have the hh:mm:ss portion of the timecode run as close as possible to local time. But SMPTE timecode and equipment that depends on it cannot tolerate a discontinuity in the hh:mm:ss:video-frame sequence. To work around this limitation the industry has typically adopted a procedure called "Daily Jam". Here, DST shifts, STDOFF shifts and leap-second shifts are deferred to some local time and the hh:mm:ss:video-frame values are reset ("jammed") to match the local wall-clock time. This is an imperfect solution which can produce a "glitch" in the hh:mm:ss:video-frame sequence, especially for systems running at "non-integer" rates. Therefore the "Daily Jam" is typically instituted at some local time least likely to disrupt normal operations. In the USA this time is typically 02:00:00 (2am). Daily Jam is imperfect but an effective work-around that has been used for decades.
Hey, I'm old enough to remember when TV stations signed off for the night. That would have worked fine back then.
In recent years SMPTE has developed several standards related to packet-based network "streaming" of video and audio. They've elected to use IEEE Precision Time Protocol as the primary synchronization reference. The standards called ST2059-1 and ST2059-2 in particular define the relation of timecode to PTP time. These also codify the use of Daily Jam procedures.
User requirements call for use of local time together with video frame and audio-sample accuracy. And it is this set of challenges that have brought me (kicking and screaming :-) ) into the wider world of timekeeping and so to Tz Database.
In 2019 I was invited to make a presentation to ION/PTTI: Accurate Local Timestamps, Brooks Harris https://www.ion.org/publications/abstract.cfm?articleID=16763
In this I pointed out that typical timestamps were insufficient to represent truly unambiguous local time. If we have two ISO 8601 timestamps with date, time and UTC offset: 2023-07-01 00:01:23 -7:00 2023-07-01 00:04:56 -7:00 We can normalize to UTC and determine that the first precedes the second; precedence is established. Great! This, of course, is the first of most important aspect of timekeeping. This was the whole point to Posix-time to begin with; maintain event precedence within the system.
Or, rather, to provide a monotonic scale for time-stamping events - the goals are 1) allow conversion to local time and 2) maintain monotonicity. (Leap seconds, and the POSIX choice to mandate 86400-second days, made monotonicity a bit tricky, but I digress....)
But this entirely misses the fact that the two timestamps may have come from two different time zones. If you add the time zone it becomes much clearer what the meaning of the two events are: 2023-07-01 00:01:23 -7:00 America/Los_Angeles 2023-07-01 00:04:56 -7:00 America/Phoenix
But this does not signal that DST was in effect in America/Los_Angeles and not observed in America/Phoenix. You could add an "isdst" flag: 2023-07-01 00:01:23 -7:00 America/Los_Angeles DST 2023-07-01 00:04:56 -7:00 America/Phoenix STD That works (more-or-less) in these two typical and familiar cases.
So presumably there's a requirement for those representations of local time other than "allow local time to be displayed" and "be convertible to UTC", as that's true of that time stamp format without either the timezone ID or a DST/STD indication.' (Presumably "DST" means "not STD" rather than "daylight saving time", as 1) Morocco shifts its clocks for Ramadan but that's not DST and 2) Ireland's *summer* time is standard time.)
But when you get to other time zones that simple logic may not hold up. My favorite "kryptonite" example is Europe/Moscow:
...
Take the transition at 1991-03-31 02:00:00 3:00 Russia MSK/MSD 1991 Mar 31 2:00s 2:00 Russia EE%sT 1992 Jan 19 2:00s
This is an STDOFF shift with a simultaneous offsetting DST shift. As Time And Date puts it:
Mar 31 No change, Mar 31, 1991 - Daylight Saving Time Started DST started on Sunday, March 31, 1991, 2:00:00 am. However, clocks were not changed because Moscow switched time zones at the same time. https://www.timeanddate.com/time/change/russia/moscow?year=1991
This is a "no-op" transition (the YMDhms sequence is not interrupted) but several important facts changed; both STDOFF and DST shifted, and the Abbreviation changed.
The fact that the designation string (which, in this case, does happen to be an abbreviation, although that is not guaranteed) changed renders it *not* a "no-op" transition from zic's point of view.
Zic also sets the isdst flag:
670373999 1991-03-31 01:59:59 isdst 0 gmtoff 10800 stdoff 10800 MSK 670374000 1991-03-31 02:00:00 isdst 1 gmtoff 10800 stdoff 10800 EEST
...which means that the isdst flag also changed, further making it not a "no-op" transition from pic's point of view. However, it reports "EEST", not "EEDT", so *that's* probably a bug. tzname[] is set to { "MSK", "EEST" } after localtime() is called on 670374000, so the timezone code in macOS 13.6, at least, has this bug in it. It should probably have been set to { "EEST", "EEDT" }.
There is another objective in my work. In video we have always been able to represent any time-point within the 24-hour range, important for editorial adjustments a duration calculation. To do this comprehensively with local time you need to know that this day includes a transition and what and when that transition occurs.
Why? Why does the local time, plus the offset from UTC of local time at that instant, not suffice to represent any time?
On 2024-01-07 16:51, Guy Harris wrote:
https://datatracker.ietf.org/doc/html/rfc8536
describes TZif format (to use the capitalization in the RFC) up to version 3 (version 4 is not documented in the RFC, but it's a small change to leap second handling, documented in the tzfile man page).
According to it, the local time type records contain a field named utoff (presumably changed from gmtoff)
Yes, that's right.
Zic also sets the isdst flag:
670373999 1991-03-31 01:59:59 isdst 0 gmtoff 10800 stdoff 10800 MSK 670374000 1991-03-31 02:00:00 isdst 1 gmtoff 10800 stdoff 10800 EEST
...which means that the isdst flag also changed, further making it not a "no-op" transition from pic's point of view.
However, it reports "EEST", not "EEDT", so *that's* probably a bug.
The Rule line says "S" so I think "EEST" is correct there. It is part of an EET/EEST pair.
tzname[] is set to { "MSK", "EEST" } after localtime() is called on 670374000, so the timezone code in macOS 13.6, at least, has this bug in it. It should probably have been set to { "EEST", "EEDT" }. This is an area where current POSIX, though self-consistent, is a poor match for TZDB. POSIX assumes that there is just one standard time and at most one DST, so there are at most two abbreviations and they can be put into tzname[0] and tzname[1]. This assumption is incorrect for TZDB, where there can be many abbreviations for standard time (for different timestamps of course), and likewise for DST.
If memory serves, tzcode addresses this by putting the most recently-used standard-time abbreviation into tzname[0], and the most recently-used DST abbreviation into tzname[1]. That's consistent with the macOS behavior you reported. For timezones specified by POSIX TZ strings this conforms to POSIX. But for timezones like Europe/Moscow, this is a thread-unsafe mess -- which is partly why tm_zone was invented and it's also why theory.html advises against the use of tzname. I suspect that the draft next POSIX doesn't address this issue, though it should.
On Jan 7, 2024, at 5:14 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
The Rule line says "S" so I think "EEST" is correct there. It is part of an EET/EEST pair.
I.e., "S" for "summer" rather than for "standard".
If memory serves, tzcode addresses this by putting the most recently-used standard-time abbreviation into tzname[0], and the most recently-used DST abbreviation into tzname[1]. That's consistent with the macOS behavior you reported. For timezones specified by POSIX TZ strings this conforms to POSIX. But for timezones like Europe/Moscow, this is a thread-unsafe mess -- which is partly why tm_zone was invented and it's also why theory.html advises against the use of tzname.
I suspect that the draft next POSIX doesn't address this issue, though it should.
It has tm_zone, so I guess they address it in that fashion. tzname[] is still there, although they note that if another thread does tzset() or something that behaves as if tzset() were called, the behavior of accesses to tzname[] is undefined, so they address that part by say "yup, it's thread-unsafe, Here Be Dragons". The only lack of thread safety for tm_zone with localtime_r() is "If the tm structure member tm_zone is accessed after the value of TZ is subsequently modified, the behaviour is undefined."
Brooks Harris via tz said:
For example, there is a STDOFF shift at 1971 Oct 31 2:00u from 1:00 to 0:00 (a West shift) in London:
# Zone NAME STDOFF RULES FORMAT [UNTIL] Zone Europe/London -0:01:15 - LMT 1847 Dec 1 0:00s 0:00 GB-Eire %s 1968 Oct 27 1:00 - BST 1971 Oct 31 2:00u 0:00 GB-Eire %s 1996 0:00 EU GMT/BST
Zic and TzIf reflect this change as a shift in gmtoff, not stdoff:
57722399 1971-10-31 02:59:59 isdst 0 gmtoff 3600 stdoff 0 BST 57722400 1971-10-31 02:00:00 isdst 0 gmtoff 0 stdoff 0 GMT
That's what I mean by "adjusted" for Posix sake. It gives the proper UTC offset, yes, but not for the right reason. The underlying reason was an STDOFF shift, presumably stated in the law behind it.
The reason in this case was that the British Standard Time Act 1968 c.45 contained a sunset (sorry) clause: 4(2) Sections 1 to 3 of the Act shall expire at two o'clock, Greenwich Mean Time, in the morning of 31st October 1971 unless made permanent under subsection (3) below; An attempt to make it permanent was defeated in the House of Commons on 1970-12-02 by 81 votes to 366. See Hansard HC Deb 02 December 1970 vol 807 cc1331-422. Therefore 4(2) came into effect and on that date the legal time changed from that specified in the British Standard Time Act (GMT+1) to that specified in the [1880 c.9 (43 & 44 Vict.).] Statutes (Definition of Time) Act 1880 (GMT) as modified by the Summer Time Acts 1922 to 1947. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
On 1/7/24 18:08, Guy Harris wrote:
tzname[] is still there, although they note that if another thread does tzset() or something that behaves as if tzset() were called, the behavior of accesses to tzname[] is undefined
There's another problem with tzname that I forgot to mention. If you run localtime and it returns tm_isdt==0, then tzname[0] is well-defined but tzname[1] is not. This is true even in single-threaded apps. Similarly, tzname[0] is not well defined when localtime returns tm_isdst>0. For example, suppose TZ="America/Los_Angeles", T is during December 1, 1945, and we run localtime(&T). tzname[0] must be "PST", but tzname[1] might plausibly be "PPT" (the most recently-observed DST), and it might plausibly be "PDT" (the nearest DST in the future). The TZif file doesn't tell you which is correct. In this particular case the .zi file "northamerica" suggests that "PPT" is correct, but this detail is something I invented when I wrote that part of the .zi file. It doesn't have anything to do with US or California laws that I know about. This sort of invention is not something that I worried about when I invented it, because it didn't affect any info visible via the POSIX APIs, so I didn't bother to comment this sort of thing when it occurs in the data, which it does many times. As I recall, tzcode doesn't update tzname[1] when tm_isdst==0, so tzname[1] simply has whatever value it had before localtime was called. Anyway, whatever value tzname[1] has, it's garbage if TZDB Zones are being used. It's a misfeature of the POSIX API (or any other API) to expose this information, because there is no "correct" value and nobody really needs a "correct" value anyway.
On 1/7/2024 7:51 PM, Guy Harris wrote:
On Jan 7, 2024, at 12:01 PM, Brooks Harris<brooks@edlmax.com> wrote:
Right, but I think there should be. Posix cannot distinguish an stdoff shift independent from a gmtoff shift. So presumably:
"stdoff shift" is short for "a shift in the offset between UTC and standard time", "standard time" being what is specified as such by law;
"gmtoff shift" is short for "a shift in the offset between civil time and standard time".
Given that, not all shifts in the offset between UTC and civil time" are "gmtoff shifts", which is a bit confusing, given that "gmtoff" sounds as if it's an offset from "GMT" or UTC.
stdoff is the "standard" offset from UTC of UT *without DST*. gmtoff is the offset from UT or UTC *with DST*. There is no "dstoff" to signal the DST value in effect, which is usually 1-hour but can be negative (Dublin), "double summertime" or possibly some other value. This is where the isdst flag is insufficient to cover those not 1-hour cases. The term "standard" is some what ambiguous in general but understood within the TzDb context as the "normal" or "base" offset from UT or UTC
Not all "gmtoff shifts" constitute transitions to or from "daylight saving time", as Morocco, for example, has on some occasions introduced daylight saving time, but also shifts the clock during Ramadan.
Yes. Which brings up the terms "spring forward" and "fall back" which imply 1-hour shifts in the spring and fall, but this doesn't work for the four transitions in Morocco and elsewhere or for negative DST.
For example, there is a STDOFF shift at 1971 Oct 31 2:00u from 1:00 to 0:00 (a West shift) in London:
# Zone NAME STDOFF RULES FORMAT [UNTIL] Zone Europe/London -0:01:15 - LMT 1847 Dec 1 0:00 GB-Eire %s 1968 Oct 27 1:00 - BST 1971 Oct 31 2:00u 0:00 GB-Eire %s 1996 0:00 EU GMT/BST
Zic and TzIf reflect this change as a shift in gmtoff, not stdoff:Zic and TzIf reflect this change as a shift in gmtoff, not stdoff:
57722399 1971-10-31 02:59:59 isdst 0 gmtoff 3600 stdoff 0 BST 57722400 1971-10-31 02:00:00 isdst 0 gmtoff 0 stdoff 0 GMT RFC 8536:
https://datatracker.ietf.org/doc/html/rfc8536
describes TZif format (to use the capitalization in the RFC) up to version 3 (version 4 is not documented in the RFC, but it's a small change to leap second handling, documented in the tzfile man page).
According to it, the local time type records contain a field named utoff (presumably changed from gmtoff), which contains "a four-octet signed integer specifying the number of seconds to be added to UT in order to determine local time" for that local time type, a Boolean field that indicates whether that local time type is to be considered DST or not, and a field that is an index into the table of time zone designation strings for the designation string to be used for that local time type.
Transitions are represented in the file by two parallel arrays (rather than an array of structures), one containing the UNIX time of the transition and once containing the local time type to which local time is transitioning.
The local time type includes an offset from UTC and a "is this DST?" flag. Neither of those directly provide a "offset of standard time from UTC" value; at best, the "is this DST?" flag can be used to *infer* the standard time offset.
The transition in question is a transition from British Standard Time, which had a +1 hour offset from GMT, to GMT. The "before" local time type has a utoff of +1 hour, an "is this DST?" flag of 0 (because DST wasn't observed at that point), and a designation string index that points to "BSD". The "after" local time has a utoff of 0, an "is this DST?" flag of 0 (because it was in October, when DST was not in effect - DST wouldn't resume until next spring), and a destination string index that points to "GMT".
I don't know what program produced the output you're showing; how did it incorrectly infer that the offset between standard time and UTC before the transition was 0 rather than +1 hour (3600 seconds)? This is output from my modified version of zdump reading the TZIf output of my modified version of zic. These include a modified version of struct tm which I've renamed "struct tztm", to which I've added long int tm_stdoff which is populated by the values of STDOFF from the TzDb source files. This tm_stdoff value is also added to the TZIf file data.
struct tztm { int tm_sec; /* Seconds. [0-60] (1 leap second) */ int tm_min; /* Minutes. [0-59] */ int tm_hour; /* Hours. [0-23] */ int tm_mday; /* Day. [1-31] */ int tm_mon; /* Month. [0-11] */ int tm_year; /* Year - 1900. */ int tm_wday; /* Day of week. [0-6] */ int tm_yday; /* Days in year.[0-365] */ int tm_isdst; /* DST. [-1/0/1]*/ long int tm_gmtoff; /* Seconds east of UTC. */ const char *tm_zone; /* Timezone abbreviation. */ long int tm_stdoff; // CCT addition <<<<<<<<<<<<<<<<<<< }; So, in this example stdoff is 0:00 on both sides of the transition. 0:00 GB-Eire %s 1996 0:00 EU GMT/BST
That's what I mean by "adjusted" for Posix sake. It gives the proper UTC offset, yes, but not for the right reason. The underlying reason was an STDOFF shift, presumably stated in the law behind it. I don't know what "it' is here; it's not the TZif file format, as it does not contain anything that specifies a reason for a transition; at most, it provides a value to which to set tm_isdst and a value to indicate what time zone designation strings to use. Reasons for the transitions can only be guessed at from those values, and code may well incorrectly guess the reason. (The *correct* guess for the reasons for the transition at midnight on 1968-10-27 is "the offset between current local time and UTC changed from 0:00:00 to +1:00:00 and the designation string changed from GMT to BST", and the *correct* guess for the reasons for the transition at 2:00 UTC on 1971-10-31 are "the offset between current local time and UTC changed from +1:00:00 to 0:00:00 and the designation string changed from BST to GMT". Neither one is "the" reason; a change in the current local time, in whether tm_isdst should be set to 0 or 1, or in the designation string are sufficient to cause a transition.)
TzDb source has been careful to try to honor the laws and customs behind the local time rules, and I strongly support that policy. What do you mean by "honor"? Do you mean "give the correct results when, for example, localtime() is called", with the local time being the result of law and the designation string being a result of custom in some cases (in other cases, there doesn't appear to be a custom, and it's just given as an offset from UTC). Or do you mean "specifies the offset of local standard time from UTC in Zone lines and the offset of current local time from local standard time in Rule lines", in which case, as far as I know, that's more a case of notational convenience for people writing zic source files and a matter of honoring laws and customs; laws and customs affected that part of the syntax to the extent that the chosen syntax means you have to do less work to translate laws to zic file text. Right. As Paul said in another part of this thread, he "invents" things where necessary or convenient. Fine. But in general these seem to try to follow official laws if and where they exist and are clear.
I would presume this custom originates with the USA laws that specifically define the time zones as offsets from UTC and the optional one-hour DST shifts. These are the familiar behaviors of time zones where DST is "observed". I'm not sure to which "custom" you're referring, but, if you mean "TZif files don't separately provide the local standard time and current local time offsets from UTC", that probably originated with "damn, the US DST rules changed *again* - thanks, Clorox! - and the way UNIXes handle converting to local time will have to be fixed, which involves changing a table in a library and recompiling the library and relinking all programs that use the library; can we come up with something better?", which was the origin of the tzdb project. Right. Perhaps "custom" is a misleading word. I meant that Posix seems to support USA rules ok but is not complete for many other time zones that act differently than the rules we US-based people are most familiar with.
What 15 U.S. Code section 261 - Zones for standard time; interstate or foreign commerce:
https://www.govtrack.us/congress/bills/89/s1404/text
states is that
For the purpose of establishing the standard time of the United States, the territory of the United States shall be divided into nine zones in the manner provided in this section. Except as provided in section 260a(a) of this title, the standard time of the first zone shall be Coordinated Universal Time retarded by 4 hours; that of the second zone retarded by 5 hours; that of the third zone retarded by 6 hours; that of the fourth zone retarded by 7 hours; that of the fifth zone retarded [1] 8 hours; that of the sixth zone retarded by 9 hours; that of the seventh zone retarded by 10 hours; that of the eighth zone retarded by 11 hours; and that of the ninth zone shall be Coordinated Universal Time advanced by 10 hours.
which specifies the offsets from UTC for standard time in the time zones of the US, and what 15 U.S. Code § 260a - Advancement of time or changeover dates:
https://www.law.cornell.edu/uscode/text/15/260a#a
states in subparagraph (a) is that
During the period commencing at 2 o’clock antemeridian on the second Sunday of March of each year and ending at 2 o’clock antemeridian on the first Sunday of November of each year, the standard time of each zone established by sections 261 to 264 of this title, as modified by section 265 of this title, shall be advanced one hour and such time as so advanced shall for the purposes of such sections 261 to 264, as so modified, be the standard time of such zone during such period; however, (1) any State that lies entirely within one time zone may by law exempt itself from the provisions of this subsection providing for the advancement of time, but only if that law provides that the entire State (including all political subdivisions thereof) shall observe the standard time otherwise applicable during that period, and (2) any State with parts thereof in more than one time zone may by law exempt either the entireState as provided in (1) or may exempt the entire area of the State lying within any time zone.
which specifies that, unless some state says "no, thanks" to DST, the clocks get turned forward by 1 hour on the second Sunday of March at 2AM and turned back on the first Sunday of November at 2AM.
So section 261 "specifically [defines] the time zones as offsets from UTC" and section 260a(a) describes the "optional one-hour DST shifts".
That's not what TZif files do; they specify, for local time zone types, the *current* offset from UTC for that local time zone type, including shifts for DST or Ramadan or any other such change. Right. gmtoff is the sum of STDOFF and the current DST value. Thus STDOFF and DST are confounded.
I would point out the ambiguity of the term "standard". Those laws say it is "standard time" with or without DST. " ...the standard time of each zone ... as modified by ... shall be advanced one hour and such time as so advanced shall ... be the standard time of such zone during such period" This is consistent with how ISO 8601 defines "standard". But the common way of referring to local time, at least in the USA, is to say, for example, "Eastern Standard Time" or "Eastern Daylight Time". This is more descriptive and familiar to many of us but its actually in contradiction to the USA law as written.
I'm involved with several standards projects at the Society of Motion Picture and Television Engineers (SMPTE). The television and broadcast industries have used "SMPTE Timecode" (defined primarily by a standard named ST12-1) for decades, since the mid 1970s. SMPTE Timecode is used in very many ways, from labeling of video frames to synchronization and essentially glues the timekeeping of the whole industry together.
SMPTE Timecode was derived from IRIG and is quite similar to the formats used by WWV radio and others with addition of video-related metadata.
Timecode is used all over the world, where various video rates are used. For example most of Europe uses 1/25 (25 frames-per-second). These systems have an exact relation to running time in seconds and are sometimes called "integral rate". But the USA, Canada, Japan and others use the 'strange' rate off 30000/1001, called a "non-integral rate". This has all sorts of algorithmic ramifications and implementations must be careful with their math. But this is well-know in the industry. Meantime there is the matter of audio synchronization of various frequencies with video. A typical audio frequency is 48000/1.
If you're interested in more detail about video and audio rates and SMPTE Timecode you might read:
Conversion of Audio Samples to Video Frames Brooks Harris, May 8, 1998 http://edlmax.com/AudioNTSC.htm
Conversion between SMPTE hh:mm:ss:ff Time Code and Frames Brooks Harris, EdlMax, LLC. Version V4 2015-04-04 http://edlmax.com/SMPTETimeCodeConversion.htm
In broadcast television it's important to have the hh:mm:ss portion of the timecode run as close as possible to local time. But SMPTE timecode and equipment that depends on it cannot tolerate a discontinuity in the hh:mm:ss:video-frame sequence. To work around this limitation the industry has typically adopted a procedure called "Daily Jam". Here, DST shifts, STDOFF shifts and leap-second shifts are deferred to some local time and the hh:mm:ss:video-frame values are reset ("jammed") to match the local wall-clock time. This is an imperfect solution which can produce a "glitch" in the hh:mm:ss:video-frame sequence, especially for systems running at "non-integer" rates. Therefore the "Daily Jam" is typically instituted at some local time least likely to disrupt normal operations. In the USA this time is typically 02:00:00 (2am). Daily Jam is imperfect but an effective work-around that has been used for decades. Hey, I'm old enough to remember when TV stations signed off for the night. That would have worked fine back then.
It was used back then, and still used today. In those old days they actually "jammed", interrupted, the vertical sync frequency itself together with the timecode values, and this put a vertical roll into the broadcast video. Today the frequency is maintained constant and only the timecode values are "jammed" (reset).
In recent years SMPTE has developed several standards related to packet-based network "streaming" of video and audio. They've elected to use IEEE Precision Time Protocol as the primary synchronization reference. The standards called ST2059-1 and ST2059-2 in particular define the relation of timecode to PTP time. These also codify the use of Daily Jam procedures.
User requirements call for use of local time together with video frame and audio-sample accuracy. And it is this set of challenges that have brought me (kicking and screaming :-) ) into the wider world of timekeeping and so to Tz Database.
In 2019 I was invited to make a presentation to ION/PTTI: Accurate Local Timestamps, Brooks Harris https://www.ion.org/publications/abstract.cfm?articleID=16763
In this I pointed out that typical timestamps were insufficient to represent truly unambiguous local time. If we have two ISO 8601 timestamps with date, time and UTC offset: 2023-07-01 00:01:23 -7:00 2023-07-01 00:04:56 -7:00 We can normalize to UTC and determine that the first precedes the second; precedence is established. Great! This, of course, is the first of most important aspect of timekeeping. This was the whole point to Posix-time to begin with; maintain event precedence within the system. Or, rather, to provide a monotonic scale for time-stamping events - the goals are 1) allow conversion to local time and 2) maintain monotonicity.
(Leap seconds, and the POSIX choice to mandate 86400-second days, made monotonicity a bit tricky, but I digress....)
Quite. The leap-second is evil.
But this entirely misses the fact that the two timestamps may have come from two different time zones. If you add the time zone it becomes much clearer what the meaning of the two events are: 2023-07-01 00:01:23 -7:00 America/Los_Angeles 2023-07-01 00:04:56 -7:00 America/Phoenix
But this does not signal that DST was in effect in America/Los_Angeles and not observed in America/Phoenix. You could add an "isdst" flag: 2023-07-01 00:01:23 -7:00 America/Los_Angeles DST 2023-07-01 00:04:56 -7:00 America/Phoenix STD That works (more-or-less) in these two typical and familiar cases. So presumably there's a requirement for those representations of local time other than "allow local time to be displayed" and "be convertible to UTC", as that's true of that time stamp format without either the timezone ID or a DST/STD indication.'
Yes. Imagine for example a news organization collecting news feeds from cameras in Los Angeles, Washington DC, Johannesburg, Taipei, or anywhere else. You really need to know the local time when and where an event happened, not just its UTC time.
(Presumably "DST" means "not STD" rather than "daylight saving time", as 1) Morocco shifts its clocks for Ramadan but that's not DST and 2) Ireland's *summer* time is standard time.)
Right. To my point that "standard" is somewhat ambiguous and not all time zones behave in the ways familiar to many of us, like in the USA. I've learned to be very careful not to impose these familiar biases on my implementations of local time. It just doesn't work the same way in many places.
But when you get to other time zones that simple logic may not hold up. My favorite "kryptonite" example is Europe/Moscow: ...
Take the transition at 1991-03-31 02:00:00 3:00 Russia MSK/MSD 1991 Mar 31 2:00s 2:00 Russia EE%sT 1992 Jan 19 2:00s
This is an STDOFF shift with a simultaneous offsetting DST shift. As Time And Date puts it: Mar 31 No change, Mar 31, 1991 - Daylight Saving Time Started DST started on Sunday, March 31, 1991, 2:00:00 am. However, clocks were not changed because Moscow switched time zones at the same time. https://www.timeanddate.com/time/change/russia/moscow?year=1991
This is a "no-op" transition (the YMDhms sequence is not interrupted) but several important facts changed; both STDOFF and DST shifted, and the Abbreviation changed. The fact that the designation string (which, in this case, does happen to be an abbreviation, although that is not guaranteed) changed renders it *not* a "no-op" transition from zic's point of view.
Zic also sets the isdst flag:
670373999 1991-03-31 01:59:59 isdst 0 gmtoff 10800 stdoff 10800 MSK 670374000 1991-03-31 02:00:00 isdst 1 gmtoff 10800 stdoff 10800 EEST ...which means that the isdst flag also changed, further making it not a "no-op" transition from pic's point of view.
However, it reports "EEST", not "EEDT", so *that's* probably a bug. tzname[] is set to { "MSK", "EEST" } after localtime() is called on 670374000, so the timezone code in macOS 13.6, at least, has this bug in it. It should probably have been set to { "EEST", "EEDT" }.
Maybe. But that's what comes out today, and that seems sufficient for the Posix purposes. It might not be "correct", I guess?
There is another objective in my work. In video we have always been able to represent any time-point within the 24-hour range, important for editorial adjustments a duration calculation. To do this comprehensively with local time you need to know that this day includes a transition and what and when that transition occurs. Why? Why does the local time, plus the offset from UTC of local time at that instant, not suffice to represent any time?
With SMPTE timecode one can represent any time-point within the 24-hour range. SMPTE timecode includes flags for the "count mode" (non-drop-frame and drop-frame). There is no discontinuity in the hh:mm:ss:frames counting sequence. So with a single SMPTE timecode an application can "trim" forward or back and calculate durations from point to point along the 24-hour timeline. There is often no relation to actual local time-of-day, just a count from zero to 24 hours. This is very typical in many scenarios, especially post-production (editing). There is no need for access to any other metadata. I extend this idea to local time time-of-day representation. Some days have transitions, so my timestamp design carries sufficient information to signal "this day is a transition day", "it is a transition of x value (often DST shifts, sometimes STDOFF shifts)", and "the transition occurs at this time-of-day". Thus, an application that does not have access to any metadata (TZif or TzDb source file) can accurately represent any point during that day. Example - A "fall back" DST transition in America/New_York: D2022-11-06T00:00:00U-04Zamerica/new_yorkAedtV2021aL27S01t-01a02cMuX UTC 01667707227 D2022-11-06T01:59:59U-04Zamerica/new_yorkAedtV2021aL27S01t-01a02cMuX UTC 01667714426 D2022-11-06T01:00:00U-05Zamerica/new_yorkAestV2021aL27S00t-01a02cMuX UTC 01667714427 D2022-11-06T23:59:59U-05Zamerica/new_yorkAestV2021aL27S00t-01a02cMuX UTC 01667797226 ^^^^^^^^^^ DST transition metadata So with a single timestamp on a given day all points of that entire day can be represented without access to any additional metadata. Timekeeping is fun! Thanks, -Brooks
On 1/7/2024 5:48 PM, Paul Eggert wrote:
On 2024-01-07 12:01, Brooks Harris wrote:
Zic and TzIf reflect this change as a shift in gmtoff, not stdoff:
57722399 1971-10-31 02:59:59 isdst 0 gmtoff 3600 stdoff 0 BST 57722400 1971-10-31 02:00:00 isdst 0 gmtoff 0 stdoff 0 GMT
That's what I mean by "adjusted" for Posix sake. It gives the proper UTC offset, yes, but not for the right reason. The underlying reason was an STDOFF shift, presumably stated in the law behind it.
I'm still not following this, unfortunately. I'll try again.
The use cases you gave mostly involved comparing local-time timestamps when their UT offsets are known, and obviously gmtoff suffices for that. You mentioned one use case for which the POSIX API is ill-designed (namely, "find the next gmtoff transition"), but gmtoff suffices for that too: we don't need stdoff there either. There are couple intermingled topics in the thread.
------------------- A) Regarding if zone eras with UNTIL times with only a year designation should be included in zic output. I think they should be. Take for example America/New_York: # Zone NAME STDOFF RULES FORMAT [UNTIL] Zone America/New_York -4:56:02 - LMT 1883 Nov 18 12:03:58 -5:00 US E%sT 1920 -5:00 NYC E%sT 1942 -5:00 US E%sT 1946 -5:00 NYC E%sT 1967 -5:00 US E%sT There are four transitions at the 1st-of-the-year, each changing the RULES field. Why is this important? In my implementations I find the date of a transition, the transition type (STDOFF, dstoff, or leap-seconds), the value of the transition shift, and the time-of-day of the shift. Thus an application that has no access to metadata (TZIf, Posix-time or anything else) can confidently represent any point within that day from metadata contained in the timestamps themselves. I gave this example in a response to Guy's similar question in this thread: Example - A "fall back" DST transition in America/New_York: D2022-11-06T00:00:00U-04Zamerica/new_yorkAedtV2021aL27S01t-01a02cMuX UTC 01667707227 D2022-11-06T01:59:59U-04Zamerica/new_yorkAedtV2021aL27S01t-01a02cMuX UTC 01667714426 D2022-11-06T01:00:00U-05Zamerica/new_yorkAestV2021aL27S00t-01a02cMuX UTC 01667714427 D2022-11-06T23:59:59U-05Zamerica/new_yorkAestV2021aL27S00t-01a02cMuX UTC 01667797226 ^^^^ ^^^^^^^^^^ "gmtoff" DST transition metadata To accomplish this the application must search the transitions backwards (previous transition) to discover STDOFF shifts and forward (next transition) to discover if and when a DST or leap-second transition is to occur. If these 1st-of-the-year transitions are not included in the transition list the searches will return the wrong transition and fail. I've modified my implementation of outzone() and **optimize of writezone() to retain these 1st-of-the-year transitions. This results in their being included in the TZIf files but this does not alter the normal behavior localtime() because these are essentially "no-op" transitions. ------------------- B) Regarding stdoff, gmtoff, and Isdst gmtoff is the sum of STDOFF and any DST values in effect. This gives the appropriate and important offset to normalize to UTC but it confounds the two factors. I think the STDOFF value is critical. It essentially defines the (approximate) longitude of the idealized time zone independent of any DST shifts applied. And, regardless of longitude, it also defines the *base* offset from UTC in the time domain whether or not any DST shifts are in effect. As noted elsewhere, the isdst flag is insufficient in cases of "double summertime". (And nothing says you couldn't have "triple summertime " or "quadruple summertime"). You'd really like a "dstoff" variable to carry this value, including negative DST (Dublin and such). Then there is leap-seconds. This is a third offset factor indicating the leap-seconds value. Lets call it "lsoff". So you'd have stdoff + dstoff + lsoff = gmtoff. So in my view its not only a matter of arriving at gmtoff, which is critical of course, but that the other factors, stdoff, dstoff, and lsoff be included in the timestamp. To do this would significantly alter TZif, ideally would also be incorporated in Posix-time struct tm, and should probably also be added to ISO 8601 representations. I'm not sure if all this is feasible but I think if these topics are not addressed a truly unambiguous timestamp is not possible because the formats are incomplete. By the way, the term "standard" is somewhat ambiguous where in some places it may refer to "wall clock time" whether DST is in effect or not, as noted in another thread on the list. Its ok within the context of TzDb because we are familiar with its meaning. But I think there really should be a better term, such as "normal time", or "primary offset". Also, DST shifts might not actually refer to "daylight saving" shifts but might be used for some other reason. So, more generally, these might be seen as "secondary offsets" (secondary to the primary stdoff value).
Also, it's often not the case that the underlying reason was stated in the law behind it. The laws often don't specify this information, or the laws simply aren't available (we're relying on secondary sources), and in these cases I just made up internal details like stdoff to make the POSIX-visible timestamp info correct.
APIs should not be exposing juryrigged internal details to the user, as many of these details are simply my invention and do not reflect known legislation.
OK. From an interoperability point of view I see TzDb source data to be *the law*, whether backed by official documentation or invented for convenience or otherwise. Without your "inventions" a lot of this might not work. Thanks! Thanks, -Brooks
On 1/9/2024 4:45 PM, Doug Ewell wrote:
Brooks Harris wrote:
The leap-second is evil. Presumably in the same way that leap year is “evil,” and for the same reason.
-- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org Ah, no.
Ok, my comment was a bit smart aleck. The leap-second is "evil" for several reasons. Leap-seconds are introduced at irregular times to maintain approximation of observed solar time. They are not algorithmically predicable, requiring lookup of the metadata provided by IERS. Posix-time and many systems that have fixed 86400-second-days have no way to properly represent leap-seconds, positive or negative. You can't fit 86401 pegs in 86400 holes, nor fill all 86400 holes with 86399 pegs. This is the root of the great leap-second controversy, the incommensurability between UTC with leap-seconds and systems with fixed 86400-second-days, which has been raging since at least 1999. The leap year is algorithmically predicable and doesn't cause interoperability problems. -Brooks
On 1/9/2024 5:00 PM, Derick Rethans wrote:
On 9 January 2024 21:45:03 GMT, Doug Ewell via tz <tz@iana.org> wrote:
Brooks Harris wrote:
The leap-second is evil. Presumably in the same way that leap year is “evil,” and for the same reason.
I'd add DST to the evil pile too 😂
cheers Derick Hear! Hear! -Brooks
Derick Rethans wrote:
The leap-second is evil.
Presumably in the same way that leap year is “evil,” and for the same reason.
I'd add DST to the evil pile too 😂
I agree with this one, because — unlike the others — DST exists not to bring human-calculated time in line with the behavior of the earth, but rather to perpetuate an illusion of “longer days” when a more honest solution would be to adjust the civil time of events that depend on sunlight. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org
Brooks Harris wrote:
Leap-seconds are introduced at irregular times to maintain approximation of observed solar time. They are not algorithmically predicable, requiring lookup of the metadata provided by IERS.
That’s because the earth doesn’t jiggle and joggle uniformly. If it had instead been decided to introduce leap seconds on a regular, predictable 18- or 24-month cycle, based on the data we had in the early 1970s, we would be faced with inaccuracy in the opposite direction.
Posix-time and many systems that have fixed 86400-second-days have no way to properly represent leap-seconds, positive or negative. You can't fit 86401 pegs in 86400 holes, nor fill all 86400 holes with 86399 pegs.
You also can’t fit an earth year into a perpetual, integral number of 7-day weeks, or into 12 or 13 months of uniform length. The fallacy is in trying to fit those pegs into those holes at all.
The leap year is algorithmically predicable and doesn't cause interoperability problems.
For the next several thousand years, anyway, which is not our problem or that of our children, any more than the inaccuracy of the Julian calendar was many hundreds of years removed from being Caesar’s problem. Apologies for snarkiness. -- Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org
On Jan 9, 2024, at 1:10 PM, Brooks Harris <brooks@edlmax.com> wrote:
On 1/7/2024 7:51 PM, Guy Harris wrote:
On Jan 7, 2024, at 12:01 PM, Brooks Harris <brooks@edlmax.com> wrote:
Right, but I think there should be. Posix cannot distinguish an stdoff shift independent from a gmtoff shift.
So presumably:
"stdoff shift" is short for "a shift in the offset between UTC and standard time", "standard time" being what is specified as such by law;
"gmtoff shift" is short for "a shift in the offset between civil time and standard time".
Given that, not all shifts in the offset between UTC and civil time" are "gmtoff shifts", which is a bit confusing, given that "gmtoff" sounds as if it's an offset from "GMT" or UTC.
stdoff is the "standard" offset from UTC of UT *without DST*. gmtoff is the offset from UT or UTC *with DST*. There is no "dstoff" to signal the DST value in effect, which is usually 1-hour but can be negative (Dublin), "double summertime" or possibly some other value. This is where the isdst flag is insufficient to cover those not 1-hour cases.
In TZif files, there is neither stdoff nor dstoff, there's just gmtoff. In zic *source* files, there's stdoff (in the STDOFF column) and dstoff (in the SAVE column). There is no gmtoff; that's calculated by zic and put into the TZif file.
The term "standard" is some what ambiguous in general but understood within the TzDb context as the "normal" or "base" offset from UT or UTC
In particular, it's what the relevant laws deem to be "standard time"; that's why standard time is in effect in summer in Ireland.
Not all "gmtoff shifts" constitute transitions to or from "daylight saving time", as Morocco, for example, has on some occasions introduced daylight saving time, but also shifts the clock during Ramadan.
Yes. Which brings up the terms "spring forward" and "fall back" which imply 1-hour shifts in the spring and fall, but this doesn't work for the four transitions in Morocco and elsewhere or for negative DST.
So those terms should be avoided. (They don't imply 1-hour shifts; "spring forward an hour" and "fall back an hour" would.)
I don't know what program produced the output you're showing; how did it incorrectly infer that the offset between standard time and UTC before the transition was 0 rather than +1 hour (3600 seconds)?
This is output from my modified version of zdump reading the TZIf output of my modified version of zic. These include a modified version of struct tm which I've renamed "struct tztm", to which I've added long int tm_stdoff which is populated by the values of STDOFF from the TzDb source files. This tm_stdoff value is also added to the TZIf file data.
I.e., it's a modified version of dump reading the *modified-TZif* output of your modified version of zic. That should have been stated up front.
Right. Perhaps "custom" is a misleading word. I meant that Posix seems to support USA rules ok
*Current* USA rules; for earlier USA rules, see below.... The POSIX API has no problems with converting "seconds since the Epoch" to year/month/day/hour/minute/second in local time, even for local time in, for example, Morocco; the only issue with Ireland is "what does tm_isdst mean" - does it mean "time is shifted from standard time" or does it mean "time is set ahead for the summer"?. The problems that the POSIX API have are with the time zone designation strings and time offsets. Those are *not* stored in the POSIX struct tm; they are, instead, stored in global variables. The tzdb code makes an attempt to handle that, by setting the global variables with a time is converted. It's also very much not thread-safe. The current draft of the next revision of POSIX has tm_zone and tm_gmtoff, which addresses those problems (except that it says that, if a program calls localtime() or even localtime_r() in one thread, and another thread changes the value of the TZ environment variable between the time when localtime()/localtime_r() filled in the structure and when tm_zone is used in that structure, the result is undefined).
but is not complete for many other time zones that act differently than the rules we US-based people are most familiar with.
Which means "if the standard offset from UTC changes or the time zone designation string changes", *both* of which have happened in the US in the past. For example, the "US" rules in the northamerica file are # Rule NAME FROM TO - IN ON AT SAVE LETTER/S Rule US 1918 1919 - Mar lastSun 2:00 1:00 D Rule US 1918 1919 - Oct lastSun 2:00 0 S Rule US 1942 only - Feb 9 2:00 1:00 W # War Rule US 1945 only - Aug 14 23:00u 1:00 P # Peace Rule US 1945 only - Sep 30 2:00 0 S Rule US 1967 2006 - Oct lastSun 2:00 0 S Rule US 1967 1973 - Apr lastSun 2:00 1:00 D Rule US 1974 only - Jan 6 2:00 1:00 D Rule US 1975 only - Feb lastSun 2:00 1:00 D Rule US 1976 1986 - Apr lastSun 2:00 1:00 D Rule US 1987 2006 - Apr Sun>=1 2:00 1:00 D Rule US 2007 max - Mar Sun>=8 2:00 1:00 D Rule US 2007 max - Nov Sun>=1 2:00 0 S Note that the "LETTER/S" changed to something other than "S" or "D", so that the designation strings for US time zones changed from the usual EST/EDT, CST/CDT, ... PST/PDT pattern. And as for America/Chicago, well: # Zone NAME STDOFF RULES FORMAT [UNTIL] Zone America/Chicago -5:50:36 - LMT 1883 Nov 18 18:00u -6:00 US C%sT 1920 -6:00 Chicago C%sT 1936 Mar 1 2:00 -5:00 - EST 1936 Nov 15 2:00 -6:00 Chicago C%sT 1942 -6:00 US C%sT 1946 -6:00 Chicago C%sT 1967 -6:00 US C%sT We'll ignore the LMT entry, which merely serves to indicate when standard time began (the STDOFF for that entry is the offset from GMT at some particular place in Chicago). That particular IANA timezone apparently shifted to *Eastern Standard* time - no DST - between 1936-03-01 at 2:00 local time and 1936-11-15 at 2:00 local time.
(Leap seconds, and the POSIX choice to mandate 86400-second days, made monotonicity a bit tricky, but I digress....) Quite. The leap-second is evil.
Note, BTW, that zic, the TZIf file format, and the TZDB code all handle leap seconds. Zic can be told to put leap second transitions, from the leapseconds file, into a TZif file and, if the TZDB code is pointed at a TZif file with the leap second information, it will treat a time_t value as being seconds that have elapsed since the Epoch rather than as "seconds since the Epoch", i.e. on a transition from 23:59:59 to 23:59:60, the time_t value increases by one. It will convert a time_t corresponding to a 23:59:60 time to have a tm_sec value of 60.
Yes. Imagine for example a news organization collecting news feeds from cameras in Los Angeles, Washington DC, Johannesburg, Taipei, or anywhere else. You really need to know the local time when and where an event happened, not just its UTC time.
"2023-07-01 00:01:23 -7:00" is sufficient to allow that - that's 2023-07-01 07:01:23 UTC. A tzid is not necessary for that, nor is representing anything other than the offset from UTC in effect at that point in time.
(Presumably "DST" means "not STD" rather than "daylight saving time", as 1) Morocco shifts its clocks for Ramadan but that's not DST and 2) Ireland's *summer* time is standard time.) Right. To my point that "standard" is somewhat ambiguous and not all time zones behave in the ways familiar to many of us, like in the USA. I've learned to be very careful not to impose these familiar biases on my implementations of local time. It just doesn't work the same way in many places.
So why is it necessary to indicate why the time was shifted, by some amount (not necessarily one hour) from "standard" time at that point in time? Is that due to timestamps possibly *not* matching local time due to a local time shift in the middle of a video segment, so that an SMPTE timestamp, at least, can't show the results of that shift?
However, it reports "EEST", not "EEDT", so *that's* probably a bug. tzname[] is set to { "MSK", "EEST" } after localtime() is called on 670374000, so the timezone code in macOS 13.6, at least, has this bug in it. It should probably have been set to { "EEST", "EEDT" }.
Maybe. But that's what comes out today, and that seems sufficient for the Posix purposes. It might not be "correct", I guess?
It turns out it wasn't a bug - "EEST" is "Eastern European *Summer* Time".
Why? Why does the local time, plus the offset from UTC of local time at that instant, not suffice to represent any time?
With SMPTE timecode one can represent any time-point within the 24-hour range. SMPTE timecode includes flags for the "count mode" (non-drop-frame and drop-frame). There is no discontinuity in the hh:mm:ss:frames counting sequence.
Unless you're using TAI for time codes, there will eventually be discontinuities, for leap seconds if nothing else, and, if the timecode is local time, for any shifts in local time. Presumably that's where the jamming comes in, so that...
So with a single SMPTE timecode an application can "trim" forward or back and calculate durations from point to point along the 24-hour timeline. There is often no relation to actual local time-of-day, just a count from zero to 24 hours. This is very typical in many scenarios, especially post-production (editing). There is no need for access to any other metadata.
...the time codes have no discontinuities within a given sequence of video. So is the issue that, as a result of discontinuities within in local time but not within SMPTE timecodes, sometimes the timecode is out of sync with local time, and there needs to be some information to indicate the delta? If so, why is it not sufficient to provide that delta, rather than, for example, a "STD" vs. "DST" indication?
I extend this idea to local time time-of-day representation. Some days have transitions, so my timestamp design carries sufficient information to signal "this day is a transition day", "it is a transition of x value (often DST shifts, sometimes STDOFF shifts)", and "the transition occurs at this time-of-day". Thus, an application that does not have access to any metadata (TZif or TzDb source file) can accurately represent any point during that day.
Unless something changes twice within the day, which is neither excluded by the zic source file format nor by the TZif compiled format.
Example - A "fall back" DST transition in America/New_York:
D2022-11-06T00:00:00U-04Zamerica/new_yorkAedtV2021aL27S01t-01a02cMuX UTC 01667707227 D2022-11-06T01:59:59U-04Zamerica/new_yorkAedtV2021aL27S01t-01a02cMuX UTC 01667714426 D2022-11-06T01:00:00U-05Zamerica/new_yorkAestV2021aL27S00t-01a02cMuX UTC 01667714427 D2022-11-06T23:59:59U-05Zamerica/new_yorkAestV2021aL27S00t-01a02cMuX UTC 01667797226 ^^^^^^^^^^ DST transition metadata
Is there any need to have anything other than the current dstoff there? A transition that affects the "standard time offset" would be indicated by the "offset from UTC' changing and the "additional offset to the standard time offset" not changing. (What does "UTC NNNNNNNNNNN" mean here? Those aren't POSIX-time values, as 01667707227 is 1977-11-28 02:27:35 UTC.)
So with a single timestamp on a given day all points of that entire day can be represented without access to any additional metadata.
I.e., it's putting information from the America/New_York TZif file's transition entry that covers that particular time stamp, so if all that's to be done is to do conversions on *that particular timestamp*, the software doesn't need to look at that file. By the way, it would probably be best not to convert the tzid to all lower case, as not all file systems are case-insensitive. (They're also not all case-sensitive, so having two different tzids that differ only in case would be a mistake.)
On Jan 9, 2024, at 1:11 PM, Brooks Harris <brooks@edlmax.com> wrote:
A) Regarding if zone eras with UNTIL times with only a year designation should be included in zic output.
Where was it ever an issue about the handling of UNTIL times with only a year designation? UNTIL times with only a year designation have, I think, an implied date of the first day of January in that year; I'm not sure what the implied time is. As such, as far as I know, they are treated no differently from any other entry.
I think they should be. Take for example America/New_York:
# Zone NAME STDOFF RULES FORMAT [UNTIL] Zone America/New_York -4:56:02 - LMT 1883 Nov 18 12:03:58 -5:00 US E%sT 1920 -5:00 NYC E%sT 1942 -5:00 US E%sT 1946 -5:00 NYC E%sT 1967 -5:00 US E%sT
There are four transitions at the 1st-of-the-year, each changing the RULES field.
In 1919, the US repealed DST. In 1920, it appears that the America/New_York IANA timezone decided to stick with DST; that's why the "-5:00 US E%sT 1920" entry ends in 1920 - that timezone switched to the DST rules of New York City rather than the rules of the US as a whole. Apparently, in 1920, "New York and dozens of other cities adopted their own metropolitan daylight saving policies": https://www.smithsonianmag.com/history/100-years-later-madness-daylight-savi... However, the NYC rules don't have a transition until March, so that, from the point of view of converting times (or adjusting clocks), no change occurred until the last Sunday in March 1920. I infer from the JSTOR paper linked to by the Smithsonian article that New York State mandated advanced time in 1918, which was somewhat of a no-op at that time, as federally-mandated advanced time also went into effect in 1918; however, New York State may not have repealed it in 1919 when federally-mandated advanced time was repealed. 1920 is only relevant because it's the first year in which New York State and "default US" rules (or lack of same) differed. This all means that it's not clear what the transition date should be for that change in America/New_York. As Paul noted, the same results would occur for all time conversions for many different UNTIL values - "until 2AM local time on the last Sunday of March 2020" would also work.
Why is this important? In my implementations I find the date of a transition, the transition type (STDOFF, dstoff, or leap-seconds), the value of the transition shift, and the time-of-day of the shift.
There was no shift on 1920-01-01. There was, in America/New_York, a transition on 1920-03-28 at 2:00 AM local time, advancing the clock by one hour. There wasn't even a shift in STDOFF on 1920-01-01; there was only a shift in rules, which would have occurred on that date if that was the effective date of the 1919 repeal of DST. The only shift was in the rules, and the rules only differed, in their effects, starting 1920-03-28 at 2:00 AM local time. You may be thinking of "no-op" transitions, which change neither the offset from UTC, nor the designation strings, nor the "is_dst" flag value. An entry with an UNTIL date that's just a year is neither guaranteed to correspond to a no-op transition nor to correspond to a transition that changes one or more of those values.
B) Regarding stdoff, gmtoff, and Isdst
gmtoff is the sum of STDOFF and any DST values in effect. This gives the appropriate and important offset to normalize to UTC but it confounds the two factors.
I think the STDOFF value is critical. It essentially defines the (approximate) longitude of the idealized time zone independent of any DST shifts applied.
The longitude is *quite* approximate, given that 1) time zones can be rather wide and 2) time zone boundaries do not necessarily correspond to meridians. In what scenarios would that be critical?
And, regardless of longitude, it also defines the *base* offset from UTC in the time domain whether or not any DST shifts are in effect.
And in what scenarios would that be critical?
As noted elsewhere, the isdst flag is insufficient in cases of "double summertime". (And nothing says you couldn't have "triple summertime " or "quadruple summertime"). You'd really like a "dstoff" variable to carry this value, including negative DST (Dublin and such).
Is this required to support sending SMPTE local-time timestamps, which apparently do *not* shift at the instant of a time shift, so as not to introduce discontinuities in the middle of a video segment. I.e., SMPTE local-time timestamps are in a time scale that is *neither* local time *nor* UTC, so you need two offsets in order to convert the SMPTE timestamps to local time or to UTC.
Then there is leap-seconds. This is a third offset factor indicating the leap-seconds value. Lets call it "lsoff".
So you'd have stdoff + dstoff + lsoff = gmtoff.
Which is the case in TZif files produced when zic is handed a -L flag that specifies a list of past and expected future leap seconds.
OK. From an interoperability point of view I see TzDb source data to be *the law*, whether backed by official documentation or invented for convenience or otherwise.
"Otherwise" here meaning "well, that's what clocks in those regions are doing". The goal is to reflect how clocks behave in timezones; this follows "the law" because, in most cases, clocks follow the law. However, if there's a region where the law says one thing and a significant number of people do something else, we may well have two different - and partially or completely overlapping - IANA timezones.
Guy Harris via tz wrote in <500A8B25-FD29-4448-8F84-C388EC817A7C@sonic.net>: |On Jan 9, 2024, at 2:12 PM, Brooks Harris <brooks@edlmax.com> wrote: |> the incommensurability between UTC with leap-seconds | |TO what incommensurability are you referring? UTC's happy to go from \ |23:59:59 to 23:59:60. Not only UTC. I have see many shining eyes watching clocks doing that step, children, announcement in widely watch TV of public law, with all the thousands of years of culture and science that lead to this magic moment of understanding in the endless void of physical universe. It is a shame it is thrown away, and that mail of the female commissioner who had this "fruitful" meeting -- i refer to the official thing that was posted on the leapseconds list -- screams for repeating "O tempora, O mores!" Thank you. --End of <500A8B25-FD29-4448-8F84-C388EC817A7C@sonic.net> --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
On 1/9/2024 5:57 PM, Guy Harris wrote:
On Jan 9, 2024, at 1:10 PM, Brooks Harris<brooks@edlmax.com> wrote:
On 1/7/2024 7:51 PM, Guy Harris wrote:
On Jan 7, 2024, at 12:01 PM, Brooks Harris<brooks@edlmax.com> wrote:
Right, but I think there should be. Posix cannot distinguish an stdoff shift independent from a gmtoff shift.
So presumably:
"stdoff shift" is short for "a shift in the offset between UTC and standard time", "standard time" being what is specified as such by law;
"gmtoff shift" is short for "a shift in the offset between civil time and standard time".
Given that, not all shifts in the offset between UTC and civil time" are "gmtoff shifts", which is a bit confusing, given that "gmtoff" sounds as if it's an offset from "GMT" or UTC. stdoff is the "standard" offset from UTC of UT *without DST*. gmtoff is the offset from UT or UTC *with DST*. There is no "dstoff" to signal the DST value in effect, which is usually 1-hour but can be negative (Dublin), "double summertime" or possibly some other value. This is where the isdst flag is insufficient to cover those not 1-hour cases. In TZif files, there is neither stdoff nor dstoff, there's just gmtoff. Right, but I added stdoff. I can do this "blue sky" because I have no installed base. In zic *source* files, there's stdoff (in the STDOFF column) and dstoff (in the SAVE column). There is no gmtoff; that's calculated by zic and put into the TZif file. Yes. By RULES, etc. The term "standard" is some what ambiguous in general but understood within the TzDb context as the "normal" or "base" offset from UT or UTC In particular, it's what the relevant laws deem to be "standard time"; that's why standard time is in effect in summer in Ireland. Right. Gotta love those negative DST shifts, right? Not all "gmtoff shifts" constitute transitions to or from "daylight saving time", as Morocco, for example, has on some occasions introduced daylight saving time, but also shifts the clock during Ramadan.
Yes. Which brings up the terms "spring forward" and "fall back" which imply 1-hour shifts in the spring and fall, but this doesn't work for the four transitions in Morocco and elsewhere or for negative DST. So those terms should be avoided. (They don't imply 1-hour shifts; "spring forward an hour" and "fall back an hour" would.) Right. But those terms are used in many places, so caution.
I don't know what program produced the output you're showing; how did it incorrectly infer that the offset between standard time and UTC before the transition was 0 rather than +1 hour (3600 seconds)?
This is output from my modified version of zdump reading the TZIf output of my modified version of zic. These include a modified version of struct tm which I've renamed "struct tztm", to which I've added long int tm_stdoff which is populated by the values of STDOFF from the TzDb source files. This tm_stdoff value is also added to the TZIf file data. I.e., it's a modified version of dump reading the *modified-TZif* output of your modified version of zic.
That should have been stated up front.
Yes indeed, sorry. Focused on the discussion I'd forgotten to mention how I'd modified zic and struct tm.
Right. Perhaps "custom" is a misleading word. I meant that Posix seems to support USA rules ok *Current* USA rules; for earlier USA rules, see below....
The POSIX API has no problems with converting "seconds since the Epoch" to year/month/day/hour/minute/second in local time, even for local time in, for example, Morocco; the only issue with Ireland is "what does tm_isdst mean" - does it mean "time is shifted from standard time" or does it mean "time is set ahead for the summer"?.
The problems that the POSIX API have are with the time zone designation strings and time offsets. Those are *not* stored in the POSIX struct tm; they are, instead, stored in global variables. The tzdb code makes an attempt to handle that, by setting the global variables with a time is converted. Yes. That's what I meant to point out earlier. TzDb must "trick" Posix into giving the best YMDhms representation and gmtoff values. It's also very much not thread-safe. I've not addressed that in my current implementation. I anticipate it's a challenge for Posix or any other implementation. (I've done a lot of thread-safe, multi-thread, and RPC things in my video/audio work over the years and so avoided the complexity in my current work. The timekeeping is complicated enough.) The current draft of the next revision of POSIX has tm_zone and tm_gmtoff, which addresses those problems (except that it says that, if a program calls localtime() or even localtime_r() in one thread, and another thread changes the value of the TZ environment variable between the time when localtime()/localtime_r() filled in the structure and when tm_zone is used in that structure, the result is undefined). I'd hope they look into resolving that sort of issue.
but is not complete for many other time zones that act differently than the rules we US-based people are most familiar with. Which means "if the standard offset from UTC changes or the time zone designation string changes", *both* of which have happened in the US in the past. For example, the "US" rules in the northamerica file are
# Rule NAME FROM TO - IN ON AT SAVE LETTER/S Rule US 1918 1919 - Mar lastSun 2:00 1:00 D Rule US 1918 1919 - Oct lastSun 2:00 0 S Rule US 1942 only - Feb 9 2:00 1:00 W # War Rule US 1945 only - Aug 14 23:00u 1:00 P # Peace Rule US 1945 only - Sep 30 2:00 0 S Rule US 1967 2006 - Oct lastSun 2:00 0 S Rule US 1967 1973 - Apr lastSun 2:00 1:00 D Rule US 1974 only - Jan 6 2:00 1:00 D Rule US 1975 only - Feb lastSun 2:00 1:00 D Rule US 1976 1986 - Apr lastSun 2:00 1:00 D Rule US 1987 2006 - Apr Sun>=1 2:00 1:00 D Rule US 2007 max - Mar Sun>=8 2:00 1:00 D Rule US 2007 max - Nov Sun>=1 2:00 0 S
Note that the "LETTER/S" changed to something other than "S" or "D", so that the designation strings for US time zones changed from the usual EST/EDT, CST/CDT, ... PST/PDT pattern.
And as for America/Chicago, well:
# Zone NAME STDOFF RULES FORMAT [UNTIL] Zone America/Chicago -5:50:36 - LMT 1883 Nov 18 18:00u -6:00 US C%sT 1920 -6:00 Chicago C%sT 1936 Mar 1 2:00 -5:00 - EST 1936 Nov 15 2:00 -6:00 Chicago C%sT 1942 -6:00 US C%sT 1946 -6:00 Chicago C%sT 1967 -6:00 US C%sT Right. We'll ignore the LMT entry, which merely serves to indicate when standard time began (the STDOFF for that entry is the offset from GMT at some particular place in Chicago). I don't ignore the first entry. But it's a special case, the very beginning of each time zone's data (entry zero) and also its 'identity' ("America/Chicago"). The UNTIL values are very odd, having come from Shanks. Good thing with respect to the old copyright issue, thanks Paul. That particular IANA timezone apparently shifted to *Eastern Standard* time - no DST - between 1936-03-01 at 2:00 local time and 1936-11-15 at 2:00 local time.
(Leap seconds, and the POSIX choice to mandate 86400-second days, made monotonicity a bit tricky, but I digress....) Quite. The leap-second is evil. Note, BTW, that zic, the TZIf file format, and the TZDB code all handle leap seconds. Zic can be told to put leap second transitions, from the leapseconds file, into a TZif file and, if the TZDB code is pointed at a TZif file with the leap second information, it will treat a time_t value as being seconds that have elapsed since the Epoch rather than as "seconds since the Epoch", i.e. on a transition from 23:59:59 to 23:59:60, the time_t value increases by one. It will convert a time_t corresponding to a 23:59:60 time to have a tm_sec value of 60. I have not experimented with "right". I am imposing leap-seconds in the YMDhms representation after-the fact of TzDb in normal configuration. I must support also "rolling leap-seconds" where TzDb explicitly side-steps that option. This is a topic for another thread. Yes. Imagine for example a news organization collecting news feeds from cameras in Los Angeles, Washington DC, Johannesburg, Taipei, or anywhere else. You really need to know the local time when and where an event happened, not just its UTC time. "2023-07-01 00:01:23 -7:00" is sufficient to allow that - that's 2023-07-01 07:01:23 UTC. A tzid is not necessary for that, nor is representing anything other than the offset from UTC in effect at that point in time. I see the tzid as critical for human understanding. With YMDhms and gmtoff, like 2023-07-01 00:01:23 -7:00, all you've really done is encode UTC with an offset. Yes, presumably the YMDhms values were the local time in that time zone, but what time zone did it come from? You really need the time zone identity. (Presumably "DST" means "not STD" rather than "daylight saving time", as 1) Morocco shifts its clocks for Ramadan but that's not DST and 2) Ireland's *summer* time is standard time.) Right. To my point that "standard" is somewhat ambiguous and not all time zones behave in the ways familiar to many of us, like in the USA. I've learned to be very careful not to impose these familiar biases on my implementations of local time. It just doesn't work the same way in many places. So why is it necessary to indicate why the time was shifted, by some amount (not necessarily one hour) from "standard" time at that point in time? Is that due to timestamps possibly *not* matching local time due to a local time shift in the middle of a video segment, so that an SMPTE timestamp, at least, can't show the results of that shift?
Yes, partly. A discontinuity in SMPTE timecode can potentially send some parts of some systems to *black*. Not acceptable. It depends on the way each application in the environment have implemented the timecode and possibly local time. You need to avoid that sort of failure even if the timecode hh:mm:ss is not exactly consistent with true local time-of-day (wall clock). More generally, in synchronization with PTP, the "PTP profile" must signal (announce) upcoming shifts ("jumps") (DST, leap-second) to the receiver. This is done with a special SMPTE PTP Profile that also includes video-related metadata such as frame rate and "Daily Jam" information. A timestamp that can fully represent those time points and "jump" announcements must include *all* the metadata. That's part of my objectives.
However, it reports "EEST", not "EEDT", so *that's* probably a bug. tzname[] is set to { "MSK", "EEST" } after localtime() is called on 670374000, so the timezone code in macOS 13.6, at least, has this bug in it. It should probably have been set to { "EEST", "EEDT" }. Maybe. But that's what comes out today, and that seems sufficient for the Posix purposes. It might not be "correct", I guess? It turns out it wasn't a bug - "EEST" is "Eastern European *Summer* Time".
Why? Why does the local time, plus the offset from UTC of local time at that instant, not suffice to represent any time? With SMPTE timecode one can represent any time-point within the 24-hour range. SMPTE timecode includes flags for the "count mode" (non-drop-frame and drop-frame). There is no discontinuity in the hh:mm:ss:frames counting sequence. Unless you're using TAI for time codes, there will eventually be discontinuities, for leap seconds if nothing else, and, if the timecode is local time, for any shifts in local time.
Presumably that's where the jamming comes in, so that...
So with a single SMPTE timecode an application can "trim" forward or back and calculate durations from point to point along the 24-hour timeline. There is often no relation to actual local time-of-day, just a count from zero to 24 hours. This is very typical in many scenarios, especially post-production (editing). There is no need for access to any other metadata. ...the time codes have no discontinuities within a given sequence of video.
So is the issue that, as a result of discontinuities within in local time but not within SMPTE timecodes, sometimes the timecode is out of sync with local time, Yes. and there needs to be some information to indicate the delta? There should be, but currently there is not in the SMPTE formats. Addressing that is part of the challenge. If so, why is it not sufficient to provide that delta, rather than, for example, a "STD" vs. "DST" indication?
I extend this idea to local time time-of-day representation. Some days have transitions, so my timestamp design carries sufficient information to signal "this day is a transition day", "it is a transition of x value (often DST shifts, sometimes STDOFF shifts)", and "the transition occurs at this time-of-day". Thus, an application that does not have access to any metadata (TZif or TzDb source file) can accurately represent any point during that day. Unless something changes twice within the day, which is neither excluded by the zic source file format nor by the TZif compiled format. That's a question I have. As far as I can tell zic creates only one transition per day. There are examples in the source files where the STDOFF changes at YMD 00:00:00 and there is also a DST shift at YMD 02:00:00. That's two transitions in that day. But zic outputs a single transition at YMD 00:00:00. While the formats could presumably handle two transitions per day it appears zic combines them.
What is the TzDb policy here? Paul?
Example - A "fall back" DST transition in America/New_York:
D2022-11-06T00:00:00U-04Zamerica/new_yorkAedtV2021aL27S01t-01a02cMuX UTC 01667707227 D2022-11-06T01:59:59U-04Zamerica/new_yorkAedtV2021aL27S01t-01a02cMuX UTC 01667714426 D2022-11-06T01:00:00U-05Zamerica/new_yorkAestV2021aL27S00t-01a02cMuX UTC 01667714427 D2022-11-06T23:59:59U-05Zamerica/new_yorkAestV2021aL27S00t-01a02cMuX UTC 01667797226 ^^^^^^^^^^ DST transition metadata Is there any need to have anything other than the current dstoff there? A transition that affects the "standard time offset" would be indicated by the "offset from UTC' changing and the "additional offset to the standard time offset" not changing. There are examples where both STDOFF and DST shift. Both changes are indicated in my formats. This is Paris with an STDOFF 1-hour shift East and simultaneous minus 2-hour DST shift:
D1945-09-15T23:59:59U+02Zeurope/parisAwemtV2021aL00S02cMuX UTC -0766634401 D1945-09-16T00:00:00U+02e01+03Zeurope/parisAwemtV2021aL00S02t-02a03cMuX UTC -0766634400 D1945-09-16T02:59:59U+02e01+03Zeurope/parisAwemtV2021aL00S02t-02a03cMuX UTC -0766623601 current offset ^^^^ current DST ^^^ change East by 01 at 03 ^^^^^^ change DST -02 at 03 ^^^^^^^ D1945-09-16T02:00:00U+01e01+03Zeurope/parisAcetV2021aL00S00t-02a03cMuX UTC -0766623600 offset updated ^^^^ DST updated ^^^ D1945-09-16T23:59:59U+01e01+03Zeurope/parisAcetV2021aL00S00t-02a03cMuX UTC -0766544401 D1945-09-17T00:00:00U+01Zeurope/parisAcetV2021aL00MuX UTC -0766544400
(What does "UTC NNNNNNNNNNN" mean here? The seconds count since 1970 on the Etc/UTC time zone with leap-seconds. The UTC NNNNNNNNNNN is not part of the timestamp itself, just output to the listing from a call to an internal Get1970SecsUTC().
Those aren't POSIX-time values, as 01667707227 is 1977-11-28 02:27:35 UTC.)
So with a single timestamp on a given day all points of that entire day can be represented without access to any additional metadata. I.e., it's putting information from the America/New_York TZif file's transition entry that covers that particular time stamp, so if all that's to be done is to do conversions on *that particular timestamp*, the software doesn't need to look at that file.
By the way, it would probably be best not to convert the tzid to all lower case, as not all file systems are case-insensitive. (They're also not all case-sensitive, so having two different tzids that differ only in case would be a mistake.) Yeah, I know. I need to address that. The format as written relies on the idea that main field delimiters are upper-case, so sub-fields must be lower case. Initially I believed I could do this, but later versions of TzDb and Posix abbreviations have disqualified that assumption. I need to add some sort of escape character to indicate upper v.s. lower case.
I appreciate your expert eye on what I've said here. Your comments are helpful. Thanks, -Brooks
On 2024-01-13 08:58, Brooks Harris wrote:
That's a question I have. As far as I can tell zic creates only one transition per day. There are examples in the source files where the STDOFF changes at YMD 00:00:00 and there is also a DST shift at YMD 02:00:00. That's two transitions in that day. But zic outputs a single transition at YMD 00:00:00. While the formats could presumably handle two transitions per day it appears zic combines them.
What is the TzDb policy here? Paul?
zic coalesces two transitions into one if they occur at the same time. But it shouldn't coalesce a transition with another one that occurs two hours later. If it does that, it's a bug in zic (and probably also a bug in the data).
On 2024-01-13 08:58, Brooks Harris wrote:
That's a question I have. As far as I can tell zic creates only one transition per day. There are examples in the source files where the STDOFF changes at YMD 00:00:00 and there is also a DST shift at YMD 02:00:00. That's two transitions in that day. But zic outputs a single transition at YMD 00:00:00. While the formats could presumably handle two transitions per day it appears zic combines them.
What is the TzDb policy here? Paul?
zic coalesces two transitions into one if they occur at the same time. But it shouldn't coalesce a transition with another one that occurs two hours later. If it does that, it's a bug in zic (and probably also a bug in the data).
participants (16)
-
Arthur David Olson -
Bradley White -
Brian Inglis -
brian.inglis@systematicsw.ab.ca -
Brooks Harris -
Clive D.W. Feather -
Derick Rethans -
Derick Rethans -
Doug Ewell -
Florian Weimer -
Guy Harris -
Matthew Donadio -
Michael H Deckers -
Paul Eggert -
Steffen Nurpmeso -
Stephen Colebourne