Transitions after those included in zoneinfo binaries?

I recently noticed a discrepancy between the python-dateutil and pytz libraries when calculating offsets for dates after the last included transition in the zonefile binaries. What it boils down to is that when it runs out of transitions, python-dateutil selects the "standard" offset and pytz assumes that after the last transition the value "holds". In the southern hemisphere, this produces two different behaviors. Obviously there's no "right" answer here, since these are predictions about what time zones will be applied 20 years in the future (and even if the rules stayed the same, the "end date" is entirely artificial). Still, I wonder if it might be worth a bit of discussion and possibly a note in the Theory file? Barring a representation of the data that exposes the rules directly (in which case no such artificial limitation would be necessary), I think either fallback is defensible. Best, Paul

On Wed, Dec 13, 2017 at 3:26 PM, Paul G <paul@ganssle.io> wrote:
I recently noticed a discrepancy between the python-dateutil and pytz libraries when calculating offsets for dates after the last included transition in the zonefile binaries. What it boils down to is that when it runs out of transitions, python-dateutil selects the "standard" offset and pytz assumes that after the last transition the value "holds". In the southern hemisphere, this produces two different behaviors.
Obviously there's no "right" answer here, since these are predictions about what time zones will be applied 20 years in the future (and even if the rules stayed the same, the "end date" is entirely artificial). Still, I wonder if it might be worth a bit of discussion and possibly a note in the Theory file? Barring a representation of the data that exposes the rules directly (in which case no such artificial limitation would be necessary), ...
Does not the "POSIX-TZ-environment-variable-style string for use in handling instants after the last transition time stored in the file" specify exactly what to do?
I think either fallback is defensible.
Best,
Paul

Paul G wrote:
What it boils down to is that when it runs out of transitions, python-dateutil selects the "standard" offset and pytz assumes that after the last transition the value "holds".
Neither of these is correct. The correct answer is that the extended-POSIX-TZ string at the end of the file specifies the behaviour following the last explicit transition. If there is no such string, the behaviour is not specified, but it is implied that it's trickier than POSIX can represent, and it is unwise to assume anything. In version 1 of the tzfile format, which didn't have the option of any POSIX-ish-TZ string, the type of local time specified by the last explicit transition applies from then until the end of time, namely 2038-01-19T03:14:08Z. -zefram

Interesting, I wonder if these zonefile parsers were both written before version 2 came out. Neither of them supports the 64-bit files, either. Should be a simple matter to update dateutil accordingly. On 12/13/2017 03:41 PM, Zefram wrote:
Paul G wrote:
What it boils down to is that when it runs out of transitions, python-dateutil selects the "standard" offset and pytz assumes that after the last transition the value "holds".
Neither of these is correct. The correct answer is that the extended-POSIX-TZ string at the end of the file specifies the behaviour following the last explicit transition. If there is no such string, the behaviour is not specified, but it is implied that it's trickier than POSIX can represent, and it is unwise to assume anything.
In version 1 of the tzfile format, which didn't have the option of any POSIX-ish-TZ string, the type of local time specified by the last explicit transition applies from then until the end of time, namely 2038-01-19T03:14:08Z.
-zefram

Well, what I meant by that was that they are probably both working from the version 1 standard. It was already on my to do list to fix dateutil's support for the newer versions. Either way doesn't seem amazingly critical since these are just best guesses about timezone transitions 20 years in the future, so hopefully no one is relying on them being accurate (though I imagine at least some people are). On 12/13/2017 03:51 PM, Zefram wrote:
Paul G wrote:
Neither of them supports the 64-bit files, either.
Well then, they can just use the actual explicit transition (as you say pytz does), and they'll be fine. Unless, by some bizarre chance, the world doesn't end by 2038.
-zefram

On 14 December 2017 at 07:55, Paul G <paul@ganssle.io> wrote:
Well, what I meant by that was that they are probably both working from the version 1 standard. It was already on my to do list to fix dateutil's support for the newer versions.
Either way doesn't seem amazingly critical since these are just best guesses about timezone transitions 20 years in the future, so hopefully no one is relying on them being accurate (though I imagine at least some people are).
Yes, pytz is working from the v1 standard and will fail around 2038. I won't be fixing this in pytz - Python now has the necessary hooks to include and correctly use a tzfile parser in Python core, so the work should happen there. I don't seem to have gotten around to it myself, so interested parties can join the Python datetime SIG at https://mail.python.org/mailman/listinfo/datetime-sig . -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/

On Tue, Dec 19, 2017, at 06:55, Stuart Bishop wrote:
Yes, pytz is working from the v1 standard and will fail around 2038. I won't be fixing this in pytz - Python now has the necessary hooks to include and correctly use a tzfile parser in Python core, so the work should happen there. I don't seem to have gotten around to it myself, so interested parties can join the Python datetime SIG at https://mail.python.org/mailman/listinfo/datetime-sig .
Speaking of... while doing something unrelated, I noticed that the current version of glibc shipping in Ubuntu fails in 2038 - for all later dates it uses the offset of the last transition as of December 31, 2038 (which is not the 32-bit limit - is that the point that zic stops putting explicit transitions in the 64-bit data by default?)

On 12/20/2017 12:21 PM, Random832 wrote:
I noticed that the current version of glibc shipping in Ubuntu fails in 2038 - for all later dates it uses the offset of the last transition as of December 31, 2038 (which is not the 32-bit limit - is that the point that zic stops putting explicit transitions in the 64-bit data by default?)
No, it appears to be a bug in glibc's implementation of localtime and related functions. The zic-generated files are fine, but 64-bit glibc mishandles timestamps starting in the year 2039 (it handles the year 2038 correctly). Thanks for reporting the problem: I filed a glibc bug report here. https://sourceware.org/bugzilla/show_bug.cgi?id=22639

On 2017-12-20 13:21, Random832 wrote:
On Tue, Dec 19, 2017, at 06:55, Stuart Bishop wrote:
Yes, pytz is working from the v1 standard and will fail around 2038. I won't be fixing this in pytz - Python now has the necessary hooks to include and correctly use a tzfile parser in Python core, so the work should happen there. I don't seem to have gotten around to it myself, so interested parties can join the Python datetime SIG at https://mail.python.org/mailman/listinfo/datetime-sig . Speaking of... while doing something unrelated, I noticed that the current version of glibc shipping in Ubuntu fails in 2038 - for all later dates it uses the offset of the last transition as of December 31, 2038 (which is not the 32-bit limit - is that the point that zic stops putting explicit transitions in the 64-bit data by default?)
The signed 32 bit time_t limit is 2038 Jan 19 Tue 03:14:07, so libraries with that limit would use the rules in effect at that time for all later times. Some 32 bit or smaller architecture libraries have that limitation. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada

On 12/13/2017 12:41 PM, Zefram wrote:
the extended-POSIX-TZ string at the end of the file specifies the behaviour following the last explicit transition.
Yes, and the initial part of this behavior should agree with the info associated with the last transition. However, this latter constraint wasn't understood until last year, and was addressed only in July 2016: https://github.com/eggert/tz/commit/081c50f30308b589e7e296f485135d03cf046cb1 and older software doesn't respect it.
participants (7)
-
Bradley White
-
Brian Inglis
-
Paul Eggert
-
Paul G
-
Random832
-
Stuart Bishop
-
Zefram