Wrong symlinks with BACKWARD=backward PACKRATDATA=backzone
Hi, the Debian/Ubuntu package uses BACKWARD=backward PACKRATDATA=backzone. The resulting tzdata.zi has the timezone Africa/Asmara, but Africa/Asmera points to Africa/Nairobi instead of the existing Africa/Asmara: ``` $ make AWK=gawk BACKWARD="backward" PACKRATDATA=backzone PACKRATLIST=zone.tab VERSION_DEPS= tzdata.zi gawk \ -v DATAFORM=`expr main.zi : '\(.*\).zi'` \ -v PACKRATDATA='backzone' \ -v PACKRATLIST='zone.tab' \ -f ziguard.awk \ africa antarctica asia australasia europe northamerica southamerica etcetera factory backward backzone >main.zi.out mv main.zi.out main.zi { (type git) >/dev/null 2>&1 && \ V=`git describe --match '[0-9][0-9][0-9][0-9][a-z]*' \ --abbrev=7 --dirty` || \ if test 'unknown' = unknown && V=`cat version`; then \ case $V in *-dirty);; *) V=$V-dirty;; esac; \ else \ V='unknown'; \ fi; } && \ printf '%s\n' "$V" >version.out mv version.out version version=`sed 1q version` && \ LC_ALL=C gawk \ -v dataform='main' \ -v deps='ziguard.awk africa antarctica asia australasia europe northamerica southamerica etcetera factory backward backzone zone.tab zishrink.awk' \ -v redo='posix_right' \ -v version="$version" \ -f zishrink.awk \ main.zi >tzdata.zi.out mv tzdata.zi.out tzdata.zi $ grep Asmera tzdata.zi L Africa/Nairobi Africa/Asmera ``` Can this fixed/configured? -- Benjamin Drung Debian & Ubuntu Developer
On 3/6/24 13:29, Benjamin Drung via tz wrote:
the Debian/Ubuntu package uses BACKWARD=backward PACKRATDATA=backzone.
Why is that? In other words, what bug report prompted the use of PACKRATDATA? That may help figure out a reasonable fix, if any is needed.
The resulting tzdata.zi has the timezone Africa/Asmara, but Africa/Asmera points to Africa/Nairobi instead of the existing Africa/Asmara
'backzone' contains out-of-scope and often-wrong data dating back to an older way of doing things. The 'backzone' data entries are not maintained and are not necessarily compatible with the in-scope data in 'africa', 'asia', etc. There is tension between being bug-for-bug compatible with the older tzdata, and being compatible with in-scope data and/or with other platforms. The 'backzone' file is designed for the former case; this minimizes maintenance effort, since it's relatively easy to just keep the 'backzone' data the way it was. I wouldn't want to take on the maintenance burden for the latter case, as that'd be quite a mess.
On Wed, 2024-03-06 at 23:46 -0800, Paul Eggert wrote:
On 3/6/24 13:29, Benjamin Drung via tz wrote:
the Debian/Ubuntu package uses BACKWARD=backward PACKRATDATA=backzone.
Why is that? In other words, what bug report prompted the use of PACKRATDATA? That may help figure out a reasonable fix, if any is needed.
This is the bug report: https://bugs.launchpad.net/ubuntu/+source/tzdata/+bug/2003797 We have lots of users and some of them care about pre-1970 timestamps. We want to satisfy them as well.
The resulting tzdata.zi has the timezone Africa/Asmara, but Africa/Asmera points to Africa/Nairobi instead of the existing Africa/Asmara
'backzone' contains out-of-scope and often-wrong data dating back to an older way of doing things. The 'backzone' data entries are not maintained and are not necessarily compatible with the in-scope data in 'africa', 'asia', etc.
There is tension between being bug-for-bug compatible with the older tzdata, and being compatible with in-scope data and/or with other platforms. The 'backzone' file is designed for the former case; this minimizes maintenance effort, since it's relatively easy to just keep the 'backzone' data the way it was. I wouldn't want to take on the maintenance burden for the latter case, as that'd be quite a mess.
The goal is to get the pre-1970 data back, but keep it unchanged for post 1970. -- Benjamin Drung Debian & Ubuntu Developer
On 2024-03-07 01:35, Benjamin Drung via tz wrote:
On Wed, 2024-03-06 at 23:46 -0800, Paul Eggert wrote:
On 3/6/24 13:29, Benjamin Drung via tz wrote:
the Debian/Ubuntu package uses BACKWARD=backward PACKRATDATA=backzone.
Why is that? In other words, what bug report prompted the use of PACKRATDATA? That may help figure out a reasonable fix, if any is needed.
This is the bug report: https://bugs.launchpad.net/ubuntu/+source/tzdata/+bug/2003797
I'm afraid that URL doesn't help figure out a fix, as it's not an end user bug report. It's an announcement that you decided to use PACKRATDATA, inspired by the long discussion on the tz mailing list. This discussion reflected a philosophical disagreement that has little practical effect on end users. If you'd like to take on the burden of maintaining an higher-effort (and more-political) approach based on an alternative maintenance philosophy, it'd be helpful to propose specific changes in "git format-patch" form. These should be independent of the main data, so presumably they would patch the "backzone" file. It might help to try these changes out on Ubuntu first, by patching there and seeing how well it works there.
On 3/8/24 10:13, Paul Eggert wrote:
it'd be helpful to propose specific changes in "git format-patch" form. These should be independent of the main data, so presumably they would patch the "backzone" file. It might help to try these changes out on Ubuntu first, by patching there and seeing how well it works there.
Another possibility that just occurred to me, is that you could use "make DATAFORM=vanguard" when building tzdata. The vanguard format treats links-to-links in a better way, and this should fix the particular symlink issue you mentioned. It entails several other changes too, though, so it'd be a good idea to review them too.
On Tue, 2024-03-12 at 17:14 -0700, Paul Eggert wrote:
On 3/8/24 10:13, Paul Eggert wrote:
it'd be helpful to propose specific changes in "git format-patch" form. These should be independent of the main data, so presumably they would patch the "backzone" file. It might help to try these changes out on Ubuntu first, by patching there and seeing how well it works there.
Another possibility that just occurred to me, is that you could use "make DATAFORM=vanguard" when building tzdata. The vanguard format treats links-to-links in a better way, and this should fix the particular symlink issue you mentioned. It entails several other changes too, though, so it'd be a good idea to review them too.
Finally I tested "make DATAFORM=vanguard". The generated tzdata.zi does not work with zic from glibc 2.38: ``` $ /usr/sbin/zic -d tzdata/tzgen -L /dev/null tzdata.zi "tzdata.zi", line 59: invalid ending year "tzdata.zi", line 60: invalid ending year "tzdata.zi", line 299: invalid ending year [...] ``` Diffing tzdata.zi generated with DATAFORM=main and DATAFORM=vanguard shows that these lines that zic complains were different: ``` $ diff -u /usr/share/zoneinfo/tzdata.zi tzdata.zi [...] @@ -55,8 +56,8 @@ R K 2014 o - Jun 26 24 0 - R K 2014 o - Jul 31 24 1 S R K 2014 o - S lastTh 24 0 - -R K 2023 ma - Ap lastF 0 1 S -R K 2023 ma - O lastTh 24 0 - +R K 2023 m - Ap lastF 0 1 S +R K 2023 m - O lastTh 24 0 - R L 1951 o - O 14 2 1 S R L 1952 o - Ja 1 0 0 - R L 1953 o - O 9 2 1 S [...] ``` Attached a patch to introduce DATAFORM=link to just apply the link related changes from "vanguard". An alternative would be to move the link changes into the "main" data form and the previous behaviour into "rearguard". Let me know what you think about the patch. -- Benjamin Drung Debian & Ubuntu Developer
On 4/2/24 09:55, Benjamin Drung via tz wrote:
An alternative would be to move the link changes into the "main" data form and the previous behaviour into "rearguard".
Another alternative would be for Ubuntu to upgrade to release TZDB 2024a's zic.c (also private.h and tzfile.h and zdump.c). This would avoid the need for a new configuration option, which would be simpler in the long run. Come to think of it, glibc should do that too. I'll add that to my list of things to do.
On Tue, 2024-04-02 at 13:18 -0700, Paul Eggert wrote:
On 4/2/24 09:55, Benjamin Drung via tz wrote:
An alternative would be to move the link changes into the "main" data form and the previous behaviour into "rearguard".
Another alternative would be for Ubuntu to upgrade to release TZDB 2024a's zic.c (also private.h and tzfile.h and zdump.c). This would avoid the need for a new configuration option, which would be simpler in the long run.
Switching to TZDB's zic would be an option for the development release of Ubuntu, but not for the stable releases: We ship tzdata.zi in /usr/share/zoneinfo and we do not know if users rely on being able to use zic from glibc on that file. Attached the new version of the patch to avoid adding a new data format.
Come to think of it, glibc should do that too. I'll add that to my list of things to do.
That would be great. -- Benjamin Drung Debian & Ubuntu Developer
On 2024-04-05 04:42, Benjamin Drung via tz wrote:
Switching to TZDB's zic would be an option for the development release of Ubuntu, but not for the stable releases: We ship tzdata.zi in /usr/share/zoneinfo and we do not know if users rely on being able to use zic from glibc on that file.
Makes sense.
Come to think of it, glibc should do that too. I'll add that to my list of things to do.
That would be great.
I committed that change to the glibc master repository and it should appear in the next glibc release. See <https://sourceware.org/git/?p=glibc.git;a=commit;h=1f94147a79fcb7211f1421b87...>.
Attached the new version of the patch to avoid adding a new data format.
That patch would be problematic, as main.zi should reflect what's in the main files. That is, "cat africa antarctica ..." should output a copy of main.zi. Let's take a step back for a minute. Today I read the bug report <https://bugs.launchpad.net/ubuntu/+source/tzdata/+bug/2003797> that caused Ubuntu to start using PACKRATLIST=zone.tab recently. This bug report doesn't mention user requests; instead, it seems to have been prompted by your impression that some Ubuntu users might be bothered by tzdata 2022b's changes to pre-1970 Europe/Oslo. As you can guess, I doubt whether Ubuntu users care much about pre-1970 computer-generated timestamps in Norway. If I'm wrong and some users do care, Ubuntu could instead move the packrat stuff to its tzdata-legacy package, or to a new tzdata-packrat package that contains the packrat info. This would simplify maintenance of main tzdata package, and would lessen the number of trivial differences between default Ubuntu and elsewhere. Even if for some reason it's necessary to keep backzone's Europe/Oslo in Ubuntu's tzdata package, that doesn't mean Ubuntu tzdata needs all of backzone's entries. Instead, it could keep just the backzone entries that Ubuntu users actually need, and omit troublemaking entries like Africa/Asmara where the data are low quality and are surely wrong anyway (does anybody really think that the inhabitants of Asmara adjusted their clocks by 12 seconds on January 1, 1890 at midnight? I don't). Ubuntu could do that by creating a file packrat-ubuntu.tab containing lines like this: NO +5955+01045 Europe/Oslo i.e., just the subset of zone.tab that refers to backzone entries containing high-quality pre-1970 data that users actually need, and then building with: make PACKRATDATA=backzone PACKRATLIST=packrat-ubuntu.tab Surely, though, it would be simpler and more reliable to eliminate the packrat stuff from the tzdata package, possibly moving it to tzdata-legacy or to a new tzdata-packrat package.
On 2024-04-07 14:55, Paul Eggert wrote:
main.zi should reflect what's in the main files. That is, "cat africa antarctica ..." should output a copy of main.zi.
Come to think of it, 'make check' should test that, in the common case where PACKRATLIST is empty. I installed the attached patch to do this.
On Wed, 6 Mar 2024 at 21:30, Benjamin Drung via tz <tz@iana.org> wrote:
the Debian/Ubuntu package uses BACKWARD=backward PACKRATDATA=backzone. The resulting tzdata.zi has the timezone Africa/Asmara, but Africa/Asmera points to Africa/Nairobi instead of the existing Africa/Asmara:
Can this fixed/configured?
As can be seen, the PACKRATDATA approach is not suitable for full backwards compatibility. This is why I'm still maintaining global-tz even though I don't want to https://github.com/JodaOrg/global-tz As can be seen, it correctly adjusts this example: https://github.com/JodaOrg/global-tz/blob/global-tz/africa#L407 https://github.com/JodaOrg/global-tz/blob/global-tz/backward#L191 I'm happy if Canonical wants to switch to global-tz and/or help maintain it. Stephen
participants (3)
-
Benjamin Drung -
Paul Eggert -
Stephen Colebourne