On 18:44 Tue 08 Jun , Stephen Colebourne via tz wrote:
On Tue, 8 Jun 2021 at 17:23, Paul Eggert via tz <tz@iana.org> wrote:
On 6/8/21 1:56 AM, Stephen Colebourne via tz wrote:
It is possible to automatically derive the data for group 1 from the data for group 2. The other way around is not possible.
That's OK, as I'm not proposing the other way around. I'm merely proposing a flag to generate either group-1-style or group-2-style data from what we have now. This should be doable relatively easily. (And if I'm wrong, we can revisit this technical issue of course.)
Are you proposing a makefile option to recreate the original source files using the data in backzone? That seems much harder work than doing it the other way around and prone to error. (Remember that downstream projects need the correct source files, not any other output file)
One key advantage of the `idmerge` file is that it makes the merge process obvious. ie. that TZDB favours Berlin over Stockholm/Oslo. I'm proposing that the historic data is present in its original location (africa to southamerica). And I propose that the flag and `idmerge` provide the necessary tools to reduce the file size for those that need it.
+1
The importance of the full dataset being the default can be seen here for example: https://www.oracle.com/java/technologies/javase/tzupdater-readme.html As can be seen, the docs expect users to download the latest tarball. It isn't reasonable for TZDB to demand all downstream users change their URL. Another variant of the same tool: https://docs.azul.com/core/timezone-updater
I work on OpenJDK for Red Hat, as one of three OpenJDK 8u upstream maintainers [0] and also packaging various versions for Fedora, CentOS and RHEL. We have similar tooling (based on what is in OpenJDK itself) in Fedora, CentOS & RHEL to produce the Java data files from the TZDB sources. This performs the same purpose as these tools; it allows updated Java data files to be produced for the latest tzdb update, without the JDK itself having to be updated. I agree with Stephen that expecting every downstream usage of this data to be updated is unreasonable, when it can just be fixed at source. I don't see a strong case for the new structure to be the default and to expect every downstream user to alter their code. That in itself implies an amount of risk, as much downstream use is in very stable codebases.
Here is another example, where a downstream system has had to adapt to retain compatibility: http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/801874e394a7/make/gendata/Ge... https://github.com/openjdk/jdk/blob/master/make/data/tzdata/jdk11_backward
I think the better link may be: https://hg.openjdk.java.net/jdk8u/jdk8u/jdk/diff/7dcb74c3ffba/makefiles/Gend... which shows the change and https://hg.openjdk.java.net/jdk8u/jdk8u/jdk/rev/7dcb74c3ffba/makefiles/Genda... the entire changeset. We would have to bring any future tzdata update into the 8u repositories, which are stable repositories designed for security and bug fixes. We don't want to be doing major reworks of TZDB data handling there. On that note, 8u still uses rearguard format, because we decided the risk to switch to vanguard was not worth it. [1] [2]
I know CLDR has previously expressed significant concern about the compatibility issue too.
Stephen
[0] https://wiki.openjdk.java.net/display/jdk8u/Main [1] https://mail.openjdk.java.net/pipermail/jdk8u-dev/2020-June/011931.html [2] https://mail.openjdk.java.net/pipermail/jdk8u-dev/2020-October/012890.html Thanks, -- Andrew :) Pronouns: he / him or they / them Senior Free Java Software Engineer OpenJDK Package Owner Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222