Undoing the effect of the new alike-since-1970 patch

There's been a clear need expressed to support tzdb users who would rather not deal with the effects of the recently-proposed alike-since-1970 patch. On the other hand there are also fairness and guideline-oriented reasons for the patch, which was originally discussed and installed with more than our usual care and review. Because of the extensive followup discussions I don't see how a single version could be appealing to both sides of this disagreement. So I propose we add a Makefile or similar build-time option to let tzdb users have it either way. Set the flag one way, and it will be as if the recently-proposed changes did not occur. Set it the other way, and you'll get the changes. This will be similar to our vanguard/rearguard support, which helped address a similar disagreement back in 2018. We could even control things via the existing vanguard/rearguard build-time option; that might be simpler. If you have ideas about what the the option's specific effects should be, please let me know. If there is support for this idea, I expect to be able to implement this option soon, in plenty of time before any urgent change due to a real-world rule change arrives. Like most compromises, I expect this approach to not be entirely satisfactory to anybody. However, I hope it helps us move forward together.

Paul Eggert via tz <tz@iana.org> writes:
So I propose we add a Makefile or similar build-time option to let tzdb users have it either way. Set the flag one way, and it will be as if the recently-proposed changes did not occur. Set it the other way, and you'll get the changes.
Works for me, as long as it's possible to get the formerly-default zone data (that is, without the old backzone additions).
This will be similar to our vanguard/rearguard support, which helped address a similar disagreement back in 2018. We could even control things via the existing vanguard/rearguard build-time option; that might be simpler. If you have ideas about what the the option's specific effects should be, please let me know.
Please *do not* tie this to vanguard/rearguard. The fact that tzdb consumers might disagree with your revisions to the data classification doesn't mean that they want to get locked into obsolete file formats. regards, tom lane

On Sat, 5 Jun 2021 at 04:14, Paul Eggert via tz <tz@iana.org> wrote:
So I propose we add a Makefile or similar build-time option to let tzdb users have it either way. Set the flag one way, and it will be as if the recently-proposed changes did not occur. Set it the other way, and you'll get the changes.
Unfortunately, this approach doesn't work for the downstream users I represent unless the default data in tzdb is as it was before the patch. This is because the downstream users consume the source code files without using the makefile. As has been discussed previously, this is a common way that tzdb is consumed, making the default data critical. And as discussed previously, there are a large number of independent users consuming the data who would not all know about the need for change. I have no problem with an option in the makefile that automatically merges zones where they are the same after 1970, but by default the data needs to be present in the source file as it was previously. My view is that the need for a more compact tzdb output is likely to be limited to space-constrained devices whose developers are more likely to be aware of such a flag.
This will be similar to our vanguard/rearguard support, which helped address a similar disagreement back in 2018.
It is important to understand that rearguard did not solve the problem in 2018. Downstream projects were still forced to change their code. Joda-Time has explicit (and flaky) code in place to revert the 2018 changes as it reads in the source files. This is because Joda-Time users do not use the makefile, so there is no direct way to access the rearguard data.
On the other hand there are also fairness and guideline-oriented reasons for the patch
FWIW, I entirely agree that it is wrong that there is a zone with full history for Norway and Sweden but not for Angola or Slovakia. The difference is that I think TZDB should be expanded to include the missing history, not shrunk. Given the above, I think the right course of action is to revert the change first, and then work on agreeing what the target state of TZDB should be (which deserves a separate thread). Stephen

On 6/6/21 1:07 AM, Stephen Colebourne via tz wrote:
rearguard did not solve the problem in 2018. Downstream projects were still forced to change their code.
They were not forced to. They could have used (and still can use) the Makefile to generate files in the format they prefer. This generation process doesn't need to be done in Java; it can be done offline on any POSIX platform, and the result can be redistributed and given to whatever Java code still requires rearguard format. If that process is too much trouble for downstream users, we can arrange to generate the alternate source upstream. For example, we could ship a tarball with an altered tzdata.zi file (this is what Tom suggested).

On Sun, 6 Jun 2021 at 19:59, Paul Eggert <eggert@cs.ucla.edu> wrote:
If that process is too much trouble for downstream users, we can arrange to generate the alternate source upstream. For example, we could ship a tarball with an altered tzdata.zi file (this is what Tom suggested).
As noted previously, there are two main groups in the debate - minimal and maximal. 1) The minimal group believe TZDB should only focus on data after 1970 and use the minimum set of IDs necessary to achieve that (considering the ID name to have no meaning or value pre-1970). 2) The maximal group believe TZDB should place greater value on pre-1970 data and the name of the ID (considering that governmental authorities are an important part of time-keeping). It is possible to automatically derive the data for group 1 from the data for group 2. The other way around is not possible. I'm suggesting that the default data in the repo is for group 2 with a way to derive group 1. This could result in two separate downloadable tarballs. I imagine the tasks would be: - take the relevant parts of the `backzone` data and put it back in the main files (Canada/Montreal would not be put back for example) - add a new `idmerge` file identifying IDs that are the same post-1970 and which ID should be the one to have its history retained - alter the makefile to have a "minimal" option to use the new `idmerge` file An alternative approach that avoids the `idmerge` file would be for the "minimal" option to simply strip all pre-1970 data Stephen

Nice and clear workable solution, as far as I can see. On 2021-06-08 04:56, Stephen Colebourne via tz wrote:
On Sun, 6 Jun 2021 at 19:59, Paul Eggert <eggert@cs.ucla.edu> wrote:
If that process is too much trouble for downstream users, we can arrange to generate the alternate source upstream. For example, we could ship a tarball with an altered tzdata.zi file (this is what Tom suggested). As noted previously, there are two main groups in the debate - minimal and maximal.
1) The minimal group believe TZDB should only focus on data after 1970 and use the minimum set of IDs necessary to achieve that (considering the ID name to have no meaning or value pre-1970).
2) The maximal group believe TZDB should place greater value on pre-1970 data and the name of the ID (considering that governmental authorities are an important part of time-keeping).
It is possible to automatically derive the data for group 1 from the data for group 2. The other way around is not possible.
I'm suggesting that the default data in the repo is for group 2 with a way to derive group 1. This could result in two separate downloadable tarballs.
I imagine the tasks would be: - take the relevant parts of the `backzone` data and put it back in the main files (Canada/Montreal would not be put back for example) - add a new `idmerge` file identifying IDs that are the same post-1970 and which ID should be the one to have its history retained - alter the makefile to have a "minimal" option to use the new `idmerge` file
An alternative approach that avoids the `idmerge` file would be for the "minimal" option to simply strip all pre-1970 data
Stephen

On 6/8/21 1:56 AM, Stephen Colebourne via tz wrote:
It is possible to automatically derive the data for group 1 from the data for group 2. The other way around is not possible.
That's OK, as I'm not proposing the other way around. I'm merely proposing a flag to generate either group-1-style or group-2-style data from what we have now. This should be doable relatively easily. (And if I'm wrong, we can revisit this technical issue of course.)

On Tue, 8 Jun 2021 at 17:23, Paul Eggert via tz <tz@iana.org> wrote:
On 6/8/21 1:56 AM, Stephen Colebourne via tz wrote:
It is possible to automatically derive the data for group 1 from the data for group 2. The other way around is not possible.
That's OK, as I'm not proposing the other way around. I'm merely proposing a flag to generate either group-1-style or group-2-style data from what we have now. This should be doable relatively easily. (And if I'm wrong, we can revisit this technical issue of course.)
Are you proposing a makefile option to recreate the original source files using the data in backzone? That seems much harder work than doing it the other way around and prone to error. (Remember that downstream projects need the correct source files, not any other output file) One key advantage of the `idmerge` file is that it makes the merge process obvious. ie. that TZDB favours Berlin over Stockholm/Oslo. I'm proposing that the historic data is present in its original location (africa to southamerica). And I propose that the flag and `idmerge` provide the necessary tools to reduce the file size for those that need it. The importance of the full dataset being the default can be seen here for example: https://www.oracle.com/java/technologies/javase/tzupdater-readme.html As can be seen, the docs expect users to download the latest tarball. It isn't reasonable for TZDB to demand all downstream users change their URL. Another variant of the same tool: https://docs.azul.com/core/timezone-updater Here is another example, where a downstream system has had to adapt to retain compatibility: http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/801874e394a7/make/gendata/Ge... https://github.com/openjdk/jdk/blob/master/make/data/tzdata/jdk11_backward I know CLDR has previously expressed significant concern about the compatibility issue too. Stephen

On 6/8/21 10:44 AM, Stephen Colebourne via tz wrote:
Are you proposing a makefile option to recreate the original source files using the data in backzone?
Not byte-for-byte copies, no. Just a functional copy containing the equivalent of what the data would have looked like had the new alike-since-1970 patch not been installed. One way to do this would be to generate a file that contains what tzdata.zi does, except it omits the effect of the alike-since-1970 patch. This is basically what Tom Lane was asking for here: https://mm.icann.org/pipermail/tz/2021-June/030198.html and I expect it wouldn't be too hard to automate that.

On Tue, 8 Jun 2021 at 19:21, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 6/8/21 10:44 AM, Stephen Colebourne via tz wrote:
Are you proposing a makefile option to recreate the original source files using the data in backzone?
Not byte-for-byte copies, no. Just a functional copy containing the equivalent of what the data would have looked like had the new alike-since-1970 patch not been installed.
I'm not worried about the comments if that is what you are thinking of. I'm more concerned about exactly what the output you are proposing will contain - it is not clear to me. At this point, I'm more interested in getting to a resolution that is suitable for the long term. I've yet to see a willingness to engage on backwards compatibility - to stop fiddling with the data. A key part of that is engaging on the topic of what constitutes a minimum set of IDs. Java, CLDR and probably more need that for their compatibility. (ie. just reverting this one patch is insufficient to reach a stable resolution, because of the unfairness of Angola/Slovakia) Look at it this way, your patch (and probably previous ones) are making a political statement of the kind you say you don't want. The file `europe` contains full commentary for Germany: https://github.com/eggert/tz/blob/main/europe#L1397-L1452 yet Sweden and Norway get just 2 lines: https://github.com/eggert/tz/blob/main/europe#L3383-L3384 with the actual commentary relegated to the semi-trash can of `backzone`. Where is the fairness in that? I'd also point out that the file is structured by country, which makes a mockery of the idea that timezones are not connected to countries. Stephen

On 2021-06-09 09:29, Stephen Colebourne via tz wrote:
Look at it this way, your patch (and probably previous ones) are making a political statement of the kind you say you don't want. The file `europe` contains full commentary for Germany: https://github.com/eggert/tz/blob/main/europe#L1397-L1452 yet Sweden and Norway get just 2 lines: https://github.com/eggert/tz/blob/main/europe#L3383-L3384 with the actual commentary relegated to the semi-trash can of `backzone`.Where is the fairness in that? I'd also point out that the file is structured by country, which makes a mockery of the idea that timezones are not connected to countries.
I think you treat the tzdb data a bit harsh. tzdb uses political countries only internally, as regions of the surface of the Earth. This is a matter of practicality, not of concept. No "political statements" are made, none are contained in the TZif files for users. And "backzone/" should be considered as a repository for historical data that is (currently) not needed by the majority of tzdb users. Some of these data is carefully researched by a multitude of people, and is available nowhere else. And there are a few users of these data! Comments are located close to the the "Zone" and "Rule" data they concern (except for the NOTEs in australasia) -- I find that fair. Michael Deckers.

On 6/9/21 2:29 AM, Stephen Colebourne via tz wrote:
I'm not worried about the comments if that is what you are thinking of. I'm more concerned about exactly what the output you are proposing will contain - it is not clear to me.
Yes, I was definitely thinking of the comments. Also, the ordering of Zones (which is irrelevant). Stuff like that. The idea is that the generated TZif files should be identical to what they would have been without the alike-since-1970 patch. I would expect similar results in downstream systems that don't use TZif.
I've yet to see a willingness to engage on backwards compatibility - to stop fiddling with the data.
How about this: we could say that we won't do any sort of merging like this in the future. In other words, this is the last time we'll be merging legacy zones because they differ only before 1970. Would a statement like that help? We could put such a statement into the NEWS file, say.
your patch (and probably previous ones) are making a political statement of the kind you say you don't want.
Sure, but no matter what we do we'll be making a political statement. The statement I'd like to see is "let's avoid political data when we can". Admittedly not everyone agrees (which is also politics :-).

I have to say I feel like the idea of a `make` flag for this purpose seems like compromise for the sake of compromise. Isn't reducing the maintenance burden the primary point of the zone merging and the move of historical data into backzone? By adding a flag that generates the old and new versions, you still have to maintain the historical data (in backzone), and /also/ you need to maintain a program that converts the new version into the old version. Unlike `vanguard` / `rearguard`, where my understanding was that these were supposed to be make targets where you use "rearguard" to buy yourself time to support deprecated features (or load bearing bugs...) and you use "vanguard" to opt in to stuff that will soon be the default, this proposal seems to be /indefinitely/ introducing a schism. The only reason I can see for doing it this way is if the concern were about the sizes of tzdata binaries, and for people who need extra-slim distributions they could merge a bunch of zones, but honestly for those rare use cases it seems like it would be easier to undo the patch in its entirety and add a utility that detects when a set of zones is identical after a specified date (doesn't have to be 1970!) and merge all the rest into links, for people making custom small distributions of this nature. On 6/9/21 4:12 PM, Paul Eggert via tz wrote:
On 6/9/21 2:29 AM, Stephen Colebourne via tz wrote:
I'm not worried about the comments if that is what you are thinking of. I'm more concerned about exactly what the output you are proposing will contain - it is not clear to me.
Yes, I was definitely thinking of the comments. Also, the ordering of Zones (which is irrelevant). Stuff like that. The idea is that the generated TZif files should be identical to what they would have been without the alike-since-1970 patch. I would expect similar results in downstream systems that don't use TZif.
I've yet to see a willingness to engage on backwards compatibility - to stop fiddling with the data.
How about this: we could say that we won't do any sort of merging like this in the future. In other words, this is the last time we'll be merging legacy zones because they differ only before 1970. Would a statement like that help? We could put such a statement into the NEWS file, say.
your patch (and probably previous ones) are making a political statement of the kind you say you don't want.
Sure, but no matter what we do we'll be making a political statement. The statement I'd like to see is "let's avoid political data when we can". Admittedly not everyone agrees (which is also politics :-).

On Wed, 9 Jun 2021 at 21:13, Paul Eggert <eggert@cs.ucla.edu> wrote:
Yes, I was definitely thinking of the comments. Also, the ordering of Zones (which is irrelevant). Stuff like that.
Neither comments nor ordering would be significant.
I've yet to see a willingness to engage on backwards compatibility - to stop fiddling with the data.
How about this: we could say that we won't do any sort of merging like this in the future. In other words, this is the last time we'll be merging legacy zones because they differ only before 1970. Would a statement like that help? We could put such a statement into the NEWS file, say.
It would be clearer to place an explicit statement in the charter or theory file. That the TZDB co-ordinator will not merge timezones or perform other actions that remove data from the database.
your patch (and probably previous ones) are making a political statement of the kind you say you don't want.
Sure, but no matter what we do we'll be making a political statement. The statement I'd like to see is "let's avoid political data when we can".
Country-based politics can be avoided by outsourcing the decision to ISO-3166. When they recognise a country, a zone ID must also exist. It is an incredibly simple rule, and easy for any drive-by commenters to understand. As Paul Ganssle says, the route you are trying to pursue really doesn't make any logical sense because it is more work than putting all the data in the main files and have the makefile perform a simple filter by date. Once the premise is accepted that the backzone data is a meaningful part of the project, there is literally no point in retaining it in backzone (rather than europe or africa). If you are willing to publish tarballs containing what is currently in backzone, does that mean that it is now accepted that the backzone is a valid and meaningful part of TZDB? That it can be enhanced if found to be wrong? Like Paul Ganssle, I really think the line you are trying to draw between the main files and backzone doesn't make any sense. Once the premise that the data currently in backzone matters, it should be restored to its rightful place in the main files, and tooling used to merge zones (which would be consistent as opposed to the piecemeal approach of the past few years. Stephen

On 6/9/21 2:41 PM, Stephen Colebourne via tz wrote:
It would be clearer to place an explicit statement in the charter or theory file.
Sure, we could make this statement a guideline in the theory file (that's where the guidelines are) instead of just a NEWS entry. Would that do?
Country-based politics can be avoided by outsourcing the decision to ISO-3166.
That would help, but it would not be nearly enough. Country codes are not our biggest political issue, as witness our long discussions over how to spell certain entries, which city should be used in an area, when exactly some foreign power controlled some location, etc. The more unnecessary Zones we have, the more of these unnecessary discussions we'll have, particularly as the unnecessary Zones will be present purely for political reasons. (And ISO 3166 does have a country code for Kosovo, so even the country-code issue can and plausibly will be disputed.) Besides, we're better off the less we couple to the UN or to the ISO or whatever.
Once the premise is accepted that the backzone data is a meaningful part of the project
Nobody is saying backzone is meaningless. However, it doesn't logically follow that because backzone has meaning to some, it must be used by all; or even that backzone should be maintained to the same standard as the mainline data (which it's not). Being lower-priority isn't the same as being frozen: as I wrote last week <https://mm.icann.org/pipermail/tz/2021-June/030181.html> we can take good patches for 'backzone'. Goodness knows it needs them; it has too many errors and inconsistencies and unfairnesses. That being said, I urge potential contributors to focus on the mainline data instead, as we have trouble enough with there and our collective but limited resources are best focused there.

On Thu, 10 Jun 2021 at 18:37, Paul Eggert via tz <tz@iana.org> wrote:
On 6/9/21 2:41 PM, Stephen Colebourne via tz wrote:
It would be clearer to place an explicit statement in the charter or theory file.
Sure, we could make this statement a guideline in the theory file (that's where the guidelines are) instead of just a NEWS entry. Would that do?
If a statement is to be made it should be in both news and theory.
(And ISO 3166 does have a country code for Kosovo, so even the country-code issue can and plausibly will be disputed.)
I can't find evidence to support that. Kosovo is not present in the official ISO browser tool. There is a code that is being used by some for Kosovo, but it isn't formally endorsed by ISO: "The code XK is being used by the European Commission,[25] the IMF, and SWIFT,[26] CLDR and other organizations as a temporary country code for Kosovo." https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2#RS
Once the premise is accepted that the backzone data is a meaningful part of the project
Nobody is saying backzone is meaningless. However, it doesn't logically follow that because backzone has meaning to some, it must be used by all; or even that backzone should be maintained to the same standard as the mainline data (which it's not).
Being lower-priority isn't the same as being frozen: as I wrote last week <https://mm.icann.org/pipermail/tz/2021-June/030181.html> we can take good patches for 'backzone'. Goodness knows it needs them; it has too many errors and inconsistencies and unfairnesses.
"maintained to the same standard" is a phrase that provides confusion. There is no guarantee that the history of a zone in the main files is correct. In fact, much of the data in backzone is more correct than that in the main files. If all the data in backzone had been left in the main files, no one would have batted an eyelid - its just more data about timezones. As a reminder, what I object to is for users of TZDB Europe/Oslo to get any data from Europe/Berlin. Or for users of Europe/Amsterdam to get any data from Europe/Brussels. Your proposed compromise is not acceptable on that basis. Only two solutions appear acceptable to me - no pre-1970 data in the main files (*all* pre-1970 data would be in separate files, and opt-in) - revert the patch and any previous patches that merged zones across country borders But these solutions should apply to all users (not just the ones I represent). The basic idea that you take an ID for one place and get the pre-1970 history for somewhere completely different (in a different country) isn't tenable for TZDB however much you might wish it was. Given your unwillingness to accept that merging timezones across country boundaries is unacceptable, perhaps it might be more fruitful to explore whether the TZDB community would be willing to accept the other option - no pre-1970 data in the main files? Stephen

On 6/10/21 4:28 PM, Stephen Colebourne via tz wrote:
If a statement is to be made it should be in both news and theory.
Sure, that could be done. Proposed draft attached. I have not installed this in the development sources.
- no pre-1970 data in the main files (*all* pre-1970 data would be in separate files, and opt-in)
It should be easy to get the effect of this, by running the command 'make ZFLAGS=-r@0', which omits pre-1970 data in the generated TZif files. A similar option could be added to the Java-based equivalent to zic. If that doesn't suffice for some reason, we could add a flag to generate a tarball with pre-1970 data removed from the source files; this would be like the already-existing rearguard-format tarball except it would be an independent option.
- revert the patch and any previous patches that merged zones across country borders
This alternative is less appealing, for reasons already discussed. I think we're better off with a technical compromise, such as 'make' one-liner mentioned above, or something like the compromise I suggested at the start of this thread <https://mm.icann.org/pipermail/tz/2021-June/030220.html>.

On Mon, 14 Jun 2021 at 08:18, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 6/10/21 4:28 PM, Stephen Colebourne via tz wrote:
If a statement is to be made it should be in both news and theory.
Sure, that could be done. Proposed draft attached. I have not installed this in the development sources.
The patch is fine so far as it goes. But it makes a mockery of not reverting the merging patch under discussion.
- revert the patch and any previous patches that merged zones across country borders
This alternative is less appealing, for reasons already discussed. I think we're better off with a technical compromise, such as 'make' one-liner mentioned above, or something like the compromise I suggested at the start of this thread <https://mm.icann.org/pipermail/tz/2021-June/030220.html>.
We seem to be at an impasse. I don't think there is any support from the mailing list for the merging patch to remain in the repo. You've had many requests to revert it, and no requests to retain it. There are technical solutions available to reduce the amount of data published to downstream users, but the starting point must be a fully populated database, not one that is logically broken. The next action must be to revert. Then we can agree on any technical measures necessary. Stephen

agreed On 2021-06-15 10:56, Stephen Colebourne via tz wrote:
On Mon, 14 Jun 2021 at 08:18, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 6/10/21 4:28 PM, Stephen Colebourne via tz wrote:
If a statement is to be made it should be in both news and theory. Sure, that could be done. Proposed draft attached. I have not installed this in the development sources. The patch is fine so far as it goes. But it makes a mockery of not reverting the merging patch under discussion.
- revert the patch and any previous patches that merged zones across country borders This alternative is less appealing, for reasons already discussed. I think we're better off with a technical compromise, such as 'make' one-liner mentioned above, or something like the compromise I suggested at the start of this thread <https://mm.icann.org/pipermail/tz/2021-June/030220.html>. We seem to be at an impasse.
I don't think there is any support from the mailing list for the merging patch to remain in the repo. You've had many requests to revert it, and no requests to retain it.
There are technical solutions available to reduce the amount of data published to downstream users, but the starting point must be a fully populated database, not one that is logically broken. The next action must be to revert. Then we can agree on any technical measures necessary.
Stephen

agreed (2) On 15.06.21 17:12, David Patte via tz wrote:
agreed
On 2021-06-15 10:56, Stephen Colebourne via tz wrote:
On Mon, 14 Jun 2021 at 08:18, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 6/10/21 4:28 PM, Stephen Colebourne via tz wrote:
If a statement is to be made it should be in both news and theory. Sure, that could be done. Proposed draft attached. I have not installed this in the development sources. The patch is fine so far as it goes. But it makes a mockery of not reverting the merging patch under discussion.
- revert the patch and any previous patches that merged zones across country borders This alternative is less appealing, for reasons already discussed. I think we're better off with a technical compromise, such as 'make' one-liner mentioned above, or something like the compromise I suggested at the start of this thread <https://mm.icann.org/pipermail/tz/2021-June/030220.html>. We seem to be at an impasse.
I don't think there is any support from the mailing list for the merging patch to remain in the repo. You've had many requests to revert it, and no requests to retain it.
There are technical solutions available to reduce the amount of data published to downstream users, but the starting point must be a fully populated database, not one that is logically broken. The next action must be to revert. Then we can agree on any technical measures necessary.
Stephen

Dear tz@iana.org, On Tuesday, 15 June 2021 19:00:37 CEST Alois Treindl via tz wrote: agreed (3)
agreed (2)
On 15.06.21 17:12, David Patte via tz wrote:
agreed
[...]
We seem to be at an impasse.
I don't think there is any support from the mailing list for the merging patch to remain in the repo. You've had many requests to revert it, and no requests to retain it.
There are technical solutions available to reduce the amount of data published to downstream users, but the starting point must be a fully populated database, not one that is logically broken. The next action must be to revert. Then we can agree on any technical measures necessary.
Stephen
Cheers, Jürgen -- Jürgen Appel Research Scientist Denmark's National Metrology Institute Dansk Fundamental Metrologi, DFM A/S (dfm.dk) Kogle Allé 5 DK-2970 Hørsholm Denmark Mobile: +45 25459049 Email: jap@dfm.dk VAT: DK-29217939

Agreed Mark (📱) On Wed, Jun 16, 2021, 01:30 Jürgen Appel via tz <tz@iana.org> wrote:
Dear tz@iana.org,
On Tuesday, 15 June 2021 19:00:37 CEST Alois Treindl via tz wrote:
agreed (3)
agreed (2)
On 15.06.21 17:12, David Patte via tz wrote:
agreed
[...]
We seem to be at an impasse.
I don't think there is any support from the mailing list for the merging patch to remain in the repo. You've had many requests to revert it, and no requests to retain it.
There are technical solutions available to reduce the amount of data published to downstream users, but the starting point must be a fully populated database, not one that is logically broken. The next action must be to revert. Then we can agree on any technical measures necessary.
Stephen
Cheers, Jürgen
-- Jürgen Appel Research Scientist Denmark's National Metrology Institute Dansk Fundamental Metrologi, DFM A/S (dfm.dk) Kogle Allé 5 DK-2970 Hørsholm Denmark
Mobile: +45 25459049 Email: jap@dfm.dk VAT: DK-29217939

Agreed Debbie
On Jun 16, 2021, at 7:08 AM, Mark Davis ☕️ via tz <tz@iana.org> wrote:
Agreed
Mark (📱)
On Wed, Jun 16, 2021, 01:30 Jürgen Appel via tz <tz@iana.org <mailto:tz@iana.org>> wrote: Dear tz@iana.org <mailto:tz@iana.org>,
On Tuesday, 15 June 2021 19:00:37 CEST Alois Treindl via tz wrote:
agreed (3)
agreed (2)
On 15.06.21 17:12, David Patte via tz wrote:
agreed
[...]
We seem to be at an impasse.
I don't think there is any support from the mailing list for the merging patch to remain in the repo. You've had many requests to revert it, and no requests to retain it.
There are technical solutions available to reduce the amount of data published to downstream users, but the starting point must be a fully populated database, not one that is logically broken. The next action must be to revert. Then we can agree on any technical measures necessary.
Stephen
Cheers, Jürgen
-- Jürgen Appel Research Scientist Denmark's National Metrology Institute Dansk Fundamental Metrologi, DFM A/S (dfm.dk <http://dfm.dk/>) Kogle Allé 5 DK-2970 Hørsholm Denmark
Mobile: +45 25459049 Email: jap@dfm.dk <mailto:jap@dfm.dk> VAT: DK-29217939

There hasn’t been any info on the below in the last 10 days, is anything happening in the background which the list is not privy to? Or are we in a stalemate between the TZ coordinator and the opponents of this change? From: tz <tz-bounces@iana.org> On Behalf Of Deborah Goldsmith via tz Sent: 18 June 2021 00:09 To: Mark Davis <mark@macchiato.com> Cc: Time Zone Mailing List <tz@iana.org> Subject: Re: [tz] Undoing the effect of the new alike-since-1970 patch Agreed Debbie On Jun 16, 2021, at 7:08 AM, Mark Davis ☕️ via tz <tz@iana.org<mailto:tz@iana.org>> wrote: Agreed Mark (📱) On Wed, Jun 16, 2021, 01:30 Jürgen Appel via tz <tz@iana.org<mailto:tz@iana.org>> wrote: Dear tz@iana.org<mailto:tz@iana.org>, On Tuesday, 15 June 2021 19:00:37 CEST Alois Treindl via tz wrote: agreed (3)
agreed (2)
On 15.06.21 17:12, David Patte via tz wrote:
agreed
[...]
We seem to be at an impasse.
I don't think there is any support from the mailing list for the merging patch to remain in the repo. You've had many requests to revert it, and no requests to retain it.
There are technical solutions available to reduce the amount of data published to downstream users, but the starting point must be a fully populated database, not one that is logically broken. The next action must be to revert. Then we can agree on any technical measures necessary.
Stephen
Cheers, Jürgen -- Jürgen Appel Research Scientist Denmark's National Metrology Institute Dansk Fundamental Metrologi, DFM A/S (dfm.dk<http://dfm.dk/>) Kogle Allé 5 DK-2970 Hørsholm Denmark Mobile: +45 25459049 Email: jap@dfm.dk<mailto:jap@dfm.dk> VAT: DK-29217939

On Mon, Jun 28, 2021 at 5:36 AM Paw Boel Nielsen via tz <tz@iana.org> wrote:
There hasn’t been any info on the below in the last 10 days, is anything happening in the background which the list is not privy to? Or are we in a stalemate between the TZ coordinator and the opponents of this change?
Hi, I'm one of the two Area Directors for the Applications and Real-Time area at the IETF. This issue has been brought to our attention under BCP 175, Section 5. I am in contact with the appropriate people and will work with the community to develop a path forward from here. I can't give a promise yet on timing, but you can expect an update soon. -MSK

Hi Murray, Is there any update? If 2021b comes out tomorrow, without the patch being reverted that would be a lot of work for Android (and other projects too, I believe). I do agree with Stephen. On Mon, 28 Jun 2021 at 20:37, Murray S. Kucherawy via tz <tz@iana.org> wrote:
On Mon, Jun 28, 2021 at 5:36 AM Paw Boel Nielsen via tz <tz@iana.org> wrote:
There hasn’t been any info on the below in the last 10 days, is anything happening in the background which the list is not privy to? Or are we in a stalemate between the TZ coordinator and the opponents of this change?
Hi, I'm one of the two Area Directors for the Applications and Real-Time area at the IETF.
This issue has been brought to our attention under BCP 175, Section 5. I am in contact with the appropriate people and will work with the community to develop a path forward from here. I can't give a promise yet on timing, but you can expect an update soon.
-MSK

On 9/13/21 6:33 AM, Almaz Mingaleev via tz wrote:
If 2021b comes out tomorrow, without the patch being reverted that would be a lot of work for Android
We do expect some progress on this issue shortly. In the meantime please don't worry; 2021b is not coming out tomorrow. By the way, what work would be needed for Android? Can you point to Android code that would need changing, for example? My impression is that the patch has little effect on Android users; if I'm wrong, it'd be nice to know details.

Platform-wise the main issue is that we can no longer use zone.tab to validate our region <-> time zone mapping. In Android time zone picker user selects region first, and then there is a list of available time zones for that region. The opposite mapping is needed to show the region on the same screen when the user has already selected a time zone. Our biggest concern is consistency not just within Android [1], but across everything. So far TZDB is the source of truth in terms of existing time zones and how they map to ISO 3166 codes. With the patch applied each platform/library might add their own solution to overcome that (maybe even create their own fork). Over time they might diverge from TZDB. Downstream consumers of TZDB time zone IDs (e.g. Android, ICU, OpenJDK, geonames, timezone-boundary-builder, etc.) and their users, rightly or wrongly, associate time zone IDs with countries. More precisely, it's usually not the IDs themselves they associate, it's the metadata that other projects have associated with the IDs (e.g. exemplar locations, display names like "British Standard Time") assuming they are country specific. We think the current state of TZDB is likely to lead to the kind of problem that Stephen pointed out in https://mm.icann.org/pipermail/tz/2021-May/030142.html and support something like his "tzdb should provide is a full time-zone with history (not a Link) for each region identified by an ISO-3166-1 code". With TZDB abdicating responsibility for the mapping between time zone and ISO code, the problem doesn't go away, it just gets pushed to the many downstream projects who have no choice but to work the way users have come to expect.
From an Android perspective (ignoring other platforms/libraries/etc) the patch just adds an extra maintenance burden and creates inconsistency risks.
[1] There are several time zone related APIs available on Android: java.util.TimeZone, java.time package, ICU4C/ICU4J, C API provided by bionic(Android’s implementation of libc). bionic and java.util.TimeZone read TZif file, java.time is based on ICU4J, ICU uses its own .dat file. On Mon, 13 Sept 2021 at 22:30, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 9/13/21 6:33 AM, Almaz Mingaleev via tz wrote:
If 2021b comes out tomorrow, without the patch being reverted that would be a lot of work for Android
We do expect some progress on this issue shortly. In the meantime please don't worry; 2021b is not coming out tomorrow.
By the way, what work would be needed for Android? Can you point to Android code that would need changing, for example? My impression is that the patch has little effect on Android users; if I'm wrong, it'd be nice to know details.

On 9/14/21 11:41 AM, Almaz Mingaleev wrote:
Platform-wise the main issue is that we can no longer use zone.tab to validate our region <-> time zone mapping.
Thanks for mentioning this. I was unaware that the compatibility concern focused on zone.tab. Since zone.tab is present only for compatibility reasons, it makes sense to revert the changes to it. Proposed patch attached. I have not installed this patch into the development repository yet though, as it's a sensitive topic and I want people to have time for comments. In the long run I suggest looking at zone1970.tab, as it has fewer entries and should be more convenient for users who simply want to set a time zone. It should even be possible to add a file 'zonenow.tab' that lists only zones that differ for current and future timestamps; that would be even smaller and so should be even simpler for users. None of this is urgent, though.

On 9/14/21 2:59 PM, Paul Eggert wrote:
Since zone.tab is present only for compatibility reasons, it makes sense to revert the changes to it. Proposed patch attached.
Almaz, have you had a chance to look at that patch? It's archived here: https://mm.icann.org/pipermail/tz/2021-September/030388.html I ask because it looks like we'll need a new tzdb release soon, due to Samoa's recent decision to discontinue DST: https://mm.icann.org/pipermail/tz/2021-September/030398.html and I'm hoping that the patch would ameliorate at least some of the potential compatibility issues mentioned in the alike-since-1970 area.

Thanks for the patch, Paul. I've applied it, but it is not what zone.tab was in 2021a - lots of countries are mapped to America/Port_of_Spain in your patch. Also HR is mapped to Europe/Belgrade. Please see full 2021a and 2021b diff attached. Is it safe to revert zone.tab to what it was in 2021a <https://github.com/eggert/tz/blob/3831c591e188edc16d1a6855fb20ebee78c4e27b/z...> ? On Mon, 20 Sept 2021 at 08:31, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 9/14/21 2:59 PM, Paul Eggert wrote:
Since zone.tab is present only for compatibility reasons, it makes sense to revert the changes to it. Proposed patch attached.
Almaz, have you had a chance to look at that patch? It's archived here:
https://mm.icann.org/pipermail/tz/2021-September/030388.html
I ask because it looks like we'll need a new tzdb release soon, due to Samoa's recent decision to discontinue DST:
https://mm.icann.org/pipermail/tz/2021-September/030398.html
and I'm hoping that the patch would ameliorate at least some of the potential compatibility issues mentioned in the alike-since-1970 area.

On 9/20/21 5:34 AM, Almaz Mingaleev wrote:
Thanks for the patch, Paul. I've applied it, but it is not what zone.tab was in 2021a
Thanks for checking. Unfortunately I overlooked one of the important commits since 2021a. I fixed that and came up with the attached patch; please try that instead.
Is it safe to revert zone.tab to what it was in 2021a <https://github.com/eggert/tz/blob/3831c591e188edc16d1a6855fb20ebee78c4e27b/z...>
Almost, but that misses the post-2021a replacement of Enderbury with Kanton. Here are the changes from 2021a to the zone.tab that results from applying the attached patch to the current main branch of the development repository: diff --git a/zone.tab b/zone.tab index 1f0128f..3d4a7f7 100644 --- a/zone.tab +++ b/zone.tab @@ -3,7 +3,7 @@ # This file is in the public domain, so clarified as of # 2009-05-17 by Arthur David Olson. # -# From Paul Eggert (2018-06-27): +# From Paul Eggert (2021-09-20): # This file is intended as a backward-compatibility aid for older programs. # New programs should use zone1970.tab. This file is like zone1970.tab (see # zone1970.tab's comments), but with the following additional restrictions: @@ -16,6 +16,9 @@ # clocks have agreed since 1970; this is a narrower definition than # that of zone1970.tab. # +# Unlike zone1970.tab, a row's third column can be a Link from +# 'backward' instead of a Zone, +# # This table is intended as an aid for users, to help them select timezones # appropriate for their practical needs. It is not intended to take or # endorse any position on legal or territorial claims. @@ -228,7 +231,7 @@ KE -0117+03649 Africa/Nairobi KG +4254+07436 Asia/Bishkek KH +1133+10455 Asia/Phnom_Penh KI +0125+17300 Pacific/Tarawa Gilbert Islands -KI -0308-17105 Pacific/Enderbury Phoenix Islands +KI -0247-17143 Pacific/Kanton Phoenix Islands KI +0152-15720 Pacific/Kiritimati Line Islands KM -1141+04316 Indian/Comoro KN +1718-06243 America/St_Kitts @@ -391,7 +394,7 @@ TK -0922-17114 Pacific/Fakaofo TL -0833+12535 Asia/Dili TM +3757+05823 Asia/Ashgabat TN +3648+01011 Africa/Tunis -TO -2110-17510 Pacific/Tongatapu +TO -210800-1751200 Pacific/Tongatapu TR +4101+02858 Europe/Istanbul TT +1039-06131 America/Port_of_Spain TV -0831+17913 Pacific/Funafuti

Thanks, Paul. It works fine. On Mon, 20 Sept 2021 at 15:50, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 9/20/21 5:34 AM, Almaz Mingaleev wrote:
Thanks for the patch, Paul. I've applied it, but it is not what zone.tab was in 2021a
Thanks for checking. Unfortunately I overlooked one of the important commits since 2021a. I fixed that and came up with the attached patch; please try that instead.
Is it safe to revert zone.tab to what it was in 2021a < https://github.com/eggert/tz/blob/3831c591e188edc16d1a6855fb20ebee78c4e27b/z...
Almost, but that misses the post-2021a replacement of Enderbury with Kanton. Here are the changes from 2021a to the zone.tab that results from applying the attached patch to the current main branch of the development repository:
diff --git a/zone.tab b/zone.tab index 1f0128f..3d4a7f7 100644 --- a/zone.tab +++ b/zone.tab @@ -3,7 +3,7 @@ # This file is in the public domain, so clarified as of # 2009-05-17 by Arthur David Olson. # -# From Paul Eggert (2018-06-27): +# From Paul Eggert (2021-09-20): # This file is intended as a backward-compatibility aid for older programs. # New programs should use zone1970.tab. This file is like zone1970.tab (see # zone1970.tab's comments), but with the following additional restrictions: @@ -16,6 +16,9 @@ # clocks have agreed since 1970; this is a narrower definition than # that of zone1970.tab. # +# Unlike zone1970.tab, a row's third column can be a Link from +# 'backward' instead of a Zone, +# # This table is intended as an aid for users, to help them select timezones # appropriate for their practical needs. It is not intended to take or # endorse any position on legal or territorial claims. @@ -228,7 +231,7 @@ KE -0117+03649 Africa/Nairobi KG +4254+07436 Asia/Bishkek KH +1133+10455 Asia/Phnom_Penh KI +0125+17300 Pacific/Tarawa Gilbert Islands -KI -0308-17105 Pacific/Enderbury Phoenix Islands +KI -0247-17143 Pacific/Kanton Phoenix Islands KI +0152-15720 Pacific/Kiritimati Line Islands KM -1141+04316 Indian/Comoro KN +1718-06243 America/St_Kitts @@ -391,7 +394,7 @@ TK -0922-17114 Pacific/Fakaofo TL -0833+12535 Asia/Dili TM +3757+05823 Asia/Ashgabat TN +3648+01011 Africa/Tunis -TO -2110-17510 Pacific/Tongatapu +TO -210800-1751200 Pacific/Tongatapu TR +4101+02858 Europe/Istanbul TT +1039-06131 America/Port_of_Spain TV -0831+17913 Pacific/Funafuti

On 9/20/21 8:15 AM, Almaz Mingaleev wrote:
Thanks, Paul. It works fine.
Thanks for checking. As the "Revert May patch to zone.tab" patch fixes your problems, and there doesn't seem to be any objection to it and I doubt there would be any, I installed it into the development database on GitHub.

Follow-up question, if the commit won't be reverted, should backward file's header be updated? Now it says "This file provides links between current names for timezones and their old names". I don't think "Europe/Berlin" is new name for "Europe/Oslo'' (and I believe it can be applied to most of the changes from that commit). On Mon, 20 Sept 2021 at 17:32, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 9/20/21 8:15 AM, Almaz Mingaleev wrote:
Thanks, Paul. It works fine.
Thanks for checking. As the "Revert May patch to zone.tab" patch fixes your problems, and there doesn't seem to be any objection to it and I doubt there would be any, I installed it into the development database on GitHub.

On 15 June 2021 15:56:45 BST, Stephen Colebourne via tz <tz@iana.org> wrote:
We seem to be at an impasse.
I don't think there is any support from the mailing list for the merging patch to remain in the repo. You've had many requests to revert it, and no requests to retain it.
There are technical solutions available to reduce the amount of data published to downstream users, but the starting point must be a fully populated database, not one that is logically broken. The next action must be to revert. Then we can agree on any technical measures necessary.
Hear hear. cheers, Derick

On 18:44 Tue 08 Jun , Stephen Colebourne via tz wrote:
On Tue, 8 Jun 2021 at 17:23, Paul Eggert via tz <tz@iana.org> wrote:
On 6/8/21 1:56 AM, Stephen Colebourne via tz wrote:
It is possible to automatically derive the data for group 1 from the data for group 2. The other way around is not possible.
That's OK, as I'm not proposing the other way around. I'm merely proposing a flag to generate either group-1-style or group-2-style data from what we have now. This should be doable relatively easily. (And if I'm wrong, we can revisit this technical issue of course.)
Are you proposing a makefile option to recreate the original source files using the data in backzone? That seems much harder work than doing it the other way around and prone to error. (Remember that downstream projects need the correct source files, not any other output file)
One key advantage of the `idmerge` file is that it makes the merge process obvious. ie. that TZDB favours Berlin over Stockholm/Oslo. I'm proposing that the historic data is present in its original location (africa to southamerica). And I propose that the flag and `idmerge` provide the necessary tools to reduce the file size for those that need it.
+1
The importance of the full dataset being the default can be seen here for example: https://www.oracle.com/java/technologies/javase/tzupdater-readme.html As can be seen, the docs expect users to download the latest tarball. It isn't reasonable for TZDB to demand all downstream users change their URL. Another variant of the same tool: https://docs.azul.com/core/timezone-updater
I work on OpenJDK for Red Hat, as one of three OpenJDK 8u upstream maintainers [0] and also packaging various versions for Fedora, CentOS and RHEL. We have similar tooling (based on what is in OpenJDK itself) in Fedora, CentOS & RHEL to produce the Java data files from the TZDB sources. This performs the same purpose as these tools; it allows updated Java data files to be produced for the latest tzdb update, without the JDK itself having to be updated. I agree with Stephen that expecting every downstream usage of this data to be updated is unreasonable, when it can just be fixed at source. I don't see a strong case for the new structure to be the default and to expect every downstream user to alter their code. That in itself implies an amount of risk, as much downstream use is in very stable codebases.
Here is another example, where a downstream system has had to adapt to retain compatibility: http://hg.openjdk.java.net/jdk8u/jdk8u/jdk/file/801874e394a7/make/gendata/Ge... https://github.com/openjdk/jdk/blob/master/make/data/tzdata/jdk11_backward
I think the better link may be: https://hg.openjdk.java.net/jdk8u/jdk8u/jdk/diff/7dcb74c3ffba/makefiles/Gend... which shows the change and https://hg.openjdk.java.net/jdk8u/jdk8u/jdk/rev/7dcb74c3ffba/makefiles/Genda... the entire changeset. We would have to bring any future tzdata update into the 8u repositories, which are stable repositories designed for security and bug fixes. We don't want to be doing major reworks of TZDB data handling there. On that note, 8u still uses rearguard format, because we decided the risk to switch to vanguard was not worth it. [1] [2]
I know CLDR has previously expressed significant concern about the compatibility issue too.
Stephen
[0] https://wiki.openjdk.java.net/display/jdk8u/Main [1] https://mail.openjdk.java.net/pipermail/jdk8u-dev/2020-June/011931.html [2] https://mail.openjdk.java.net/pipermail/jdk8u-dev/2020-October/012890.html Thanks, -- Andrew :) Pronouns: he / him or they / them Senior Free Java Software Engineer OpenJDK Package Owner Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222

Hi Paul, I welcome the direction that this is going in, but I've some observations: On Fri, 4 Jun 2021, Paul Eggert via tz wrote:
There's been a clear need expressed to support tzdb users who would rather not deal with the effects of the recently-proposed alike-since-1970 patch. On the other hand there are also fairness and guideline-oriented reasons for the patch, which was originally discussed and installed with more than our usual care and review.
Wrt to "fairness". Although IMO it is sad to see that some tzids don't have pre-1970 data, I don't think you can call it "fair" to then remove/restrict/hide away this data for tzids which have this data. "If I can't have it, you can't have it" is fairly infantile, IMO.
From what I remember, the policy/guideline has always just been "don't split up zones for only pre-1970 data", without any mentions of hiding the data that we already have behind a flag.
Because of the extensive followup discussions I don't see how a single version could be appealing to both sides of this disagreement.
So I propose we add a Makefile or similar build-time option to let tzdb users have it either way. Set the flag one way, and it will be as if the recently-proposed changes did not occur. Set it the other way, and you'll get the changes.
Who was actively asking for the data to be restricted? I might have missed it when going through the whole thread, but I don't think there was anybody (besides yourself) actively asking for this to happen. If there was nobody actively asking for the pre-1970 to be hidden for some zones, does it really make sense to add such a new flag, especially considering it will add more work for the TZ Coordinator? If current users don't care about pre-1970 data, then can already invoke zic with "zic -r @0" as is documented in zic(8).
If there is support for this idea, I expect to be able to implement this option soon, in plenty of time before any urgent change due to a real-world rule change arrives.
As Stephen said, I would prefer that the current state to be reversed first, and then we can discuss what we want in the other thread that Stephen started. cheers, Derick -- PHP 7.4 Release Manager Host of PHP Internals News: https://phpinternals.news Like Xdebug? Consider supporting me: https://xdebug.org/support https://derickrethans.nl | https://xdebug.org | https://dram.io twitter: @derickr and @xdebug

On 6/7/21 1:05 AM, Derick Rethans wrote:
Although IMO it is sad to see that some tzids don't have pre-1970 data, I don't think you can call it "fair" to then remove/restrict/hide away this data for tzids which have this data.
The patch doesn't remove or hide or restrict data. It merely changes the category that some data are in. That being said, there is a "fairness" question related to the category choice, and I'll discuss that below.
From what I remember, the policy/guideline has always just been "don't split up zones for only pre-1970 data"
Although "don't split up zones for only pre-1970 data" is certainly part of the guidelines, it's not the whole thing. We have merged many zones in the past, and these merges were following the guidelines.
Who was actively asking
This "fairness" question didn't come from a direct request by an end user to recategorize existing data. It came indirectly from a user request to add a zone for Kosovo, a request that devolved into a heated argument that other countries in similar situations have zones so Kosovo should have one too. Of course I could have responded that Kosovo isn't a "real" country because its ISO 3166 status is X but my experience is that such responses can raise hackles even further, and that these discussions are better served by appealing directly to timekeeping concerns. Given all the discussion in this thread I don't expect everyone to appreciate or agree with my experience, and to some extent I'm trying to answer your question rather than argue for the patch, as it looks like we'll be better off supporting both the keep-the-patch and the undo-the-patch sides of this disagreement.
If current users don't care about pre-1970 data, then can already invoke zic with "zic -r @0" as is documented in zic(8).
Yes, and better support for something along those lines has long been on my to-do list ("zic -r@0" is not enough because it results in duplicates). But that's a separate issue. I don't think anyone in the current thread has argued for removing all the pre-1970 data.
participants (15)
-
Almaz Mingaleev
-
Alois Treindl
-
Andrew Hughes
-
David Patte
-
Deborah Goldsmith
-
Derick Rethans
-
Jürgen Appel
-
Mark Davis ☕️
-
Michael H Deckers
-
Murray S. Kucherawy
-
Paul Eggert
-
Paul Ganssle
-
Paw Boel Nielsen
-
Stephen Colebourne
-
Tom Lane