The Unicode ICU team discussed the proposed changes in the TZDB in their meeting earlier this week and we are reporting the consensus here. This is an initial report, since time is short.
Members are very concerned about the downstream impact, and the inevitable compatibility mismatches between different implementations. While the pre-1970 data may not seem important to some people, the instability caused by its removal can be considerable, and last for years to come. Even if the TZDB provides a way to produce data compatible with 2021a or before by option, this may introduce confusion. For example, an OS packager may pick a default data package with pre-1970 rules merged, while a library packager like ICU may pick a variant with pre-1970 data preserved. Previously, multiple implementations used a single data so there is general consistency. With the proposed plan, there could be differences in results before 1970 between multiple implementations, causing problems everywhere - e.g. Linux and Java, ICU and Linux, etc.
If the change is made, here are the probable steps that would happen in ICU, based on the two areas that would be affected.
1. Dropping zone IDs from the zone.tab.
The main impact here is that a lot of implementations rely on the mapping of zone IDs to ISO country codes. ICU already has an internal exception table that contains certain (zone IDs, ISO code) mappings that retains information that used to be in zone.tab. We would extend that table to add all of the zones dropped by the proposed change. We would probably also move the data and the rest of zone.tab to CLDR, so that we have a public, structured set of data in XML and JSON. This would effectively clone the zone.tab data.
That way, implementations could use the zone.tab information to maintain the difference between Europe/Oslo and Europe/Berlin. That is, while the internal software might map Europe/Oslo to Europe/Berlin via a Link to get rules for evaluation, the library would still treat Europe/Oslo as a separate ID from Europe/Berlin.
2. Removing the pre-1970 rules
Or rather, moving the pre-1970 data into a file that is mixed in with other data that is not currently used. ICU doesn't want to get into the business of maintaining a fork of the TZDB, but if another major industry player took that role on, then ICU would consider adopting it so that the data is maintained.Mark Davis, Unicode Consortium President and Chair of CLDRYoshito Umaoka, Vice Chair of ICU