On Wed, Oct 6, 2021 at 10:09 AM Stephen Colebourne via tz <tz@iana.org> wrote:
Following on from the previous thread [1] I wanted to try and classify the IDs we have, which may or may not identify missing IDs.
Again, please avoid talking about pre-1970 data at this point.
But you can't talk about why the zones that exist exist without talking about the compatibility concerns, which deal with pre-1970 data and its treatment and how that's changed.
Non-region locations --------------------------- IDs for locations that are not region IDs. Each ID will have the same wall clock since 1970 as one of the region IDs.
Examples: Europe/Oslo, Europe/Amsterdam, Atlantic/Reykjavik
Consider: Can we write down a rule that covers which IDs are included here? And therefore when a new ID can be added to this set? If we can define a rule, then these can be split so rule-following IDs are in the main files and rule-breaking ones are in `backward` (although ideally they should be separate from the spelling changes). Obviously, we can say these IDs only exist for backwards compatibility, but that seems like a weak justification, and doesn't tackle the issue of when a new ID would be added to the list (which has been a point of tension).
Backwards compatibility is a very strong justification when systems are using identifiers to store data. Consumers of tzdata need to understand what is and is not backwards compatible.
As is well known, I think the obvious rule is that the IDs follow the ISO-3166-1 standard (rule: one ID per ISO code, additional IDs may be added where clocks have diverged since 1970). Using ISO-3166 can be justified by IANA domain policy [2]:
You need to justify this rule with respect to timezone data. The space of concerns is entirely different. Time zones are far more aligned with commercial relationships across borders than they are with political boundaries: that's why Idaho, Indiana, and Australia look the way they do. A lot of timezone splits coincide with changes to political boundaries, making the utility of country by country zones dubious: South Sudan, East Timor, etc.
As per the previous thread, these non-region location IDs are actively used in downstream business applications, and it is not OK that only works because tzdb happens to have IDs for backwards compatibility. There needs to be a better justification than that - these non-region locations need to be fully supported, with a consistent rule used to define what is and is not fully supported.
Do you have actual applications that broke/ would have broken because these are not zones? The current situation does not have your proposal. So how can you justify your proposal based on applications that need them to work: how do those applications work today? To use your previous example, if a contract specifies delivery in Frankfurt at the locally observed time of 4 in some time in the future, why will that be the same as 4 in Berlin in the future? The right way to handle this representation is to introduce an extra step of indirection: from location to zone info. We don't have crystal balls, we can't predict what zones will be needed in the future. Sincerely, Watson Ladd -- Astra mortemque praestare gradatim