Following on from the previous thread [1] I wanted to try and classify the IDs we have, which may or may not identify missing IDs. Again, please avoid talking about pre-1970 data at this point. Obsolete ------------ IDs that are obsolete and should never be used. They date from many years ago whe tzdb was just starting. Yet these still do appear in downstream UIs even today (of course UIs should not use the tzdb ID list, but in reality lots do). Examples: Portugal, NZ-CHAT, Navajo, Libya. Proposal: Provide 3-6 months notice, then move obsolete IDs to a new file "obsolete" which downstream projects are strongly encouraged not to include. (I would argue that the time has come to properly remove these IDs, which are very inconsistent in terms of which are provided and which not, eg Portugal, but not Spain) Deprecated, same location ------------------------------------ IDs that have been deprecated with a single clear alternative ID being provided. Both IDs represent the same physical location/city. Spelling changes: Asia/Katmandu (replaced by Asia/Kathmandu), Asia/Rangoon (replaced by Asia/Yangon) ID structure changes: America/Louisville (replaced by America/Kentucky/Louisville) Proposal: Ensure all of these are in `backward` Consider: Is there any way to move these IDs to the obsolete file? Maybe after 5 years? Or do we just accept backwards compatibility restrictions on these? Legally described mega-zones ----------------------------------------- IDs for locations where a federal or supra-national body defines rules, eg the EU or US DOT. Examples: US/Mountain, CET, WET Consider: Can we write down a rule to identify when something like this should be included? Then move the matching IDs to the main files (eg. are the EU and US DOT the only two examples here?) Regions ----------- IDs for abstract regions that have had the same wall clock since 1970. Examples: Europe/Berlin, America/New_York, Africa/Abidjan Proposal: Ensure all of these are in the main files. Consider: Should there be new IDs for each of these abstract regions to indicate they are a separate and distinct concept? eg. "Region/Berlin". (Maybe something to consider in future threads as it isn't clear what the benefit of doing so is without considering pre-1970 which I'm still trying to avoid) Non-region locations --------------------------- IDs for locations that are not region IDs. Each ID will have the same wall clock since 1970 as one of the region IDs. Examples: Europe/Oslo, Europe/Amsterdam, Atlantic/Reykjavik Consider: Can we write down a rule that covers which IDs are included here? And therefore when a new ID can be added to this set? If we can define a rule, then these can be split so rule-following IDs are in the main files and rule-breaking ones are in `backward` (although ideally they should be separate from the spelling changes). Obviously, we can say these IDs only exist for backwards compatibility, but that seems like a weak justification, and doesn't tackle the issue of when a new ID would be added to the list (which has been a point of tension). As is well known, I think the obvious rule is that the IDs follow the ISO-3166-1 standard (rule: one ID per ISO code, additional IDs may be added where clocks have diverged since 1970). Using ISO-3166 can be justified by IANA domain policy [2]: "We are not in the business of deciding what is and what is not a country. Instead, we employ a neutral standard maintained by the ISO 3166 Maintenance Agency. Our policy is to create new country-code top-level domains when the country or territory is listed on the ISO 3166-1 standard." As per the previous thread, these non-region location IDs are actively used in downstream business applications, and it is not OK that only works because tzdb happens to have IDs for backwards compatibility. There needs to be a better justification than that - these non-region locations need to be fully supported, with a consistent rule used to define what is and is not fully supported. Fixed/etc type rules -------------------------- IDs with a fixed offset Examples: GMT, UTC, Etc/GMT-9 Proposal: No change, retain in the main files unless a particular ID is considered obsolete or deprecated Are there any more classifications I've missed? Stephen [1] https://mm.icann.org/pipermail/tz/2021-September/030857.html [2] https://www.iana.org/help/eligible-tlds
On Wed, Oct 6, 2021 at 10:09 AM Stephen Colebourne via tz <tz@iana.org> wrote:
Following on from the previous thread [1] I wanted to try and classify the IDs we have, which may or may not identify missing IDs.
Again, please avoid talking about pre-1970 data at this point.
But you can't talk about why the zones that exist exist without talking about the compatibility concerns, which deal with pre-1970 data and its treatment and how that's changed.
Non-region locations --------------------------- IDs for locations that are not region IDs. Each ID will have the same wall clock since 1970 as one of the region IDs.
Examples: Europe/Oslo, Europe/Amsterdam, Atlantic/Reykjavik
Consider: Can we write down a rule that covers which IDs are included here? And therefore when a new ID can be added to this set? If we can define a rule, then these can be split so rule-following IDs are in the main files and rule-breaking ones are in `backward` (although ideally they should be separate from the spelling changes). Obviously, we can say these IDs only exist for backwards compatibility, but that seems like a weak justification, and doesn't tackle the issue of when a new ID would be added to the list (which has been a point of tension).
Backwards compatibility is a very strong justification when systems are using identifiers to store data. Consumers of tzdata need to understand what is and is not backwards compatible.
As is well known, I think the obvious rule is that the IDs follow the ISO-3166-1 standard (rule: one ID per ISO code, additional IDs may be added where clocks have diverged since 1970). Using ISO-3166 can be justified by IANA domain policy [2]:
You need to justify this rule with respect to timezone data. The space of concerns is entirely different. Time zones are far more aligned with commercial relationships across borders than they are with political boundaries: that's why Idaho, Indiana, and Australia look the way they do. A lot of timezone splits coincide with changes to political boundaries, making the utility of country by country zones dubious: South Sudan, East Timor, etc.
As per the previous thread, these non-region location IDs are actively used in downstream business applications, and it is not OK that only works because tzdb happens to have IDs for backwards compatibility. There needs to be a better justification than that - these non-region locations need to be fully supported, with a consistent rule used to define what is and is not fully supported.
Do you have actual applications that broke/ would have broken because these are not zones? The current situation does not have your proposal. So how can you justify your proposal based on applications that need them to work: how do those applications work today? To use your previous example, if a contract specifies delivery in Frankfurt at the locally observed time of 4 in some time in the future, why will that be the same as 4 in Berlin in the future? The right way to handle this representation is to introduce an extra step of indirection: from location to zone info. We don't have crystal balls, we can't predict what zones will be needed in the future. Sincerely, Watson Ladd -- Astra mortemque praestare gradatim
On Thu, 7 Oct 2021 at 07:08, Watson Ladd via tz <tz@iana.org> wrote:
Backwards compatibility is a very strong justification when systems are using identifiers to store data. Consumers of tzdata need to understand what is and is not backwards compatible.
I agree with backwards compatibility. The primary concern here is whether an ID is considered deprecated or not.
Do you have actual applications that broke/ would have broken because these are not zones?
This looks like a misunderstanding. At no point in the OP did I talk about Zones vs Links. I only talked about IDs. What I'm trying to do is: - get agreement that tzdb has IDs beyond those needed for the abstract region system - that those IDs are in widespread use - that those IDs should be considered a fully supported aspect of tzdb because of their widespread use - that there is a clear rule expressing which IDs are fully supported and which are deprecated. I don't think the first two are seriously doubted, it is the latter two where the issues are.
You need to justify this rule with respect to timezone data. The space of concerns is entirely different. Time zones are far more aligned with commercial relationships across borders than they are with political boundaries: that's why Idaho, Indiana, and Australia look the way they do. A lot of timezone splits coincide with changes to political boundaries, making the utility of country by country zones dubious: South Sudan, East Timor, etc.
None of that is in dispute. IDs will continue to exist and be created driven by timezone needs, not by countries. The question is how do we justify the IDs that we *already have*, and that are in *widespread use*. Think of it as overlapping concerns. There is a minimal set of IDs that are needed for timezone reasons (abstract regions like Europe/Berlin and Africa/Abidjan). And there is an overlapping set of IDs that have existed for a very long time, are very widely used, but are considered by some to be deprecated. A typical downstream user *cannot tell the difference* between the two types of ID - they both look and act exactly the same. What I'm trying to do is bring that downstream user point of view back to tzdb. ISO countries happen to be the best fit to describe the set of IDs that I'm arguing should be fully supported. That is because it used to be the rule. The change to remove ISO countries happened as recently as February 2019: https://github.com/eggert/tz/commit/6176aefe79e83ddb8f255849b85c149f34d46aba... https://mm.icann.org/pipermail/tz/2019-February/027571.html I believe the ISO rule should be reinstated (without threatening the primary rule that IDs are created based on timezone needs) Stephen
Stephen Colebourne wrote in <CACzrW9Cau-q2e+yoBcg=7Rztu19zZN582Xca9ZMYbz=H_0iOpw@mail.gmail.com>: |On Thu, 7 Oct 2021 at 07:08, Watson Ladd via tz <tz@iana.org> wrote: ... |I agree with backwards compatibility. The primary concern here is |whether an ID is considered deprecated or not. IDs shall be stable and not change at all. At maximum they should be moved to backward, for example if a zone is renamed (which may happen for example for Ukraine in a not too distant future, if that is still necessary then, and it looks as if it would be). I would always have been all for making infinite stability of IDs a documented assertion. Even if not, your "six months" claim is nothing but an aggressive statement. I think this entire thread is shadowboxing and noise for nothing. Looking at the interface of your JAVA framework i think getID() should simply return the correct ID, which may be a link if it is one, and just in case this is not what getID() already returns. In my opinion there are only two problems with IANA TZ and how Paul Eggert manages it as a maintainer. That is the same Paul Eggert who contributes to this software since 1995, an astonishing 26 years. Corroding the maintainership with a continuous stream of noise is disgusting. The first is combining of datasets to equal-post-1970 bundles, which is going on for many years. But the data is there, it is in backzone, and everybody can easily install a complete TZ DB on one's own request. Yet noone did, even though many packagers are on this list. This is schizophrenic. But maybe it is only because of black and yellow people do not matter as much as some blue-eyed white people from northern Europe, which does not truly come as a surprise given the audience who possibly reads this now, and given the fact that colonianisation ended only about 55 years ago, and de facto was only turned from armed political to armed material oppression. The second is that documentation did not follow suit the code improvements, as has been recently shown for tzselect(8) and the -t option. That tzselect(8) uses the administrative zone1970.tab and not the end-user-preferrable zone.tab is a different thing. If you want to enforce upon the maintainer of the TZ database that the pre-1970 data is joined back into the normal data, or that "backzone" is splitted into "backzone" and "unreliable", and/or that "backzone" is included by default, and/or that "ZFLAGS='-r @0'" is made default, then create a thread and try to gain enough hums, but please stop this subversion by spreading uncertainty and that even upon topics which never were under discussion as far as i know, and i also read this list for a decade now. It must anyway be said it was nicer once i only had the distribution and did not know about the list :) Ciao from Germany, --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
Stephen Colebourne via tz said:
What I'm trying to do is: - get agreement that tzdb has IDs beyond those needed for the abstract region system - that those IDs are in widespread use - that those IDs should be considered a fully supported aspect of tzdb because of their widespread use - that there is a clear rule expressing which IDs are fully supported and which are deprecated.
I don't see the word "deprecated" anywhere in the theory file. Who is saying that IDs should be deprecated and what do they mean by that term?
Think of it as overlapping concerns. There is a minimal set of IDs that are needed for timezone reasons (abstract regions like Europe/Berlin and Africa/Abidjan). And there is an overlapping set of IDs that have existed for a very long time, are very widely used, but are considered by some to be deprecated. A typical downstream user *cannot tell the difference* between the two types of ID - they both look and act exactly the same.
So long as they don't disappear, why does the downstream user care? So what you're saying is that we need to be clear when a name can disappear. Completely. From the database that the downstream user sees. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
On Thu, 7 Oct 2021 at 16:29, Clive D.W. Feather <clive@davros.org> wrote:
I don't see the word "deprecated" anywhere in the theory file. Who is saying that IDs should be deprecated and what do they mean by that term?
When an ID is deprecated, the project is sending a message that users should not use that ID and should use something else instead. The concept of backwards compatibility is different - it simply keeps whatever IDs we have, without assigning any additional semantic meaning. This matters because recent events indicate that some IDs like Europe/Oslo or Atlantic/Reykjavik are viewed by some list members as being deprecated by tzdb. ie. that downstream users are only using tzdb correctly if they use the region IDs like Europe/Berlin or Africa/Abidjan. The distinction here may seem subtle at first glance, but it is very real. As an application developer I want to be relying on the fully supported public API of the underlying project, not the deprecated parts (even if there is no intention to remove them). I want to squash the negativity around IDs like Europe/Oslo or Atlantic/Reykjavik and have them fully embraced as part of the main supported API of tzdb. Stephen
Stephen Colebourne via tz said:
On Thu, 7 Oct 2021 at 16:29, Clive D.W. Feather <clive@davros.org> wrote:
I don't see the word "deprecated" anywhere in the theory file. Who is saying that IDs should be deprecated and what do they mean by that term?
When an ID is deprecated, the project is sending a message that users should not use that ID and should use something else instead.
You have completely missed my point. Where does "the project" talk about deprecating IDs at all? Not in the theory file. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
On Fri, 8 Oct 2021 at 07:20, Clive D.W. Feather <clive@davros.org> wrote:
Stephen Colebourne via tz said:
On Thu, 7 Oct 2021 at 16:29, Clive D.W. Feather <clive@davros.org> wrote:
I don't see the word "deprecated" anywhere in the theory file. Who is saying that IDs should be deprecated and what do they mean by that term?
When an ID is deprecated, the project is sending a message that users should not use that ID and should use something else instead.
You have completely missed my point.
Where does "the project" talk about deprecating IDs at all? Not in the theory file.
There is a frequently expressed view that lots of IDs in tzdb only exist for backwards compatibility. It is pretty clear that means they are not considered to be part of the main API of the project. This has two effects - new IDs will not added when they should be (eg when Kosovo gets an ISO code) and pre-1970 history gets merged (because some IDs are considered less important than others). The first step to resolving the issues tzdb has is to stop treating non-region location IDs as a historical mistake. Stephen
Stephen Colebourne via tz said:
Where does "the project" talk about deprecating IDs at all? Not in the theory file.
There is a frequently expressed view that lots of IDs in tzdb only exist for backwards compatibility.
When ignoring pre-1970 data (as you asked to do earlier in this discussion), that's right. But that is not deprecating them. It is not saying that they will be removed at some point in the future.
It is pretty clear that means they are not considered to be part of the main API of the project.
When ignoring pre-1970 data. These are "if we were building this from scratch with the current rules, these wouldn't be needed" IDs. We only include them (when ignoring pre-1970 data) because of backwards compatibility. But I don't see anyone suggesting we get rid of them.
This has two effects - new IDs will not added when they should be
No, you can't say "should". You're inventing a rule and then trying to use that to claim that others are breaking the rules. That's begging the question. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
Stephen Colebourne via tz <tz@iana.org> writes:
On Fri, 8 Oct 2021 at 07:20, Clive D.W. Feather <clive@davros.org> wrote:
Where does "the project" talk about deprecating IDs at all? Not in the theory file.
There is a frequently expressed view that lots of IDs in tzdb only exist for backwards compatibility.
Many things in computing exist for backwards compatibility but are not deprecated. Deprecated implies they will go away at some point. So far as I can tell, no such implication has ever been stated for, for example, link entries in backwards. -- Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>
Deprecated implies they will go away at some point
That is not necessarily the case. In Unicode we have deprecated certain characters, but we make it very clear that they will *never* be removed. Nor does it mean that implementations have to ignore them if they are passed to the implementation. What it does mean is that people are strongly discouraged from generating them in implementations. So they shouldn't be on keyboards, or in character pickers. Mark On Fri, Oct 8, 2021 at 9:19 AM Russ Allbery via tz <tz@iana.org> wrote:
Stephen Colebourne via tz <tz@iana.org> writes:
On Fri, 8 Oct 2021 at 07:20, Clive D.W. Feather <clive@davros.org> wrote:
Where does "the project" talk about deprecating IDs at all? Not in the theory file.
There is a frequently expressed view that lots of IDs in tzdb only exist for backwards compatibility.
Many things in computing exist for backwards compatibility but are not deprecated. Deprecated implies they will go away at some point. So far as I can tell, no such implication has ever been stated for, for example, link entries in backwards.
-- Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>
Mark Davis ☕️ <mark@macchiato.com> writes:
Deprecated implies they will go away at some point
That is not necessarily the case. In Unicode we have deprecated certain characters, but we make it very clear that they will *never* be removed. Nor does it mean that implementations have to ignore them if they are passed to the implementation. What it does mean is that people are strongly discouraged from generating them in implementations. So they shouldn't be on keyboards, or in character pickers.
Perhaps the root of the problem, or at least a problem, are these varying definitions of deprecated. Stephen appears to have assumed that presence in the backward file implies a strong version of deprecated and he or someone else has encoded that assumption into Joda-Time, hence Joda-Time's "canonicalization" of links. But that belief appears to me to have been an assumption unsupported by any documentation or intent on the part of the tzdata maintainers, hence all the problems that approach has caused over the years. I believe tzdata intends little more than "these two names point to the same data" and is not intending to imply that one of them should be avoided in new uses, let alone that either of them would ever go away. But that's just an assumption on my part. Spelling it out explicitly would be good. -- Russ Allbery (eagle@eyrie.org) <https://www.eyrie.org/~eagle/>
Stephen Colebourne via tz said:
Following on from the previous thread [1] I wanted to try and classify the IDs we have, which may or may not identify missing IDs.
Again, please avoid talking about pre-1970 data at this point.
Obsolete ------------ IDs that are obsolete and should never be used. They date from many years ago whe tzdb was just starting. Yet these still do appear in downstream UIs even today (of course UIs should not use the tzdb ID list, but in reality lots do).
Examples: Portugal, NZ-CHAT, Navajo, Libya.
Proposal: Provide 3-6 months notice, then move obsolete IDs to a new file "obsolete" which downstream projects are strongly encouraged not to include. (I would argue that the time has come to properly remove these IDs, which are very inconsistent in terms of which are provided and which not, eg Portugal, but not Spain)
Counter-proposal. These should be treated as renamings. So Portugal -> Europe/Lisbon and treated like your next category. I presume that these are the same as some canonical zone since 1970. Pre-1970 data in these should be treated however we decide to treat pre-1970 data.
Deprecated, same location ------------------------------------ IDs that have been deprecated with a single clear alternative ID being provided. Both IDs represent the same physical location/city.
Spelling changes: Asia/Katmandu (replaced by Asia/Kathmandu), Asia/Rangoon (replaced by Asia/Yangon)
ID structure changes: America/Louisville (replaced by America/Kentucky/Louisville)
Proposal: Ensure all of these are in `backward` Consider: Is there any way to move these IDs to the obsolete file? Maybe after 5 years? Or do we just accept backwards compatibility restrictions on these?
Or make the information available (and possibly tools) to allow downstreams to decide their policy on these. For example, a file that said: Asia/Rangoon Asia/Yangon rename 2005-11-26 (or whatever the actual rename date was). The explicit "rename" there allows this file to show other things, such as merges of zones that only differ pre-1970: Europe/Oslo Europe/Berlin merge 2020-12-31 (or "merge-pre-1970").
Legally described mega-zones ----------------------------------------- IDs for locations where a federal or supra-national body defines rules, eg the EU or US DOT.
Examples: US/Mountain, CET, WET
Consider: Can we write down a rule to identify when something like this should be included? Then move the matching IDs to the main files (eg. are the EU and US DOT the only two examples here?)
The EU doesn't define "CET" or "WET", or even specify the names. The EU specifies constraints on the rule for the zones that cover the places that follow EU rules. So "CET" is not a zone; it's a collection of zones that have been in step since early 1983 or whatever later date they joined the collection. "WET", incidentally, starts from 1998-03-29. Nothing in EU law stops a country moving from "CET" to "WET" or "EET". And these names do not appear anywhere I can find in the legislation (it was "Member States belonging to the zero time zone and the other Member States" and later became "the Member States apart from Ireland and the United Kingdom, on the one hand, and Ireland and the United Kingdom, on the other", which is not the same division; this was probably when Portugal joined). So, since these don't describe "places keeping the same time since 1970", what exactly are they and why do we have them? (I suspect that US/Mountain has a similar problem in that not everywhere in Mountain time observes DST.)
Regions ----------- IDs for abstract regions that have had the same wall clock since 1970.
Examples: Europe/Berlin, America/New_York, Africa/Abidjan
Proposal: Ensure all of these are in the main files. Consider: Should there be new IDs for each of these abstract regions to indicate they are a separate and distinct concept? eg. "Region/Berlin". (Maybe something to consider in future threads as it isn't clear what the benefit of doing so is without considering pre-1970 which I'm still trying to avoid)
Non-region locations --------------------------- IDs for locations that are not region IDs. Each ID will have the same wall clock since 1970 as one of the region IDs.
Examples: Europe/Oslo, Europe/Amsterdam, Atlantic/Reykjavik
Consider: Can we write down a rule that covers which IDs are included here?
If I'm understanding correctly what Paul's been doing, these are "IDs that refer to regions that have the same time history since 1970 as another region but a different time history before that and are not the region that uses the ID that would be chosen using our standard conventions (basically 'largest town')". Or, put another way, partition the set of all zones into subsets, each of which have the same history since 1970. In each subset, one is what you've called an abstract region and the rest are non-region locations. The choice of the first is made based on our normal naming rules.
And therefore when a new ID can be added to this set? If we can define a rule, then these can be split so rule-following IDs are in the main files and rule-breaking ones are in `backward` (although ideally they should be separate from the spelling changes).
Hang on, why should we ever add a new ID at all. My view is that we should *not* be adding new IDs. So long as we're talking about a post-1970 database, that is. In other words, the rule is "they stay for backwards compatibility reasons and no other". For someone only building with 1970-onwards data, these would be equivalent to aliases, so are treated as equivalent to renames - see above.
Obviously, we can say these IDs only exist for backwards compatibility, but that seems like a weak justification,
Why? If we were starting a new TZDB from scratch, we could ignore it because there wouldn't be any backwards to be compatible with. But there is, so we need to thing about it.
and doesn't tackle the issue of when a new ID would be added to the list (which has been a point of tension).
Why not "never"? Well, apart from following a bug fix.
As is well known, I think the obvious rule is that the IDs follow the ISO-3166-1 standard (rule: one ID per ISO code, additional IDs may be added where clocks have diverged since 1970). Using ISO-3166 can be justified by IANA domain policy [2]:
That's not a justification, since IANA were handing rights over these names to those ISO bodies. And IANA have long given up on that policy, which is why there are .gg and .scot.
As per the previous thread, these non-region location IDs are actively used in downstream business applications, and it is not OK that only works because tzdb happens to have IDs for backwards compatibility. There needs to be a better justification than that
Sorry, but that is *exactly* the definition of "backwards compatibility". If someone starts a new application that only uses post-Paul-merges names because that's all they see, they will *not* be using these names nor care in the slightest about them. It's *ALL* about backwards compatibility.
Fixed/etc type rules -------------------------- IDs with a fixed offset
Examples: GMT, UTC, Etc/GMT-9
Proposal: No change, retain in the main files unless a particular ID is considered obsolete or deprecated
The easiest way to treat these is to deem that there are certain virtual places with their own time history (e.g. "international waters near the 30 degrees east meridian") which deserve their own zone on that basis (this one being Etc/GMT-1). But you've left out the mammoth in the room, which is pre-1970 data. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
On Thu, 7 Oct 2021 at 16:23, Clive D.W. Feather <clive@davros.org> wrote:
Proposal: Ensure all of these are in `backward` Consider: Is there any way to move these IDs to the obsolete file? Maybe after 5 years? Or do we just accept backwards compatibility restrictions on these?
Or make the information available (and possibly tools) to allow downstreams to decide their policy on these.
For example, a file that said:
Asia/Rangoon Asia/Yangon rename 2005-11-26
Seems like a Good Idea. This is another way to handle obsolete IDs: Europe/Lisbon Portugal rename 2005-11-26 obsolete
Legally described mega-zones ----------------------------------------- The EU doesn't define "CET" or "WET", or even specify the names. So, since these don't describe "places keeping the same time since 1970", what exactly are they and why do we have them?
If done properly, they would only exist from the date that the rule was first established. As per the first of my threads, they are needed because CET does not have the same semantic meaning as Europe/Berlin.
Regions Non-region locations --------------------------- IDs for locations that are not region IDs. Each ID will have the same wall clock since 1970 as one of the region IDs.
Examples: Europe/Oslo, Europe/Amsterdam, Atlantic/Reykjavik
Consider: Can we write down a rule that covers which IDs are included here?
Or, put another way, partition the set of all zones into subsets, each of which have the same history since 1970. In each subset, one is what you've called an abstract region and the rest are non-region locations. The choice of the first is made based on our normal naming rules.
Kind of. The difficulty is that the ID of the abstract region is the *same* as the ID of the non-region location. In a theoretically pure solution, these two ID schemes are completely separate. ie. you would have a Region/Berlin region ID that represents time across the whole region with Europe/Berlin and Europe/Oslo location IDs following along (potentially with pre-1970 history). However what we actually have is a single merged ID scheme, where IDs for abstract regions, and IDs of locations where time is the same since 1970 exist together and are indistinguishable. I actually think the tzdb approach is just fine, so long as all the IDs (region and non-region locations) are equally supported. ie. it is not OK for region IDs to get things that non-region location IDs don't get.
And therefore when a new ID can be added to this set? If we can define a rule, then these can be split so rule-following IDs are in the main files and rule-breaking ones are in `backward` (although ideally they should be separate from the spelling changes).
Hang on, why should we ever add a new ID at all. My view is that we should *not* be adding new IDs. So long as we're talking about a post-1970 database, that is. In other words, the rule is "they stay for backwards compatibility reasons and no other".
Imagine Shetland broke away from the UK. Imagine this was undisputed and it joined the UN with everyone around the world happy to recognise it, thus it gets its own ISO code. But as a self-governing country it chooses to continue to follow the same timezone as London. tzdb would then be in the position that it has an ID representing a location in every European country except Shetland. The only justification being presented for this is that Shetland wasn't around 10 years ago when tzdb did have a one location per ISO code policy, thus it can't benefit from backwards compatibility as a justification for inclusion. I contend that this is an untenable position for tzdb to place itself in. (Substitute Shetland for Kosovo, and you get a scenario that is likely to happen in the next few years).
And IANA have long given up on that policy, which is why there are .gg and .scot.
The IANA rule is still working just fine. GG is an ISO code. And .scot is a generic tld not a country-code one. See column 2: https://www.iana.org/domains/root/db Stephen
Stephen Colebourne via tz said:
Legally described mega-zones ----------------------------------------- The EU doesn't define "CET" or "WET", or even specify the names. So, since these don't describe "places keeping the same time since 1970", what exactly are they and why do we have them?
If done properly, they would only exist from the date that the rule was first established. As per the first of my threads, they are needed because CET does not have the same semantic meaning as Europe/Berlin.
And what is the semantic meaning of "CET"? As I said, I can't find any legal definition of it.
Hang on, why should we ever add a new ID at all. My view is that we should *not* be adding new IDs. So long as we're talking about a post-1970 database, that is. In other words, the rule is "they stay for backwards compatibility reasons and no other".
Imagine Shetland broke away from the UK. Imagine this was undisputed and it joined the UN with everyone around the world happy to recognise it, thus it gets its own ISO code. But as a self-governing country it chooses to continue to follow the same timezone as London.
tzdb would then be in the position that it has an ID representing a location in every European country except Shetland.
Not true. What about Sealand and Liberland, to name two others?
I contend that this is an untenable position for tzdb to place itself in.
I disagree.
And IANA have long given up on that policy, which is why there are .gg and .scot. The IANA rule is still working just fine. GG is an ISO code.
Not for the first 10 years that .gg .je and .im. And what about .ac and .eu? IANA are *not* following ISO-3166.
And .scot is a generic tld not a country-code one.
It still got approved for Scotland, though. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
On 10/6/21 10:08 AM, Stephen Colebourne via tz wrote: As far as the categorization of IDs goes, I think I'd categorize them somewhat differently: * The names defined by the PRIMARY_YDATA files. * The names defined by 'backward'. (Though this category now tends to blur into the previous one, as the use of 'backward' is more popular nowadays.) * The names defined by 'etcetera'. * The names defined by 'backzone' but not by the other files. This categorization uses maintainer lingo. But it determines what names end users see so it's a valid categorization for end users too. There's another way to categorize names, which might be better in the long run than what we have now. We could categorize them as follows: * The names currently defined by 'etcetera'. * Names for each set of clocks that are planned to agree in the future. This is useful for applications like planning calendars, setting thermostats, etc. * Names for each set of clocks that have agreed since 1970 and are planned to agree in the future. (This category includes the previous category.) Current tzdb Zones approximate this set (though we still have 20-odd Zones too many). * Backward-compatibility aliases for the above. * Other names (outside the scope of tzdb, so 'backzone' stuff). OK, getting back to your classification:
Examples: Portugal, NZ-CHAT, Navajo, Libya.
Proposal: Provide 3-6 months notice, then move obsolete IDs to a new file "obsolete" which downstream projects are strongly encouraged not to include. (I would argue that the time has come to properly remove these IDs, which are very inconsistent in terms of which are provided and which not, eg Portugal, but not Spain)
Inconsistent they definitely are. And it might make sense to remove old names that are rarely used, and that are so inconsistent that they cause more problems (via confusion) than they cure (by supporting old TZ settings). However, I would think we'd need more than a few month's notice.
Deprecated, same location ------------------------------------ IDs that have been deprecated with a single clear alternative ID being provided. Both IDs represent the same physical location/city.
Spelling changes: Asia/Katmandu (replaced by Asia/Kathmandu), Asia/Rangoon (replaced by Asia/Yangon)
ID structure changes: America/Louisville (replaced by America/Kentucky/Louisville)
Proposal: Ensure all of these are in `backward` Consider: Is there any way to move these IDs to the obsolete file? Maybe after 5 years? Or do we just accept backwards compatibility restrictions on these?
I'd say that there's less of an argument for removing these names, as the confusion is surely less.
Legally described mega-zones ----------------------------------------- IDs for locations where a federal or supra-national body defines rules, eg the EU or US DOT.
Examples: US/Mountain, CET, WET
Consider: Can we write down a rule to identify when something like this should be included? Then move the matching IDs to the main files (eg. are the EU and US DOT the only two examples here?)
Although there are some other examples like this in the world, I think we're better off not pursuing this as it would duplicate existing functionality (thus causing confusion), it'd be a pain to nail down exactly what a "mega-zone" would be (or not be), and it'd be more opportunity for real-world politics to strike.
Examples: Europe/Berlin, America/New_York, Africa/Abidjan Proposal: Ensure all of these are in the main files. Consider: Should there be new IDs for each of these abstract regions to indicate they are a separate and distinct concept? eg. "Region/Berlin".
I'd rather avoid having yet more names for the same thing. We already have so many aliases that there's some confusion.
IDs for locations that are not region IDs. Each ID will have the same wall clock since 1970 as one of the region IDs.
Examples: Europe/Oslo, Europe/Amsterdam, Atlantic/Reykjavik
Consider: Can we write down a rule that covers which IDs are included here?
Yes: a backward-compatibility rule. If we had the name in previous releases, we should keep the name. This rule is simple and clear, and helps avoid name proliferation.
these non-region location IDs are actively used in downstream business applications
That's fine, as these business applications should continue to work because the old IDs will continue to be maintained. If a new ISO country code is established but no ID is created (because there's no timekeeping need for one), the applications can continue to use the same IDs they were using before.
On Fri, 8 Oct 2021 at 08:33, Paul Eggert via tz <tz@iana.org> wrote:
these non-region location IDs are actively used in downstream business applications
That's fine, as these business applications should continue to work because the old IDs will continue to be maintained. If a new ISO country code is established but no ID is created (because there's no timekeeping need for one), the applications can continue to use the same IDs they were using before.
Do you believe it is equitable/fair/just to discriminate against that country? Because if tzdb has at least one ID for a city within every other ISO country as per the previous rule, that most certainly is what you are doing. Stephen
Stephen Colebourne via tz said:
That's fine, as these business applications should continue to work because the old IDs will continue to be maintained. If a new ISO country code is established but no ID is created (because there's no timekeeping need for one), the applications can continue to use the same IDs they were using before.
Do you believe it is equitable/fair/just to discriminate against that country?
We're no more discriminating against that country than against Yorkshire or Cornwall.
Because if tzdb has at least one ID for a city within every other ISO country as per the previous rule,
But it doesn't. And there is no such rule at present. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
participants (7)
-
Clive D.W. Feather -
Mark Davis ☕️ -
Paul Eggert -
Russ Allbery -
Steffen Nurpmeso -
Stephen Colebourne -
Watson Ladd