On Sat, 2 Oct 2021 at 03:24, Paul Eggert via tz <tz@iana.org> wrote:
* An abstract region ID would naturally have no pre-1970 data, as it doesn't define data in a single city/location.
Sorry, I don't know what "naturally" refers to here. Aren't abstract IDs orthogonal to eliminating pre-1970 data? After all, we could introduce abstract IDs without eliminating pre-1970 data, and vice versa.
An abstract region represents a part of the Earth that has had the same clock rules since 1970. At some point prior to 1970 different locations within the abstract region had different clock rules, even if that was just LMT. There is "naturally" no pre-1970 data because the definition of the abstract region is all about post-1970 data, with pre-1970 data diverging. Imagine there was an abstract region called "Abstract/SameAsGmt" which is what Iceland and Ivory Coast follow. That region *as a whole* does not have *one* set of pre-1970 data, it has *many* sets. ie. Pre-1970 data only realistically works when viewed at a city level, not at an abstract region level.
3) "An event will happen at this time in the future" If an end-user wants to store an event in the future, eg. a one hour meeting next month, the correct approach is to store the date, time and zone ID
That doesn't suffice, as it doesn't work if the Zone splits in the meantime.
Of course that is true. But this hasn't been a problem in practice for business applications AFAICT. The point above is documenting how things *are*. Business applications and calendaring systems really do store the local date/time and zone ID from TZDB for future events. This works precisely because there are enough IDs provided by TZDB to make this work. If TZDB had only ever offered an ID for each abstract region, then *some other* system/project would have needed to be invented to define a set of IDs suitable for business applications. And moreover, that other system/project would need to provide the mapping from their ID to an underlying abstract region. (This is kind of Russ' point - that the IDs needed for business applications could in theory be separated from the IDs that provide data for abstract regions.) But TZDB doesn't just provide IDs for abstract regions, it has always provided more than that. And by providing separate IDs like Europe/Oslo and Atlantic/Reykjavik for many years, TZDB has provided the basic tools necessary for business applications to function in the way described above. While I can understand the conceptual desire to have TZDB only provide abstract regions, it seems completely impractical to do so at this point given how embedded the data actually is across the industry. ie. it is too late to get rid of IDs like Europe/Oslo and Atlantic/Reykjavik. But there may yet be other ways to allow TZDB to focus on the abstract regions (a discussion for a later thread). (Note that this comment only discusses the IDs, not the associated data). I hope this explanation explains more clearly the "one ID per country" conclusion in the OP. TZDB *already* effectively provides one ID per country, and applications *already* rely on that fact to meet use case #3. What is really needed IMO is an RFC/theory updates to recognise the vital nature of IDs like Europe/Oslo and Atlantic/Reykjavik to business applications (Again, the comment here is just about IDs, not the associated data)
4) "An event will happen at this time in the future relative to a shared/common definition"
A TV show might be defined to air at 8pm Mountain Time in one months time. It will air at that time regardless of whether any of the states that are currently on Mountain Time change to Pacific Time.
I don't think we can realistically predict what will happen for future events of this sort, which means this sort of thing should not be a design constraint. tzdb needs flexibility, not a straightjacket, to deal with unknown future events.
Suppose, for example, that in mid-2022 the US east coast switches from -05 (-04 with DST) to -04 all year, something that's under serious consideration. A TV show that was formerly planned for 2022-12-01 19:00 -05 will likely be rescheduled for 2022-12-01 19:00 -04, i.e., still 7pm in New York, Philadelphia, Miami, etc. but with a different UTC. If such a show had been scheduled with TZ='EST5EDT' then that TZ setting would be wrong after the change; whereas if it had been scheduled with TZ='America/New_York' it would be OK.
Not sure if the point was missed here. My point is really just an observation of timezone reality IMO - that US Mountain Time (as defined by the US DOT) is not the same as Denver Time (the time experienced in Denver). It is a coincidence that the two happen to be the same, but they are not legally the same. (A legal contract could be written to refer to Denver Time or US Mountain Time, and if the two diverge, that distinction would matter.) IMO, it is important to ensure that IDs exist for such shared zones. But luckily enough we already have "US/Mountain" and "CET", so the ID part is already sorted (in the US and EU). Logically, the ID US/Mountain should provide no time zone data prior to the US DOT first defining the rule, but that subtlety probably isn't really necessary.
* It is hard at present to identify IDs that are deprecated because of proper aliasing, such as spelling changes, vs those where the ID is a real location but its TZDB status has been reduced.
Good point, and it'd be good to document somehow why Link entries exist. Among other things, I suspect that there are more than just the two reasons that you mention.
Defining the set of IDs the project should provide (still ignoring pre-1970 issues) is IMO the next step in the process which would address the Link/backward point. I'll start a new thread for that soon, unless there is much more to say on this thread. thanks Stephen