Re: [tz] TZDB use cases

Oct. 2, 2021

      On Sat, 2 Oct 2021 at 03:24, Paul Eggert via tz <tz@iana.org> wrote:
...
...
* An abstract region ID would naturally have no pre-1970 data, as it
doesn't define data in a single city/location.
Sorry, I don't know what "naturally" refers to here. Aren't abstract IDs
orthogonal to eliminating pre-1970 data? After all, we could introduce
abstract IDs without eliminating pre-1970 data, and vice versa.
An abstract region represents a part of the Earth that has had the
same clock rules since 1970. At some point prior to 1970 different
locations within the abstract region had different clock rules, even
if that was just LMT. There is "naturally" no pre-1970 data because
the definition of the abstract region is all about post-1970 data,
with pre-1970 data diverging.

Imagine there was an abstract region called "Abstract/SameAsGmt" which
is what Iceland and Ivory Coast follow. That region *as a whole* does
not have *one* set of pre-1970 data, it has *many* sets. ie. Pre-1970
data only realistically works when viewed at a city level, not at an
abstract region level.
...
...
3) "An event will happen at this time in the future"
If an end-user wants to store an event in the future, eg. a one hour
meeting next month, the correct approach is to store the date, time
and zone ID
That doesn't suffice, as it doesn't work if the Zone splits in the meantime.
Of course that is true. But this hasn't been a problem in practice for
business applications AFAICT.

The point above is documenting how things *are*. Business applications
and calendaring systems really do store the local date/time and zone
ID from TZDB for future events. This works precisely because there are
enough IDs provided by TZDB to make this work. If TZDB had only ever
offered an ID for each abstract region, then *some other*
system/project would have needed to be invented to define a set of IDs
suitable for business applications. And moreover, that other
system/project would need to provide the mapping from their ID to an
underlying abstract region.  (This is kind of Russ' point - that the
IDs needed for business applications could in theory be separated from
the IDs that provide data for abstract regions.)

But TZDB doesn't just provide IDs for abstract regions, it has always
provided more than that. And by providing separate IDs like
Europe/Oslo and Atlantic/Reykjavik for many years, TZDB has provided
the basic tools necessary for business applications to function in the
way described above.

While I can understand the conceptual desire to have TZDB only provide
abstract regions, it seems completely impractical to do so at this
point given how embedded the data actually is across the industry. ie.
it is too late to get rid of IDs like Europe/Oslo and
Atlantic/Reykjavik. But there may yet be other ways to allow TZDB to
focus on the abstract regions (a discussion for a later thread). (Note
that this comment only discusses the IDs, not the associated data).

I hope this explanation explains more clearly the "one ID per country"
conclusion in the OP. TZDB *already* effectively provides one ID per
country, and applications *already* rely on that fact to meet use case
#3. What is really needed IMO is an RFC/theory updates to recognise
the vital nature of IDs like Europe/Oslo and Atlantic/Reykjavik to
business applications  (Again, the comment here is just about IDs, not
the associated data)
...
...
4) "An event will happen at this time in the future relative to a
shared/common definition"
...
A TV show might be defined to air at 8pm Mountain Time in one months
time. It will air at that time regardless of whether any of the states
that are currently on Mountain Time change to Pacific Time.
I don't think we can realistically predict what will happen for future
events of this sort, which means this sort of thing should not be a
design constraint. tzdb needs flexibility, not a straightjacket, to deal
with unknown future events.
Suppose, for example, that in mid-2022 the US east coast switches from
-05 (-04 with DST) to -04 all year, something that's under serious
consideration. A TV show that was formerly planned for 2022-12-01 19:00
-05 will likely be rescheduled for 2022-12-01 19:00 -04, i.e., still 7pm
in New York, Philadelphia, Miami, etc. but with a different UTC. If such
a show had been scheduled with TZ='EST5EDT' then that TZ setting would
be wrong after the change; whereas if it had been scheduled with
TZ='America/New_York' it would be OK.
Not sure if the point was missed here. My point is really just an
observation of timezone reality IMO - that US Mountain Time (as
defined by the US DOT) is not the same as Denver Time (the time
experienced in Denver). It is a coincidence that the two happen to be
the same, but they are not legally the same. (A legal contract could
be written to refer to Denver Time or US Mountain Time, and if the two
diverge, that distinction would matter.)

IMO, it is important to ensure that IDs exist for such shared zones.
But luckily enough we already have "US/Mountain" and "CET", so the ID
part is already sorted (in the US and EU). Logically, the ID
US/Mountain should provide no time zone data prior to the US DOT first
defining the rule, but that subtlety probably isn't really necessary.
...
...
* It is hard at present to identify IDs that are deprecated because of
proper aliasing, such as spelling changes, vs those where the ID is a
real location but its TZDB status has been reduced.
Good point, and it'd be good to document somehow why Link entries exist.
Among other things, I suspect that there are more than just the two
reasons that you mention.
Defining the set of IDs the project should provide (still ignoring
pre-1970 issues) is IMO the next step in the process which would
address the Link/backward point. I'll start a new thread for that
soon, unless there is much more to say on this thread.

thanks
Stephen

Re: [tz] TZDB use cases

Stephen Colebourne