On Thu, Nov 4, 2021 at 10:11 PM Philip Paeps <philip@trouble.is> wrote:
On 2021-11-05 12:17:34 (+0800), Brian Park via tz wrote:
I get the impression that this debate is caused by the existence of 2 different schools of thought: [...]
I want to suggest that it may be possible for these 2 views to coexist.
They de facto coexist right now. The overwhelming majority of the data are descriptive. Only recent efforts have made some of the post-1970 data appear more prescriptive.
They coexist in an ad hoc manner right now, and that seems to be one of the causes for the contention. I am suggesting that we formalize the separation, so that both groups are happier.
We
could create a new file, e.g. call it 'countryzone', which contains a set of Links organized in a hierarchical tree by country, pointing to the Core zones.
I strongly believe we should continue to carefully avoid attempting to group data by country. [I would even avoid using the word "country" wherever possible.]
Can you explain why? Because it will cause arguments about disputed places? I think only a small minority of places around the world are disputed. By separating these ISO-country timezones into a 'countryzone' file, perhaps we can confine the debate into a smaller section of the TZDB. We could create duplicate entries (i.e. Country1/City, Country2/City), or create a pseudo-country called "Disputed" (i.e. Disputed/City). The point is, we can create policies that govern these disputed regions. Could we move 'countryzone' into a separate project? Probably, but some amount of initial coordination and refactoring would be required to resolve conflicting zone identifiers. Overall, I feel like the TZDB data should lean a bit more towards matching how end-users think about timezones in the real world (Prescriptive), and lean slightly less on treating timezones as a clustering problem (Descriptive). But I can see pros and cons of both approaches. Which is why I am suggesting ways to make the 2 approaches interoperate better.
For the pre-1970 data, it is my understanding that the 'backzone' file
contains Zone records which should replace ONLY the LinkMerged records found in the other files. I propose that all LinkMerged records be extracted into a separate file (let's call it 'mergedzone') so that there is a clear symmetry between 'backzone' and 'mergedzone', which allows them to be substituted for each other. The dependency diagram looks something like this:
As I've suggested before in another thread, I think we should consider undoing the split into backzone. I really liked Stephen's phrasing earlier in this thread: acceptably accurate, not outrageously wrong. We started moving data to backzone to limit the scope of 'active' maintenance to post-1970 data. That artificial split led us towards a more prescriptive worldview. It seems clear that prescriptive simply does not work for a real world with people on it.
I think Paul Eggert has made it clear that he does not want to maintain this data. My proposed refactoring of this info into the 'backzone' / 'mergedzone' pair makes it easy for downstream libraries to add back the 'backzone' data if they want. The 'make PACKRATDATA=backzone' hack does not help downstream libraries which do not use TZif or the Makefile.
If there is any chance that this will result in being able to type "Canada/Toronto" instead of "America/Toronto", that would resolve an annoyance that has lasted some 30-35 years.
In this context, America refers to the landmass, not to the political
entity occupying a large chunk of it. [Canada/Eastern etc moved to backward around 1993, as far as I can tell.]
Virtual no one in the world thinks of "America" as referring to all of "North America" and "South America". Brian