What data should TZDB offer?

TZDB has seen recent difficulties due to conflicting desires and expectations of the dataset. This is an attempt to capture some of these: 1) LMT LMT is confusing for many downstream users because they don't understand the concept. Recent threads have noted queries from Postgres users, I can attest to confusion in various Java libraries. In fact, earlier versions of Java removed the LMT concept. I think the time is right to properly consider an alternative to LMT. I believe we can define an offset for each region that the region has most typically been associated with post 1970. For example, Europe/Paris is most associated with +01:00 since 1970. This provides the "normal looking" offset that most users desire for the LMT period. 2) Negative DST Negative DST in the source files continues to be a problem, but we should address other issues first. 3) Links At present, it is not possible to identify which region names are deprecated (such as spelling changes) and which represent important data. Having such a distinction would allow permanent deprecations to be removed from some downstream systems, and would also allow downstream systems to provide functionality to normalize IDs from old to new in a correct and consistent way. 4) Pre-1970 history There is general, though not compete, agreement that TZDB's main focus is on post-1970 data. However I and others have an expectation that pre-1970 data is retained and not be removed. I don't want to get into a position where pre-1970 history is basically completely unreliable relative to the name of the ID. Getting Germany's pre-1970 rules when you have an ID for Oslo or Sweden is not acceptable. (We already have some of that, but the recent proposal would have greatly increased the issue. I feel that there is the potential to achieve an agreeable solution, but it does require acceptance that full pre-1970 history is needed for some places that have the same zone history post-1970. 5) Compatibility guarantee Changes need to be made with more consideration of the implications of compatibility on downstream users that do not use the makefile. Proposal ------------ That TZDB shall adopt the principle that the main geographic files (africa to southamerica) shall contain data with full history for locations where zone history has differed since 1970 subject to the minimum requirement that there is at least one full zone with history defined for each independent country as defined by ISO-3166-1. Dependent territories in ISO-3166-1 that are within 1/24th of the earth circumference of another dependent territory or parent country with the same sovereignty shall be combined if their post-1970 history is identical. That TZDB shall replace LMT with the offset that best represents standard time for the location during the period 1970 to 2021. That TZDB shall define a non-makefile mechanism, which may involve a new file, to identify permanently deprecated IDs, such as "Turkey" or "W-SU". That TZDB shall offer a command line makefile flag that filters the data to reduce the binary output where data is the same post-1970. That consideration is given to whether this flag should erase pre-1970 history as part of it's truncation process. That these rules shall be encoded in the theory file along with an explicit statement of backwards compatibility. ----- It is my belief that this proposal meets the issues expressed above while also respecting the concerns of fairness, guidelines and politics expressed by others. For example, TZDB would not include a full zone with history for Kosovo until ISO-3166-1 includes it. This provides a straightforward defence against the worst issues of politics. The dependent territory rules are designed to allow locations that are close to each other in distance and sovereignty to be combined, such as Jersey and London. I have not analyzed how many zones of full history can be saved by this mechanism. I acknowledge that the above is a significant change to TZBD, but it does more fully align TZDB with the Governmental authorities that actually define time zones. I also believe it more closely aligns TZDB with the expectations of downstream users. Stephen

Stephen Colebourne via tz <tz@iana.org> writes:
[ assorted proposals ]
One other issue that I think deserves more attention than it has gotten lately is that tzdb has become a de facto standard and users rely on its stability. I would like to see some sort of principle adopted that minimizes changes in historical data. In particular, I think it's past time to prohibit data changes adopted for essentially-administrative reasons (as opposed to new findings of historical fact). I'd put the recent reorganization under the heading of things that would be forbidden by this principle, and also the changes a few years ago that removed "made up" zone abbreviations. Whatever the justification for those abbreviations originally, some people had come to depend on them, and little was to be gained by removing them. Looking at Stephen's list with this in mind, one thing I'd vote against is redefining the LMT offsets. Yeah, I suggested that myself a few days ago, but that was in the context of what seemed to be a fait accompli that would largely destroy tzdb's backward compatibility anyway. If we're going to reverse that choice and try to preserve the existing pre-1970 data, then preserving the existing LMT data goes along with that. The idea of having at least one zone per ISO-3166-1 country does seem like a good one, though. Aside from possibly deflecting politically-based complaints, this seems basically like future proofing: even if two countries have shared clocks since 1970, they could diverge at any time. Being prepared with an appropriate zone name should minimize the pain to users. Also notice that splitting an existing zone creates no compatibility problems, since no one is obligated to switch to the new zone name immediately. regards, tom lane

A partly-baked idea: would it make sense to allow something like a link in a Zone record? Let's say that, if the first character in the GMTOFF column is not a decimal digit or a '-', then GMTOFF is the name of another Zone. The only other column in such a record would be [UNTIL]. As an example, let's say that America/Shiprock becomes a Zone, and that the Navajo people followed Arizona time until that state opted out of DST. That Zone might look something like (I'm just making this up): # Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone America/Shiprock -6:59:56 - LMT 1883 Nov 18 12:00:04 America/Phoenix 1967 America/Denver That would be easy to parse. The "interesting" bit would be figuring out which row in the linked Zone we should start with. It would be the row following the one whose [UNTIL] is the greatest lower bound of the date of interest. Downstream systems that use the source files directly would need access to files with the older format for a while to give them time to change their parsers. (Maybe the first column in a new Zone record could be something other than "Zone".) --Bill Seymour On Sun, Jun 6, 2021 at 10:03 AM Tom Lane via tz <tz@iana.org> wrote:
Stephen Colebourne via tz <tz@iana.org> writes:
[ assorted proposals ]
One other issue that I think deserves more attention than it has gotten lately is that tzdb has become a de facto standard and users rely on its stability. I would like to see some sort of principle adopted that minimizes changes in historical data. In particular, I think it's past time to prohibit data changes adopted for essentially-administrative reasons (as opposed to new findings of historical fact). I'd put the recent reorganization under the heading of things that would be forbidden by this principle, and also the changes a few years ago that removed "made up" zone abbreviations. Whatever the justification for those abbreviations originally, some people had come to depend on them, and little was to be gained by removing them.
Looking at Stephen's list with this in mind, one thing I'd vote against is redefining the LMT offsets. Yeah, I suggested that myself a few days ago, but that was in the context of what seemed to be a fait accompli that would largely destroy tzdb's backward compatibility anyway. If we're going to reverse that choice and try to preserve the existing pre-1970 data, then preserving the existing LMT data goes along with that.
The idea of having at least one zone per ISO-3166-1 country does seem like a good one, though. Aside from possibly deflecting politically-based complaints, this seems basically like future proofing: even if two countries have shared clocks since 1970, they could diverge at any time. Being prepared with an appropriate zone name should minimize the pain to users. Also notice that splitting an existing zone creates no compatibility problems, since no one is obligated to switch to the new zone name immediately.
regards, tom lane

On Jun 6, 2021, at 11:02, Tom Lane via tz <tz@iana.org> wrote:
One other issue that I think deserves more attention than it has gotten lately is that tzdb has become a de facto standard and users rely on its stability. I would like to see some sort of principle adopted that minimizes changes in historical data. In particular, I think it's past time to prohibit data changes adopted for essentially-administrative reasons (as opposed to new findings of historical fact). I'd put the recent reorganization under the heading of things that would be forbidden by this principle, and also the changes a few years ago that removed "made up" zone abbreviations. Whatever the justification for those abbreviations originally, some people had come to depend on them, and little was to be gained by removing them.
[…]
The idea of having at least one zone per ISO-3166-1 country does seem like a good one, though. Aside from possibly deflecting politically-based complaints, this seems basically like future proofing: even if two countries have shared clocks since 1970, they could diverge at any time. Being prepared with an appropriate zone name should minimize the pain to users. Also notice that splitting an existing zone creates no compatibility problems, since no one is obligated to switch to the new zone name immediately.
These two points in particular are synergistic; stability of historical data is just a Good Thing, but no one here wants to see the TZDB maintainer receiving ‘vitriolic’ e-mail, nor getting sued [again]. Hence, politically ‘future-proofing' the DB seems a prudent move. Cheers! |---------------------------------------------------------------------| | Frederick F. Gleason, Jr. | Chief Developer | | | Paravel Systems | |---------------------------------------------------------------------| | A room without books is like a body without a soul. | | | | -- Cicero | |---------------------------------------------------------------------|

Fred Gleason via tz <tz@iana.org> writes:
On Jun 6, 2021, at 11:02, Tom Lane via tz <tz@iana.org> wrote:
The idea of having at least one zone per ISO-3166-1 country does seem like a good one, though.
These two points in particular are synergistic; stability of historical data is just a Good Thing, but no one here wants to see the TZDB maintainer receiving ‘vitriolic’ e-mail, nor getting sued [again]. Hence, politically ‘future-proofing' the DB seems a prudent move.
I had a further thought about this: if we want to have both of these principles (zone-per-country and stability of old data), then it would make sense to insist that we don't create new per-country zones until someone has done the research to fill in plausible old data back to the LMT era for the proposed zone name. If that initial data later proves wrong, well, fixing it falls within longstanding tzdb practice. But we shouldn't start out a new zone with known-bogus old data. As an example that relates to one of the current complaints, if Sweden had been part of the Europe/Berlin zone all along, we'd not split out Europe/Stockholm without first reconstructing plausible historical data for Sweden. The advantage of this rule is that it would encourage an incremental approach to getting to zone-per-country. If someone comes along and whines that $wherever should have its own zone, they can be told "Sure. Come back when you've done the research." There's no reason that Paul and Tim should be expected to make that happen on their own. regards, tom lane

Tom Lane via tz <tz@iana.org> wrote on Sun, 6 Jun 2021 at 13:31:06 EDT in <655997.1623000666@sss.pgh.pa.us>:
I had a further thought about this: if we want to have both of these principles (zone-per-country and stability of old data), then it would make sense to insist that we don't create new per-country zones until someone has done the research to fill in plausible old data back to the LMT era for the proposed zone name.
I don't think this is correct or fair. It's not correct because if there is a newly established zone (per-country or otherwise), early adopters of that zone can tolerate some flux in the data. It's not as if we're coming into an established zone with a lot of dependancies and expectations of stability. If the new zone goes in and today and the historical data comes a year later, that's probably OK. Not ideal, but OK. For sure we'd want to minimize adding it in fits and starts. I don't think it's fair because the tz database needs to turn on a dime to reflect political changes that happen on rapid timescales. If a zone into two countries (whether by one of two previously-aligned countries changing their time zone rules, or by other political or even military mechanism) overnight, we need to push a new release out ASAP, we can't say, "oh, sorry, you can't have working time on your computers, we have to research the history, just suffer along for 6 months." -- jhawk@alum.mit.edu John Hawkinson

John Hawkinson <jhawk@alum.mit.edu> writes:
Tom Lane via tz <tz@iana.org> wrote on Sun, 6 Jun 2021 at 13:31:06 EDT in <655997.1623000666@sss.pgh.pa.us>:
I had a further thought about this: if we want to have both of these principles (zone-per-country and stability of old data), then it would make sense to insist that we don't create new per-country zones until someone has done the research to fill in plausible old data back to the LMT era for the proposed zone name.
I don't think it's fair because the tz database needs to turn on a dime to reflect political changes that happen on rapid timescales. If a zone into two countries (whether by one of two previously-aligned countries changing their time zone rules, or by other political or even military mechanism) overnight, we need to push a new release out ASAP, we can't say, "oh, sorry, you can't have working time on your computers, we have to research the history, just suffer along for 6 months."
You're misunderstanding the context, I think. If country X actually changes their DST rules with minimal notice, then yeah, we'd have to create a new zone and worry about correcting its old data later. What I'm thinking about is how to handle the situation where X should have its own zone according to the newly formulated zone-per-country rule, but there is no post-1970 data divergence that would make it necessary to have a separate zone according to other rules. I do not think that there need be any urgency about making that new zone come into existence, especially not if that would certainly lead to the need to change its pre-1970 data later. So I don't buy that there's any "fairness" argument. What's unfair about asking somebody who wants a quick change to do the legwork to support it? regards, tom lane

I agree that all historical data is murky, and gets murkier (?} the further you go back. This is to be expected, and users should be able to use the data as a 'best guess' until it is improved. Until we all go back to local solar time, there will always be some inaccuracies, but tz should be the place to capture the current 'best guesses', and be the destination for capturing improvements. On 2021-06-06 13:56, John Hawkinson via tz wrote:
Tom Lane via tz <tz@iana.org> wrote on Sun, 6 Jun 2021 at 13:31:06 EDT in <655997.1623000666@sss.pgh.pa.us>:
I had a further thought about this: if we want to have both of these principles (zone-per-country and stability of old data), then it would make sense to insist that we don't create new per-country zones until someone has done the research to fill in plausible old data back to the LMT era for the proposed zone name. I don't think this is correct or fair.
It's not correct because if there is a newly established zone (per-country or otherwise), early adopters of that zone can tolerate some flux in the data. It's not as if we're coming into an established zone with a lot of dependancies and expectations of stability. If the new zone goes in and today and the historical data comes a year later, that's probably OK. Not ideal, but OK. For sure we'd want to minimize adding it in fits and starts.
I don't think it's fair because the tz database needs to turn on a dime to reflect political changes that happen on rapid timescales. If a zone into two countries (whether by one of two previously-aligned countries changing their time zone rules, or by other political or even military mechanism) overnight, we need to push a new release out ASAP, we can't say, "oh, sorry, you can't have working time on your computers, we have to research the history, just suffer along for 6 months."
-- jhawk@alum.mit.edu John Hawkinson

I would agree to both of these. I also think it would be preferable to restore (with corrections if necessary) all the distinct zones per ISO country that have varied since 1970. Russia and Canada have numerous timezones, just as the USA has, and it would simplify access, as well as providing more historical data per country. Overall my preference is to make historical data available for as many cities as possible / per country, in one place; or for at least all zones per country that have differed since 1970. If they link is to generic 'noname' zones which crosses political boundaries, then I could also be happy with that, as long as no historical data is lost. In the long run, app developers will also want historical data for separate zones that merged before 1970, and they could exist in a secondary table, managed by tz, or even managed by someone else as a separate project. On 2021-06-06 12:44, Fred Gleason via tz wrote:
On Jun 6, 2021, at 11:02, Tom Lane via tz <tz@iana.org <mailto:tz@iana.org>> wrote:
One other issue that I think deserves more attention than it has gotten lately is that tzdb has become a de facto standard and users rely on its stability. I would like to see some sort of principle adopted that minimizes changes in historical data. In particular, I think it's past time to prohibit data changes adopted for essentially-administrative reasons (as opposed to new findings of historical fact). I'd put the recent reorganization under the heading of things that would be forbidden by this principle, and also the changes a few years ago that removed "made up" zone abbreviations. Whatever the justification for those abbreviations originally, some people had come to depend on them, and little was to be gained by removing them.
[…]
The idea of having at least one zone per ISO-3166-1 country does seem like a good one, though. Aside from possibly deflecting politically-based complaints, this seems basically like future proofing: even if two countries have shared clocks since 1970, they could diverge at any time. Being prepared with an appropriate zone name should minimize the pain to users. Also notice that splitting an existing zone creates no compatibility problems, since no one is obligated to switch to the new zone name immediately.
These two points in particular are synergistic; stability of historical data is just a Good Thing, but no one here wants to see the TZDB maintainer receiving ‘vitriolic’ e-mail, nor getting sued [again]. Hence, politically ‘future-proofing' the DB seems a prudent move.
Cheers!
|---------------------------------------------------------------------| | Frederick F. Gleason, Jr. | Chief Developer | | | Paravel Systems | |---------------------------------------------------------------------| | A room without books is like a body without a soul. | | | | -- Cicero | |---------------------------------------------------------------------|

On Sun, 6 Jun 2021, Tom Lane via tz wrote:
Stephen Colebourne via tz <tz@iana.org> writes:
[ assorted proposals ]
<snip>
I'd put the recent reorganization under the heading of things that would be forbidden by this principle, and also the changes a few years ago that removed "made up" zone abbreviations. Whatever the justification for those abbreviations originally, some people had come to depend on them, and little was to be gained by removing them.
The PHP project got a fair amount of bug reports about the "made up" abbreviation removal (in favour of <+03>). So I agree with that. cheers, Derick -- PHP 7.4 Release Manager Host of PHP Internals News: https://phpinternals.news Like Xdebug? Consider supporting me: https://xdebug.org/support https://derickrethans.nl | https://xdebug.org | https://dram.io twitter: @derickr and @xdebug

Tom Lane via tz said:
One other issue that I think deserves more attention than it has gotten lately is that tzdb has become a de facto standard and users rely on its stability. I would like to see some sort of principle adopted that minimizes changes in historical data. [...] and also the changes a few years ago that removed "made up" zone abbreviations. Whatever the justification for those abbreviations originally, some people had come to depend on them, and little was to be gained by removing them.
But I'd like another principle to be "truth". We should not be making up data - that goes both for lying about LMT offsets and lying about the abbreviations people use. Remember that the basic principle was "what's used on the ground".
The idea of having at least one zone per ISO-3166-1 country does seem like a good one, though. Aside from possibly deflecting politically-based complaints,
And adding other ones.
this seems basically like future proofing: even if two countries have shared clocks since 1970, they could diverge at any time.
So can a country and a dependent territory. Or even a non-dependent one: to pick a fanciful example, as part of the upcoming EU changes, Italy could decide to stick to UTC+2 but France to UTC+1, then Corsica decide to go to UTC+2.
Being prepared with an appropriate zone name should minimize the pain to users.
Why not wait until it's needed? We're likely to get more notice than we get for some countries' Ramadan changes. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646

On Sun, 6 Jun 2021, Stephen Colebourne via tz wrote:
TZDB has seen recent difficulties due to conflicting desires and expectations of the dataset. This is an attempt to capture some of these:
1) LMT LMT is confusing for many downstream users because they don't understand the concept. Recent threads have noted queries from Postgres users, I can attest to confusion in various Java libraries. In fact, earlier versions of Java removed the LMT concept. I think the time is right to properly consider an alternative to LMT. I believe we can define an offset for each region that the region has most typically been associated with post 1970. For example, Europe/Paris is most associated with +01:00 since 1970. This provides the "normal looking" offset that most users desire for the LMT period.
I'm not sure whether this is a solution that you can universally use. For example Amsterdam's rules in the 1920s specifically used UTC+1172/+4772. Having the now standard time "+3600" before that makes less sense than continuing to keep LMT (which also happens to be +1172).
2) Negative DST Negative DST in the source files continues to be a problem, but we should address other issues first.
When I first highlighted this Irish case (https://mm.icann.org/pipermail/tz/2017-December/025621.html), it was never my intention that the tz data was changed to negative DST - I was originally only pointing out that IST isn't only Iran Standard Time, but also Irish Standard Time. I don't mind how it is recorded, as long as it produces the right output (in the form of transitions).
3) Links At present, it is not possible to identify which region names are deprecated (such as spelling changes) and which represent important data. Having such a distinction would allow permanent deprecations to be removed from some downstream systems, and would also allow downstream systems to provide functionality to normalize IDs from old to new in a correct and consistent way.
PHP's implementation *does* know which ones are deprecated, by marking the tzids that are not listed in code/zone.tab as such. It does not however offer what the replacement is. <snip>
Proposal ------------ That TZDB shall adopt the principle that the main geographic files (africa to southamerica) shall contain data with full history for locations where zone history has differed since 1970 subject to the minimum requirement that there is at least one full zone with history defined for each independent country as defined by ISO-3166-1. Dependent territories in ISO-3166-1 that are within 1/24th of the earth circumference of another dependent territory or parent country with the same sovereignty shall be combined if their post-1970 history is identical.
That TZDB shall replace LMT with the offset that best represents standard time for the location during the period 1970 to 2021.
That TZDB shall define a non-makefile mechanism, which may involve a new file, to identify permanently deprecated IDs, such as "Turkey" or "W-SU".
That TZDB shall offer a command line makefile flag that filters the data to reduce the binary output where data is the same post-1970. That consideration is given to whether this flag should erase pre-1970 history as part of it's truncation process.
That these rules shall be encoded in the theory file along with an explicit statement of backwards compatibility.
----- It is my belief that this proposal meets the issues expressed above while also respecting the concerns of fairness, guidelines and politics expressed by others. For example, TZDB would not include a full zone with history for Kosovo until ISO-3166-1 includes it. This provides a straightforward defence against the worst issues of politics.
The dependent territory rules are designed to allow locations that are close to each other in distance and sovereignty to be combined, such as Jersey and London. I have not analyzed how many zones of full history can be saved by this mechanism.
I acknowledge that the above is a significant change to TZBD, but it does more fully align TZDB with the Governmental authorities that actually define time zones. I also believe it more closely aligns TZDB with the expectations of downstream users.
This proposal seems very sensible to me, with perhaps the exception as what to pick for the "before LMT" offset. cheers, Derick -- PHP 7.4 Release Manager Host of PHP Internals News: https://phpinternals.news Like Xdebug? Consider supporting me: https://xdebug.org/support https://derickrethans.nl | https://xdebug.org | https://dram.io twitter: @derickr and @xdebug

Derick Rethans via tz said:
2) Negative DST Negative DST in the source files continues to be a problem, but we should address other issues first.
When I first highlighted this Irish case (https://mm.icann.org/pipermail/tz/2017-December/025621.html), it was never my intention that the tz data was changed to negative DST - I was originally only pointing out that IST isn't only Iran Standard Time, but also Irish Standard Time. I don't mind how it is recorded, as long as it produces the right output (in the form of transitions).
If negative DST represents what actually happens in the real world, then that's what we should be using. If implementations can't cope with negative DST, then they have bugs in them and/or don't match the real world. They need to change; we should not be lying to everyone just because some people won't face reality. (I don't object to a mechanism for people to do a build with the lies in, provided they have to explicitly accept it. Lies shouldn't be the default.) The other thing I think we need to discuss in this debate is the intent of pre-1970 data. It should be clear to people what this data is meant to cover. I was under the impression that pre-1970 data represents the time in at least some part of the zone. So, at present, the Berlin zone has pre-1970 data that represents at least part of Germany and the Stockholm zone has pre-1970 data that represents at least part of Sweden. If these are merged, the resulting zone has two names but both only contain pre-1970 data for part of Germany. In this case, it needs to be made very clear that the area covered by Stockholm is "Germany and Sweden", not "Sweden". The problem appears, to me, to be two zones that have the same contents but different descriptions. That would need to be fixed. (Personally, I'd like all the pre-1970 data that we used to have to still be available. But I understand that Paul doesn't want to have to change dozens of zones that last differed in the 19th century just because of a change in the 21st. There ought to be a solution to this.) -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646

On Jun 7, 2021, at 1:25 AM, Derick Rethans via tz <tz@iana.org> wrote:
I'm not sure whether this is a solution that you can universally use. For example Amsterdam's rules in the 1920s specifically used UTC+1172/+4772. Having the now standard time "+3600" before that makes less sense than continuing to keep LMT (which also happens to be +1172).
Hopefully nobody assumes that, prior to 1835, Europe/Amsterdam, in the backzone version of the database, refers to any part of the Europe/Amsterdam other than Amsterdam itself (or some subregion thereof). (The same applies for all other tzdb regions that begin with an LMT value.) Note that theory.html says "don't do that": In short, many, perhaps most, of the tz database's pre-1970 and future timestamps are either wrong or misleading. Any attempt to pass the tz database off as the definition of time should be unacceptable to anybody who cares about the facts. In particular, the tz database's LMT offsets should not be considered meaningful, and should not prompt creation of timezones merely because two locations differ in LMT or transitioned to standard time at different dates.
I was originally only pointing out that IST isn't only Iran Standard Time, but also Irish Standard Time.
Hopefully nobody using any APIs that return time zone abbreviations is assuming that those abbreviations can, in the general case, be used to determine what time zone you're in - i.e., display them to the user, but don't attempt to process them to infer any information unless you know that they come from a limited set of tzdb regions in which no two regions ever have the same abbreviations.

On Mon, 7 Jun 2021 at 10:28, Guy Harris via tz <tz@iana.org> wrote:
Note that theory.html says "don't do that":
In short, many, perhaps most, of the tz database's pre-1970 and future timestamps are either wrong or misleading. Any attempt to pass the tz database off as the definition of time should be unacceptable to anybody who cares about the facts. In particular, the tz database's LMT offsets should not be considered meaningful, and should not prompt creation of timezones merely because two locations differ in LMT or transitioned to standard time at different dates.
If you ask a Java library "what was the offset on 1st Jan 1600 at 12:36 pm" they will all provide an answer. My understanding is that the same is true of libraries in many other environments. Most systems will provide LMT as the answer. Some (eg. the older Java API) will provide a non-LMT offset. Those of us on this list know that the question being asked is pretty stupid, but it is asked by end users. And when it is asked, people are surprised by the LMT answer. I is worthwhile considering if LMT is the best solution to the far past problem. The second point is not relevant - the proposal is to determine which zones get a LMT and history based on post-1970 data with an ISO country minimum. There is no proposal to create timezones solely for LMT. Stephen

In the 1600s, pretty much everyone used Local solar time / apparent time / sundial time, except astronomers. Use of Local Mean time only really caught on for the public in the early 1800s. Capturing the date of application of LMT for the primary city of a zone is very useful, but determining the exact date it occurred is likely quite difficult. History is always murky. In my own code I use 1800 as the transition date between LST and LMT when its not specified elsewhere. On 2021-06-07 10:38, Stephen Colebourne via tz wrote:
On Mon, 7 Jun 2021 at 10:28, Guy Harris via tz <tz@iana.org> wrote:
Note that theory.html says "don't do that":
In short, many, perhaps most, of the tz database's pre-1970 and future timestamps are either wrong or misleading. Any attempt to pass the tz database off as the definition of time should be unacceptable to anybody who cares about the facts. In particular, the tz database's LMT offsets should not be considered meaningful, and should not prompt creation of timezones merely because two locations differ in LMT or transitioned to standard time at different dates. If you ask a Java library "what was the offset on 1st Jan 1600 at 12:36 pm" they will all provide an answer. My understanding is that the same is true of libraries in many other environments. Most systems will provide LMT as the answer. Some (eg. the older Java API) will provide a non-LMT offset. Those of us on this list know that the question being asked is pretty stupid, but it is asked by end users. And when it is asked, people are surprised by the LMT answer. I is worthwhile considering if LMT is the best solution to the far past problem.
The second point is not relevant - the proposal is to determine which zones get a LMT and history based on post-1970 data with an ISO country minimum. There is no proposal to create timezones solely for LMT.
Stephen

On 6/7/21 7:57 AM, David Patte via tz wrote:
In my own code I use 1800 as the transition date between LST and LMT when it's not specified elsewhere.
1800 is surely a bit early, for all but the most technologically-advanced locations. In most of the world, solar time maintained its supremacy over local mean time well after 1800. The "when it's not specified elsewhere" intrigues me, though. What sort of specification do you have elsewhere? One can imagine a tzdb extension containing when local mean time came into effect at each location. Unfortunately if we added something along these lines to tzdb, I expect we'd have to invent nearly every data item. It'd be like a good chunk of the pre-1970 data we already have, only worse. Part of the problem is that people in the early 19th century didn't much care whether they were using local solar time or local mean time, and many towns actually observed a mean-time approximation to solar time. Here's a quote from page 15 of Francis Abbott's book "A Treatise on the Management of Public Clocks", 3rd ed. (1839): "Nothing is more common than to place the management and regulating of church clocks in the hands of the sexton, without keeping any check upon him or allowing him a salary to stimulate him in this important duty; the consequence is, that the village clocks throughout the country are kept by chance, and generally speaking vary from one quarter to three quarters of an hour from mean time." Abbott also wrote (page 16) that when a sexton periodically checked and set a church clock's time, "the usual mode of ascertaining time is by the sundial", i.e., the clock was considered to be a good-enough approximation to solar time rather to local mean time. This state of affairs didn't change until the telegraph made it feasible for timestamps to be communicated more accurately, and railroads needed more-accurate time.

Paul Eggert via tz said:
In my own code I use 1800 as the transition date between LST and LMT when it's not specified elsewhere.
1800 is surely a bit early, for all but the most technologically-advanced locations. In most of the world, solar time maintained its supremacy over local mean time well after 1800. [...] Part of the problem is that people in the early 19th century didn't much care whether they were using local solar time or local mean time, and many towns actually observed a mean-time approximation to solar time. [...] This state of affairs didn't change until the telegraph made it feasible for timestamps to be communicated more accurately, and railroads needed more-accurate time.
I was wandering round the Fitzwilliam Museum in Cambridge the other day and noticed a Thomas Tompion (1639-1713) clock. It had two hands and a dial with Roman numerals I to XII twice going round it. Outside that, in Arabic numerals, were 5, 10, ..., 60, also twice. At the top it said "EQVAL 60 TIME". Immediately outside that was a ring which could clearly rotate separately. This also had 5, 10, ..., 60 on it twice, with "APPARET 60 TIME" (tilde over the first E) at the near top - it was rotated so that the 60 was aligned with 13.6 on the main dial. (Clearly the staff don't know how to set it correctly, since the equation of time was 3.22 that day.) So 1800 is perhaps a bit late, at least for some places. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646

On Mon 2021-06-07T20:14:00+0100 Clive D.W. Feather via tz hath writ:
I was wandering round the Fitzwilliam Museum in Cambridge the other day and noticed a Thomas Tompion (1639-1713) clock. It had two hands and a dial with Roman numerals I to XII twice going round it. Outside that, in Arabic numerals, were 5, 10, ..., 60, also twice. At the top it said "EQVAL 60 TIME".
So 1800 is perhaps a bit late, at least for some places.
A clock of this quality would be sold as a "regulator" and priced accordingly. Not even most observatories had clocks accurate to one second until around 1800. If one second is the standard then 1800 is a good guess at the year. -- Steve Allen <sla@ucolick.org> WGS-84 (GPS) UCO/Lick Observatory--ISB 260 Natural Sciences II, Room 165 Lat +36.99855 1156 High Street Voice: +1 831 459 3046 Lng -122.06015 Santa Cruz, CA 95064 https://www.ucolick.org/~sla/ Hgt +250 m

On Mon, Jun 07, 2021 at 02:28:01AM -0700, Guy Harris via tz wrote:
Hopefully nobody using any APIs that return time zone abbreviations is assuming that those abbreviations can, in the general case, be used to determine what time zone you're in - i.e., display them to the user, but don't attempt to process them to infer any information unless you know that they come from a limited set of tzdb regions in which no two regions ever have the same abbreviations.
PHP used to do this in the past: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=628079#19

Stephen Colebourne via tz said:
1) LMT LMT is confusing for many downstream users because they don't understand the concept.
In that case, explain it better. But just because people don't understand how the real world worked is not a good reason to lie to them
Recent threads have noted queries from Postgres users, I can attest to confusion in various Java libraries. In fact, earlier versions of Java removed the LMT concept. I think the time is right to properly consider an alternative to LMT. I believe we can define an offset for each region that the region has most typically been associated with post 1970. For example, Europe/Paris is most associated with +01:00 since 1970. This provides the "normal looking" offset that most users desire for the LMT period.
Derick has pointed out that this looks ridiculous for Amsterdam. I'll also point out Dublin, where the official time was Dublin Mean Time for many years. Making TZDB say that Dublin used GMT before it used DMT is total and absolute nonsense. I thought you were the one objecting violently to the idea that a zone might contain fake data; wasn't that the whole point of your Stockholm argument? So why do you want to create fake data now? If LMT is what happened, then LMT is what the database should say. If people don't want to see LMT in their systems, delete everything before 1970 in your local copy.
Proposal ------------ That TZDB shall adopt the principle that the main geographic files (africa to southamerica) shall contain data with full history for locations where zone history has differed since 1970 subject to the minimum requirement that there is at least one full zone with history defined for each independent country as defined by ISO-3166-1.
I disagree with this. There is no need to create zones just to have one per country.
Dependent territories in ISO-3166-1 that are within 1/24th of the earth circumference of another dependent territory or parent country with the same sovereignty shall be combined if their post-1970 history is identical.
I also disagree with this. If it's justified to have separate zones for countries, why not for dependent territories? And why should the distance matter? Oh, and why on earth "1/24th" instead of "15 degrees" like everyone is used to? And which circumference? The earth has more than one. Why not just give a distance in km? And why treat a territory 10 degrees due south differently to one 20 degrees due south? Are you measuring from the nearest points of the two territories and, if so, are you working from high tide, low tide, 3 mile limit, 15 mile limit, 50 mile limit, or claimed waters? Or are you measuring from a capital city or other administrative centre? What if the dependent territory is not north or south of the administrative centre of the country but some other part of it? Oh, please define "dependent". This is a useless definition as written. Why are you happy for Taiwan to be excluded under these rules but not Sweden? Answer: politics, which is what we are trying to avoid. For the record, I OBJECT to this proposal.
That TZDB shall replace LMT with the offset that best represents standard time for the location during the period 1970 to 2021.
For the record, I OBJECT to this proposal.
It is my belief that this proposal meets the issues expressed above while also respecting the concerns of fairness, guidelines and politics expressed by others. For example, TZDB would not include a full zone with history for Kosovo until ISO-3166-1 includes it. This provides a straightforward defence against the worst issues of politics.
Better would be to ignore politics entirely and say that TZDB would not include a zone for Kosovo until its time differs from wherever is used now.
The dependent territory rules are designed to allow locations that are close to each other in distance and sovereignty to be combined, such as Jersey and London.
Jersey is not part of the UK - it is a Crown Dependency. Not the same thing at all. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646

On Mon, 7 Jun 2021 at 10:08, Clive D.W. Feather via tz <tz@iana.org> wrote:
Stephen Colebourne via tz said:
1) LMT I thought you were the one objecting violently to the idea that a zone might contain fake data; wasn't that the whole point of your Stockholm argument? So why do you want to create fake data now?
I'm recording the issues that I've seen users have with tzdb data, and proposing a possible solution. LMT as currently defined causes issues and I believe the proposal would be less surprising to non-expert users.
That TZDB shall adopt the principle that the main geographic files (africa to southamerica) shall contain data with full history for locations where zone history has differed since 1970 subject to the minimum requirement that there is at least one full zone with history defined for each independent country as defined by ISO-3166-1.
I disagree with this. There is no need to create zones just to have one per country.
TZDB does not live in an abstract idealised world. The vast majority of the world's population associates strongly with the country they are in.
I also disagree with this. If it's justified to have separate zones for countries, why not for dependent territories? And why should the distance matter? Oh, and why on earth "1/24th" instead of "15 degrees" like everyone is used to?
Meh, of course 15 degrees is a better way to put it. The distance is a general guidance (LMT location to LMT location) to separate "local" from "far away". eg. Aruba is far away from the Netherlands, but close to Bonaire - that is the distinction that I tried to capture. Dependent territories are of course entitled to their own zone providing that data has differed since 1970. The only reason for including the dependent territories part at all is to permit rare cases of merging where the locations are local and aligned by sovereignty. If there is a general view that the dependent territories part of the proposal is overly complex, it can be dropped at the expense of creating more zones.
Oh, please define "dependent".
Listed at Wikipedia, based on 3 sources. The official ISO data indicates whether the code is "independent" which is also useful. https://en.wikipedia.org/wiki/List_of_ISO_3166_country_codes https://www.iso.org/obp/ui/#iso:code:3166:BQ
Why are you happy for Taiwan to be excluded under these rules but not Sweden? Answer: politics, which is what we are trying to avoid. Better would be to ignore politics entirely and say that TZDB would not include a zone for Kosovo until its time differs from wherever is used now.
Taiwan has an ISO-3166-1 code, just like Sweden. The only case of interest that I can find is Kosovo, which does not have either an ISO or TZDB code. It is *very* easy to say "TZDB will add a zone representing time in Kosovo as soon as ISO-3166-1 includes it". End of politics.
For the record, I OBJECT to this proposal.
For the record, I OBJECT to the decimation of TZDB data over the last few years. If you object then feel free to provide a counter proposal. (One that seeks to address the issues at hand). As a reminder, the ISO-3166-1 rule is a *minimum standard". Nothing would change about creating or merging time zones within a country. Stephen

Hi Stephen, Without getting into any of the substance of any of the proposals you mentioned, a hopefully minor aside: On 06.06.21 10:08, Stephen Colebourne via tz wrote:
I acknowledge that the above is a significant change to TZBD, but it does more fully align TZDB with the Governmental authorities that actually define time zones.
The goal of this database has never been to align to governmental authorities, but to express what people in a region think local time is. Let's please not lose sight of that. Eliot

On Mon, 7 Jun 2021 at 15:24, Eliot Lear via tz <tz@iana.org> wrote:
Without getting into any of the substance of any of the proposals you mentioned, a hopefully minor aside:
On 06.06.21 10:08, Stephen Colebourne via tz wrote:
I acknowledge that the above is a significant change to TZBD, but it does more fully align TZDB with the Governmental authorities that actually define time zones.
The goal of this database has never been to align to governmental authorities, but to express what people in a region think local time is. Let's please not lose sight of that.
And yet there are (older) TZ region names for many countries, including Eire, Egypt, Turkey, GB, Israel, Hongkong, PRC, NZ... There are also modern names with countries/states in them, eg America/Argentina/Cordoba and America/Indiana/Indianapolis. Given this, I think "never" is a significant overstretch. "what people in a region think" is essentially unknowable. What governmental authorities define is generally well recorded. Where we can accurately fnd data to indicate that a region is not following the lead of the governmental authorities then I agree we need to make sure that those people's experience can be expressed in some form by TZDB. But I think that governmental authorities so dominate the field of timezones, and what our downstream users perceive of timezones, that we need to reflect it. Putting our heads in the sand and pretending that governmental authorities don't exist is not helpful. Stephen

On Jun 7, 2021, at 10:53, Stephen Colebourne via tz <tz@iana.org> wrote:
"what people in a region think" is essentially unknowable. What governmental authorities define is generally well recorded. Where we can accurately fnd data to indicate that a region is not following the lead of the governmental authorities then I agree we need to make sure that those people's experience can be expressed in some form by TZDB. But I think that governmental authorities so dominate the field of timezones, and what our downstream users perceive of timezones, that we need to reflect it. Putting our heads in the sand and pretending that governmental authorities don't exist is not helpful.
Other than Asia/Urumqi, I’m hard pressed to call to mind a region where a significant section of the population has basically decided to defy all governmental authority. However, there are areas where *which* governmental authority is relevant is a very live question (the Crimea comes to mind immediately). The “what people in a region think” standard allows TZDB to avoid taking sides about which political entity is the “legitimate” one. Cheers! |---------------------------------------------------------------------| | Frederick F. Gleason, Jr. | Chief Developer | | | Paravel Systems | |---------------------------------------------------------------------| | A room without books is like a body without a soul. | | | | -- Cicero | |---------------------------------------------------------------------|

On Mon, 7 Jun 2021 at 16:12, Fred Gleason via tz <tz@iana.org> wrote:
On Jun 7, 2021, at 10:53, Stephen Colebourne via tz <tz@iana.org> wrote: "what people in a region think" is essentially unknowable. What governmental authorities define is generally well recorded. Where we can accurately fnd data to indicate that a region is not following the lead of the governmental authorities then I agree we need to make sure that those people's experience can be expressed in some form by TZDB. But I think that governmental authorities so dominate the field of timezones, and what our downstream users perceive of timezones, that we need to reflect it. Putting our heads in the sand and pretending that governmental authorities don't exist is not helpful.
Other than Asia/Urumqi, I’m hard pressed to call to mind a region where a significant section of the population has basically decided to defy all governmental authority. However, there are areas where *which* governmental authority is relevant is a very live question (the Crimea comes to mind immediately). The “what people in a region think” standard allows TZDB to avoid taking sides about which political entity is the “legitimate” one.
The proposal does not take sides about which political entity is the “legitimate” one. I wouldn't want tzdb to do that. Imagine a case where country A invades country B and takes control of the city that TZDB references, but not the whole country, and then imposes its own timezone in that city. With the current rules, TZDB would update the current timezone rule for the city to that of the invader's new rule, because that is what people in that city experience. But for the rest of the country that was not invaded would still need a region in TZDB to record it's current timezone. Thus a new region would need to be created based on the largest city in the remnants of the invaded country. With the new rules, the same process would occur. I contend that the new rule only really affects pre-1970 data, not how TZDB approaches changes in geopolitics. If the TZDB IDs were of the form Country/City, eg Germany/Berlin, I would agree with the concern, but fortunately they are not. Stephen
participants (13)
-
Bill Seymour
-
Clive D.W. Feather
-
David Patte
-
Derick Rethans
-
Eliot Lear
-
Fred Gleason
-
Guy Harris
-
John Hawkinson
-
Paul Eggert
-
Stepan Golosunov
-
Stephen Colebourne
-
Steve Allen
-
Tom Lane