Before 1970 - proposal to change LMT
The TZDB has few issues after 1970 as far as I can see, but before that date there are debates about what should be included and what should not. Options include - nothing at all before 1970 (but the format demands something...) - only data that has some reasonable source before 1970 - include data before 1970 even if its source is less than ideal Ultimately, the problem revolves around the data format (which cannot realistically be changed) and LMT. In the past there have been various discussions around LMT with the conclusion that it is a bit of a silly concept, because it refers to a single location, not a region. But we are in the position where the format requires it, and it is widely used (even if that use is accidental rather than deliberate). Broadly, my position is that where data can be verified pre-1970, such as in the UK, or my recent Angola email, it should be retained. Commentary should be added to the data files to justify the data that is present. Where the data cannot be verified, I am willing to take a judgement call, provided that the judgement call is not just "we don't know so lets link it elsewhere". What I find objectionable is to have a named entity (zone or link) where the LMT of the named location is replaced by the LMT of some other location. I understand that others may not care, but it seems flat out wrong to me (and is key to solving the pre-1970 problem). Therefore, I'd like to propose a simple alteration to LMT. I propose that all LMT values in the database are replaced by an new value, representing what could be described as "averaged/smoothed regional far past time". In real terms, this means changing all the LMT values to an appropriate fixed value for that location. Once the fixed value is chosen, there should be no reason not to retain it forever. The intention is that the fixed offset value is the most common standard offset used by the location, favouring an offset close to the LMT if ambiguous. The offset chosen would typically be a multiple of 15 minutes, and not contain seconds at all. To help frame the debate, I produced a list of zones and all their *standard* transitions (excluding DST) for 2013h; https://gist.github.com/jodastephen/03487c61782e612db85d Some examples: Europe/London -00:01:15 LMT Z 1847-12-01 My proposal is to change the LMT value from "-00:01:15" to "Z" America/New_York -04:56:02 -05:00 1883-11-18 My proposal is to change the LMT value from "-04:56:02" to "-05:00" America/Panama -05:18:08 -05:19:36 1890-01-01 -05:00 1908-04-22 My proposal is to change the LMT value from "-05:18:08" to "-05:00". If justification for the "-05:19:36" value cannot be found, then that should be removed. The net effect of this would be to provide a much more regularized value for the far past (pre 1850 or later). In most cases, the new LMT offset would be the same as the first non-LMT offset, resulting in the removal of a messy transition, leaving only a name change (eg LMT to GMT). Advantages - regularized far past (my experience shows that developers find the LMT seconds offset confusing when they see it) - increased stability, once adopted there is no reason for the LMT value to change - minimizes churn in the database due to "more accurate" longitude/latitude - allows the same offset to be used even if the largest city changes in a region - typically aligns the far past with the first "real" zone definition line - allows multiple locations to be linked (as per recent changes) without debate about changing any timestamps in most cases - when multiple locations are linked, only the start of regular time is then significant data (where known it should be retained) Disadvantages: - changes the meaning of LMT (the name could be possibly be changed to FPT - far past time) - loses existing LMT data (it could still be made available in a separate file) - requires a one-off change I hope that this proposal can be considered, as it would seem to address the majority of the issues in a simple and neat way that benefits most downstream consumers with extra regularity and stability. Stephen
Stephen Colebourne wrote:
What I find objectionable is to have a named entity (zone or link) where the LMT of the named location is replaced by the LMT of some other location.
This objection seems to be based on a misunderstanding of what the LMT entries were always supposed to mean. They never meant anything like "the time at this location was exactly 7 hours, 52 minutes and 58 seconds before GMT" (to use America/Los_Angeles as an example). Instead, they meant that the time was not closely specified: different people at that location would not have cared about minor differences in GMT offsets, or (if pushed to be specific and if knowledgable about the topic) would even have disagreed about what the GMT offsets should be. In hindsight the tz database format should had have a specific notation for this. But it doesn't, so we use LMT entries as stand-ins. Their exact GMT offsets do not matter. It's similar to the zzz notation that we use for places while uninhabited. I suppose one could argue that the LMT and zzz notations are both abuses of the format, but we needed *some* way to say that the local time was not closely specified or was undefined, and that's what we came up with.
I propose that all LMT values in the database are replaced by an new value, representing what could be described as "averaged/smoothed regional far past time".
This substitutes one notation for another. Why change notations now?
The net effect of this would be to provide a much more regularized value for the far past (pre 1850 or later).
This makes it sound like the proposed notation would be more misleading than the current one, as it would suggest an even-more-regularized past than the current one does. We shouldn't give users the incorrect impression that long-ago timekeeping was tidy.
On 11 August 2014 00:59, Paul Eggert <eggert@cs.ucla.edu> wrote:
What I find objectionable is to have a named entity (zone or link) where the LMT of the named location is replaced by the LMT of some other location.
This objection seems to be based on a misunderstanding of what the LMT entries were always supposed to mean. They never meant anything like "the time at this location was exactly 7 hours, 52 minutes and 58 seconds before GMT" (to use America/Los_Angeles as an example). Instead, they meant that the time was not closely specified: different people at that location would not have cared about minor differences in GMT offsets, or (if pushed to be specific and if knowledgable about the topic) would even have disagreed about what the GMT offsets should be.
In hindsight the tz database format should had have a specific notation for this. But it doesn't, so we use LMT entries as stand-ins. Their exact GMT offsets do not matter. It's similar to the zzz notation that we use for places while uninhabited. I suppose one could argue that the LMT and zzz notations are both abuses of the format, but we needed *some* way to say that the local time was not closely specified or was undefined, and that's what we came up with.
This seems to be missing the point (again). It does not matter what the intention of LMT in the data format was. What does matter is what it has actually been used for! In reality, the LMT value is widely available as an actual, visible, usable value in applications in many different programming environments. By all means say that we should not be in this mess, but we are, and the proposal is an incredibly simple way to reduce the effects. The concept that LMT somehow means "undefined" has been totally lost at this point. It is also an unhelpful answer, as the programming environments need *some* answer for time-zones in the far past, and right now LMT is the obvious one to pick up. Simply changing the notation from LMT to undefined would not stop the need that programming environments have for a far past value. (By the way, if a programming environment wanted to choose a value for the far past that was not LMT, the chances are that they would choose a smoothed value, exactly as proposed, because that is more in line with the expectations of normal developers)
The net effect of this would be to provide a much more regularized value for the far past (pre 1850 or later).
This makes it sound like the proposed notation would be more misleading than the current one, as it would suggest an even-more-regularized past than the current one does. We shouldn't give users the incorrect impression that long-ago timekeeping was tidy.
All values for the far past are misleading to some degree, "more misleading" is subjective. The proposal simply chooses a value that would be beneficial to the long term maintenance of the tzdb data, by replacing an unstable value by a stable value. That way, most of the link discussion goes away, for example because all of West Africa would then share the same smoothed LMT value. Finally, I will note that for those users who already understand that LMT data means "undefined", the proposal makes no difference. Those users can carry on interpreting LMT in that way. However, for those users that do interpret LMT as an actual real offset to be used for the far past (the vast majority of programming environments) the proposal provides a simpler, clearer answer. Stephen
Stephen Colebourne wrote:
All values for the far past are misleading to some degree, "more misleading" is subjective.
Sure, but all the same it would be weird for a user who does not know that LMT means "approximate placeholder" to see a transition like this: Zone Europe/Paris 0:15:00 - LMT 1891 Mar 15 0:01 0:09:21 - PMT 1911 Mar 11 0:01 for France's institution of a standard time zone in 1891. That would appear to be a transition from a GMT-based time zone to a let's-thumb-our-nose-at-the-British time zone, which is not at all what happened. In contrast, the (non)-transition that's currently in the database feels closer to what actually happened. Yes, this is subjective, but the proposed change feels considerably more artificial and forced than what we've got.
On 11 August 2014 15:58, Paul Eggert <eggert@cs.ucla.edu> wrote:
All values for the far past are misleading to some degree, "more misleading" is subjective.
Sure, but all the same it would be weird for a user who does not know that LMT means "approximate placeholder" to see a transition like this:
Zone Europe/Paris 0:15:00 - LMT 1891 Mar 15 0:01 0:09:21 - PMT 1911 Mar 11 0:01
for France's institution of a standard time zone in 1891. That would appear to be a transition from a GMT-based time zone to a let's-thumb-our-nose-at-the-British time zone, which is not at all what happened. In contrast, the (non)-transition that's currently in the database feels closer to what actually happened.
Yes, this is subjective, but the proposed change feels considerably more artificial and forced than what we've got.
The question is which is least bad. A smoothed regional value, or an exact lat/long one. As I've said before, I don't overly mind keeping lat/long LMT, but if it is kept it needs to (a) be maintained and (b) have some reasonable meaning. That means an LMT value of some actual relevance to each significant zone in the database (where significant is at least one zone per ISO region). Pointing an LMT value across region boundaries is not OK. Ultimately, the effect of that is to require at least one real zone (not a link) for each ISO region, something others have requested on multiple occasions (exceptions could be made for near uninhabited places like Antarctica, but not for the Carribean or Africa). Smoothed LMT cross-region or lat/long LMT per region? I don't overly mind (although smoothed would help many downstream users when they do request time in the 1700s). However, LMT values that are cross ISO-region by excessive linking is a conceptual mess. Stephen
Stephen Colebourne wrote:
That means an LMT value of some actual relevance to each significant zone in the database (where significant is at least one zone per ISO region).
That's not common practice; we've long had links that cross national boundaries. The classic example is Europe/Vatican but there are several others. Europe/Bratislava, for example has been a link to Europe/Prague ever since it was introduced in 1993, and it's Slovakia's only representative. There isn't a real need for a separate LMT placeholder per country. They're just placeholders, after all.
participants (2)
-
Paul Eggert -
Stephen Colebourne