Dealing with Pre-1970 Data

David Patte ₯

Aug. 30, 2013

2:44 a.m.

Though it has been handy to have some pre-1970 data within the tz database, I don't see that it is the best solution for historical tz data in the long run. It is clear that that going back far enough in time there is a different LMT for every 15 seconds of latitude. My own preference would be that historical (and perhaps all tz data) be given a numberic identifier, not a 'America/Someplace' name, for the populated areas in question. A good source of geographic identifiers already available is the geonames database. The tz data for Montreal could be identified by the geonames number for Montreal, and the tz data for Toronto associated with the geoname number for Toronto. Then, to build regions, since all geonames records already have a field for the tz region, these could reflect the numeric identifier of the tz recommended region for each location. For example, Ottawa, in the geonames database would refer to the numeric identifier of Toronto instead of 'America/Toronto', until sometime decides to add historical data for Ottawa's own tz history, at which point it would adopt the identifier for Ottawa.

Show replies by date

Lester Caine

August 2013

8:41 a.m.

David Patte ₯ wrote:

...

Though it has been handy to have some pre-1970 data within the tz database, I don't see that it is the best solution for historical tz data in the long run. It is clear that that going back far enough in time there is a different LMT for every 15 seconds of latitude.

Only since 1884 ;) But even then as Paul as said, timezones where not actually agreed on, only where the base UTC time was defined. The 'local time' was still synchronised to the sun overhead at noon, hence there was no 'zone' I think I need to elaborate on why some of this history IS important. Up until comparatively recently much historic material has been 'timestamped' by local time + location. This makes 'normalising' times for a genealogical comparison difficult as one then has to look up information as to the time offset from UTC. Sound familiar? PHP has now added a nice 'timezone' facility which provides this lookup, and we can NOW store data as UTC timestamp + location and it will provide a time display as either local time, UTC or any other location for comparison with local events world wide. Since the PHP offsets are generated currently from TZ the next question would be "how does one fix the currently missing data and flag where data IS speculation?" Tinkering with 'history' when that has already been used for storing normalized data is my own concern. I can probably pass the buck to Derick who currently maintains the Date/Time classes, but he simply 'updates' from the provided data. Moving forward is it going to be necessary to maintain a separate 'history' just to keep data from being corrupted? Adding a version stamp to a timezone so we know what was used? I presume Stephen is doing the same thing in Java, so I am perhaps a little surprised if he is now back peddling on getting a common historic base here! -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Guy Harris

8:10 p.m.

On Aug 30, 2013, at 1:41 AM, Lester Caine <lester@lsces.co.uk> wrote:

...

David Patte ₯ wrote:

...
Though it has been handy to have some pre-1970 data within the tz database, I don't see that it is the best solution for historical tz data in the long run. It is clear that that going back far enough in time there is a different LMT for every 15 seconds of latitude.

Only since 1884 ;) But even then as Paul as said, timezones where not actually agreed on, only where the base UTC time was defined. The 'local time' was still synchronised to the sun overhead at noon, hence there was no 'zone'

...and therefore entries prior to the establishment of timezones don't belong in the tzdb, given what the "z" in "tzdb" stands for. If we're obliged to leave them in the tzdb for backwards compatibility purposes, we should: accompany them with a disclaimer that they're not actually meaningful, for all the reasons discussed here (not all locations within a time zone necessarily had the same LMT value before the establishment of the zone; different people *in the same city* might have had clocks different from each other by significant amounts; etc.), explicitly stating that supporting conversion of times prior to the establishment of a standard time zone in a locale is out of scope for the tzdb; freeze them and devote no effort to updating them; not create any new tzdb zones if the only reason for the new zone is "before standard time was established, these two locations had different LMT". Historical rules *subsequent to* time zone establishment, however, are arguably worth keeping and perhaps even updating, albeit perhaps with a disclaimer saying we can't guarantee historical accuracy and/or that they are subject to change due to additional historical information being found.

Stephen Colebourne

9:07 p.m.

On 30 August 2013 21:10, Guy Harris <guy@alum.mit.edu> wrote:

...

If we're obliged to leave them in the tzdb for backwards compatibility purposes, we should:

accompany them with a disclaimer that they're not actually meaningful, for all the reasons discussed here (not all locations within a time zone necessarily had the same LMT value before the establishment of the zone; different people *in the same city* might have had clocks different from each other by significant amounts; etc.), explicitly stating that supporting conversion of times prior to the establishment of a standard time zone in a locale is out of scope for the tzdb;

...

freeze them and devote no effort to updating them;

-1. So long as we have them, there should be a best efforts attempt to maintain them. Realistically, that only means that any new zone should have an appropriate LMT set. No more.

...

not create any new tzdb zones if the only reason for the new zone is "before standard time was established, these two locations had different LMT".

+1 (just to note that Rome vs Vatican was to fulfil the "full history in an ISO code" requirement I proposed, not to create unecessary LMTs)

...

Historical rules *subsequent to* time zone establishment, however, are arguably worth keeping and perhaps even updating, albeit perhaps with a disclaimer saying we can't guarantee historical accuracy and/or that they are subject to change due to additional historical information being found.

+1. This is the far more important issue. FWIW, I've started the process of removing LMT from Java JSR-310. Unfortunately, the option of throwing an error for far past values (back to the age of the universe) is not an option. I do understand the weirdness of these far past times, but in the context of the API it is simply not acceptable to have date-times in the 1700s for example without any notion of offset from GMT. https://github.com/ThreeTen/threeten/issues/332 Stephen

Guy Harris

9:42 p.m.

On Aug 30, 2013, at 2:07 PM, Stephen Colebourne <scolebourne@joda.org> wrote:

...

On 30 August 2013 21:10, Guy Harris <guy@alum.mit.edu> wrote:

...
If we're obliged to leave them in the tzdb for backwards compatibility purposes, we should:

...

...
freeze them and devote no effort to updating them;

-1. So long as we have them, there should be a best efforts attempt to maintain them. Realistically, that only means that any new zone should have an appropriate LMT set. No more.

-1. *Maybe* we should keep the existing LMT values, but I emphatically do *NOT* believe that we should provide LMT for new tzids; we should not do *anything* that encourages people to think Gondwanaland/Argo_City somehow corresponds specifically to Argo City, rather than to a zone within a particular ISO 3166-named entity the currently-most-populated (undisputed) city of which happens to be Argo City.

...

FWIW, I've started the process of removing LMT from Java JSR-310. Unfortunately, the option of throwing an error for far past values (back to the age of the universe) is not an option. I do understand the weirdness of these far past times, but in the context of the API it is simply not acceptable to have date-times in the 1700s for example without any notion of offset from GMT.

Then do it within the Java runtime library, perhaps using the tzdb for standard times and using Something Else prior to the availability of standard time at the location (perhaps just pretending that, prior to the establishment of standard time, the initial standard time offset was in place).

Lester Caine

10:03 p.m.

Guy Harris wrote:

...

not create any new tzdb zones if the only reason for the new zone is "before standard time was established, these two locations had different LMT". I have no problem with that. I see no reason for LMT being returned when looking for a TZ offset, but there is well documented TZ data prior to 1970 which should be retained and returned. The various API's simply need to switch to an alternative when a date is prior to 'known timezone time', but we are probably going to argue if that date is 1884 or something later? Certainly some of the links provided have shown the politics involved so nothing has changed there :)

...

Historical rules*subsequent to* time zone establishment, however, are arguably worth keeping and perhaps even updating, albeit perhaps with a disclaimer saying we can't guarantee historical accuracy and/or that they are subject to change due to additional historical information being found. Exactly. There are notes as to the accuracy of some data and that is all one can do. It's that material that we were fighting to maintain the independence of recently?

Personally I can see a future linkup here with the historic mapping fork on openstreetmap. The various changes over time to both the timezone bands, and the daylight saving areas need to be documented and linked to the chronology so that one can track the changes at any location. This is where I would expect LMT to be generated since location is accurately defined, and while the Greenwich Meridian only came into being in 1884, there is perfect sense in 'mapping' times prior to that based on the offset of midday from it for earlier times? This is also an historically correct fact once clocks were being used and synchronising that with the latter zero of UTC is just logical? Now WHEN was the sun dial invented :) -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Paul Eggert

12:36 a.m.

Guy Harris wrote:

...

If we're obliged to leave them in the tzdb for backwards compatibility purposes, we should:

accompany them with a disclaimer that they're not actually meaningful

That sounds reasonable. Here's a draft of a disclaimer, along with a pointer to a discussion of how little we know even about solar time if we go back far enough, and if it weren't for those amazing Babylonian astronomers we'd know even less. I've pushed this into the experimental repository.

...

From 5e8489b16dfe4cf7493ad7a3578d90656236d310 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Fri, 30 Aug 2013 16:43:57 -0700 Subject: [PATCH] * Theory: Describe LMT better.

Following a suggestion by Guy Harris in <http://mm.icann.org/pipermail/tz/2013-August/019706.html>. --- Theory | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/Theory b/Theory index b4bd4c2..580b548 100644 --- a/Theory +++ b/Theory @@ -224,7 +224,23 @@ could misbehave if data were omitted for pre-1970 transitions. However, the database is not designed for and does not suffice for applications requiring accurate handling of all past times everywhere, as it would take far too much effort and guesswork to record all -details of pre-1970 civil timekeeping. +details of pre-1970 civil timekeeping. The pre-1970 data in this +database covers only a tiny sliver of how clocks actually behaved; +the vast majority of the necessary information was lost or never +recorded, and much of what little remains is fabricated. + +Local mean time (LMT) offsets are recorded in the database only +because the format requires an offset. They should not be considered +meaningful, and should not prompt creation of zones merely because two +locations differ in LMT. Historically, not only did different +locations in the same zone typically use different LMT offsets, often +different people in the same location maintained mean-time clocks that +differed significantly, and many people used solar or some other time +instead of mean time. As for leap seconds, we don't know the history +of earth's rotation accurately enough to map SI seconds to historical +solar time to more than about one-hour accuracy; see Stephenson FR +(2003), Historical eclipses and Earth's rotation, A&G 44: 2.22-2.27 +<http://dx.doi.org/10.1046/j.1468-4004.2003.44222.x>. As noted in the README file, the tz database is not authoritative (particularly not for pre-1970 time stamps), and it surely has errors. @@ -384,8 +400,11 @@ in decreasing order of importance: name identifying each zone and append 'T', 'ST', etc. as before; e.g. 'VLAST' for VLAdivostok Summer Time. - Use UTC (with time zone abbreviation "zzz") for locations while - uninhabited. The "zzz" mnemonic is that these locations are, + Use 'LMT' for local mean time of locations before the introduction + of standard time; see "Scope of the tz database". + + Use UTC (with time zone abbreviation 'zzz') for locations while + uninhabited. The 'zzz' mnemonic is that these locations are, in some sense, asleep. Application writers should note that these abbreviations are ambiguous -- 1.8.1.2

Garrett Wollman

2:10 a.m.

<<On Fri, 30 Aug 2013 17:36:37 -0700, Paul Eggert <eggert@cs.ucla.edu> said:

...

That sounds reasonable. Here's a draft of a disclaimer, along with a pointer to a discussion of how little we know even about solar time if we go back far enough, and if it weren't for those amazing Babylonian astronomers we'd know even less. I've pushed this into the experimental repository.

I like this change. I would also note that anyone who wants LMT for a particular zone can calculate it from the coordinates given in zone.tab. -GAWollman

Lester Caine

7:45 a.m.

Paul Eggert wrote:

...

Guy Harris wrote:

...
...
If we're obliged to leave them in the tzdb for backwards compatibility purposes, we should: accompany them with a disclaimer that they're not actually meaningful

...

That sounds reasonable. Here's a draft of a disclaimer, along with a pointer to a discussion of how little we know even about solar time if we go back far enough, and if it weren't for those amazing Babylonian astronomers we'd know even less. I've pushed this into the experimental repository.

Paul - I still object to using 1970 as some magic point at which data came good! Just because some legacy systems can't cope with 'negative times' is not a good basis. Yes some material needs to be tagged with a level of uncertainty, and when putting genealogical timestamps that is normally the case, but much of the pre-1970 TIME ZONE information IS accurate? Even after 1970 there is a level of inaccuracy - mainly political? - so its almost a case of providing a flag when something is disputed? I do like this particular change but in does only need to be an agreed standard for taking location information and creating a time offset? The 'equation' can then be used with any location just by feeding the coordinates in, so there is no need for a 'database'. In fact, any existing 'location' in the database we should be able to access TZ offset, and LMT offset as both have a purpose. When someone says 'an hour after midday' the two times may well provide insite to some discrepancy! -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Guy Harris

11:20 a.m.

On Aug 31, 2013, at 12:45 AM, Lester Caine <lester@lsces.co.uk> wrote:

...

I do like this particular change but in does only need to be an agreed standard for taking location information and creating a time offset?

To which particular change are you referring?

...

The 'equation' can then be used with any location just by feeding the coordinates in, so there is no need for a 'database'.

I assume you're not asserting here that a simple mathematical formula, with latitude and longitude as variables, that results in the offset of "standard time" from UTC for that location exists, given that no such formula can exists (note the word "simple" there - if it were a simple problem, the tz database wouldn't have been created in the first place).

...

In fact, any existing 'location' in the database we should be able to access TZ offset, and LMT offset as both have a purpose.

What do you mean by "LMT offset"?

Lester Caine

1:51 p.m.

Guy Harris wrote:

...

On Aug 31, 2013, at 12:45 AM, Lester Caine <lester@lsces.co.uk> wrote:

...
I do like this particular change but in does only need to be an agreed standard for taking location information and creating a time offset? To which particular change are you referring?

Paul's inclusion of LMT offsets in the main database.

...

...
The 'equation' can then be used with any location just by feeding the coordinates in, so there is no need for a 'database'. I assume you're not asserting here that a simple mathematical formula, with latitude and longitude as variables, that results in the offset of "standard time" from UTC for that location exists, given that no such formula can exists (note the word "simple" there - if it were a simple problem, the tz database wouldn't have been created in the first place).

The simple calculation is the difference between the time when the sun is directly overhead at Greenwich and at the location being looked at. An approximation only requires the longitude, but the local mean time needs a correction based on the date, or rather the time of year. http://en.wikipedia.org/wiki/Local_mean_time is nice and simple, and a nice set of calculations are on http://en.wikipedia.org/wiki/Equation_of_time Any equation is an approximation, but as long as we all work from the same one then we will get the same result.

...

...
In fact, any existing 'location' in the database we should be able to access TZ offset, and LMT offset as both have a purpose. What do you mean by "LMT offset"?

As has been described. The LMT offset is based on longitude and varies smoothly around the planet, while timezones set a fixed offset for a large slice of the planet which may also vary based on latitude. Add in daylight saving, and we have the whole reason that the tz database exists, but LMT time is totally independent of that. Paul is simply using a 'snapshot' of the LMT values for zones where there is no alternative data, and the value stored in the database may not actually match that created by longitude of the describing city, but rather a longitude related to the whole zone. And I am assuming that he is not bothering about the annual cycle in calculation. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

David Patte ₯

2:20 p.m.

LMT at any longitude is a simple offset calculation from GMT. LMT = GMT + Longitude (as HMS) But Local Apparent Time (Sundial Time) is a far more complex formula based on the date in question and other factors.

Lester Caine

3:29 p.m.

David Patte ₯ wrote:

...

LMT at any longitude is a simple offset calculation from GMT. LMT = GMT + Longitude (as HMS)

Yes the base offset is a simple calculation. I got a bit carried away with Guy's assertion that there couldn't be one ;)

...

But Local Apparent Time (Sundial Time) is a far more complex formula based on the date in question and other factors.

Again, cart before horse, I'd been looking recently at time material based on sundial mid day and looking to incorporate that into my own time management stuff. I was getting around to the point of asking if anybody has a preferred estimation they use but decided this was not the right venue? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Steve Allen

4:24 p.m.

On Sat 2013-08-31T16:29:54 +0100, Lester Caine hath writ:

...

Again, cart before horse, I'd been looking recently at time material based on sundial mid day and looking to incorporate that into my own time management stuff. I was getting around to the point of asking if anybody has a preferred estimation they use but decided this was not the right venue?

Even if this were the right venue, choosing a boundary for the transition between local apparent time and local mean time is just as arbitrary as a boundary for adoption of standard time. As a result of the widespread use of chronometers the UK Admiralty dictated that 1833 was the boundary for the Nautical Almanac, but in the footnote added to the Future of UTC meeting 2 years ago http://www.cacr.caltech.edu/futureofutc/2011/preprints/21_AAS_11-669_discuss... At Mystic Seaport library Frank Reed found logbooks with worked lunars where navigators got aberrent longitudes because they had not adapted to the change in the Almanac. For non-navigational clocks of lesser quality we should expect that resetting them according to some noon mark was the practice until the local advent of telegraph and rail modified civic practice. -- Steve Allen <sla@ucolick.org> WGS-84 (GPS) UCO/Lick Observatory--ISB Natural Sciences II, Room 165 Lat +36.99855 1156 High Street Voice: +1 831 459 3046 Lng -122.06015 Santa Cruz, CA 95064 http://www.ucolick.org/~sla/ Hgt +250 m

Lester Caine

6:21 p.m.

Steve Allen wrote:

...

Even if this were the right venue, choosing a boundary for the transition between local apparent time and local mean time is just as arbitrary as a boundary for adoption of standard time.

LMT is just as valid today as in the past - there is no date to be considered, and 'solar mid day' has not changed. Just as we have a few useful computer oriented scripts for co-ordinate conversion, something similar for LMT would be nice :) Guy - others seem to have understood exactly where I was coming from, and the reference TO LMT offset should have been enough to clarify that we were talking about that element of the latest update. The point was that LMT is still valid in parallel with a TZ offset ... it's not a matter of choosing between them. It just depends on the context, which is why I was a little surprised Paul included it, but it does provide a valid backdrop. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Guy Harris

5:57 p.m.

On Aug 31, 2013, at 8:29 AM, Lester Caine <lester@lsces.co.uk> wrote:

...

David Patte ₯ wrote:

...
LMT at any longitude is a simple offset calculation from GMT. LMT = GMT + Longitude (as HMS)

Yes the base offset is a simple calculation. I got a bit carried away with Guy's assertion that there couldn't be one ;)

What I said was that there couldn't be a simple calculation for the offset of *standard time* from UTC:

...

I assume you're not asserting here that a simple mathematical formula, with latitude and longitude as variables, that results in the offset of "standard time" from UTC for that location exists, given that no such formula can exists (note the word "simple" there - if it were a simple problem, the tz database wouldn't have been created in the first place).

In your message, you didn't mention LMT until *after* you spoke of 1970 (which is a year chosen as a marker for the point in time for which when the database maintainers say "after this point, we'll put more significant effort into trying to keep the data accurate", for reasons given by Paul) and speaking of "this particular change", which was a change that did more than just address the LMT issue. It might've been less confusing had you said "I do like *your LMT-related changes*", as that would have made it clearer that you were, in the second paragraph of your message, referring to LMT. (Your *first* paragraph wasn't discussing the LMT-related parts of Paul's change, as 1970 isn't a cutoff date of any sort for LMT.) So, yes, I was aware that LMT had a simple formula; given that, I'd personally have preferred that it not be in the database, with dates prior to the "start time" in the database handled by whatever means the particular implementor chooses, whether it's "extrapolate the earliest offset-from-standard-time infinitely backwards into the past" or "fail" or, if the particular function takes a longitude and latitude rather than a tzid as an argument, "calculate LMT".

Paul Eggert

11:27 a.m.

Garrett Wollman wrote:

...

I would also note that anyone who wants LMT for a particular zone can calculate it from the coordinates given in zone.tab.

Thanks, the following patch tries to address that. Lester Caine wrote:

...

Just because some legacy systems can't cope with 'negative times' is not a good basis.

True, but it's more than just that. It would be a huge amount of work to change the cutoff to be even (say) 1950, if you want the data to be reliable. It'd be a lot of work even if you don't care all that much about reliability. I doubt whether anybody's going to do this work any time soon. This is probably worth mentioning in the Theory file, though. Thanks for everybody's comments; I've added the following to the draft.

...

From 28cf16f57596fe8285e087e40f79d855dcfad67d Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sat, 31 Aug 2013 04:20:31 -0700 Subject: [PATCH] * Theory: Describe LMT, cutoffs, zone.tab, and time.tab better.

Following discussion by Lester Caine in <http://mm.icann.org/pipermail/tz/2013-August/019716.html> and Garrett Wollman in <http://mm.icann.org/pipermail/tz/2013-August/019715.html>. --- Theory | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/Theory b/Theory index 580b548..4081b24 100644 --- a/Theory +++ b/Theory @@ -228,6 +228,10 @@ details of pre-1970 civil timekeeping. The pre-1970 data in this database covers only a tiny sliver of how clocks actually behaved; the vast majority of the necessary information was lost or never recorded, and much of what little remains is fabricated. +Although 1970 is a somewhat-arbitrary cutoff, there are significant +challenges to moving the cutoff back even by a decade or two, due to +the wide variety of local practices before computer timekeeping +became prevalent. Local mean time (LMT) offsets are recorded in the database only because the format requires an offset. They should not be considered @@ -235,8 +239,9 @@ meaningful, and should not prompt creation of zones merely because two locations differ in LMT. Historically, not only did different locations in the same zone typically use different LMT offsets, often different people in the same location maintained mean-time clocks that -differed significantly, and many people used solar or some other time -instead of mean time. As for leap seconds, we don't know the history +differed significantly, many people used solar or some other time +instead of mean time, and standard time often replaced LMT only gradually +at each location. As for leap seconds, we don't know the history of earth's rotation accurately enough to map SI seconds to historical solar time to more than about one-hour accuracy; see Stephenson FR (2003), Historical eclipses and Earth's rotation, A&G 44: 2.22-2.27 @@ -336,9 +341,16 @@ in decreasing order of importance: If a name is changed, put its old spelling in the 'backward' file. This means old spellings will continue to work. -The file 'zone.tab' lists the geographical locations used to name -time zone rule files. It is intended to be an exhaustive list -of names for geographic regions as described above. +The file 'zone.tab' lists geographical locations used to name time +zone rule files. It is intended to be an exhaustive list of names +for geographic regions as described above; this is a subset of the +Zone entries in the data. The file 'time.tab' is a simplified +version of 'zone.tab', the intent being that entries are coalesced +if their time stamps agree after 1970, which means the entries are +distinct in 'zone.tab' only because of the abovementioned political +constraints. Although a 'zone.tab' location's longitude corresponds +to its LMT offset with one hour for every 15 degrees east longitude, +this relationship is not exact and is not true for 'time.tab'. Older versions of this package used a different naming scheme, and these older names are still supported. -- 1.8.1.2

Stephen Colebourne

9:42 a.m.

On 31 August 2013 01:36, Paul Eggert <eggert@cs.ucla.edu> wrote:

...

Guy Harris wrote:

...
If we're obliged to leave them in the tzdb for backwards compatibility purposes, we should: accompany them with a disclaimer that they're not actually meaningful

That sounds reasonable. Here's a draft of a disclaimer, along with a pointer to a discussion of how little we know even about solar time if we go back far enough, and if it weren't for those amazing Babylonian astronomers we'd know even less. I've pushed this into the experimental repository.

+1. The change on LMT looks fine to me. Given its new status, I'm happy for LMT to not correspond to the city in the zone ID where a Link occurs. Stephen

Eliot Lear

7:32 a.m.

My only input to this discussion: The TZ database is perhaps the most comprehensive source of historical timezone data in the world, and we (a) shouldn't lose data, and (b) should encourage updating of the database with historical data based on credible historical information. Whether we do this for computers is a fair question. But this leads me to ask a question: what problem are we trying to solve at this moment? What is broken that needs fixing? I seem to have lost the plot on that. Eliot On 8/30/13 10:10 PM, Guy Harris wrote:

...

On Aug 30, 2013, at 1:41 AM, Lester Caine <lester@lsces.co.uk> wrote:

...
David Patte ₯ wrote:

...
Though it has been handy to have some pre-1970 data within the tz database, I don't see that it is the best solution for historical tz data in the long run. It is clear that that going back far enough in time there is a different LMT for every 15 seconds of latitude. Only since 1884 ;) But even then as Paul as said, timezones where not actually agreed on, only where the base UTC time was defined. The 'local time' was still synchronised to the sun overhead at noon, hence there was no 'zone' ...and therefore entries prior to the establishment of timezones don't belong in the tzdb, given what the "z" in "tzdb" stands for.

If we're obliged to leave them in the tzdb for backwards compatibility purposes, we should:

accompany them with a disclaimer that they're not actually meaningful, for all the reasons discussed here (not all locations within a time zone necessarily had the same LMT value before the establishment of the zone; different people *in the same city* might have had clocks different from each other by significant amounts; etc.), explicitly stating that supporting conversion of times prior to the establishment of a standard time zone in a locale is out of scope for the tzdb;

freeze them and devote no effort to updating them;

not create any new tzdb zones if the only reason for the new zone is "before standard time was established, these two locations had different LMT".

Historical rules *subsequent to* time zone establishment, however, are arguably worth keeping and perhaps even updating, albeit perhaps with a disclaimer saying we can't guarantee historical accuracy and/or that they are subject to change due to additional historical information being found.

Paul Eggert

11:49 a.m.

Eliot Lear wrote:

...

what problem are we trying to solve at this moment? What is broken that needs fixing? I seem to have lost the plot on that.

As for the LMT issue, the problem there seems to be one of documenting better what 'LMT' means in the database, which I've been trying to do. This stuff has all been in my head for years, but it's not obvious to newcomers. As for the changes that discarded some pre-1970 data, the underlying problem is the unnecessary proliferation of entries due to political constraints. These entries unnecessarily complicate users' lives, add to our maintenance burden, and prompt endless political disputes. We can't remove politically-motivated entries due to backward-compatibility concerns, but we should not proliferate them further, as this project should be about timekeeping, not politics. P.S. I should mention that the politics are in the guidelines only because of my own mistake when I wrote them way back when. I intended to keep politics out as much as possible (which is partly why the Zone naming convention is as successful as it is), but I slipped up when documenting zone.tab.

Lester Caine

2:48 p.m.

Paul Eggert wrote:

...

As for the changes that discarded some pre-1970 data, the underlying problem is the unnecessary proliferation of entries due to political constraints.

While the politics is irritating, having a single repository for ALL available data should be a goal? We are still discussing two databases here, one with pre 1970 data and one without. As more people are involved nowadays, filling in gaps in the history may actually be possible even if time consuming, but if someone does find a new piece of information how can that now be recorded? Is there going to be a bar on adding data simply because it's pre-1970? Do we need a second list where historic 'patches' can be logged and discussed, leaving this list for the much more continuous turnover of ongoing changes in new data? I probably need to point out that as far as I am concerned, the UK data is correct back to the creation of daylight saving back in 1916 so I've always ignored the 1970 'limit' anyway and I've not found any problems with other countries which is why I was rather surprised at the suggestion there were. My own next area of interest is 'railway time', and adding an LMT base creates an interesting hook to compare that information. But a single entry for the whole of the UK is of little use, and you certainly will not want to add 'UK town offsets' :) -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Paul Eggert

9:06 p.m.

Lester Caine wrote:

...

having a single repository for ALL available data should be a goal?

Sure, it's a worthy goal, even if pre-1970 timestamps are currently out of scope for the current database. We could collect all the data that we can for an extended database that contains new zones that differ from existing ones only for pre-1970 timestamps. We could then derive the current database by applying a filter to the extended database, along the lines that Zefram suggested. This filtering could be done automatically and at the source level, so existing tz source file readers would not need to be changed, and we wouldn't have to maintain two copies of the database. As I understand it, Stephen wouldn't oppose the existence of a filter per se, but is uneasy about having the default filter being set to 1970. But I'm afraid the filtering approach won't work unless we continue to filter at 1970 as we have regularly done in the past. Too much existing practice is based in the 1970 cutoff, and (as now explained in the Theory file) the 1970 cutoff is not really that arbitrary -- rather, it corresponds roughly with the advent of computerized timekeeping and of a greater need for standardized civil time.

...

I probably need to point out that as far as I am concerned, the UK data is correct back to the creation of daylight saving back in 1916

I think you're right, but the UK is quite a special case: the tz database relies on years of first-class work done by Joseph Myers and others. Other countries are not done nearly as well. And even the UK entries, which are the best we have, don't cover all the history of standard time in the UK, much less pre-standard time. For more about this, please see Myers's nice summary <http://www.polyomino.org.uk/british-time/>.

Lester Caine

10:29 p.m.

Paul Eggert wrote:

...

And even the UK entries, which are the best we have, don't cover all the history of standard time in the UK, much less pre-standard time. For more about this, please see Myers's nice summary <http://www.polyomino.org.uk/british-time/>.

That includes the fine detail on why the Isle of Man is very slightly different to London time prior to 1968, but does highlight some material that has still to be verified. However it ALSO highlights that things are progressing nicely while documents like this lay dormant! For example the first missing document is now on line, along with a number of additional IOM documents. http://www.legislation.gov.im/cms/images/LEGISLATION/PRINCIPAL/1883/1883-000... a year ahead of the world game but still behind developments in the UK :) While I accept that a lot of historic material has been lost, the digitization of many government archives is finding a lot of information that WAS thought to be lost. Until the question is asked we can't definitively say material is no longer available, and this forum is the ideal location to consolidate specifically time related material? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Stephen Colebourne

11:39 p.m.

On 31 August 2013 22:06, Paul Eggert <eggert@cs.ucla.edu> wrote:

...

Lester Caine wrote:

...
having a single repository for ALL available data should be a goal?

Sure, it's a worthy goal, even if pre-1970 timestamps are currently out of scope for the current database. We could collect all the data that we can for an extended database that contains new zones that differ from existing ones only for pre-1970 timestamps. We could then derive the current database by applying a filter to the extended database, along the lines that Zefram suggested. This filtering could be done automatically and at the source level, so existing tz source file readers would not need to be changed, and we wouldn't have to maintain two copies of the database.

I can see no reason why an optional additional file "extended" could not exist for new zone IDs that only exist to record history before 1970. Such additional zones would not appear in zone.tab. Most people would just ignore the "extended" file. However, I would argue that any zone ID that already exists (or is newly created) should have its full pre-1970 history retained and enhanced within the main tzdb files, so all current consumers simply pickup the enhancements.

...

As I understand it, Stephen wouldn't oppose the existence of a filter per se, but is uneasy about having the default filter being set to 1970. But I'm afraid the filtering approach won't work unless we continue to filter at 1970 as we have regularly done in the past. Too much existing practice is based in the 1970 cutoff, and (as now explained in the Theory file) the 1970 cutoff is not really that arbitrary -- rather, it corresponds roughly with the advent of computerized timekeeping and of a greater need for standardized civil time.

Filtering data applies to database consumers via zic, not the database itself. The database itself should not be limited to storing data after 1970. If someone makes a contribution to an existing ID before 1970, that data should be included in the main files. Whether that contribution causes a Link to become a full Zone should never be a relevant factor. It might well be that this is slightly more broad than previously, but it is not overly onerous. There are relatively few Links in the tzdb that exist to share data (as opposed to simple renames). Only if someone researches the history of one of those locations is there a need to convert the Link to a Zone. ie. the notion that "its before 1970 so we don't care" needs to be toned down a little. Still a useful guide, but if real data is available pre-1970, it is only responsible to keep and retain it. Final note. LMT can now be safely ignored in this discussion. thanks Stephen

Guy Harris

September 2013

1:18 a.m.

On Aug 31, 2013, at 4:39 PM, Stephen Colebourne <scolebourne@joda.org> wrote:

...

On 31 August 2013 22:06, Paul Eggert <eggert@cs.ucla.edu> wrote:

...
Lester Caine wrote:

...
having a single repository for ALL available data should be a goal?

Sure, it's a worthy goal, even if pre-1970 timestamps are currently out of scope for the current database. We could collect all the data that we can for an extended database that contains new zones that differ from existing ones only for pre-1970 timestamps. We could then derive the current database by applying a filter to the extended database, along the lines that Zefram suggested. This filtering could be done automatically and at the source level, so existing tz source file readers would not need to be changed, and we wouldn't have to maintain two copies of the database.

I can see no reason why an optional additional file "extended" could not exist for new zone IDs that only exist to record history before 1970. Such additional zones would not appear in zone.tab. Most people would just ignore the "extended" file.

However, I would argue that any zone ID that already exists (or is newly created) should have its full pre-1970 history retained and enhanced within the main tzdb files, so all current consumers simply pickup the enhancements.

I.e., grandfather in *existing* tzids that exist only to record history before 1970, leaving them in the non-"extended" part of the database, but don't add *new* such tzids to the non-"extended" part of the database?

...

Filtering data applies to database consumers via zic, not the database itself. The database itself should not be limited to storing data after 1970. If someone makes a contribution to an existing ID before 1970, that data should be included in the main files. Whether that contribution causes a Link to become a full Zone should never be a relevant factor.

I can see at least two ways of handling this: 1) have the database include the full collection of tzids wherein if two regions had different standardized time at any point, they get separate tzids, have a winnowing process that can be instructed to somehow winnows out regions that had the same standardized time subsequent to some cutoff date (e.g., January 1, 1970), and allow consumers of the database to do whatever winnowing they choose, including none; 2) have an "extended" database for any *new* cases where we split a region due to pre-1970 differences, preserve the existing such splits, and not have a winnowing process, just a choice, on the part of consumers of the database, to use the "extended" file or not. ("Consumer" here doesn't mean "end user"; I suspect most end users don't know enough to care and may not even want to care - they're busy dealing with other problems. "Consumer" here refers to software that packages the database and uses it.) Both of those preserve all the historical information we have, and leave it up to the consumer to decide whether they care at all about pre-1970 differences or not. The first doesn't let the consumer (again, as defined above) choose to include some but not all zone splits due to pre-1970 differences (in particular, it doesn't let the consumer preserve the existing splits); the latter doesn't let the consumer choose to include none of them. The first suggestion is what I think Zefram was suggesting; the second suggestion sounds like what you're suggesting. Either suggest would allow full support for pre-1970 standardized times (to the extent that any given release the database has information about those times) to be provided by a consumer. The first one allows a consumer to winnow out of all tzids that are not of interest if they don't view supporting conversion of pre-1970 times as something they want to put significant effort into supporting (I suspect that a number of UN*Xes might choose that option), reducing the number of options to present to whoever's configuring the system to use a particular tzid. The second one doesn't require choosing between tossing out all splits due to pre-1970 differences and tossing out none of them. (What reasons, if any, exist for removing tzids that exist due to differences in pre-1970 standardized time? Is that intended to reduce the number of zones to present to someone configuring the system if your system doesn't make a vigorous effort to support those times, and to reduce the effort needed to maintain them? Is it intended to be consistent with a rule forbidding adding *new* tzids due to differences in pre-1970 standardized time? Is it intended to reduce political disputes over tzids - presumably by reducing the number of tzids and thus the number of possible points of complaint, as there's nothing particularly politically interesting about January 1, 1970, as far as I can tell? I went back through the archive and, if the controversial changes are the ones from the "Move links to 'backward' if they exist only because of country codes." message, I'm probably missing something, as that doesn't appear to be discarding existing pre-1970 splits, at least from the description.)

Paul Eggert

4:49 a.m.

Guy Harris wrote:

...

The first suggestion is what I think Zefram was suggesting; the second suggestion sounds like what you're suggesting.

Yes, that sounds about right.

...

Is that intended to reduce the number of zones to present to someone configuring the system if your system doesn't make a vigorous effort to support those times, and to reduce the effort needed to maintain them?

Yes to both. The extra zones are a burden to users and to maintainers, and for most users their cost exceeds their benefit.

...

Is it intended to be consistent with a rule forbidding adding *new* tzids due to differences in pre-1970 standardized time?

Yes. That rule's from "Theory".

...

Is it intended to reduce political disputes over tzids - presumably by reducing the number of tzids and thus the number of possible points of complaint,

Yes, that's part of it. But more important, it reduces the number of zones put in only because a location is politically "special" in some way.

...

I went back through the archive and, if the controversial changes are the ones from the "Move links to 'backward' if they exist only because of country codes." message, I'm probably missing something, as that doesn't appear to be discarding existing pre-1970 splits, at least from the description.)

You're correct, those changes don't discard existing pre-1970 splits.

Stephen Colebourne

9:12 a.m.

On 1 September 2013 02:18, Guy Harris <guy@alum.mit.edu> wrote:

...

I.e., grandfather in *existing* tzids that exist only to record history before 1970, leaving them in the non-"extended" part of the database, but don't add *new* such tzids to the non-"extended" part of the database?

Yes. Currently, no one is working on such extended data, so there is no issue. The "extended" file is a simple option to keep any additional data separate. It would also be possible for such data to be in a completely separate project, sharing the same file format, and that may be more acceptable to some. Note that I do not care about pre-1970 data for locations that are not currently in the tzdb (before the recent destruction). I do care about the pre-1970 data of existing IDs, but only to the extent that they are maintained or enhanced and not deleted. Others may care differently, but I believe my needs are in line with Theory and long-standing practice.

...

I went back through the archive and, if the controversial changes are the ones from the "Move links to 'backward' if they exist only because of country codes." message, I'm probably missing something, as that doesn't appear to be discarding existing pre-1970 splits, at least from the description.)

There are two real issues in the threads - (1) what to do about pre-1970 data in general (where I object to its deletion as part of merging IDs) (2) the removal of the 16 year old rule wrt one zone per ISO-3166 code The "Move links" stuff is about (2), not (1). thanks Stephen

David Patte ₯

12:50 p.m.

On 2013-09-01 5:12, Stephen Colebourne wrote:

...

Currently, no one is working on such extended data, so there is no issue.

Actually, there is a small group of people working on extended data. --

Paul Eggert

4:28 p.m.

Stephen Colebourne wrote:

...

There are two real issues in the threads - (1) what to do about pre-1970 data in general (where I object to its deletion as part of merging IDs)

We appear to have consensus that this data should not be deleted, so this no longer appears to be an active issue.

...

(2) the removal of the 16 year old rule wrt one zone per ISO-3166 code The "Move links" stuff is about (2), not (1).

The "Move links" stuff did not change any zones, so it would not be affected by a "one zone per ISO-3166 code" guideline. And as I mentioned in a previous email, the older guideline was not "one zone per ISO-3166 code".

Lester Caine

8:09 a.m.

Stephen Colebourne wrote:

...

ie. the notion that "its before 1970 so we don't care" needs to be toned down a little. Still a useful guide, but if real data is available pre-1970, it is only responsible to keep and retain it. That is probably all I am asking for as well ... But exactly how new pre-1970 material is managed is still a question mark?

...

Final note. LMT can now be safely ignored in this discussion. Agreed, although I would make a request for some support in code rather than the database. Again if we have a common set of calculations was can at least be sure we are using the same figures?

-- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Guy Harris

8:32 a.m.

On Sep 1, 2013, at 1:09 AM, Lester Caine <lester@lsces.co.uk> wrote:

...

Stephen Colebourne wrote:

...
Final note. LMT can now be safely ignored in this discussion. Agreed, although I would make a request for some support in code rather than the database.

So you'd like the tz code to include a routine that takes, as an argument, a longitude value and a seconds-since-the-Epoch time value, and converts it to LMT (independent, of course, of the current time zone), so that people who want LMT rather than local standardized time for a particular location on Earth would call that routine, supplying it the longitude of that location, rather than calling localtime(), and, similarly, supply a routine that takes a "struct tm" and a longitude value and returns a seconds-since-the-Epoch value, so that people who want to convert an LMT date/time value for a particular location on Earth, rather than a local standardized time date/time value for a particular tzdb zone, to seconds-since-the-Epoch would call that routine, supplying it the longitude for that location, rather than calling mktime()?

Stephen Colebourne

8:59 a.m.

On 1 September 2013 09:32, Guy Harris <guy@alum.mit.edu> wrote:

...

So you'd like the tz code to include a routine that takes, as an argument, a longitude value and a seconds-since-the-Epoch time value, and converts it to LMT (independent, of course, of the current time zone), so that people who want LMT rather than local standardized time for a particular location on Earth would call that routine, supplying it the longitude of that location, rather than calling localtime(), and, similarly, supply a routine that takes a "struct tm" and a longitude value and returns a seconds-since-the-Epoch value, so that people who want to convert an LMT date/time value for a particular location on Earth, rather than a local standardized time date/time value for a particular tzdb zone, to seconds-since-the-Epoch would call that routine, supplying it the longitude for that location, rather than calling mktime()?

Such a routine would be useful to some, though no longer to me because I've decided to eliminate LMT for the data I use. thanks Stephen

Lester Caine

9:15 a.m.

Guy Harris wrote:

...

...
Stephen Colebourne wrote:

...
...
...
Final note. LMT can now be safely ignored in this discussion. Agreed, although I would make a request for some support in code rather than the database.

So you'd like the tz code to include a routine that takes, as an argument, a longitude value and a seconds-since-the-Epoch time value, and converts it to LMT (independent, of course, of the current time zone), so that people who want LMT rather than local standardized time for a particular location on Earth would call that routine, supplying it the longitude of that location, rather than calling localtime(), and, similarly, supply a routine that takes a "struct tm" and a longitude value and returns a seconds-since-the-Epoch value, so that people who want to convert an LMT date/time value for a particular location on Earth, rather than a local standardized time date/time value for a particular tzdb zone, to seconds-since-the-Epoch would call that routine, supplying it the longitude for that location, rather than calling mktime()?

That is the simple bit, Longitude to an offset from UTC with an option for a 15deg zone version, and yes I know it's pig simple, but a function just completes the jigsaw. If a search on the database does not return a 'hit', we default to an LMT zone (15deg slot), or a simple LMT offset. However the one I was really thinking about was the 'equation of time' calculation in a format we can use as a 'standard'. Not sure Paul will see the necessity, but it's one of those 'useful functions' that is very much time related. The main problem here is that accuracy of results are going to depend on the compiler/language used so there is not going to be a 'definitive' answer? These are more related to pre-1970 activity, so many people are probably not interested, but I am sure there is a 'demand' for a reference set of material for this aspect of time? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Zefram

10:35 a.m.

Lester Caine wrote:

...

That is the simple bit, Longitude to an offset from UTC

Perl script attached for your convenience.

...

However the one I was really thinking about was the 'equation of time' calculation in a format we can use as a 'standard'.

For what purpose do you expect to use *apparent* solar time? If you really want it, look at astronomical sources. That's where the demand exists for formulae for this and many other phenomena that are tangentially related to time. -zefram

Lester Caine

12:23 p.m.

Zefram wrote:

...

...
That is the simple bit, Longitude to an offset from UTC Perl script attached for your convenience. My own use is with PHP timestamps :) ... but you have already made assumptions in that script which is the point here. Silly little things like is 7.5 degrees included in the first or second zone of a simple grid. Academic, but I've seen too many cases where 'that will not be a problem' comes up in software only to find later that one system is working one way and another. 'Rounding' is the bane of many of these things and often it's that very rounding that leads to the problems. Uncertainty needs to be managed.

...

...
...
However the one I was really thinking about was the 'equation of time' calculation in a format we can use as a 'standard'. For what purpose do you expect to use*apparent* solar time? If you really want it, look at astronomical sources. That's where the demand exists for formulae for this and many other phenomena that are tangentially related to time. As I said - this only really relates to historic material. The astronomical community has methods of working which perhaps we have to live with, but a simpler view of things would suffice for historical comparisons. When converting documents that specifically use the sun as their reference. One can probably say 'Sod it xxx will do', but given the fine tooth comb that is now being applied to this data, a standard consistent with the other time standards we are using is essential. This is why the current statement that the tz database is 'invalid prior to 1970' causes such a problem!

Zefram

12:26 p.m.

Lester Caine wrote:

...

but you have already made assumptions in that script which is the point here. Silly little things like is 7.5 degrees included in the first or second zone of a simple grid.

That's not local mean time, as we understand it. If you want the strict hour-interval time zones (which is what gives you 7.5 degree edge cases), round the LMT offset to the nearest hour. -zefram

Lester Caine

1:16 p.m.

Zefram wrote:

...

...
but you have already made

...
assumptions in that script which is the point here. Silly little things like is 7.5 degrees included in the first or second zone of a simple grid.

That's not local mean time, as we understand it. If you want the strict hour-interval time zones (which is what gives you 7.5 degree edge cases), round the LMT offset to the nearest hour.

That is just a different edge case ;) Is the half hour rounded up or down. I've been working on cross database compatibility for a long time. Presented papers on that back in the early 80's. You put exactly the same data into two different databases and apply the same SQL scripts to produce what can be vastly different results. Conversions reducing resolution in data, such as the longitude fine detail and so on. Producing formulas that give a consistent result is not quite so simple as you are assuming. Even following the available standards still produces inconsistent results. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Zefram

1:25 p.m.

Lester Caine wrote:

...

Is the half hour rounded up or down.

You need to make an arbitrary decision, and sure, for some applications you need to make sure that all parties make that decision the same way. But that's totally out of scope for the tz database; it's not our place to specify a canonical choice of rounding mode. -zefram

Lester Caine

1:40 p.m.

Zefram wrote:

...

...
Is the half hour rounded up or down. You need to make an arbitrary decision, and sure, for some applications you need to make sure that all parties make that decision the same way. But that's totally out of scope for the tz database; it's not our place to specify a canonical choice of rounding mode.

The database - yes - but we are talking about the backup material that is used when the database fails. Pre-1970 for example ;) -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Guy Harris

6:42 p.m.

On Sep 1, 2013, at 6:40 AM, Lester Caine <lester@lsces.co.uk> wrote:

...

Zefram wrote:

...
...
Is the half hour rounded up or down. You need to make an arbitrary decision, and sure, for some applications you need to make sure that all parties make that decision the same way. But that's totally out of scope for the tz database; it's not our place to specify a canonical choice of rounding mode.

The database - yes - but we are talking about the backup material that is used when the database fails. Pre-1970 for example ;)

What do you mean by "when the database fails"? You presumably don't mean "when the database contains no data", as that is *NOT* uniformly true of pre-1970 times. "The database" is the text files, in zic format, in the tzdata collections, so the database *does* contain data for pre-1970 times; however, we make weaker claims about its accuracy and completeness. One thing we definitely do not do, and should not do, in the tzcode reference implementation is map all times prior to 1970 to LMT, so it's not as if "pre-1970" means "LMT". If you mean "when the time being converted is prior to the introduction of standardized time", the only thing we provide is the first "Zone" line, which currently has what is, I guess, an LMT value for some location within the city used in the tzid for the zone in question - and that's "LMT routed to a *one-second* boundary", e.g.: Zone America/New_York -4:56:02 - LMT 1883 Nov 18 12:03:58 As far as *I'm* concerned, anything having to do with non-standardized time, such as LMT and local apparent time, is, and should always be, out of scope. People who need LMT, or local apparent time, can, and must, calculate it themselves.

Lester Caine

7:27 p.m.

Guy Harris wrote:

...

On Sep 1, 2013, at 6:40 AM, Lester Caine<lester@lsces.co.uk> wrote:

...
...
Zefram wrote:

...
...
...
>Is the half hour rounded up or down. You need to make an arbitrary decision, and sure, for some applications you need to make sure that all parties make that decision the same way. But that's totally out of scope for the tz database; it's not our place to specify a canonical choice of rounding mode.

The database - yes - but we are talking about the backup material that is used when the database fails. Pre-1970 for example;) What do you mean by "when the database fails"? You presumably don't mean "when the database contains no data", as that is*NOT* uniformly true of pre-1970 times. "The database" is the text files, in zic format, in the tzdata collections, so the database*does* contain data for pre-1970 times; however, we make weaker claims about its accuracy and completeness.

One thing we definitely do not do, and should not do, in the tzcode reference implementation is map all times prior to 1970 to LMT, so it's not as if "pre-1970" means "LMT".

If you mean "when the time being converted is prior to the introduction of standardized time", the only thing we provide is the first "Zone" line, which currently has what is, I guess, an LMT value for some location within the city used in the tzid for the zone in question - and that's "LMT routed to a*one-second* boundary", e.g.:

Zone America/New_York -4:56:02 - LMT 1883 Nov 18 12:03:58

As far as*I'm* concerned, anything having to do with non-standardized time, such as LMT and local apparent time, is, and should always be, out of scope. People who need LMT, or local apparent time, can, and must, calculate it themselves.

My problem with that statement is ensuring that the 'calculations' match with those used to create the values IN the database. 'Guessing' how the initial time was generated is not an acceptable way of going, since you can't then compare accurately between a 'guessed' time and a calculated one. As a minimum where a value has come from needs to be documented, and essentially that uses the equation I'm looking for, and a defined location. Following on from that, indicating that some amendment from this base is in place and can be trusted would be useful. I'm more than happy that the data returned for the UK is all correct, and I can adjust the Isle of Man record if I need to, but how do I assess the accuracy of other historic data? The only think I am told is correct is post 1970 information. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Zefram

7:44 p.m.

Lester Caine wrote:

...

how do I assess the accuracy of other historic data?

This can only be done case-by-case, by examining the extensive commentary in the source files. It would be nice to have formalised quality metadata that could be used programmatically, but it would be an awfully big job to mark up the existing data to get that system going. Much easier to keep it up to date thereafter, though. Standardising the rounding mode that we use to compute LMT offsets won't help with this. -zefram

Paul Eggert

8:32 p.m.

Zefram wrote:

...

This can only be done case-by-case, by examining the extensive commentary in the source files.

... and for pre-1970 data most of the commentary will boil down to "see Shanks", which has been demonstrated to be unreliable when we've checked it. There is little reason to trust that data. Shanks gives no sources -- zero -- for any of his data. As a consequence, not only are the LMT offsets unreliable, for most current entries the transition dates from LMT to standard time are unreliable. Even if we were to create an extended file that distinguishes between zones based on pre-1970 data, we shouldn't create an extended-file zone merely because it transitions away from LMT at a differing date from an existing zone, because those transition dates are typically no more reliable than the LMT offset is.

Guy Harris

8:51 p.m.

On Sep 1, 2013, at 12:27 PM, Lester Caine <lester@lsces.co.uk> wrote:

...

Guy Harris wrote:

...
As far as*I'm* concerned, anything having to do with non-standardized time, such as LMT and local apparent time, is, and should always be, out of scope. People who need LMT, or local apparent time, can, and must, calculate it themselves.

My problem with that statement is ensuring that the 'calculations' match with those used to create the values IN the database.

I assume "the values in the database" are the values in Zone lines such as Zone America/New_York -4:56:02 - LMT 1883 Nov 18 12:03:58 as those are the *only* values that can be calculated with a simple formula. All the other values come from records of what governing bodies have decided to specify as standardized time; there's no formula to calculate or predict *that*. :-) If we're going to continue to even *have* lines giving local mean solar time in the database, and want to document where the values used come from, what we would do is put in Theory the formula used to calculate LMT, which is almost certainly going to be the one given by David Patte: LMT at any longitude is a simple offset calculation from GMT. LMT = GMT + Longitude (as HMS) which, of course, does not at all involve the equation of time, as it's *mean* solar time, not *apparent* solar time; and possibly put in the comments for each Zone entry with an LMT line an indication of the longitude used to calculate the LMT offset for that Zone entry, although *that* can easily be calculated from the offset from GMT.

...

As a minimum where a value has come from needs to be documented, and essentially that uses the equation I'm looking for,

Look in David Patte's e-mail.

...

and a defined location.

Convert the offset in clock hours/minutes/seconds to an offset in degrees/minutes/seconds (15 degrees of longitude = 1 hour).

...

Following on from that, indicating that some amendment from this base is in place and can be trusted would be useful.

To what sort of amendments are you referring? LMT is LMT, i.e. "1 hour for every 15 degrees of longitude from the prime meridian, and similar treatment of minutes and seconds of longitude". There's nothing to amend there.

...

I'm more than happy that the data returned for the UK is all correct, and I can adjust the Isle of Man record if I need to, but how do I assess the accuracy of other historic data?

LMT isn't historic data, it's calculated data. (No, we're not going to take into account pre-1970 changes of day length, etc..) To assess the accuracy of the tzdb's historic data about *standardized* times, you're going to have to, err, umm, dig through historical records and see whether they record what time zones were established, what offsets from GMT/UTC were established for them, what changes (if any) were made to those offsets over time, what daylight savings time rules were in effect at what points in time, and whether that agrees with what's in the tzdb.

...

The only think I am told is correct is post 1970 information.

We don't provide an absolutely certain guarantee of *that* - we might get informed at some point that East Erewhon briefly introduced Daylight Savings Time in 1974, and have to update the Zone entry for its zone to include lines for that - but we're more confident of the post-1970 information. What the top-of-trunk Theory file says, in the "scope of the tz database" section, right now is: The tz database attempts to record the history and predicted future of all computer-based clocks that track civil time. To represent this data, the world is partitioned into regions whose clocks all agree about time stamps that occur after the somewhat-arbitrary cutoff point of the POSIX Epoch (1970-01-01 00:00:00 UTC). For each such region, the database records all known clock transitions, and labels the region with a notable location. Clock transitions before 1970 are recorded for each such location, because most POSIX-compatible systems support negative time stamps and could misbehave if data were omitted for pre-1970 transitions. However, the database is not designed for and does not suffice for applications requiring accurate handling of all past times everywhere, as it would take far too much effort and guesswork to record all details of pre-1970 civil timekeeping. The pre-1970 data in this database covers only a tiny sliver of how clocks actually behaved; the vast majority of the necessary information was lost or never recorded, and much of what little remains is fabricated. Although 1970 is a somewhat-arbitrary cutoff, there are significant challenges to moving the cutoff back even by a decade or two, due to the wide variety of local practices before computer timekeeping became prevalent. Local mean time (LMT) offsets are recorded in the database only because the format requires an offset. They should not be considered meaningful, and should not prompt creation of zones merely because two locations differ in LMT. Historically, not only did different locations in the same zone typically use different LMT offsets, often different people in the same location maintained mean-time clocks that differed significantly, many people used solar or some other time instead of mean time, and standard time often replaced LMT only gradually at each location. As for leap seconds, we don't know the history of earth's rotation accurately enough to map SI seconds to historical solar time to more than about one-hour accuracy; see Stephenson FR (2003), Historical eclipses and Earth's rotation, A&G 44: 2.22-2.27 <http://dx.doi.org/10.1046/j.1468-4004.2003.44222.x>. As noted in the README file, the tz database is not authoritative (particularly not for pre-1970 time stamps), and it surely has errors. Corrections are welcome and encouraged. Users requiring authoritative data should consult national standards bodies and the references cited in the database's comments. What it said as of about three years ago was: The tz database attempts to record the history and predicted future of all computer-based clocks that track civil time. To represent this data, the world is partitioned into regions whose clocks all agree about time stamps that occur after the somewhat-arbitrary cutoff point of the POSIX Epoch (1970-01-01 00:00:00 UTC). For each such region, the database records all known clock transitions, and labels the region with a notable location. Clock transitions before 1970 are recorded for each such location, because most POSIX-compatible systems support negative time stamps and could misbehave if data were omitted for pre-1970 transitions. However, the database is not designed for and does not suffice for applications requiring accurate handling of all past times everywhere, as it would take far too much effort and guesswork to record all details of pre-1970 civil timekeeping. As noted in the README file, the tz database is not authoritative (particularly not for pre-1970 time stamps), and it surely has errors. Corrections are welcome and encouraged. Users requiring authoritative data should consult national standards bodies and the references cited in the database's comments. The key thing to note in both versions, when it comes to 1970, is Clock transitions before 1970 are recorded for each such location, because most POSIX-compatible systems support negative time stamps and could misbehave if data were omitted for pre-1970 transitions. However, the database is not designed for and does not suffice for applications requiring accurate handling of all past times everywhere, as it would take far too much effort and guesswork to record all details of pre-1970 civil timekeeping. The second sentence in that paragraph describes what the difference is between "pre-1970" and "post-1970". The last paragraph: As noted in the README file, the tz database is not authoritative (particularly not for pre-1970 time stamps), and it surely has errors. Corrections are welcome and encouraged. Users requiring authoritative data should consult national standards bodies and the references cited in the database's comments. is also important. Note that it says "*particularly not* for pre-1970 time stamps", not "*only* for pre-1970 time stamps"; the difference between pre-1970 and post-1970 isn't "unreliable and incomplete vs. reliable and complete", it's "less reliable and complete vs. more reliable and complete". (It's also not, as the "Clock transitions before 1970..." paragraph indicates, "completely absent vs. partially or completely present".)

Lester Caine

9:43 p.m.

Guy Harris wrote:

...

...
Following on from that, indicating that some amendment from this base is in place and can be trusted would be useful. To what sort of amendments are you referring? LMT is LMT, i.e. "1 hour for every 15 degrees of longitude from the prime meridian, and similar treatment of minutes and seconds of longitude". There's nothing to amend there.

The amendments are the rest of the content of the tz database other than the base LMT time. Whether correct or otherwise. But based on Paul's last comment it would seem that it's time simply to start again gathering all of the facts again and assume nothing in the database is right? :) The original question was 'do we need to start a second database with pre 1970 history' my feeling currently that this should be undertaken simply because there is obviously no interest in managing the material within the current framework? Since nobody seems to have any confidence in what is being served up prior to 1970, then should any transitions be included at all? A single LMT based offset just to make the data stable and if we want historic information then basically we are already on our own anyway. ALL I am trying to establish is if there is any demand to rectify the situation that you cut and pasted into you post on mass. What the current situation is has been quite clearly stated. That people do not want to LOOSE what history IS currently displayed has ALSO been established. So to MY mind, the next logical step is to re-assess the evidence and as a minimum give some level of confidence to that data. If that has to be outside of the existing database then so be it, and as a starting point, establishing how confident the LMT times are is essential. At least for the last century. From you quoting ...

...

However, the database is not designed for and does not suffice for applications requiring accurate handling of all past times everywhere, as it would take far too much effort and guesswork to record all details of pre-1970 civil timekeeping.

Unfortunately that effort IS required for some applications of the data so when Paul says

...

... and for pre-1970 data most of the commentary will boil down to "see Shanks", which has been demonstrated to be unreliable when we've checked it. There is little reason to trust that data. Shanks gives no sources -- zero -- for any of his data.

Then those of us who are looking to the past have to go back to basics and re-assess, something which I have been doing myself in the light of recent developments and now have the facts but no where to archive them :( Does the existing TZ database want to make provision to handle the results if people are prepared to put in the work needed? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Paul Eggert

10 p.m.

Lester Caine wrote:

...

Does the existing TZ database want to make provision to handle the results if people are prepared to put in the work needed?

We are certainly interested in corrections to the existing pre-1970 data. We've accepted those in the past and it shouldn't be too much work to maintain that data. If we're talking about adding the thousands of new zones that would be needed to support pre-1970 standard and daylight saving time, I'm not sure this list would be a good place to discuss each and every datum individually, though we could well be interested in incorporating the resulting batch of data as an optional extension to what we have now. We're currently thinking about possible ways to support such an extension, with Zefram's code being the most recent proposal.

Lester Caine

10:35 p.m.

Paul Eggert wrote:

...

...
Does the existing TZ database want to make provision to handle the results if people are prepared to put in the work needed? We are certainly interested in corrections to the existing pre-1970 data. We've accepted those in the past and it shouldn't be too much work to maintain that data.

If we're talking about adding the thousands of new zones that would be needed to support pre-1970 standard and daylight saving time, I'm not sure this list would be a good place to discuss each and every datum individually, though we could well be interested in incorporating the resulting batch of data as an optional extension to what we have now. We're currently thinking about possible ways to support such an extension, with Zefram's code being the most recent proposal.

To be honest, I would just be looking to address the history of the existing entries. Since many of the reasons for database are modern inventions such as 'daylight saving' ( which in itself is a fallacy - summer time is much more accurate ) I would not be expecting many if any additional 'zones'. The reason I've been 'hammering' the LMT aspect is because we know that at some point that will become prevalent, and there is certainly UK documentation for it, but I am quite happy that when the tz returned is 'LMT' then a 'local time offset' is required - based on longitude rather than a fixed 'zone' value. This would pre-date an 'LMTZ' offset where the zones of common time started to appear? Does that make sense of all the waffle I've been writing? It's starting to marry with the documents I've been gathering on top of the link you published to the UK research. But I don't know if it makes such sense world wide? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Guy Harris

11:20 p.m.

On Sep 1, 2013, at 3:35 PM, Lester Caine <lester@lsces.co.uk> wrote:

...

The reason I've been 'hammering' the LMT aspect is because we know that at some point that will become prevalent,

Presumably by "at some point" you mean "at some point in the past". The problem is that if you're dealing with times far enough back that you need to worry about local mean solar time, you might need to worry about local mean solar time at some particular location, in which case you don't have a tzid for a zone, you have a longitude value, and can just plug that into the formula.

...

but I am quite happy that when the tz returned is 'LMT'

Presumably by "tz" you mean "time zone abbreviation", i.e. if somebody looks up a time in the America/New_York zone prior to 1883-11-18 17:00:00 GMT (well, UTC, really), that will be a case where the tz returned LMT?

...

then a 'local time offset' is required - based on longitude rather than a fixed 'zone' value.

...which means that you won't get, for example, the right conversion for times if the longitude is sufficiently far from whatever longitude was used to generate the "LMT" entry for that zone. So if "LMT" is the time zone abbreviation for a given date/time value, whoever's doing the conversion needs to do their own calculation based on the longitude, rather than using any of the data from the tzdb. Is that what you're saying?

...

This would pre-date an 'LMTZ' offset where the zones of common time started to appear?

At least for the US, if "'LMTZ' offset" refers to the offsets for zones given as local time from a certain specified meridian in the 1918 Standard Time Act (and perhaps also in whatever railroad rules and, perhaps, US state laws), that's the case.

Lester Caine

11:38 p.m.

Guy Harris wrote:

...

So if "LMT" is the time zone abbreviation for a given date/time value, whoever's doing the conversion needs to do their own calculation based on the longitude, rather than using any of the data from the tzdb.

Is that what you're saying?

Spot on ... And as I had said before - I was surprised at Paul suggesting adding LMT to the database because it's a reasonably accurate calculating rather than some guess at a random fixed value ... No need for hundreds of new 'zones' because we are back to more linear time. But in the absence of a more accurate guide, using the LMT from the centre of a 15 degree zone is the next 'phase', which I've tagged in my own mind as LMTZ. That I think was what Paul was contemplating rather than LMT itself as a default value? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Guy Harris

12:01 a.m.

On Sep 1, 2013, at 4:38 PM, Lester Caine <lester@lsces.co.uk> wrote:

...

But in the absence of a more accurate guide, using the LMT from the centre of a 15 degree zone is the next 'phase', which I've tagged in my own mind as LMTZ. That I think was what Paul was contemplating

Given that, at least in the US, the offsets in the tzdb are, in fact, the LMT from a multiple-of-15 longitude, and have been that since the tzdb was created (at least as far back as 1918; I'd have to see state laws and railroad rules to see whether that's what was used prior to that, but I strongly suspect that's what was used), I'm not sure what there is to contemplate here. Perhaps *replacing* the current "LMT" lines for zones with a line extrapolating the standardized time offsets indefinitely back into the past might be something worth contemplating.

Paul Eggert

5:47 a.m.

Guy Harris wrote:

...

Perhaps *replacing* the current "LMT" lines for zones with a line extrapolating the standardized time offsets indefinitely back into the past might be something worth contemplating.

If I understand this proposal correctly, it would replace tz's current LMT values with values that are less-precise, since they'd be rounded to the nearest hour somehow. For (say) Paris, this would result in less-accurate data, since it would change Paris's pre-standard-time offset from 0:09:21 to zero. In practice, LMT in Paris before 1891 was, I expect, closer to 0:09:21 than to 0, and I don't see how changing 0:09:21 to 0 would improve the quality of the database entry for Paris. While I'm on the subject of Paris, local time in France between 1891 and 1911 was set by law to an offset of 0:09:21 outside train stations, and 0:04:21 inside train stations. Therefore, any comprehensive attempt to deal with historical civil time in Paris would need at least two zones. The tz database currently uses the outside-of-train-station value. My source: French time set back. New York Times, 1911-03-12, page 15. http://query.nytimes.com/mem/archive-free/pdf?res=9A02E6D71331E233A25751C1A9...

Guy Harris

6:12 a.m.

On Sep 1, 2013, at 10:47 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:

...

If I understand this proposal correctly, it would replace tz's current LMT values with values that are less-precise,

It would replace them with values that are less close to the value for some particular location in the city whose name is used to form the tzid, less close to some other locations in the zone corresponding to the tzid, and *closer* to yet other locations in the tzid. As far as I'm concerned, that's a feature, not a bug, as it shoves a cream pie in the face of developers and users who think that Europe/Paris corresponds to the city of Paris rather than to a region of Europe in which Paris is or was, at some particular point in time, the most populous city.

...

I don't see how changing 0:09:21 to 0 would improve the quality of the database entry for Paris.

It's a database entry for a region of Europe, not for Paris. If it's an entry for Paris, rather than for the French time zone, why don't we have entries for Lyon and Lille and Toulouse and Marseilles and Nice and Cannes and Grenoble and...?

Paul Eggert

6:56 a.m.

Guy Harris wrote:

...

If it's an entry for Paris, rather than for the French time zone, why don't we have entries for Lyon and Lille and ...

It's because of the 1970 cutoff. We would have entries such as you describe, if we changed the cutoff to (say) 1940 rather than 1970. Europe/Paris would split into 28 zones (if we accept the Shanks data), each with its own timestamp history. But the entry for Europe/Paris itself would not need to change. So, in the sense of surviving zone splits better, Europe/Paris is more an entry for Paris than it is an entry for all of France. I agree that the LMT offsets are notional, but even so, requiring them to be multiples of an hour feels ahistorical. Before standard time was observed people simply didn't set their clocks to hour-multiples offsets, and it would be odd to pretend that they did.

Guy Harris

7:12 a.m.

On Sep 1, 2013, at 11:56 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:

...

Guy Harris wrote:

...
If it's an entry for Paris, rather than for the French time zone, why don't we have entries for Lyon and Lille and ...

It's because of the 1970 cutoff. We would have entries such as you describe, if we changed the cutoff to (say) 1940 rather than 1970. Europe/Paris would split into 28 zones (if we accept the Shanks data), each with its own timestamp history.

But you're probably *still* going to get multiple cities within the same zone even if you change the cutoff to 1911 or even 1891: http://vanessafrance.wordpress.com/2012/03/25/a-brief-history-of-french-time... so the question I'd ask is "why should certain cities be blessed by the tzdb by virtue of having their particular local solar mean time in the database"?

...

I agree that the LMT offsets are notional, but even so, requiring them to be multiples of an hour feels ahistorical. Before standard time was observed people simply didn't set their clocks to hour-multiples offsets, and it would be odd to pretend that they did.

Then maybe we should just have localtime() return NULL for times prior to the adoption of standard time (as it doesn't take a longitude argument, we can't figure out local time - and even if it *did*, as Steve Allen noted, there might be more than one local time value for a given longitude, unless the longitude indicates which of the two Kansas City jewelers you asked for the time: http://news.google.com/newspapers?nid=1499&dat=19451005&id=Sh4aAAAAIBAJ&sjid... ) and have mktime() return -1 for those times.

Paul Eggert

7:37 a.m.

Guy Harris wrote:

...

you're probably *still* going to get multiple cities within the same zone even if you change the cutoff to 1911 or even 1891:

Absolutely. The further back you move the cutoff, the more zone splits you get, but if we're only talking about standard time there is a limit of the total number of zones. I'm guessing it's in the thousands, though it's just a guess.

...

"why should certain cities be blessed by the tzdb by virtue of having their particular local solar mean time in the database"?

The answer to that is the same as the answer to the question "why should certain cities be blessed by having their names in the tz database at all"? They're the largest cities in their respective zones, that's all.

...

maybe we should just have localtime() return NULL for times prior to the adoption of standard time

That might make sense. And it would conform to POSIX in practice, since standard time was introduced everywhere before 1970, and POSIX doesn't define behavior before 1970. Still, it would be a little weird for localtime() to stop working for dates before 1847 when TZ='Europe/London', simply because the time was different in London than it was in (say) Oxford.

Guy Harris

7:58 a.m.

On Sep 2, 2013, at 12:37 AM, Paul Eggert <eggert@cs.ucla.edu> wrote:

...

Guy Harris wrote:

...
you're probably *still* going to get multiple cities within the same zone even if you change the cutoff to 1911 or even 1891:

Absolutely. The further back you move the cutoff, the more zone splits you get, but if we're only talking about standard time there is a limit of the total number of zones.

And if we're talking about LMT, we're not talking about standard time.

...

...
"why should certain cities be blessed by the tzdb by virtue of having their particular local solar mean time in the database"?

The answer to that is the same as the answer to the question "why should certain cities be blessed by having their names in the tz database at all"? They're the largest cities in their respective zones, that's all.

(An answer that has caused some issues on the list, e.g. "butbutbut Mumbai has a *lot* more people than Kolkata" or "why not Beijing" or "why is Zagreb only there for backwards compatibility"?)

...

...
maybe we should just have localtime() return NULL for times prior to the adoption of standard time

That might make sense. And it would conform to POSIX in practice, since standard time was introduced everywhere before 1970, and POSIX doesn't define behavior before 1970. Still, it would be a little weird for localtime() to stop working for dates before 1847 when TZ='Europe/London', simply because the time was different in London than it was in (say) Oxford.

If somebody happens to have a seconds-before-the-Epoch value from before 1847, if they want to know what it corresponds to as year/month/day/hour/minute/second in local time, they'd better say what the longitude was of the locality in question; if they didn't, the correct answer is "mu", and NULL/-1 are popular ways of expressing "mu" in C.

Paul Eggert

8:09 a.m.

Guy Harris wrote:

...

If somebody happens to have a seconds-before-the-Epoch value from before 1847, if they want to know what it corresponds to as year/month/day/hour/minute/second in local time, they'd better say what the longitude was of the locality in question; if they didn't, the correct answer is "mu"

Sure, but if we're talking Europe/London, the same is also true for timestamps from 1847 through 1880, because Europe/London stands for a region where clocks didn't agree until August 2, 1880. So, if we want localtime() to return NULL when the answer is indeterminate, then we need to find the earliest time in each zone such that the zone's clocks have all agreed since then. This value is not deducible from what's in the tz database now, unfortunately.

Guy Harris

8:27 a.m.

On Sep 2, 2013, at 1:09 AM, Paul Eggert <eggert@cs.ucla.edu> wrote:

...

Guy Harris wrote:

...
If somebody happens to have a seconds-before-the-Epoch value from before 1847, if they want to know what it corresponds to as year/month/day/hour/minute/second in local time, they'd better say what the longitude was of the locality in question; if they didn't, the correct answer is "mu"

Sure, but if we're talking Europe/London, the same is also true for timestamps from 1847 through 1880, because Europe/London stands for a region where clocks didn't agree until August 2, 1880.

Yup.

...

So, if we want localtime() to return NULL when the answer is indeterminate, then we need to find the earliest time in each zone such that the zone's clocks have all agreed since then.

If we're not going to create zones for regions whose time offsets differed prior to 1970, then, yes, perhaps we *should* fail for times before then. (Alternatively, we could pick a "most of the region is on standard time, and that's good enough" date and use that, e.g. perhaps 1855 for the UK: http://web.archive.org/web/20080828054933/http://www.greycat.org/papers/time... .)

Lester Caine

9:40 a.m.

Paul Eggert wrote:

...

...
...
Perhaps*replacing* the current "LMT" lines for zones with a line extrapolating the standardized time offsets indefinitely back into the past might be something worth contemplating. If I understand this proposal correctly, it would replace tz's current LMT values with values that are less-precise, since they'd be rounded to the nearest hour somehow. For (say) Paris, this would result in less-accurate data, since it would change Paris's pre-standard-time offset from 0:09:21 to zero. In practice, LMT in Paris before 1891 was, I expect, closer to 0:09:21 than to 0, and I don't see how changing 0:09:21 to 0 would improve the quality of the database entry for Paris.

Skipping the follow on for the moment ... Paul - MY proposal is that if the database returns LMT then it is a flag that we are working with 'pre standard time' dates, and that the actual local time is calculated using a longitude value. For the UK we would have Cardif time, Oxford time and London time simply by providing the correct location. We would still be using the same single zone, and the 'default' would be defined by the location used to tag the zone. Once we move into a time where there is some standardisation, then generally the zone gets a LMTZ tag indicating that it is a time applied across a whole zone. We may still have to handle the problem of 'train time' and 'government time' but this is an historical fact that needs handling anyway and while two zones are required it avoids a proliferation of zones in the 'historic' database. If information turns up that provides more accurate offset to 'local time' than the simple location returns then it's a bridge to cross when it happens, but for now I think this provides a convenient base for consolidating what material we do have available? A simplification may be that the 1970 sub-set just returns 'LMT' so that there is a valid time, but I get the feeling people would prefer what data is available to a cut off set? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Paul Eggert

2:16 p.m.

Lester Caine wrote:

...

MY proposal is that if the database returns LMT then it is a flag that we are working with 'pre standard time' dates

OK, although that's a different timestamp than the one Guy Harris was talking about. Guy Harris suggested a timestamp such that all clocks in the named region agree after that timestamp, and some disagree earlier. For Europe/Paris that would be some timestamp in 1945, I expect; that information is not in the database now, and although a number could be deduced from the Shanks & Pottenger data it'd be almost certainly incorrect. You want a timestamp such that time stamps before that date are LMT. For Europe/Paris that's a timestamp in 1891, which *is* deducible from the current database, and which is relatively reliable, at least for Paris. You should be able to implement the flag that you want, by consulting the current database.

Lester Caine

3:25 p.m.

Paul Eggert wrote:

...

Lester Caine wrote:

...
...
MY proposal is that if the database returns LMT then it is a flag that we are working with 'pre standard time' dates OK, although that's a different timestamp than the one Guy Harris was talking about. Guy Harris suggested a timestamp such that all clocks in the named region agree after that timestamp, and some disagree earlier. For Europe/Paris that would be some timestamp in 1945, I expect; that information is not in the database now, and although a number could be deduced from the Shanks & Pottenger data it'd be almost certainly incorrect. You want a timestamp such that time stamps before that date are LMT. For Europe/Paris that's a timestamp in 1891, which*is* deducible from the current database, and which is relatively reliable, at least for Paris. You should be able to implement the flag that you want, by consulting the current database.

And moving things forward it makes things logical? Prior to the adoption of a standard time, an individually calculated LMT is a more correct answer? The key fact here is simply when a standard time was adopted. Since you bring up Paris ( and I'll admit to not looking in the database here ... ) is French Railway time covered from 1891 to 1911 - INTERNAL station clocks were required to be set 5 minutes slow by law - so French. But I don't think we need a timezone per station :) And it always tickles me that they 'set French Standard Time back 9minutes 21 seconds' in 1911 rather than admitting to adopting GMT ... Now that is interesting ... I knew about the railway time, but not that shipping time did not change for another six months ... these newspaper archives make interesting reading. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Guy Harris

5:48 p.m.

On Sep 2, 2013, at 2:40 AM, Lester Caine <lester@lsces.co.uk> wrote:

...

Paul - MY proposal is that if the database returns LMT then it is a flag that we are working with 'pre standard time' dates, and that the actual local time is calculated using a longitude value.

In at least some code that uses the tzdb, including the tz code distributed by the tzdb maintainers, that would require that the system either be able to determine its current longitude or that somebody be obliged to specify it as part of system or user configuration. I do not expect the latter of those two to happen on most UN*X systems, and only expect the former to happen on those UN*X systems that a user can conveniently carry; the latter is a large number of systems, however (the system on which I'm typing this is one such system, and not only will it attempt to determine an approximation of the current longitude and latitude if asked, it will also attempt to set the system tzid based on that). I do not, however, expect many UN*X systems to bother to make localtime() and mktime() attempt to somehow get the current longitude of the system and use that to convert seconds-before-the-Epoch values that predate the adoption of standardized time at the current location, so I don't think it's a worthwhile effort to have the tz code do so. Code that *does* do so needs no help from the tzdb or the tz code, and thus needs no work on the part of the tzdb/tz code developers, other than being able to know that the time they're trying to convert antedates the adoption of standardized time, so that they should not use the tzdb.

...

Once we move into a time where there is some standardisation, then generally the zone gets a LMTZ tag

I would simply say "standardized time"; the adoption of standardized time means you *aren't* using local mean time, you're using the mean time of some specified longitude that might not be your longitude.

...

indicating that it is a time applied across a whole zone. We may still have to handle the problem of 'train time' and 'government time'

*Somebody* may need to handle that; I'm not convinced that the tzdb needs to handle that.

Paul Eggert

5:58 a.m.

Lester Caine wrote:

...

I was surprised at Paul suggesting adding LMT to the database because it's a reasonably accurate calculating rather than some guess at a random fixed value ... No need for hundreds of new 'zones'

Sorry, I was unclear. I wasn't suggesting that we add LMT to the database, any more than is already there as a placeholder for "before standard time". My "thousands of new zones" was referring only to the number of zones I expect we'd need to cover standard time comprehensively. There is a daunting complexity to standard time before 1970.

Guy Harris

10:53 p.m.

On Sep 1, 2013, at 2:43 PM, Lester Caine <lester@lsces.co.uk> wrote:

...

Guy Harris wrote:

...
...
Following on from that, indicating that some amendment from this base is in place and can be trusted would be useful. To what sort of amendments are you referring? LMT is LMT, i.e. "1 hour for every 15 degrees of longitude from the prime meridian, and similar treatment of minutes and seconds of longitude". There's nothing to amend there.

The amendments are the rest of the content of the tz database other than the base LMT time.

What do you mean by "base LMT time"? The LMT time is not a base from which the subsequent entries are derived; the subsequent entries are derived, ultimately, from what historical records we can find of *standardized* time in various regions. In, for example, the Zone entry for America/New_York: Zone America/New_York -4:56:02 - LMT 1883 Nov 18 12:03:58 -5:00 US E%sT 1920 -5:00 NYC E%sT 1942 -5:00 US E%sT 1946 -5:00 NYC E%sT 1967 -5:00 US E%sT The lines that begin "-5:00" are not "based on LMT" in some sense that makes the formula used to calculate LMT an issue for them, they're based on the US railroads having adopted a proposal for standard time. Yes, the standardized offsets for the various time zones were based on LMT for some location within the zone, but the offsets were in one-hour units, so it's not as if the precise longitude of that location, or details of how the calculation was done (or the programming language you do it in or the compiler or interpreter you use for that language), are in any way relevant.

...

Whether correct or otherwise. But based on Paul's last comment it would seem that it's time simply to start again gathering all of the facts again and assume nothing in the database is right? :)

It certainly doesn't seem that way to *me*. For the US, for example, 40 Stat. 450: http://www.webexhibits.org/daylightsaving/usstat.html and subsequent laws establishing standardized time in the US are pretty clear. That statute says, for example: The standard time of the first zone shall be based on the mean astronomical time of the seventy-fifth degree of longitude west from Greenwich and 75 degrees of longitude west of Greenwich, at 1 hour for every 15 degrees, is 75/15 = 5 hours.

...

The original question was 'do we need to start a second database with pre 1970 history'

Given that the current database has some pre-1970 history, I sincerely hope the original question wasn't phrased in exactly that fashion; if it was phrased that way, it's an invalid question that cannot be answered with "yes" or "no", just as, for example, asking a person who has never married "are you still married?" is asking them an invalid question that cannot be answered with "yes" or "no". Perhaps "do we need to start a second database for which we're willing to split a tzdb zone if we discover that, prior to 1970, its history had different parts of the zone having different *standardized time* offsets from GMT/UTC?" is a better question.

...

my feeling currently that this should be undertaken simply because there is obviously no interest in managing the material within the current framework? Since nobody seems to have any confidence in what is being served up prior to 1970,

People have considerably less confidence in what's being served up prior to 1970 than they have in what's served up starting in 1970; the confidence decreases the further back you go in time. (We serve nothing useful up prior to the establishment of standardized time in a given location.)

...

then should any transitions be included at all?

Yes, if we have a solid historical reference (e.g., the 1918 Standard Time Act, the Act For the repeal of the daylight-saving law: http://books.google.com/books?id=by8PAAAAYAAJ&pg=PA280#v=onepage&q&f=false which explains the entries that start in 1920, and various subsequent laws turning daylight savings time on and off for the US). If we have only "Shanks says", maybe not; as Paul explained, sometimes his claims have been proven wrong by research.

...

A single LMT based offset just to make the data stable

If, for America/New_York, "LMT-based" means "based on the mean astronomical time of the seventy-fifth degree of longitude west from Greenwich", yes. If it means anything based on LMT at any other longitude, no; -5:00 is better than anything not a multiple of one hour, unless a solid historical reference can be found to support it.

...

and if we want historic information then basically we are already on our own anyway.

You've *always been* on your own for sufficiently old historical information.

...

ALL I am trying to establish is if there is any demand to rectify the situation that you cut and pasted into you post on mass. What the current situation is has been quite clearly stated. That people do not want to LOOSE what history IS currently displayed has ALSO been established. So to MY mind, the next logical step is to re-assess the evidence and as a minimum give some level of confidence to that data. If that has to be outside of the existing database then so be it, and as a starting point, establishing how confident the LMT times are is essential.

If somebody has a lack of confidence about the LMT time of, for example, the seventy-fifth degree of longitude west of Greenwich, I'd *really* like to hear their reasons for it. If they have no such lack of confidence, the *only* questions that remain about, for example, America/New_York are: Did its standard-time offset from GMT/UTC ever change to a value other than -5:00? What happened to daylight savings time in various locations within that zone? The answer to the first question is almost certainly "no", and nobody's put forth any citations to indicate that there should be any lack of confidence about that. The *second* question is the important one for most of the world, and *that's* where the bulk of the lack of confidence in the data base is.

...

From you quoting ...

...
However, the database is not designed for and does not suffice for applications requiring accurate handling of all past times everywhere, as it would take far too much effort and guesswork to record all details of pre-1970 civil timekeeping.

Unfortunately that effort IS required for some applications of the data

Then perhaps people who really need that level of accuracy should maintain their own database.

...

so when Paul says

...
... and for pre-1970 data most of the commentary will boil down to "see Shanks", which has been demonstrated to be unreliable when we've checked it. There is little reason to trust that data. Shanks gives no sources -- zero -- for any of his data.

Then those of us who are looking to the past have to go back to basics and re-assess, something which I have been doing myself in the light of recent developments and now have the facts but no where to archive them :(

I think that changes to the pre-1970 entries that don't cause new tzids to be created (and for which there's more direct evidence than just a citation of Shanks) would be accepted in the database. I suspect that changes to the pre-1970 entries that *do* cause new tzids to be created might *not* be accepted into the database; people who really need that information should perhaps, as per the above, maintain their own database.

Guy Harris

9:01 a.m.

On Sep 1, 2013, at 3:53 PM, Guy Harris <guy@alum.mit.edu> wrote:

...

It certainly doesn't seem that way to *me*. For the US, for example, 40 Stat. 450:

http://www.webexhibits.org/daylightsaving/usstat.html

and subsequent laws establishing standardized time in the US are pretty clear. That statute says, for example:

The standard time of the first zone shall be based on the mean astronomical time of the seventy-fifth degree of longitude west from Greenwich

and 75 degrees of longitude west of Greenwich, at 1 hour for every 15 degrees, is 75/15 = 5 hours.

That meridian, BTW, does *not* pass through New York City: https://maps.google.com/maps?ll=44.983333,-75&q=loc:44.983333,-75&hl=en&t=m&... (New York *state*, yes, but not New York *City*).

Andy Lipscomb

2:15 p.m.

...

...
It certainly doesn't seem that way to *me*. For the US, for example, 40 Stat. 450:

http://www.webexhibits.org/daylightsaving/usstat.html

and subsequent laws establishing standardized time in the US are pretty clear. That statute says, for example:

The standard time of the first zone shall be based on the mean astronomical time of the seventy-fifth degree of longitude west from Greenwich

and 75 degrees of longitude west of Greenwich, at 1 hour for every 15 degrees, is 75/15 = 5 hours.

That meridian, BTW, does *not* pass through New York City:

https://maps.google.com/maps?ll=44.983333,-75&q=loc:44.983333,-75&hl=en&t=m&...

(New York *state*, yes, but not New York *City*).

It seems to me we could resolve the LMT v. Standard mess by always picking a city that lies on or near the standard meridian. For example, the New York, Chicago, and Los Angeles zones could become Philadelphia, Memphis, and SouthLakeTahoe (Denver itself happens to be on the meridian); CET might for instance be Catania. (And yes, I'm kidding here!)

David Patte ₯

1:03 p.m.

On 2013-09-01 6:35, Zefram wrote:

...

...
However the one I was really thinking about was the 'equation of

...
time' calculation in a format we can use as a 'standard'.

Equation of time calculations are non-trivial requiring the knowledge of the variations of the orbit of the earth (to calculate sun position), as well as variations in earth revolution speed (to calculate DeltaT). For accuracy, astronomical software often uses large tables and series of thousands of polynomials to estimate local apparent time in the past. This is clearly outside the scope of a tz database. There are simplifications to these formulas, but they are only accurate for short timeframes. --

Lester Caine

2:17 p.m.

David Patte ₯ wrote:

...

On 2013-09-01 6:35, Zefram wrote:

...
...
However the one I was really thinking about was the 'equation of

...
time' calculation in a format we can use as a 'standard'.

Equation of time calculations are non-trivial requiring the knowledge of the variations of the orbit of the earth (to calculate sun position), as well as variations in earth revolution speed (to calculate DeltaT). For accuracy, astronomical software often uses large tables and series of thousands of polynomials to estimate local apparent time in the past. This is clearly outside the scope of a tz database.

There are simplifications to these formulas, but they are only accurate for short timeframes.

David I appreciate your insight here. My own naive view on this was that in essence we only needed an equation for over the year, with a fiddle factor which my current crude scribblings suggests would be OK in 10 year steps and probably longer than that? So what is your definition of 'short'? I'm only looking back to 1600's myself and it's unlikely I'll get that far in my life time although discussions on the time machine are progressing well on one of the other lists I frequent :) -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

Stephen Colebourne

August 2013

9:43 a.m.

On 30 August 2013 03:44, David Patte ₯ <dpatte@relativedata.com> wrote:

...

My own preference would be that historical (and perhaps all tz data) be given a numberic identifier, not a 'America/Someplace' name, for the populated areas in question. A good source of geographic identifiers already available is the geonames database. The tz data for Montreal could be identified by the geonames number for Montreal, and the tz data for Toronto associated with the geoname number for Toronto.

Then, to build regions, since all geonames records already have a field for the tz region, these could reflect the numeric identifier of the tz recommended region for each location. For example, Ottawa, in the geonames database would refer to the numeric identifier of Toronto instead of 'America/Toronto', until sometime decides to add historical data for Ottawa's own tz history, at which point it would adopt the identifier for Ottawa.

For the record, such a numeric system would be unacceptable to my user-base of developers, who need to include time-zone strings in configuation systems, initialization code and test cases. The IDs we have are relatively stable and well-known to developers, and the wider set of computer-aware people. If there were an ID change (which I don't want) I would argue for ISO3166/BiggestCity, such as GB/London. Note that such an approach allows for two IDs for the same city, useful in the middle east. Stephen

David Patte ₯

11:41 a.m.

On 2013-08-30 5:43, Stephen Colebourne wrote:

...

On 30 August 2013 03:44, David Patte ₯ <dpatte@relativedata.com> wrote:

...
My own preference would be that historical (and perhaps all tz data) be given a numberic identifier, not a 'America/Someplace' name, for the populated areas in question. A good source of geographic identifiers already available is the geonames database. The tz data for Montreal could be identified by the geonames number for Montreal, and the tz data for Toronto associated with the geoname number for Toronto.

Then, to build regions, since all geonames records already have a field for the tz region, these could reflect the numeric identifier of the tz recommended region for each location. For example, Ottawa, in the geonames database would refer to the numeric identifier of Toronto instead of 'America/Toronto', until sometime decides to add historical data for Ottawa's own tz history, at which point it would adopt the identifier for Ottawa. For the record, such a numeric system would be unacceptable to my user-base of developers, who need to include time-zone strings in configuation systems, initialization code and test cases. The IDs we have are relatively stable and well-known to developers, and the wider set of computer-aware people.

If there were an ID change (which I don't want) I would argue for ISO3166/BiggestCity, such as GB/London. Note that such an approach allows for two IDs for the same city, useful in the middle east.

Stephen

I agree with you that user-identifiable zonenames are easier for users, and in that case I also prefer ISO3166/BiggestCity - such as CA/Ottawa The LMT doesnt need to be in the database since thats easily discernable by the longitude available from various sources, but the advantage of using geonames ids, is that the database supports not only point locations (with latlng), but also larger regions, allowing the association of tz data with a province or country if it is valid for the complete region. --

4662

Age (days ago)

4666

Last active (days ago)

List overview

Download

69 comments

10 participants

participants (10)

Andy Lipscomb
David Patte ₯
Eliot Lear
Garrett Wollman
Guy Harris
Lester Caine
Paul Eggert
Stephen Colebourne
Steve Allen
Zefram