[PATCH 2/3] Replace some zones with links when that doesn't lose non-LMT info.
*Paul Eggert* eggert at cs.ucla.edu <mailto:tz%40iana.org?Subject=Re:%20Re%3A%20%5Btz%5D%20%5BPATCH%202/3%5D%20Replace%20some%20zones%20with%20links%20when%20that%0A%20doesn%27t%20lose%20non-LMT%20info.&In-Reply-To=%3C522616ED.6060902%40cs.ucla.edu%3E> /Tue Sep 3 17:05:49 UTC 2013/ Lester Caine wrote: >/can we at least agree that the quality of the material is heading in the right direction? / Yes, that's the idea. While I personally have no problem with tidy-up of obviously wrong data and also agree that correction of mistakes is more important than stability (which has never been absolutely pretended), I think end-users of tzdb should get a better opportunity to estimate how reliable the data are and how much trust they can put in their time zone calculations. Offsets in LMT-lines can easily be qualified as UNKNOWN. For the proposed date of 1970 as general separation point between UNSAFE and (apparently) RELIABLE I am not so sure. Would it not be helpful for end-users if there is an additional year-type attribute per zone which tells the users since when the data can be confirmed with high probability of correctness? Would require about 400+ attributes (ok, 1970 as default). I am sure for example in UK or in Germany the data are correct even a pretty while before 1970. Flexibility would be a good thing, isn't it? External APIs could support the users with this extra quality information per offset and make them better aware that historical tz data are not set in stone. Just giving a thought...
Meno Hochschild wrote:
Lester Caine wrote: >/can we at least agree that the quality of the material is heading in the right direction? / Yes, that's the idea.
While I personally have no problem with tidy-up of obviously wrong data and also agree that correction of mistakes is more important than stability (which has never been absolutely pretended), I think end-users of tzdb should get a better opportunity to estimate how reliable the data are and how much trust they can put in their time zone calculations.
Offsets in LMT-lines can easily be qualified as UNKNOWN. For the proposed date of 1970 as general separation point between UNSAFE and (apparently) RELIABLE I am not so sure. Would it not be helpful for end-users if there is an additional year-type attribute per zone which tells the users since when the data can be confirmed with high probability of correctness? Would require about 400+ attributes (ok, 1970 as default). I am sure for example in UK or in Germany the data are correct even a pretty while before 1970. Flexibility would be a good thing, isn't it? External APIs could support the users with this extra quality information per offset and make them better aware that historical tz data are not set in stone.
Just giving a thought...
1970 as a nominal date is more of a hindrance than a help ;) Totally agree that there should be some flag if the validity of an entry does not have a documented basis. As I said already UK/Isle of Man are good back to 1880's and I see no reason we should not have the same confidence with most of Europe? So some indication of what IS 'woolly' would be of more help in then addressing the holes. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On 09/03/2013 01:10 PM, Lester Caine wrote:
I see no reason we should not have the same confidence with most of Europe?
Wars are one reason. For example, the current TZ database "works" for Mulhouse, France only because it excludes zones that differ only before 1970, which means we currently don't care that many pre-1970 timestamps are wrong for Mulhouse. If we moved the fencepost back to 1900 we could add a zone for Mulhouse, but we shouldn't use the Shanks data for Mulhouse, because it's obviously wrong (I just checked). Maybe someone could do the legwork to research Mulhouse thoroughly, as we've already researched Paris. But maybe not; and even if Mulhouse is doable there must be hundreds of other locations within France that would need the same level of research, where the Shanks data are wrong or are almost surely wrong Doing all this properly would be a lot of work, and I expect it would be a mistake if we simply tried to shoehorn the result into the current tz database format. I don't have a solution here; I'm merely noting the problem.
So some indication of what IS 'woolly'
That indication is currently in the comments. I suppose it could be automated more, but that also sounds like a lot of work, work that I hope someone else does and not me. You could start by assuming that the pre-1970 data are unreliable unless supported by a source in the comments, where Shanks and Whitman do not count as reliable sources.
participants (3)
-
Lester Caine -
Meno Hochschild -
Paul Eggert