Large scale changes proposed in 2014f
I believe that the changes currently proposed are far far too widespread and damaging. I have taken the latest in the GitHub repo and expanded them out into actual resolved results. The differences between 2014e and 2014f-proposed can be seen here: https://github.com/jodastephen/tzdiff/commit/c812da9e12bd6f8aa52fa2dd758e236... (other version differences available here: https://github.com/jodastephen/tzdiff/commits/proposed-2014f) As can be seen from the first link, the changes proposed are very widespread. they include - Africa/Accra gains DST between 1920 and 1935 - Africa/Freetown loses its entire history of DST - Africa/Kampala has very different DST history - Africa/Dar_es_Salaam has entirely different history before 1961 - Africa/Djibouti gains history before 1960 - Africa/Addis_Ababa has completely different history before 1960 - Africa/Asmara has completely different history before 1960 - Africa/Conakry loses history before 1960 - Africa/Banjul loses history before 1964 - Africa/Maseru gains DST history - Africa/Mbabane gains DST history - Africa/Mogadishu gains completely different DST history and many others with more minor changes. I'd also note that Ho_Chi_Minh, Kashgar, Saigon, Vientiane and Phnom_Penh lose their entire history The changes to Asia/Qatar and Asia/Kuwait look suspect. Asia/Urumqi loses its entire history. Indian/Antananarivo, Comoro and Mayotte get entirely different DST history. Pacific/Midway, Pago_Pago, Saipan, Samoa get entirely different DST history. The above is an attempt to summarise some but not all of the changes. I am far from convinced that the scale of the proposed changes (far bigger than even 2013e) have been anywhere near justified. Is there consensus on the list that all of these changes are better than what went before? Stephen On 16 July 2014 06:40, Paul Eggert <eggert@cs.ucla.edu> wrote:
Paul Eggert wrote:
Alan Barrett wrote:
To the best of my knowledge, most other African countries that share the same UTC offset do not use the abbreviation "SAST"
Thanks for catching this; this was indeed an error in the merge, and I'll prepare a patch shortly that reverts to the previous behavior here as well.
By "most other African countries" I expect you meant the same set of countries that use "CAT" rather than "SAST" in Release 2014e.
Anyway, the revised patch turned into more research than I expected, but I finally came up with the attached proposal. This patch fixes the problem you noted, along with the other specific problems noted, though it does still turn some zones into links when there's no good evidence for the zones' differences.
I hope I correctly puzzled out Google Translate's mangling of Hungarian for the fixes to Hungarian daylight saving time rules in 1918-1945. Oross's work has facsimiles of the original material! If only our other sources were as meticulous....
On Fri, Jul 25, 2014 at 4:12 PM, Stephen Colebourne <scolebourne@joda.org> wrote: ...
Is there consensus on the list that all of these changes are better than what went before? Not if I am included.
-- Tobias Conradi Rheinsberger Str. 18 10115 Berlin Germany http://tobiasconradi.com
On 25 July 2014 10:12, Stephen Colebourne <scolebourne@joda.org> wrote:
I have taken the latest in the GitHub repo and expanded them out into actual resolved results.
Thanks for this. On 25 July 2014 10:12, Stephen Colebourne <scolebourne@joda.org> wrote:
Is there consensus on the list that all of these changes are better than what went before?
On the contrary, I don't recall seeing anyone but Paul voicing support for the changes in the Pacific <https://github.com/eggert/tz/commit/662016b64ae64bc8d2680f349d1a7813dc1ee536> and Africa <https://github.com/eggert/tz/commit/4b4e789d5c5ee79366b4606d139cbb9eb1d5a28d>, whether before or since the partial rollback <https://github.com/eggert/tz/commit/f1ddf32f059c17fa5a1ec24f549d70db36dc5fa9>. (Referring to the threads starting here <http://mm.icann.org/pipermail/tz/2014-July/thread.html#21170> and here <http://mm.icann.org/pipermail/tz/2014-July/thread.html#21177>.) I also think that, if we want to replace zone.tab with something better, we should take the time now to make sure its replacement doesn't unintentionally retain any major shortcomings of the previous version, so we don't find ourselves needing to supersede it again too soon. This is the proper time to do a bit of forward-thinking: BEFORE releasing a new file or format into the wild. -- Tim Parenti
Tim Parenti wrote:
if we want to replace zone.tab with something better, we should take the time now to make sure its replacement doesn't unintentionally retain any major shortcomings of the previous version, so we don't find ourselves needing to supersede it again too soon.
I took the time to consider several possible replacements, and this one was the best: it is simple, easy to understand, upward-compatible, and addresses the problem. Quite possibly we'll want to make further changes in the future, but no such change plausibly springs to mind right now, and one step at a time.
On Fri, 25 Jul 2014, Stephen Colebourne wrote:
Is there consensus on the list that all of these changes are better than what went before?
I think that these changes are a bad idea. As a general rule, I think that existing information should never be replaced by other information with the same or worse quality, but should only be replaced by information of greater quality. In other words, rules should be changed only when the new rules are clearly superior to the old rules. Even if the old rules are known to be inaccurate, they should be retained for the sake of stability until such time as more accurate information is known. I also prefer retaining at least one first class zone for each ISO3166 country, for ease of menu or map based selection, and to allow users to see that the selected zone name contains the name of a familiar city in their country. I think that some users will be upset to see a time zone name that contains the name of a city in another country, especially if the two countries have a history of conflict. The maintenance burden of making similar changes many times when many zones share modern history could be eased by new syntax in the input files, such as what was proposed by Andy Lipscomb in the "Extended data format" thread in September 2013; see <https://mm.icann.org/pipermail/tz/2013-September/019952.html> and other messages in the same thread. --apb (Alan Barrett)
I totally agree with Alan, and others, on this. On 2014-07-26 3:30, Alan Barrett wrote:
I also prefer retaining at least one first class zone for each ISO3166 country, for ease of menu or map based selection, and to allow users to see that the selected zone name contains the name of a familiar city in their country. I think that some users will be upset to see a time zone name that contains the name of a city in another country, especially if the two countries have a history of conflict.
--
On Sat, Jul 26, 2014 at 9:30 AM, Alan Barrett <apb@cequrux.com> wrote:
I also prefer retaining at least one first class zone for each ISO3166 country,
To clarify: Not every ISO 3166 country has a zone. -- Tobias Conradi Rheinsberger Str. 18 10115 Berlin Germany http://tobiasconradi.com
Alan Barrett said:
As a general rule, I think that existing information should never be replaced by other information with the same or worse quality, but should only be replaced by information of greater quality. In other words, rules should be changed only when the new rules are clearly superior to the old rules. Even if the old rules are known to be inaccurate, they should be retained for the sake of stability until such time as more accurate information is known.
I don't agree. If we know or have good reason to believe that the data is completely bogus, then there's no point in retaining it. And having different data for two zones produces a subconcious belief that both are higher quality than if we just have the same. So where the data is completely wrong, I vote for these merges.
I also prefer retaining at least one first class zone for each ISO3166 country, for ease of menu or map based selection, and to allow users to see that the selected zone name contains the name of a familiar city in their country. I think that some users will be upset to see a time zone name that contains the name of a city in another country, especially if the two countries have a history of conflict.
I completely agree with this part. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
On 27 July 2014 08:25, Clive D.W. Feather <clive@davros.org> wrote:
If we know or have good reason to believe that the data is completely bogus, then there's no point in retaining it.
There are two distinct steps here; 1) knowing it is wrong 2) knowing what is right It is important not to conflate the two. What many (most?) of the list want is for the change to occur only when both steps are true. What is proposed is for the change to happen only when the first step occurs (or is claimed - few of us can actually judge whether it is wrong). The reason for requiring both steps to happen before change is that many (most?) on the place a huge value on stability. We want there to be the absolute minimum changes necessary to the data, as that stability is hugely valuable to end-user applications. ie, stability is the "point in retaining it" you describe. As an example, consider Sierra Leone https://github.com/eggert/tz/commit/4b4e789d5c5ee79366b4606d139cbb9eb1d5a28d... https://github.com/jodastephen/tzdiff/commit/c812da9e12bd6f8aa52fa2dd758e236... where two separate sets of DST affecting 13 years and dates as recent as 1960 have been obilterated. Where is the discussion of this specific (huge) change? Where is the justification? And there are many other examples. Stephen
I think throwing out data because it is wrong is a mistake. There is a matter of degree here and since tz info is not authoritative anyway, little is to be gained. Do we think that we can approach anything like 99.99% accuracy by getting rid of suspect data and therefore make the claim that it is more "authoritative"? I think not. Certainty about what time was showing on most clocks in a given area will always be approximate until there is a global control mechanism for setting time zones legally and technically which probably won't happen until there is a one world government (don't hold your breath). There are approximately 60 Amish farms in the Pulaski, NY area all of which observe standard time year round and since we do business with them we have to remember to keep 2 clocks. This was not even an issue a decade ago. I really wish that Shanks et al had not "guessed" at transition times, but instead included reports about what transitions were certain and then showed instead the date ranges where there was uncertainty. They probably interpolated between two reports that they could find in their original research, but since they didn't publish their procedures this is a guess on my part. Since there are wide uses for this data I have to agree with Stephen Colebourne here. On Sun, Jul 27, 2014 at 5:37 AM, Stephen Colebourne <scolebourne@joda.org> wrote:
On 27 July 2014 08:25, Clive D.W. Feather <clive@davros.org> wrote:
If we know or have good reason to believe that the data is completely bogus, then there's no point in retaining it.
There are two distinct steps here; 1) knowing it is wrong 2) knowing what is right
It is important not to conflate the two.
What many (most?) of the list want is for the change to occur only when both steps are true. What is proposed is for the change to happen only when the first step occurs (or is claimed - few of us can actually judge whether it is wrong).
The reason for requiring both steps to happen before change is that many (most?) on the place a huge value on stability. We want there to be the absolute minimum changes necessary to the data, as that stability is hugely valuable to end-user applications. ie, stability is the "point in retaining it" you describe.
As an example, consider Sierra Leone
https://github.com/eggert/tz/commit/4b4e789d5c5ee79366b4606d139cbb9eb1d5a28d...
https://github.com/jodastephen/tzdiff/commit/c812da9e12bd6f8aa52fa2dd758e236... where two separate sets of DST affecting 13 years and dates as recent as 1960 have been obilterated. Where is the discussion of this specific (huge) change? Where is the justification?
And there are many other examples.
Stephen
Remember that if we remove historical transitions that are problematic, what we are actually doing is replacing them with a lack of transition. The database will still give an answer for these historical date/timezone combination. It may well give an answer that is wrong more often than it does now. Before it might have had a transition date that is wrong, now it will not have any transitions at all. In the absolute, wouldn't that make more timezone with more dates that are wrong than is currently the case? If that's so, then isn't this a step backwards? On 28/07/2014 2:29 AM, Zoidsoft wrote:
I think throwing out data because it is wrong is a mistake. There is a matter of degree here and since tz info is not authoritative anyway, little is to be gained. Do we think that we can approach anything like 99.99% accuracy by getting rid of suspect data and therefore make the claim that it is more "authoritative"? I think not.
-- Oracle Email Signature Logo Patrice Scattolin | Principal Member Technical Staff | 514.905.8744 Oracle WebCenter Mobile applications 600 Blvd de Maisonneuve West Suite 1900 Montreal, Quebec
On 28/07/14 16:40, Patrice Scattolin wrote:
Remember that if we remove historical transitions that are problematic, what we are actually doing is replacing them with a lack of transition. The database will still give an answer for these historical date/timezone combination. It may well give an answer that is wrong more often than it does now. Before it might have had a transition date that is wrong, now it will not have any transitions at all. In the absolute, wouldn't that make more timezone with more dates that are wrong than is currently the case? If that's so, then isn't this a step backwards?
The problem currently is that prior to 1970 we have no idea if the data we were working with 5 years ago was changed because it was a new quess or changed because more accurate data is available. If content has been tagged using what is proven to be incorrect assumptions, then it can be reviewed and corrected with more accurate data. Simply changing assumptions without having real facts to work against is never a good way of working. LEAVE the data as it has been used until such time as a fact comes along that proves a change is necessary. But what is actually needed is a much more robust way of identifying just which historic facts are being used? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Patrice Scattolin wrote:
Before it might have had a transition date that is wrong, now it will not have any transitions at all. In the absolute, wouldn't that make more timezone with more dates that are wrong than is currently the case?
Maybe, maybe not. It's hard to tell, without knowing the actual facts, which we don't know and in many cases are not likely to ever know. We are talking about differences that are typically on the order of an hour or less, in areas and eras in which local timekeeping was neither accurate nor precise, so it's likely a question that doesn't have all that much meaning.
On 27/07/14 10:37, Stephen Colebourne wrote:
As an example, consider Sierra Leone https://github.com/eggert/tz/commit/4b4e789d5c5ee79366b4606d139cbb9eb1d5a28d... https://github.com/jodastephen/tzdiff/commit/c812da9e12bd6f8aa52fa2dd758e236... where two separate sets of DST affecting 13 years and dates as recent as 1960 have been obilterated. Where is the discussion of this specific (huge) change? Where is the justification?
That there is very good data for a lot of pre-1970 timezones is a fact, but the blocking of updating that information has been flagged before. We NEED an historically correct set of data which I had thought that was now accepted, but it seems not? If tz is not going to provide a usable table of data for pre-1970 data then one again we need to provide an alternative ... That some current materials provenance is suspect does not remove the need for something stable which flags that timezone/daylight saving was in place is still needed. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Lester Caine wrote:
We NEED an historically correct set of data which I had thought that was now accepted, but it seems not?
We've talked in the past about adding hooks for extra pre-1970 data, presumably mainly for astrological applications. As long as they don't significantly get in the way of the main database (whose scope is areas that have agreed since 1970), I don't think people would have a problem with that. It would be a nontrivial task, though.
On Sun, Jul 27, 2014 at 08:25:55AM +0100, "Clive D.W. Feather" <clive@davros.org> wrote:
I don't agree. If we know or have good reason to believe that the data is completely bogus, then there's no point in retaining it.
Clive, you seem to be not following the discussions. You can argue that _you think_ it is pointless, but you cannot just ignore the perfectly good points others have made, foremost that of the value of stability itself, quality being equal. These points have been made, so claiming there's no point is simply incorrect: the points may be weak, or wrong, or you might disagree with them, but they do exist. At the very least, you should acknowledge that. If you can prove the stability argument to be wrong or can show that it isn't a real concern, I would be happy to hear about it. Until then, you have to accept that there is a point on retaining the data, even if it might be unconvincing or even wrong - but we have yet to see that.
different data for two zones produces a subconcious belief that both are higher quality than if we just have the same.
That may well be, but this is far from proven. In addition, I am not sure what the value of subconscious and wrong beliefs of some unspecified people about tzdata is worth. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schmorp@schmorp.de -=====/_/_//_/\_,_/ /_/\_\
Marc Lehmann <schmorp@schmorp.de> writes:
"Clive D.W. Feather" <clive@davros.org> wrote:
I don't agree. If we know or have good reason to believe that the data is completely bogus, then there's no point in retaining it.
Clive, you seem to be not following the discussions. You can argue that _you think_ it is pointless, but you cannot just ignore the perfectly good points others have made, foremost that of the value of stability itself, quality being equal.
He isn't ignoring anything. He just doesn't agree with you, and I suspect doesn't think that your arguments are as good as you think they are. For what it's worth, I agree with Clive as well. I don't see any point in retaining data that is either known to be wrong or that is highly likely to be wrong, and I think retaining it is harmful (if in a fairly minor way). There are a few vocal people here who get very excited and frantic every time Paul tries to change something. I am unconvinced that their position is as important as they think it is, or that the issues are anywhere near as significant as they think they are. Regardless, the fact that they're very persistent and vocal about their positions doesn't make them right, nor does it mean that everyone agrees with them. Some of us are just tired of repeating the same positions that we've stated in the past and don't bother to do so every time there's another flurry of OMG THE TZ DATABASE IS DESTROYED FOREVAR AND EVAR WHY DO YOU HATE THE KITTENZ???!?!?! -- Russ Allbery (eagle@eyrie.org) <http://www.eyrie.org/~eagle/>
On 29/07/14 05:28, Russ Allbery wrote:
There are a few vocal people here who get very excited and frantic every time Paul tries to change something. I am unconvinced that their position is as important as they think it is, or that the issues are anywhere near as significant as they think they are. Regardless, the fact that they're very persistent and vocal about their positions doesn't make them right, nor does it mean that everyone agrees with them.
There are a number of problems with the current situation, and no apparent wish to resolve it. Where data has already been normalised using the current tz information, that data gives a value that ideal will not change in the future. If the facts relating to that normalization are proven to be wrong, then the data relating to it needs to be reviewed. If the processing now returns a different value such as simply switching off DST prior to 1970, or just dropping data then there is nothing to verify against. This is the sort of stability people are asking for since having to revalidate data against a different 'guess' makes little sense. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On Mon, Jul 28, 2014 at 09:28:13PM -0700, Russ Allbery <eagle@eyrie.org> wrote:
_you think_ it is pointless, but you cannot just ignore the perfectly good points others have made, foremost that of the value of stability itself, quality being equal.
He isn't ignoring anything. He just doesn't agree with you, and I suspect doesn't think that your arguments are as good as you think they are.
Not agreeing with me is fine (although he wasn't arguing with me at the time, so how do you know he actually disagrees...), and thinking arguments are bad or unconvincing is just fine as well. However, pretending there aren't any points to it when some have been stated is not "disagreeing" it is simply "ignoring". That is all I pointed out, and I stand by it. It's like me claiming there are no points in favour of the change: That would both be wrong, and would ignore the arguments that have been made. It also wouldn't add anything to the discussion.
For what it's worth, I agree with Clive as well. I don't see any point in retaining data that is either known to be wrong or that is highly likely to be wrong, and I think retaining it is harmful (if in a fairly minor way).
This is a strawmen argument and a false dichotomy, it isn't what is happening. Instead, such data is replaced by other such data. The choice is not between bad data and no data as you frame it, it's between bogus data and bogus data, or rather, bogus timestamps and other bogus timestamps. And as has been pointed out, this new data has good chances of being even more wrong, so you'd argue for replacing bogus data by even more bogus data. I don't know if it is true, but I bet you don't either. More research would clearly be needed.
Some of us are just tired of repeating the same positions that we've stated in the past and don't bother to do so every time there's another
If I take your mail as an example then the reason for this is that your position doesn't seem to even apply to the issue and you should do more research to show why it would apply, _given the arguments already made_. Stating a position is absolutely pointless unless there arguments to back them up. There needs to be evidence or at least convincing arguments to change data, and this has been consistently lacking (as has been pointed out as well). Simply stating positions is arguing by assertion, and we all know how unreasonable that is.
OMG THE TZ DATABASE IS DESTROYED FOREVAR AND EVAR WHY DO YOU HATE THE KITTENZ???!?!?!
That's just childish - we only heard this silly argument from you. Ridiculing other people or their arguments is also not an argument in itself. Well, at least not a reasonable one. What this discussion needs is more reasonable arguments in favour of the change. So far, there has been a lot of arguing besides the point, and very few arguments in favour of the change that even apply to the case. That is, there might have been convincing arguments, but it's unclear whether they apply - for example, if the new data would indeed be better, then replacing very bogus data by less bogus data might well make sense, but I haven't seen anybody argue the new data is better. The only argument that trivially stands in favour is that it decreases maintainance burden (there are a few variations). Anything else is usually conditional on the assertion that the new data is better, or that the change data doesn't cause disruption etc., with no attempts to even rationalise these assumptions. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schmorp@schmorp.de -=====/_/_//_/\_,_/ /_/\_\
Marc Lehmann wrote:
I haven't seen anybody argue the new data is better.
It appears you overlooked some arguments in that direction; see <http://mm.icann.org/pipermail/tz/2014-August/021283.html> for example.
The only argument that trivially stands in favour is that it decreases maintainance burden
Decreasing maintenance burden is not trivial. As the principal maintainer I feel the burden more than most, and it's a burden I'd rather lighten before handing it off to my successor. I believe the data entries in question were mostly invented, perhaps to give customers the warm feeling that there are specific answers for everything (even when there aren't). Unless one has investigated the matter it may be hard to fully appreciate how misleading these data entries are, or how pointless and demoralizing it is to curate bogus data. We can't avoid the invented-data problem entirely (as the format requires *something* there), but it's good to lessen it when we can.
On 27 July 2014 03:25, Clive D.W. Feather <clive@davros.org> wrote:
If we know or have good reason to believe that the data is completely bogus, then there's no point in retaining it.
Even accepting your premise, which I don't, for all the other reasons discussed... we simply don't know: On 11 July 2014 00:58, Paul Eggert <eggert@cs.ucla.edu> wrote:
I can't prove it, but I have the strong impression that most of the entries that we haven't already checked were simply invented.
(from http://mm.icann.org/pipermail/tz/2014-July/021194.html) And so the only comment I've seen remotely resembling support for discarding data doesn't even seem to apply in this instance. -- Tim Parenti On 27 July 2014 03:25, Clive D.W. Feather <clive@davros.org> wrote:
Alan Barrett said:
As a general rule, I think that existing information should never be replaced by other information with the same or worse quality, but should only be replaced by information of greater quality. In other words, rules should be changed only when the new rules are clearly superior to the old rules. Even if the old rules are known to be inaccurate, they should be retained for the sake of stability until such time as more accurate information is known.
I don't agree. If we know or have good reason to believe that the data is completely bogus, then there's no point in retaining it. And having different data for two zones produces a subconcious belief that both are higher quality than if we just have the same. So where the data is completely wrong, I vote for these merges.
I also prefer retaining at least one first class zone for each ISO3166 country, for ease of menu or map based selection, and to allow users to see that the selected zone name contains the name of a familiar city in their country. I think that some users will be upset to see a time zone name that contains the name of a city in another country, especially if the two countries have a history of conflict.
I completely agree with this part.
-- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
Alan Barrett wrote:
The maintenance burden of making similar changes many times when many zones share modern history could be eased by new syntax in the input files
It'd be nice to extend the zic input syntax along those lines, but that's not something we can plausibly do before the next release. And even if we extended zic, we couldn't use the new syntax in the data for several years, because it'd take that long for the zic changes to propagate out to the field. Finally, it isn't much of a maintenance burden to keep the bogus data in the current format; the main problem with the bogus data is that it's bogus, not that it needs formatting.
On 1 August 2014 01:53, Paul Eggert <eggert@cs.ucla.edu> wrote:
Finally, it isn't much of a maintenance burden to keep the bogus data in the current format; the main problem with the bogus data is that it's bogus, not that it needs formatting.
This is what confuses me. The minimum effort is to not touch the data at all, which pleases those of us wanting stability. Could an approach where those lines considered bogus are tagged with an end of line comment? For example: Zone Africa/Nouakchott -1:03:48 - LMT 1912 0:00 - GMT 1934 Feb 26 # BOGUS -1:00 - WAT 1960 Nov 28 # BOGUS 0:00 - GMT Such an approach applied generally would preserve the data until it is definitively replaced with something better and allow it to clearly be tagged as of low quality. Applications could choose to parse a specific comment format and eliminate the data, but they would be under no obligation to do so. Stephen
Something that's been proposed several times is to have another file containing out-of-scope/less-plausible/bogus/etc. data; people who want this lower-quality data could simply zic the file. That would be easier to maintain than complicating the existing format with #BOGUS comments. Any move in this direction should wait until after the next release, though.
On Fri, 25 Jul 2014, Stephen Colebourne wrote:
The above is an attempt to summarise some but not all of the changes. I am far from convinced that the scale of the proposed changes (far bigger than even 2013e) have been anywhere near justified.
Is there consensus on the list that all of these changes are better than what went before?
No, I don't agree that the proposed changes are better. cheers, Derick
Stephen Colebourne wrote:
The differences between 2014e and 2014f-proposed can be seen here: https://github.com/jodastephen/tzdiff/commit/c812da9e12bd6f8aa52fa2dd758e236...
You can also see changes directly from the experimental version, using URLs like this: http://github.com/eggert/tz/compare/2014e...master I don't normally use a web-based interface, though; I use 'git' directly. For example, on my development host the shell command 'git diff 2014e' outputs a text file containing all the changes since 2014e.
- Africa/Accra gains DST between 1920 and 1935
Yes, this is a result of integrating a better source, one that appears to be more reliable than Shanks & Pottenger, namely Scott Keltie & Epstein 1920. The new source is more contemporaneous and is from a reasonably respectable publisher, and is more trustworthy than Shanks & Pottenger.
- Africa/Freetown loses its entire history of DST
The old entry was taken from Shanks & Pottenger, but it's so dubious that it cannot be right. Not only does it disagree with Whitman, it also disagrees with the The International Hydrographic Bulletin, 1932-33, p 63. One possible explanation is that Shanks entered some data with a reversed sign (thus confusing a DST of 20 minutes with a DST of 40 minutes from the preceding time zone), which would mean that a reasonably-sized chunk of 2014e is an hour off. Without further information we're better off omitting this dubious data entirely.
Asia/Urumqi loses its entire history.
Not all the history, just the post-1980 transitions to Beijing time, as Urumqi now uses Xinjiang time. This is due to reports by Luther Ma and from Guo Qingshen (via Alois Triendl) that have been discussed on the list. The data from Shanks & Pottenger are obviously wrong for Xinjiang time, and the new data are closer to being correct. I write "closer" because we do not know what happened in Ürümqi during the warlord rule in the 20th century -- but that part of the history has been retained. For the remaining changes you mentioned, where we lack data, your summary greatly exaggerates the extent of the changes. For example, it's simply not true that Ho Chi Minh City loses its "entire history": all its time stamps since 1931 are unchanged, and even some of the time stamps before then are the same as before. All that's removed are two or three questionable transitions before 1932, transitions for which we have no reliable source.
On 1 August 2014 01:20, Paul Eggert <eggert@cs.ucla.edu> wrote:
Stephen Colebourne wrote:
The differences between 2014e and 2014f-proposed can be seen here: https://github.com/jodastephen/tzdiff/commit/c812da9e12bd6f8aa52fa2dd758e236... I don't normally use a web-based interface, though; I use 'git' directly. For example, on my development host the shell command 'git diff 2014e' outputs a text file containing all the changes since 2014e.
The key reason for generating the tzdiff link is to be able to check what the *end result* of the changes is. ie. sometimes it is a lot easier to see the actual result of the changes when resolved into actual transitions.
- Africa/Accra gains DST between 1920 and 1935
Based on your explanation, this looks like a good change with a good justification. Can any more detail be added to NEWS?
- Africa/Freetown loses its entire history of DST
Thank you for the explanation. This level of detail helps. However while this indicates that the current data is wrong, your explanation does not indicate that replacement is right. This is the nub of the problem. (Determining "right" is of cause hard if even possible for these locations). In this case your explanation goes just about far enough to say that removal is justified, but the single Linked 1912 is not explained. The same applies to the other conversion to links. In each case, the LMT changes and often the year that standard time starts is changed. I, and others, would strongly prefer the data to be left alone until research shows something clearly better. If filtering to reduce the size of the database is needed, then it should be done in code, not the in original source data. Stephen
On 31 July 2014 20:20, Paul Eggert <eggert@cs.ucla.edu> wrote:
You can also see changes directly from the experimental version, using URLs like this:
http://github.com/eggert/tz/compare/2014e...master
I don't normally use a web-based interface, though; I use 'git' directly. For example, on my development host the shell command 'git diff 2014e' outputs a text file containing all the changes since 2014e.
One benefit of Stephen's repository, though, is that it isolates changes to data, which are otherwise lumped in with commentary and code changes, of which there have been quite a lot — many overlapping — since 2014e. Running 'git diff 2014e' gives me over 14000 lines of output which are not so useful as they would seem. In any case, I am glad to see this project moving forward with what is hopefully the beginning of a much-needed, well-researched cleanup, not the end. -- Tim Parenti
participants (13)
-
Alan Barrett -
Clive D.W. Feather -
David Patte ₯ -
Derick Rethans -
Lester Caine -
Marc Lehmann -
Patrice Scattolin -
Paul Eggert -
Russ Allbery -
Stephen Colebourne -
Tim Parenti -
Tobias Conradi -
Zoidsoft