Non-valid timezones, is there a rule to remove them?
Howdy! Looking at the zone.tab file I can see a few timezones that, although valid in the past, they are not valid anymore. I can pick as example, reported on the GNOME Bugzilla[0], Asia/Novosibirsk, that used to be UTC+7, but nowadays is considered the same one than Asia/Almaty, which one is UTC+6. How is the rule about keep or remove these not-used-anymore timezones? Would you guys accept a patch removing the not-used-anymore ones? What kind of information should I provide/get to be sure that one timezone could be removed? Thanks is advance and Best Regards. [0]: https://bugzilla.gnome.org/show_bug.cgi?id=722419 -- Fabiano Fidêncio
On 22 January 2014 21:05, Fabiano Fidêncio <fabiano@fidencio.org> wrote:
Would you guys accept a patch removing the not-used-anymore ones?
Generally speaking, no. If the history is different between Novosibirsk and Almaty (as we have), then we retain both zones for historical conversion purposes. If the data we have is wrong, that's another matter. We currently have Novosibirsk as observing UTC+7 year-round since March 2011. If they switched back to UTC+6, do you know when they did so, or have any news reports documenting the change? -- Tim Parenti
Fabiano Fidêncio wrote:
Looking at the zone.tab file I can see a few timezones that, although valid in the past, they are not valid anymore. I can pick as example, reported on the GNOME Bugzilla[0], Asia/Novosibirsk, that used to be UTC+7, but nowadays is considered the same one than Asia/Almaty, which one is UTC+6.
This appears to be a problem with some mapping between MSDN and the tz database. If so, I suggest writing to whoever's maintaining that mapping. (The tz database itself does not contain the mapping.)
On Jan 22, 2014, at 6:31 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
Fabiano Fidêncio wrote:
Looking at the zone.tab file I can see a few timezones that, although valid in the past, they are not valid anymore. I can pick as example, reported on the GNOME Bugzilla[0], Asia/Novosibirsk, that used to be UTC+7, but nowadays is considered the same one than Asia/Almaty, which one is UTC+6.
This appears to be a problem with some mapping between MSDN and the tz database. If so, I suggest writing to whoever's maintaining that mapping.
The link in the GNOME bug goes to http://msdn.microsoft.com/en-us/library/ms912391(v=winembedded.11).aspx which is *NOT* a mapping between Microsoft's Time Zone Indexes and tz database tzids; it's a list of Time Zone Indexes, names for the zone in question, and time offsets and descriptions of the locale for the zone. Some people might take the description of the locale for the name and use it to try to guess the tzid to which it maps, but, as far as I know, Microsoft makes no claim that the "Time" column on the page can be easily mapped to a tzid. So there's no mapping to correct, other than perhaps a mapping made by Evolution-EWS. As Tim Parenti asks:
If the data we have is wrong, that's another matter. We currently have Novosibirsk as observing UTC+7 year-round since March 2011. If they switched back to UTC+6, do you know when they did so, or have any news reports documenting the change?
The entry in the europe file for Asia/Novosibirsk is # # From Paul Eggert (2006-08-19): I'm guessing about Tomsk here; it's # not clear when it switched from +7 to +6. # Novosibirskaya oblast', Tomskaya oblast'. Zone Asia/Novosibirsk 5:31:40 - LMT 1919 Dec 14 6:00 6:00 - NOVT 1930 Jun 21 # Novosibirsk Time 7:00 Russia NOV%sT 1991 Mar 31 2:00s 6:00 Russia NOV%sT 1992 Jan 19 2:00s 7:00 Russia NOV%sT 1993 May 23 # say Shanks & P. 6:00 Russia NOV%sT 2011 Mar 27 2:00s 7:00 - NOVT and the entry in the asia file for Asia/Almaty is # Almaty (formerly Alma-Ata), representing most locations in Kazakhstan Zone Asia/Almaty 5:07:48 - LMT 1924 May 2 # or Alma-Ata 5:00 - ALMT 1930 Jun 21 # Alma-Ata Time 6:00 RussiaAsia ALM%sT 1991 6:00 - ALMT 1992 6:00 RussiaAsia ALM%sT 2005 Mar 15 6:00 - ALMT If Microsoft's table is correct to say 201 N. Central Asia Standard Time (GMT+06:00) Almaty, Novosibirsk then the tzdb's Asia/Novosibirsk is wrong and needs to be updated to have a standard-time offset of +6:00, starting at whatever date and time it changed from +7:00. (As Parenti notes, this does *NOT* mean that Asia/Novosibirsk should be made an alias of Asia/Almaty, much less deleted, unless the history of the two zones is known to be identical post-1970 and appears to be identical pre-1970.) If, however, Novosibirsk is 7 hours from GMT and Almaty is 6 hours from GMT, then Microsoft's table is wrong and needs to be updated to separate Almaty and Novosibirsk.
Sorry, that makes no sense. How can a timezone ever become invalid? We do have old names, where a timezone has had the name we call it by changed - then we keep the old name forever, for backward compatibility (possibly except for a case where it existed such a short time with the wrong name that it would probably never have been used - but I don't think that case has ever arisen.)
Probably I'm using the wrong term. Let me try to explain with another example. Argentina/San_Juan. IIRC, it has a different timezone than the rest of Argentina for one year or so and then they switched back to the Buenos Aires timezone. So, what I mean by invalid is keep a different name when the differentiation doesn't exist anymore. Even in those cases
Unless Asia/Almaty was always the same as Asia/Novosibirsk and both changed from UTC+7 to UTC+6 at the same time (having two identical zones is something that isn't terribly useful, but has happened, and where it is found, we tend to make one of them just be an alias for the other - just the same as if had first been called one name and then the other). But if that's not the case, and I believe here it isn't, then how would you translate an old timestamp from Asia/Novosibirsk (from when it was still UTC+7) if the rules you're using are those for Asia/Almaty which is & was (at the relevant time) UTC+6 ?
Hmmm. I think I got it now. And probably it also answer the question I've asked above.
This appears to be a problem with some mapping between MSDN and the tz database. If so, I suggest writing to whoever's maintaining that mapping.
The link in the GNOME bug goes to
http://msdn.microsoft.com/en-us/library/ms912391(v=winembedded.11).aspx
which is *NOT* a mapping between Microsoft's Time Zone Indexes and tz database tzids; it's a list of Time Zone Indexes, names for the zone in question, and time offsets and descriptions of the locale for the zone.
Some people might take the description of the locale for the name and use it to try to guess the tzid to which it maps, but, as far as I know, Microsoft makes no claim that the "Time" column on the page can be easily mapped to a tzid.
Yes, for sure they don't do this. Unfortunately we are forced to map to their format, getting the info from libical (which one gets the info from the zone.tab file).
So there's no mapping to correct, other than perhaps a mapping made by Evolution-EWS.
Yes, and I'm the one trying to do the mapping and getting confused :-)
As Tim Parenti asks:
If the data we have is wrong, that's another matter. We currently have Novosibirsk as observing UTC+7 year-round since March 2011. If they switched back to UTC+6, do you know when they did so, or have any news reports documenting the change?
The entry in the europe file for Asia/Novosibirsk is
# # From Paul Eggert (2006-08-19): I'm guessing about Tomsk here; it's # not clear when it switched from +7 to +6. # Novosibirskaya oblast', Tomskaya oblast'. Zone Asia/Novosibirsk 5:31:40 - LMT 1919 Dec 14 6:00 6:00 - NOVT 1930 Jun 21 # Novosibirsk Time 7:00 Russia NOV%sT 1991 Mar 31 2:00s 6:00 Russia NOV%sT 1992 Jan 19 2:00s 7:00 Russia NOV%sT 1993 May 23 # say Shanks & P. 6:00 Russia NOV%sT 2011 Mar 27 2:00s 7:00 - NOVT
and the entry in the asia file for Asia/Almaty is
# Almaty (formerly Alma-Ata), representing most locations in Kazakhstan Zone Asia/Almaty 5:07:48 - LMT 1924 May 2 # or Alma-Ata 5:00 - ALMT 1930 Jun 21 # Alma-Ata Time 6:00 RussiaAsia ALM%sT 1991 6:00 - ALMT 1992 6:00 RussiaAsia ALM%sT 2005 Mar 15 6:00 - ALMT
If Microsoft's table is correct to say
201 N. Central Asia Standard Time (GMT+06:00) Almaty, Novosibirsk
then the tzdb's Asia/Novosibirsk is wrong and needs to be updated to have a standard-time offset of +6:00, starting at whatever date and time it changed from +7:00. (As Parenti notes, this does *NOT* mean that Asia/Novosibirsk should be made an alias of Asia/Almaty, much less deleted, unless the history of the two zones is known to be identical post-1970 and appears to be identical pre-1970.)
Yeah, I got it and now makes sense to keep both for historical reasons.
If, however, Novosibirsk is 7 hours from GMT and Almaty is 6 hours from GMT, then Microsoft's table is wrong and needs to be updated to separate Almaty and Novosibirsk.
Hmmm. You're right and the only think I can do for now is open a bug in the Microsoft's bugzilla (or equivalent). I need to thank for all explanations. :-) Best Regards, -- Fabiano Fidêncio
Fabiano Fidêncio wrote:
Yeah, I got it and now makes sense to keep both for historical reasons.
Not so much 'It makes sense', but if I am looking at previous dates then I expect to get the same offset as was applied at the time, so historic data never becomes invalid? There is still a debate as to times prior to 1970 and currently we do not have a reliable means of determining that. It seems that Microsoft may be using a different cutoff date or different rules to providing historic data? So we need to know at what point we are no longer able to rely on a service! -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Even though the rule currently in effect now is identical to a neighboring some, that doesn't invalidate the first zone. The main reason is for scheduling purposes where, going back in the past for example, the difference is now a matter of historical record and needs to be kept for software that care for such things such as calendars. For example, if your are in Argentina/San_Juan and your calendar has a meeting in the past (for legal reasons you are keeping track of your meetings) where the zones were different, the rule is still in force. Were the Argentina/San_Juan removed that meeting time would then move by 1h when applying Argentina/Buenos Aires in place. This is more obvious if you have a meeting where some attendees were in Argentina/Buenos Aires and some were Argentina/San_Juan. In that case, with the old timezone the meeting (say a conference call) occured at two different wall clock hours with the old rule (as it actually happened at the time) vs with the Argentina/Buenos Aires rule in place forces both meeting to have not occured at the same wall clock hours. The UTC never changed, just the wallclock but that's what we need the tzid to get right. If the meeting occured around midnight, it could even change day for some attendees! So to maintain historical accuracy the old rule cannot be taken out. On 23/01/2014 5:40 AM, Fabiano Fidêncio wrote:
Sorry, that makes no sense. How can a timezone ever become invalid? We do have old names, where a timezone has had the name we call it by changed - then we keep the old name forever, for backward compatibility (possibly except for a case where it existed such a short time with the wrong name that it would probably never have been used - but I don't think that case has ever arisen.)
Probably I'm using the wrong term. Let me try to explain with another example. Argentina/San_Juan. IIRC, it has a different timezone than the rest of Argentina for one year or so and then they switched back to the Buenos Aires timezone. So, what I mean by invalid is keep a different name when the differentiation doesn't exist anymore. Even in those cases
-- Oracle Email Signature Logo Patrice Scattolin | Principal Member Technical Staff | 514.905.8744 Oracle WebCenter Mobile applications 600 Blvd de Maisonneuve West Suite 1900 Montreal, Quebec
On Jan 23, 2014, at 9:08 AM, Patrice Scattolin <patrice.scattolin@oracle.com> wrote:
For example, if your are in Argentina/San_Juan and your calendar has a meeting in the past (for legal reasons you are keeping track of your meetings) where the zones were different, the rule is still in force. Were the Argentina/San_Juan removed that meeting time would then move by 1h when applying Argentina/Buenos Aires in place.
Exactly. I think the confusion is what the timezone data is used for. If you think it is ONLY used to display the correct current local time given a UTC time reference, then the notion of an “invalid” time zone makes sense. But in fact that’s only one small application. The bigger picture is that you have a lot of applications that store timestamps for events that may go back decades. Calendar is an example. File system time stamps, or source control timestamps, are another. It is essential that such timestamps are interpreted correctly. And to do that, you must have a database that tells you the UTC to local time mapping for all time stamp values of interest. For a lot of us, that means timestamps back to the Unix epoch (1970). For some, it means timestamps back a lot farther than that. paul
On Thu, Jan 23, 2014 at 7:33 PM, <Paul_Koning@dell.com> wrote:
On Jan 23, 2014, at 9:08 AM, Patrice Scattolin < patrice.scattolin@oracle.com> wrote:
For example, if your are in Argentina/San_Juan and your calendar has a meeting in the past (for legal reasons you are keeping track of your meetings) where the zones were different, the rule is still in force. Were the Argentina/San_Juan removed that meeting time would then move by 1h when applying Argentina/Buenos Aires in place.
Exactly. I think the confusion is what the timezone data is used for.
That' s exactly the confusion I've been doing. I have to say this is one of kindest ML I've ever been. Everything is clear now. I've checked that I can c&p the CLDR table inside of libical without legal issues and I'm going for this approach. Thank you so much guys. You rock! Best Regards, -- Fabiano Fidêncio
On Thu, Jan 23, 2014 at 5:40 AM, Fabiano Fidêncio <fabiano@fidencio.org>wrote:
The link in the GNOME bug goes to
http://msdn.microsoft.com/en-us/library/ms912391(v=winembedded.11).aspx
which is *NOT* a mapping between Microsoft's Time Zone Indexes and tz database tzids; it's a list of Time Zone Indexes, names for the zone in question, and time offsets and descriptions of the locale for the zone.
Some people might take the description of the locale for the name and use it to try to guess the tzid to which it maps, but, as far as I know, Microsoft makes no claim that the "Time" column on the page can be easily mapped to a tzid.
Yes, for sure they don't do this. Unfortunately we are forced to map to their format, getting the info from libical (which one gets the info from the zone.tab file).
AFAIK, Unicode CLDR maintains the Windows->tzid mapping, which also requires the country as input in order to narrow down which zone is intended. For example, the disambiguation between UTC+06:00 zones: http://unicode.org/cldr/trac/browser/tags/release-24/common/supplemental/win... -Andrew
If, however, Novosibirsk is 7 hours from GMT and Almaty is 6 hours from GMT, then Microsoft's table is wrong and needs to be updated to separate Almaty and Novosibirsk.
Hmmm. You're right and the only think I can do for now is open a bug in the Microsoft's bugzilla (or equivalent).
"N.Central Asia Standard Time" used to be displayed as "(GMT+06:00) Almaty, Novosibirsk", but "Almaty" was removed from the display name by December 2009 cumulative time zone update for Microsoft Windows operating systems [http://support.microsoft.com/kb/976098]. Later, the UTC offset of the zone had been changed to +7 by August 2011 update [ http://support.microsoft.com/kb/2570791]. So, although some old MSDN documentation may still have the out of date information, Windows time zone data is up-to-date for this one. I'm maintaining a mapping data between the IANA tzids and Windows time zones in the Unicode CLDR project and review the data about quarterly basis [ http://www.unicode.org/repos/cldr/trunk/common/supplemental/windowsZones.xml ]. It looks Windows 8.1 start providing an API returning an IANA Time Zone Database ID [ http://msdn.microsoft.com/en-us/library/windows/apps/windows.globalization.c... ], but I did not try it yet. -Yoshito
Yoshito,
I'm maintaining a mapping data between the IANA tzids and Windows time zones in the Unicode CLDR project and review the data about quarterly basis [ http://www.unicode.org/repos/cldr/trunk/common/supplemental/windowsZones.xml ].
I'd say you're doing an awesome work! Do you think I could use your work to improve libical regarding to timezone information? We are using libical in Evolution (and its friends) and, in evolution-ews, would be nice if we can have this mapping working properly. Another question is ... do you think cldr could replace libical? This is the first time I heard about the project ...
It looks Windows 8.1 start providing an API returning an IANA Time Zone Database ID [ http://msdn.microsoft.com/en-us/library/windows/apps/windows.globalization.c...], but I did not try it yet.
I am not able to open this link, unfortunately. ps: Yoshito, please, remember me to pay you a beer whenever we meet in person! Best Regards, -- Fabiano Fidêncio
On Thu, Jan 23, 2014 at 5:05 PM, Fabiano Fidêncio <fabiano@fidencio.org>wrote:
Yoshito,
I'm maintaining a mapping data between the IANA tzids and Windows time zones in the Unicode CLDR project and review the data about quarterly basis [ http://www.unicode.org/repos/cldr/trunk/common/supplemental/windowsZones.xml ].
I'd say you're doing an awesome work! Do you think I could use your work to improve libical regarding to timezone information? We are using libical in Evolution (and its friends) and, in evolution-ews, would be nice if we can have this mapping working properly. Another question is ... do you think cldr could replace libical? This is the first time I heard about the project
Ignore my last question, please :-) With a brief reading I got what cldr supposes to do (and what it doesn't, actually the most important part :-)) But, please, don't ignore the question about using your material to improve libical's timezone information Best Regards, -- Fabiano Fidêncio
I'm maintaining a mapping data between the IANA tzids and Windows time zones in the Unicode CLDR project and review the data about quarterly basis [ http://www.unicode.org/repos/cldr/trunk/common/supplemental/windowsZones.xml ].
-Yoshito
I've found a few differences between your mapping table and my tzdata [ http://paste.stg.fedoraproject.org/4440/90540445/]. I would be happy to provide it patch formatted, just let me know if it's of your interest. Moreover, what is the reason to have those kind of differences? Could it be different from distro to distro? Best Regards, -- Fabiano Fidêncio
I'm maintaining a mapping data between the IANA tzids and Windows time zones in the Unicode CLDR project and review the data about quarterly basis [http://www.unicode.org/repos/cldr/trunk/common/ supplemental/windowsZones.xml].
-Yoshito
I've found a few differences between your mapping table and my tzdata [ http://paste.stg.fedoraproject.org/4440/90540445/]. I would be happy to provide it patch formatted, just let me know if it's of your interest.
Moreover, what is the reason to have those kind of differences? Could it be different from distro to distro?
These are FAQs. I think CLDR project should provide a document about this. This document [ http://cldr.unicode.org/development/development-process/design-proposals/ext... ] is a little bit old, but you can see some useful background information about the mapping data. (Sorry for my crappy English) CLDR project defines stable set of tzids, because tzids are used also for locale data for display names. For example - TZ database: America/Argentina/Buenos_Aires vs. CLDR: America/Buenos_Aires In old versions of the tz database only had "America/Burnos_Aires" and CLDR project uses it as a key for localized display names for the zone. Later, the tz database reorganized Argentina zones, then "America/Buenos_Aires" was moved to backward file as below: Link America/Argentina/Buenos_Aires America/Buenos_Aires Because we don't want CLDR locale data files to change the key identifying the zone, we preserve America/Buenos_Aires as the canonical name of the zone, America/Argentina/Buenos_Aires is added as an alias. The mapping is defined in another place [ http://www.unicode.org/repos/cldr/trunk/common/bcp47/timezone.xml] <type name="arbue" description="Buenos Aires, Argentina" alias="America/Buenos_Aires America/Argentina/Buenos_Aires"/> In this file, first entry of alias is the canonical 'long' id for the zone in CLDR project, and remaining entries in alias attribute are its alias. That means, the consumer of windowsZones.xml needs to use this additional mapping data. For 'missing' data, such as "Australia/Lord_Howe" - is really unmappable. In the tz database, this zone is defined as Zone Australia/Lord_Howe 10:36:20 - LMT 1895 Feb 10:00 - EST 1981 Mar 10:30 LH LHST However, Windows does not have any zone using UTC+10:30 offset. I use a small tooling for maintaining the mapping data in the ICU project and I have exception data for such zones [ http://source.icu-project.org/repos/icu/icuapps/trunk/WinTZ/src/com/ibm/icu/... ]. I think the comments below explain why these are not included. /* * There are some Olson time zones that do not have the same base UTC offset in * Windows time zones. These zones are not supported by Windows. */ static final String[] NO_BASE_OFFSET_MATCH_ZONES_ARRAY = { "Australia/Eucla", // +8:45 "Australia/Lord_Howe", // +10:30 "Etc/GMT-14", // +14:00 "Pacific/Chatham", // +12:45 "Pacific/Kiritimati", // +14:00 "Pacific/Marquesas", // -9:30 "Pacific/Norfolk", // +11:30 }; /* * These Olson time zones are using different DST rules from Windows zones with * same base offset. */ static final String[] NO_DST_RULE_MATCH_ZONES_ARRAY = { // UTC-10:00/North American DST rule. // Closest match - "Hawaiian Standard Time" (no DST) "America/Adak", // UTC-08:00/no DST. // Closest match - "Pacific Standard Time" (observes DST). "Etc/GMT+8", "America/Metlakatla", "Pacific/Pitcairn", // UTC-09:00/no DST // Closest match - "Alaskan Standard Time" (observes DST). "Etc/GMT+9", "Pacific/Gambier", // UTC-06:00/Southern Hemisphere style DST rule. // Closest match - "Central America Standard Time" (observes Northern Hemisphere style DST rule). "Pacific/Easter", // UTC-03:00 zone with North American DST rule. // Closest match - "Greenland Standard Time" (observes EU DST rule). "America/Miquelon", // UTC+02:00 with DST (Mar - Sep). // Closest match - "E. Europe Standard Time", "Israel Standard Time" and some others "Asia/Gaza", "Asia/Hebron", }; There is a request to add 'unmappable' zones included in the data [ http://unicode.org/cldr/trac/ticket/5589] and I'm planning to work on this in near future. I don't want to hijack this ML for discussing CLDR specific implementation. If you have further questions, please post your question directly to the CLDR project. You can post your questions to CLDR user mailing list (cldr-users@unicode.org) [ http://www.unicode.org/consortium/distlist.html#cldr_list] or problem reports/new feature requests to the CLDR trac [ http://unicode.org/cldr/trac]. Thanks, Yoshito
Date: Thu, 23 Jan 2014 03:05:47 +0100 From: =?ISO-8859-1?Q?Fabiano_Fid=EAncio?= <fabiano@fidencio.org> Message-ID: <CAK9pz9+4Gu8ziAH-eAkQPCRwYRzVqYAcsYrf7ynFiN9Xt_zhCA@mail.gmail.com> | Looking at the zone.tab file I can see a few timezones that, although valid | in the past, they are not valid anymore. Sorry, that makes no sense. How can a timezone ever become invalid? We do have old names, where a timezone has had the name we call it by changed - then we keep the old name forever, for backward compatibility (possibly except for a case where it existed such a short time with the wrong name that it would probably never have been used - but I don't think that case has ever arisen.) But the only way a timezone could ever become invalid would be if it had never been valid in the first place. | I can pick as example, reported on | the GNOME Bugzilla[0], Asia/Novosibirsk, that used to be UTC+7, but | nowadays is considered the same one than Asia/Almaty, which one is UTC+6. Unless Asia/Almaty was always the same as Asia/Novosibirsk and both changed from UTC+7 to UTC+6 at the same time (having two identical zones is something that isn't terribly useful, but has happened, and where it is found, we tend to make one of them just be an alias for the other - just the same as if had first been called one name and then the other). But if that's not the case, and I believe here it isn't, then how would you translate an old timestamp from Asia/Novosibirsk (from when it was still UTC+7) if the rules you're using are those for Asia/Almaty which is & was (at the relevant time) UTC+6 ? | How is the rule about keep or remove these not-used-anymore timezones? They are never removed. The bug is the "not used anymore", which is the wrong thing to do, and most likely results in incorrect conversions. | Would you guys accept a patch removing the not-used-anymore ones? I suspect not. | What kind of information should I provide/get to be sure that one timezone | could be removed? You could find evidence that the data we have (historical data) is in fact wrong, and that Asia/Novosibirsk and Asia/Almaty (for the example you gave, the same would apply in any other case) have in fact always used the same timezone, at least since 1970 (but the further back in time they have been the same, the better) and that once corrected, the two zones would be identical. But even then neither will be completely removed, one might just be made into an alias of the other (and no guarantee that would even happen - it would also depend upon the projected likelihood of the two cities remaining on the same timezone for the projected future.) kre
participants (10)
-
Andrew Paprocki -
Fabiano Fidêncio -
Guy Harris -
Lester Caine -
Patrice Scattolin -
Paul Eggert -
Paul_Koning@Dell.com -
Robert Elz -
Tim Parenti -
yoshito_umaoka@us.ibm.com