[PATCH] Replace some zones with links when that doesn't lose useful info.
This is like the 2013-09-02 change, but for the Pacific this time. * antarctica (Antarctica/DumontDUrville): * australasia (Pacific/Tahiti, Pacific/Saipan, Pacific/Majuro) (Pacific/Chuuk, Pacific/Pohnpei, Pacific/Palau, Pacific/Funafuti) (Pacific/Midway, Pacific/Wake, Pacific/Wallis): Remove zone, replacing each with a link to a region that has had the same UTC offset since 1970. This removes data that were largely invented, either by us or almost surely by Shanks or his sources. * asia, australasia, northamerica: Create links accordingly. * NEWS: Document this. --- NEWS | 9 +++++++++ antarctica | 7 +------ asia | 3 +++ australasia | 53 +++++++++++++++++++---------------------------------- northamerica | 1 + 5 files changed, 33 insertions(+), 40 deletions(-) diff --git a/NEWS b/NEWS index 82c83c9..222f91e 100644 --- a/NEWS +++ b/NEWS @@ -58,6 +58,15 @@ Unreleased, experimental changes UTC+6 and not UTC+8. (Thanks to Luther Ma and to Alois Treindl; Treindl sent helpful translations of two papers by Guo Qingsheng.) + Some zones have been turned into links, when they differed from existing + zones only for older UTC offsets where the data were likely invented. + These changes affect UTC offsets in pre-1970 time stamps only. This is + similar to the change in release 2013e, except this time for the Pacific. + The affected zones are: Antarctica/DumontDUrville, Pacific/Chuuk, + Pacific/Funafuti, Pacific/Majuro, Pacific/Midway, Pacific/Palau, + Pacific/Pohnpei, Pacific/Saipan, Pacific/Tahiti, Pacific/Wake, and + Pacific/Wallis. + Asia/Shanghai's pre-standard-time UT offset has been changed from 8:05:57 to 8:05:43, the location of Xujiahui Observatory. Its transition to standard time has been changed from 1928 to 1901. diff --git a/antarctica b/antarctica index 9e9d118..16aedaa 100644 --- a/antarctica +++ b/antarctica @@ -180,15 +180,10 @@ Zone Indian/Kerguelen 0 - zzz 1950 # Port-aux-Français # year-round base in the main continent # Dumont d'Urville, Île des Pétrels, -6640+14001, since 1956-11 # <http://en.wikipedia.org/wiki/Dumont_d'Urville_Station> (2005-12-05) +# See Pacific/Port_Moresby. # # Another base at Port-Martin, 50km east, began operation in 1947. # It was destroyed by fire on 1952-01-14. -# -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Antarctica/DumontDUrville 0 - zzz 1947 - 10:00 - PMT 1952 Jan 14 # Port-Martin Time - 0 - zzz 1956 Nov - 10:00 - DDUT # Dumont-d'Urville Time # France & Italy - year-round base # Concordia, -750600+1232000, since 2005 diff --git a/asia b/asia index 4d9faeb..68994d5 100644 --- a/asia +++ b/asia @@ -1432,6 +1432,9 @@ Zone Asia/Tokyo 9:18:59 - LMT 1887 Dec 31 15:00u 9:00 Japan J%sT # Since 1938, all Japanese possessions have been like Asia/Tokyo. +# The following region has been like Asia/Tokyo since at least 1970. +Link Asia/Tokyo Pacific/Palau + # Jordan # # From <http://star.arabia.com/990701/JO9.html> diff --git a/australasia b/australasia index 605d8dd..cd9c3e4 100644 --- a/australasia +++ b/australasia @@ -355,8 +355,7 @@ Zone Pacific/Gambier -8:59:48 - LMT 1912 Oct # Rikitea -9:00 - GAMT # Gambier Time Zone Pacific/Marquesas -9:18:00 - LMT 1912 Oct -9:30 - MART # Marquesas Time -Zone Pacific/Tahiti -9:58:16 - LMT 1912 Oct # Papeete - -10:00 - TAHT # Tahiti Time +# See Pacific/Honolulu for Tahiti. # Clipperton (near North America) is administered from French Polynesia; # it is uninhabited. @@ -366,11 +365,16 @@ Zone Pacific/Guam -14:21:00 - LMT 1844 Dec 31 9:39:00 - LMT 1901 # Agana 10:00 - GST 2000 Dec 23 # Guam 10:00 - ChST # Chamorro Standard Time +Link Pacific/Guam Pacific/Saipan # N Mariana Is # Kiribati # Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone Pacific/Tarawa 11:32:04 - LMT 1901 # Bairiki 12:00 - GILT # Gilbert Is Time +Link Pacific/Tarawa Pacific/Funafuti # Tuvalu +Link Pacific/Tarawa Pacific/Majuro # Marshall Is (most locations) +Link Pacific/Tarawa Pacific/Wake # in US minor outlying islands +Link Pacific/Tarawa Pacific/Wallis # Wallis and Futuna Zone Pacific/Enderbury -11:24:20 - LMT 1901 -12:00 - PHOT 1979 Oct # Phoenix Is Time -11:00 - PHOT 1995 @@ -381,29 +385,20 @@ Zone Pacific/Kiritimati -10:29:20 - LMT 1901 14:00 - LINT # N Mariana Is -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Pacific/Saipan -14:17:00 - LMT 1844 Dec 31 - 9:43:00 - LMT 1901 - 9:00 - MPT 1969 Oct # N Mariana Is Time - 10:00 - MPT 2000 Dec 23 - 10:00 - ChST # Chamorro Standard Time +# See Pacific/Guam. # Marshall Is +# See Pacific/Tarawa for most locations. # Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Pacific/Majuro 11:24:48 - LMT 1901 - 11:00 - MHT 1969 Oct # Marshall Islands Time - 12:00 - MHT Zone Pacific/Kwajalein 11:09:20 - LMT 1901 11:00 - MHT 1969 Oct -12:00 - KWAT 1993 Aug 20 # Kwajalein Time 12:00 - MHT # Micronesia +# See Pacific/Guadalcanal for Pohnpei. +# See Pacific/Port_Moresby for Chuuk. # Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Pacific/Chuuk 10:07:08 - LMT 1901 - 10:00 - CHUT # Chuuk Time -Zone Pacific/Pohnpei 10:32:52 - LMT 1901 # Kolonia - 11:00 - PONT # Pohnpei Time Zone Pacific/Kosrae 10:51:56 - LMT 1901 11:00 - KOST 1969 Oct # Kosrae Time 12:00 - KOST 1999 @@ -509,15 +504,15 @@ Zone Pacific/Norfolk 11:11:52 - LMT 1901 # Kingston 11:30 - NFT # Norfolk Time # Palau (Belau) -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Pacific/Palau 8:57:56 - LMT 1901 # Koror - 9:00 - PWT # Palau Time +# See Asia/Tokyo. # Papua New Guinea # Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone Pacific/Port_Moresby 9:48:40 - LMT 1880 9:48:32 - PMMT 1895 # Port Moresby Mean Time 10:00 - PGT # Papua New Guinea Time +Link Pacific/Port_Moresby Antarctica/DumontDUrville +Link Pacific/Port_Moresby Pacific/Chuuk # in Micronesia # Pitcairn # Zone NAME GMTOFF RULES FORMAT [UNTIL] @@ -532,6 +527,7 @@ Zone Pacific/Pago_Pago 12:37:12 - LMT 1879 Jul 5 -11:00 - NST 1967 Apr # N=Nome -11:00 - BST 1983 Nov 30 # B=Bering -11:00 - SST # S=Samoa +Link Pacific/Pago_Pago Pacific/Midway # in US minor outlying islands # Samoa (formerly and also known as Western Samoa) @@ -619,6 +615,7 @@ Zone Pacific/Apia 12:33:04 - LMT 1879 Jul 5 # Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone Pacific/Guadalcanal 10:39:48 - LMT 1912 Oct # Honiara 11:00 - SBT # Solomon Is Time +Link Pacific/Guadalcanal Pacific/Pohnpei # in Micronesia # Tokelau Is # @@ -657,9 +654,7 @@ Zone Pacific/Tongatapu 12:19:20 - LMT 1901 13:00 Tonga TO%sT # Tuvalu -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Pacific/Funafuti 11:56:52 - LMT 1901 - 12:00 - TVT # Tuvalu Time +# See Pacific/Tarawa. # US minor outlying islands @@ -724,21 +719,13 @@ Zone Pacific/Funafuti 11:56:52 - LMT 1901 # Fri. 6:30A Lv. HONOLOLU (Pearl Harbor), H.I. H.L.T. Ar. 5:30P Sun. # " 3:00P Ar. MIDWAY ISLAND . . . . . . . . . M.L.T. Lv. 6:00A " # -Zone Pacific/Midway -11:49:28 - LMT 1901 - -11:00 - NST 1956 Jun 3 - -11:00 1:00 NDT 1956 Sep 2 - -11:00 - NST 1967 Apr # N=Nome - -11:00 - BST 1983 Nov 30 # B=Bering - -11:00 - SST # S=Samoa +# See Pacific/Pago_Pago. # Palmyra # uninhabited since World War II; was probably like Pacific/Kiritimati # Wake -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Pacific/Wake 11:06:28 - LMT 1901 - 12:00 - WAKT # Wake Time - +# See Pacific/Tarawa. # Vanuatu # Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S @@ -753,9 +740,7 @@ Zone Pacific/Efate 11:13:16 - LMT 1912 Jan 13 # Vila 11:00 Vanuatu VU%sT # Vanuatu Time # Wallis and Futuna -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Pacific/Wallis 12:15:20 - LMT 1901 - 12:00 - WFT # Wallis & Futuna Time +# See Pacific/Tarawa. ############################################################################### diff --git a/northamerica b/northamerica index 66f362b..842f6e4 100644 --- a/northamerica +++ b/northamerica @@ -592,6 +592,7 @@ Zone Pacific/Honolulu -10:31:26 - LMT 1896 Jan 13 12:00 #Schmitt&Cox -10:00 - HST Link Pacific/Honolulu Pacific/Johnston +Link Pacific/Honolulu Pacific/Tahiti # in French Polynesia # Now we turn to US areas that have diverged from the consensus since 1970. -- 1.9.1
As we have discussed before, changes like this are disruptive to many people. The LMT and early time values are used out there, irrespective of their accuracy. Replacing them with values that are no longer remotely relevant to the location in question is just plain wrong. Stephen On 9 July 2014 07:49, Paul Eggert <eggert@cs.ucla.edu> wrote:
This is like the 2013-09-02 change, but for the Pacific this time. * antarctica (Antarctica/DumontDUrville): * australasia (Pacific/Tahiti, Pacific/Saipan, Pacific/Majuro) (Pacific/Chuuk, Pacific/Pohnpei, Pacific/Palau, Pacific/Funafuti) (Pacific/Midway, Pacific/Wake, Pacific/Wallis): Remove zone, replacing each with a link to a region that has had the same UTC offset since 1970. This removes data that were largely invented, either by us or almost surely by Shanks or his sources. * asia, australasia, northamerica: Create links accordingly. * NEWS: Document this. --- NEWS | 9 +++++++++ antarctica | 7 +------ asia | 3 +++ australasia | 53 +++++++++++++++++++---------------------------------- northamerica | 1 + 5 files changed, 33 insertions(+), 40 deletions(-)
diff --git a/NEWS b/NEWS index 82c83c9..222f91e 100644 --- a/NEWS +++ b/NEWS @@ -58,6 +58,15 @@ Unreleased, experimental changes UTC+6 and not UTC+8. (Thanks to Luther Ma and to Alois Treindl; Treindl sent helpful translations of two papers by Guo Qingsheng.)
+ Some zones have been turned into links, when they differed from existing + zones only for older UTC offsets where the data were likely invented. + These changes affect UTC offsets in pre-1970 time stamps only. This is + similar to the change in release 2013e, except this time for the Pacific. + The affected zones are: Antarctica/DumontDUrville, Pacific/Chuuk, + Pacific/Funafuti, Pacific/Majuro, Pacific/Midway, Pacific/Palau, + Pacific/Pohnpei, Pacific/Saipan, Pacific/Tahiti, Pacific/Wake, and + Pacific/Wallis. + Asia/Shanghai's pre-standard-time UT offset has been changed from 8:05:57 to 8:05:43, the location of Xujiahui Observatory. Its transition to standard time has been changed from 1928 to 1901. diff --git a/antarctica b/antarctica index 9e9d118..16aedaa 100644 --- a/antarctica +++ b/antarctica @@ -180,15 +180,10 @@ Zone Indian/Kerguelen 0 - zzz 1950 # Port-aux-Français # year-round base in the main continent # Dumont d'Urville, Île des Pétrels, -6640+14001, since 1956-11 # <http://en.wikipedia.org/wiki/Dumont_d'Urville_Station> (2005-12-05) +# See Pacific/Port_Moresby. # # Another base at Port-Martin, 50km east, began operation in 1947. # It was destroyed by fire on 1952-01-14. -# -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Antarctica/DumontDUrville 0 - zzz 1947 - 10:00 - PMT 1952 Jan 14 # Port-Martin Time - 0 - zzz 1956 Nov - 10:00 - DDUT # Dumont-d'Urville Time
# France & Italy - year-round base # Concordia, -750600+1232000, since 2005 diff --git a/asia b/asia index 4d9faeb..68994d5 100644 --- a/asia +++ b/asia @@ -1432,6 +1432,9 @@ Zone Asia/Tokyo 9:18:59 - LMT 1887 Dec 31 15:00u 9:00 Japan J%sT # Since 1938, all Japanese possessions have been like Asia/Tokyo.
+# The following region has been like Asia/Tokyo since at least 1970. +Link Asia/Tokyo Pacific/Palau + # Jordan # # From <http://star.arabia.com/990701/JO9.html> diff --git a/australasia b/australasia index 605d8dd..cd9c3e4 100644 --- a/australasia +++ b/australasia @@ -355,8 +355,7 @@ Zone Pacific/Gambier -8:59:48 - LMT 1912 Oct # Rikitea -9:00 - GAMT # Gambier Time Zone Pacific/Marquesas -9:18:00 - LMT 1912 Oct -9:30 - MART # Marquesas Time -Zone Pacific/Tahiti -9:58:16 - LMT 1912 Oct # Papeete - -10:00 - TAHT # Tahiti Time +# See Pacific/Honolulu for Tahiti. # Clipperton (near North America) is administered from French Polynesia; # it is uninhabited.
@@ -366,11 +365,16 @@ Zone Pacific/Guam -14:21:00 - LMT 1844 Dec 31 9:39:00 - LMT 1901 # Agana 10:00 - GST 2000 Dec 23 # Guam 10:00 - ChST # Chamorro Standard Time +Link Pacific/Guam Pacific/Saipan # N Mariana Is
# Kiribati # Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone Pacific/Tarawa 11:32:04 - LMT 1901 # Bairiki 12:00 - GILT # Gilbert Is Time +Link Pacific/Tarawa Pacific/Funafuti # Tuvalu +Link Pacific/Tarawa Pacific/Majuro # Marshall Is (most locations) +Link Pacific/Tarawa Pacific/Wake # in US minor outlying islands +Link Pacific/Tarawa Pacific/Wallis # Wallis and Futuna Zone Pacific/Enderbury -11:24:20 - LMT 1901 -12:00 - PHOT 1979 Oct # Phoenix Is Time -11:00 - PHOT 1995 @@ -381,29 +385,20 @@ Zone Pacific/Kiritimati -10:29:20 - LMT 1901 14:00 - LINT
# N Mariana Is -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Pacific/Saipan -14:17:00 - LMT 1844 Dec 31 - 9:43:00 - LMT 1901 - 9:00 - MPT 1969 Oct # N Mariana Is Time - 10:00 - MPT 2000 Dec 23 - 10:00 - ChST # Chamorro Standard Time +# See Pacific/Guam.
# Marshall Is +# See Pacific/Tarawa for most locations. # Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Pacific/Majuro 11:24:48 - LMT 1901 - 11:00 - MHT 1969 Oct # Marshall Islands Time - 12:00 - MHT Zone Pacific/Kwajalein 11:09:20 - LMT 1901 11:00 - MHT 1969 Oct -12:00 - KWAT 1993 Aug 20 # Kwajalein Time 12:00 - MHT
# Micronesia +# See Pacific/Guadalcanal for Pohnpei. +# See Pacific/Port_Moresby for Chuuk. # Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Pacific/Chuuk 10:07:08 - LMT 1901 - 10:00 - CHUT # Chuuk Time -Zone Pacific/Pohnpei 10:32:52 - LMT 1901 # Kolonia - 11:00 - PONT # Pohnpei Time Zone Pacific/Kosrae 10:51:56 - LMT 1901 11:00 - KOST 1969 Oct # Kosrae Time 12:00 - KOST 1999 @@ -509,15 +504,15 @@ Zone Pacific/Norfolk 11:11:52 - LMT 1901 # Kingston 11:30 - NFT # Norfolk Time
# Palau (Belau) -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Pacific/Palau 8:57:56 - LMT 1901 # Koror - 9:00 - PWT # Palau Time +# See Asia/Tokyo.
# Papua New Guinea # Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone Pacific/Port_Moresby 9:48:40 - LMT 1880 9:48:32 - PMMT 1895 # Port Moresby Mean Time 10:00 - PGT # Papua New Guinea Time +Link Pacific/Port_Moresby Antarctica/DumontDUrville +Link Pacific/Port_Moresby Pacific/Chuuk # in Micronesia
# Pitcairn # Zone NAME GMTOFF RULES FORMAT [UNTIL] @@ -532,6 +527,7 @@ Zone Pacific/Pago_Pago 12:37:12 - LMT 1879 Jul 5 -11:00 - NST 1967 Apr # N=Nome -11:00 - BST 1983 Nov 30 # B=Bering -11:00 - SST # S=Samoa +Link Pacific/Pago_Pago Pacific/Midway # in US minor outlying islands
# Samoa (formerly and also known as Western Samoa)
@@ -619,6 +615,7 @@ Zone Pacific/Apia 12:33:04 - LMT 1879 Jul 5 # Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone Pacific/Guadalcanal 10:39:48 - LMT 1912 Oct # Honiara 11:00 - SBT # Solomon Is Time +Link Pacific/Guadalcanal Pacific/Pohnpei # in Micronesia
# Tokelau Is # @@ -657,9 +654,7 @@ Zone Pacific/Tongatapu 12:19:20 - LMT 1901 13:00 Tonga TO%sT
# Tuvalu -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Pacific/Funafuti 11:56:52 - LMT 1901 - 12:00 - TVT # Tuvalu Time +# See Pacific/Tarawa.
# US minor outlying islands @@ -724,21 +719,13 @@ Zone Pacific/Funafuti 11:56:52 - LMT 1901 # Fri. 6:30A Lv. HONOLOLU (Pearl Harbor), H.I. H.L.T. Ar. 5:30P Sun. # " 3:00P Ar. MIDWAY ISLAND . . . . . . . . . M.L.T. Lv. 6:00A " # -Zone Pacific/Midway -11:49:28 - LMT 1901 - -11:00 - NST 1956 Jun 3 - -11:00 1:00 NDT 1956 Sep 2 - -11:00 - NST 1967 Apr # N=Nome - -11:00 - BST 1983 Nov 30 # B=Bering - -11:00 - SST # S=Samoa +# See Pacific/Pago_Pago.
# Palmyra # uninhabited since World War II; was probably like Pacific/Kiritimati
# Wake -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Pacific/Wake 11:06:28 - LMT 1901 - 12:00 - WAKT # Wake Time - +# See Pacific/Tarawa.
# Vanuatu # Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S @@ -753,9 +740,7 @@ Zone Pacific/Efate 11:13:16 - LMT 1912 Jan 13 # Vila 11:00 Vanuatu VU%sT # Vanuatu Time
# Wallis and Futuna -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone Pacific/Wallis 12:15:20 - LMT 1901 - 12:00 - WFT # Wallis & Futuna Time +# See Pacific/Tarawa.
###############################################################################
diff --git a/northamerica b/northamerica index 66f362b..842f6e4 100644 --- a/northamerica +++ b/northamerica @@ -592,6 +592,7 @@ Zone Pacific/Honolulu -10:31:26 - LMT 1896 Jan 13 12:00 #Schmitt&Cox -10:00 - HST
Link Pacific/Honolulu Pacific/Johnston +Link Pacific/Honolulu Pacific/Tahiti # in French Polynesia
# Now we turn to US areas that have diverged from the consensus since 1970.
-- 1.9.1
Stephen Colebourne wrote:
As we have discussed before, changes like this are disruptive to many people. The LMT and early time values are used out there, irrespective of their accuracy.
It was discussed at some length, and to some extent we're just repeating that discussion now. There's one thing new, though: we now have had significant practical experience. The earlier set of changes along these lines was published in release 2013e (2013-09-19), and it hasn't caused significant disruption in the field. In practice it seems that end users don't much care about things like the time zone of Guadeloupe in 1899 -- which is probably a good thing, since the pre-2013e database was wrong anyway. I understand that simplifying away spurious pre-1970 timestamps can cause more work for detailed regression testing, but that kind of work should be routine as the tz database is always mutating for other reasons anyway. Besides, the tail should not be wagging the dog here: regression testing should be our servant, not our master.
On 9 July 2014 16:16, Paul Eggert <eggert@cs.ucla.edu> wrote:
Stephen Colebourne wrote:
As we have discussed before, changes like this are disruptive to many people. The LMT and early time values are used out there, irrespective of their accuracy.
It was discussed at some length, and to some extent we're just repeating that discussion now. There's one thing new, though: we now have had significant practical experience. The earlier set of changes along these lines was published in release 2013e (2013-09-19), and it hasn't caused significant disruption in the field. In practice it seems that end users don't much care about things like the time zone of Guadeloupe in 1899 -- which is probably a good thing, since the pre-2013e database was wrong anyway.
Well it seems that you're going to make the change no matter what, so talking about it does feel rather futile. Its no surprise that changing relatively minor locations results in few issues. Nor is it a surprise that this list wouldn't hear of any issues, because its so far from end users. I can guarantee that the data you have and are planning to destroy is in somebodies database somewhere as both Joda-Time and Java SE 8 expose the full data right back to and including LMT to all users for all zones. As I said last time, the issue isn't removing wrong values, its about replacing them with even more wrong ones. The LMT values in particular, which used to have some meaning, now do not. If you had proposed an alternate way to define LMT values (which were based on the actual city location) then I might be less frustrated. Instead we are back to taking a sledgehammer to data that wasn't causing anyone any harm, replacing it with data that is clearly worse. <shakes head in despair> Stephen
On Wed, 9 Jul 2014, Stephen Colebourne wrote:
On 9 July 2014 16:16, Paul Eggert <eggert@cs.ucla.edu> wrote:
Stephen Colebourne wrote:
As we have discussed before, changes like this are disruptive to many people. The LMT and early time values are used out there, irrespective of their accuracy.
It was discussed at some length, and to some extent we're just repeating that discussion now. There's one thing new, though: we now have had significant practical experience. The earlier set of changes along these lines was published in release 2013e (2013-09-19), and it hasn't caused significant disruption in the field. In practice it seems that end users don't much care about things like the time zone of Guadeloupe in 1899 -- which is probably a good thing, since the pre-2013e database was wrong anyway.
Well it seems that you're going to make the change no matter what, so talking about it does feel rather futile. Its no surprise that changing relatively minor locations results in few issues. Nor is it a surprise that this list wouldn't hear of any issues, because its so far from end users. I can guarantee that the data you have and are planning to destroy is in somebodies database somewhere as both Joda-Time and Java SE 8 expose the full data right back to and including LMT to all users for all zones.
As I said last time, the issue isn't removing wrong values, its about replacing them with even more wrong ones. The LMT values in particular, which used to have some meaning, now do not. If you had proposed an alternate way to define LMT values (which were based on the actual city location) then I might be less frustrated. Instead we are back to taking a sledgehammer to data that wasn't causing anyone any harm, replacing it with data that is clearly worse.
Let me just voice my agreement to this statement. cheers, Derick -- http://derickrethans.nl | http://xdebug.org Like Xdebug? Consider a donation: http://xdebug.org/donate.php twitter: @derickr and @xdebug Posted with an email client that doesn't mangle email: alpine
I also agree with Stephen and Derick on this Database records, in any database, should not be merged just because they contain the same data. My feeling is that the relationship of the records in question is coincidental, and therefore the records should remain separate. On 2014-07-10 11:41, Derick Rethans wrote:
On Wed, 9 Jul 2014, Stephen Colebourne wrote:
On 9 July 2014 16:16, Paul Eggert <eggert@cs.ucla.edu> wrote:
Stephen Colebourne wrote:
As we have discussed before, changes like this are disruptive to many people. The LMT and early time values are used out there, irrespective of their accuracy. It was discussed at some length, and to some extent we're just repeating that discussion now. There's one thing new, though: we now have had significant practical experience. The earlier set of changes along these lines was published in release 2013e (2013-09-19), and it hasn't caused significant disruption in the field. In practice it seems that end users don't much care about things like the time zone of Guadeloupe in 1899 -- which is probably a good thing, since the pre-2013e database was wrong anyway. Well it seems that you're going to make the change no matter what, so talking about it does feel rather futile. Its no surprise that changing relatively minor locations results in few issues. Nor is it a surprise that this list wouldn't hear of any issues, because its so far from end users. I can guarantee that the data you have and are planning to destroy is in somebodies database somewhere as both Joda-Time and Java SE 8 expose the full data right back to and including LMT to all users for all zones.
As I said last time, the issue isn't removing wrong values, its about replacing them with even more wrong ones. The LMT values in particular, which used to have some meaning, now do not. If you had proposed an alternate way to define LMT values (which were based on the actual city location) then I might be less frustrated. Instead we are back to taking a sledgehammer to data that wasn't causing anyone any harm, replacing it with data that is clearly worse. Let me just voice my agreement to this statement.
cheers, Derick
--
On Jul 10, 2014, at 12:35 PM, David Patte ₯ <dpatte@relativedata.com> wrote:
I also agree with Stephen and Derick on this
Database records, in any database, should not be merged just because they contain the same data. My feeling is that the relationship of the records in question is coincidental, and therefore the records should remain separate.
Agreed. Optimization should be done as a post-processing step. Keep all the source records distinct. After they have been turned into their final internal representation for the system that uses the data — for example, “zic” output files in Unix systems — then and only then look for identical bits, and turn things into links. In our system, I’ve done exactly that. This is the right place partly because it’s automatic, it depends on data comparison rather than human effort attempting to keep explicit links up to date. And in addition, it is the right place because we use abbreviated data (we keep only zone data from 2001 up) so we end up with a very large number of matches that would not be there if you look at the source records. paul
David Patte ₯ wrote:
the relationship of the records in question is coincidental, and therefore the records should remain separate
This is a good point, and in some cases in hindsight I went too far in merging zones that were coincidentally the same (e.g., merging Antarctica/Syowa with Africa/Nairobi), so I'll undo those changes soon. However, in many cases the relationship is not coincidental, as we're talking about most-likely-invented data from a single source, and for these cases we can safely discard the spurious data. This is not a question about optimizing the implementation; it's a question of having the tz data reflect our knowledge as accurately as the current format allows. Currently the tz database pretends to know more than it does, and to the extent that we can make it more honest and accurate we should do so.
On Jul 10, 2014, at 2:39 PM, Paul Eggert <eggert@CS.UCLA.EDU> wrote:
David Patte ₯ wrote:
the relationship of the records in question is coincidental, and therefore the records should remain separate
This is a good point, and in some cases in hindsight I went too far in merging zones that were coincidentally the same (e.g., merging Antarctica/Syowa with Africa/Nairobi), so I'll undo those changes soon. However, in many cases the relationship is not coincidental, as we're talking about most-likely-invented data from a single source, and for these cases we can safely discard the spurious data.
This is not a question about optimizing the implementation; it's a question of having the tz data reflect our knowledge as accurately as the current format allows. Currently the tz database pretends to know more than it does, and to the extent that we can make it more honest and accurate we should do so.
This makes a lot of sense. The TZ source is a cultural/legal data repository, so it should group things together if they are the same thing in those worlds. paul
On Wed, Jul 09, 2014 at 08:16:12AM -0700, Paul Eggert <eggert@cs.ucla.edu> wrote:
repeating that discussion now. There's one thing new, though: we now have had significant practical experience. The earlier set of changes along these lines was published in release 2013e (2013-09-19), and it hasn't caused significant disruption in the
Something else is new, too: in the past, changes didn't have to cause "significant disruption in the field" to not be done. For example, a lot of care was used to make the tzcode portable even to weird and niche systems that, in many cases, probably don't even exist. You could doubtlessly rip out a lot of code without causing "significant disruption in the field" (if it works on gnu/linux, bsd and solaris, then failure probably isn't signficant...). The question is, why have these standards changed so drastically w.r.t. tzdata?
field. In practice it seems that end users don't much care about things like the time zone of Guadeloupe in 1899 -- which is probably a good thing, since the pre-2013e database was wrong anyway.
I think almost everybody agrees to that.
other reasons anyway. Besides, the tail should not be wagging the dog here: regression testing should be our servant, not our master.
You are shooting down your own strawmen here. In previous discussions, most people seem to have been concerned about stability of timestamps. not about accuracy, correctness or regression testing. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schmorp@schmorp.de -=====/_/_//_/\_,_/ /_/\_\
On 10 July 2014 21:20, Marc Lehmann <schmorp@schmorp.de> wrote:
field. In practice it seems that end users don't much care about things like the time zone of Guadeloupe in 1899 -- which is probably a good thing, since the pre-2013e database was wrong anyway.
I think almost everybody agrees to that.
One major difference I noticed when comparing the recent Pacific <https://github.com/eggert/tz/commit/662016b64ae64bc8d2680f349d1a7813dc1ee536> and Africa <https://github.com/eggert/tz/commit/4b4e789d5c5ee79366b4606d139cbb9eb1d5a28d> changes to the September 2013 changes in the Caribbean <https://github.com/eggert/tz/commit/c7bc2f8e1a65146a920977f297524e25df642674> is that, in the previous changes, the affected and coalesced zones were geographically close, and in all but one case, the most recent information lost was from 1912 and was only LMT. Needless to say, it's hard to sustain serious objection to such relatively minor changes, since as we know, most end-users don't really care about 100-year-old timestamps. In contrast, the removals from the Pacific and Africa are sweeping and diverse. The coalesced zones are geographically disparate and many have differences as recent as the late 1960s that have completely been removed. Additionally, I find it ironic that after a series of patches clarifying abbreviations, including the proper use of JCST versus JST, it has been decided that many are to be thrown out, and that we should link Palau into JST. I have little doubt that many of these, even those we invented, are in use... and at the very least, greater care should be taken to ensure that we aren't giving an "unfriendly" abbreviation to a region. Whatever happened to having one zone per ISO-coded region? Yes, technically, we have all those zones, often as links, but I think there's something to be said for decoupling these and letting each region remain independent, as they truly are. (For those genuinely concerned about the extra space requirements, I'd recommend that links conditional upon a winnowing threshold could be a feature that we should consider building into zic to achieve that end.) Although we have never claimed to be an authoritative compendium of all timekeeping facts, tz and its associated commentary have been highly regarded in the past for this purpose. It is a shame to see some of this historical documentation discarded for the sake of expediency that is marginal at best. Let's not forget that our code is art, and that art is often messy. -- Tim Parenti
Tim Parenti wrote:
Whatever happened to having one zone per ISO-coded region?
That's never been the guideline -- all we've had per country code is at least one entry, which may be a Link, not a Zone. Africa/Juba is a recent example; Europe/Vatican an older one. I hope your other specific concerns should be addressed by the partial-reverts that I'll propose soon. More generally, though, I fear that the data in question -- and I'm including the entries in the 1950-1970 range -- are so bogus that we do not do our users favors by propagating them.
Marc Lehmann wrote:
why have these standards changed so drastically w.r.t. tzdata?
The standards haven't changed. Nor are the proposed changes "drastic"; they're merely nibbling around some rarely-explored edges (once the problems already pointed out have been fixed). What *has* changed is my understanding of how spurious the old Shanks-based data entries are. They're really bad. I can't prove it, but I have the strong impression that most of the entries that we haven't already checked were simply invented. If I'd known how bad they were I would not have put them there in the first place.
most people seem to have been concerned about stability of timestamps. not about accuracy, correctness or regression testing.
I doubt whether there's a genuine lack of concern about accuracy or correctness. On the contrary, I think we all want the data to be as accurate and correct as it can be. The only dispute here is how much stability trumps these other concerns.
I can't prove it, but I have the strong impression that most of the entries that we haven't already checked were simply invented.
I think that's the key there. I realize the accuracy of Shanks has come into question lately, but I think that until data is proven bogus, stability trumps this suspicion. Unfortunately, we all know how difficult it can be to prove OR disprove this data. -- Tim Parenti sent from my Android phone
On Thu, 10 Jul 2014, Paul Eggert wrote:
I doubt whether there's a genuine lack of concern about accuracy or correctness. On the contrary, I think we all want the data to be as accurate and correct as it can be. The only dispute here is how much stability trumps these other concerns.
Here's where I'd draw the line: If we obtain new information with higher assurance than the old information (or guesses), then we should update the database with the new information, for the sake of accuracy. If we believe that the old information has low assurance, but we do not have higher assurance information to replace it, then we should leave the old information in place, for the sake of stability. For example, if you believe that the date that a zone switched from LMT to a standard time with a "round number" offset from UT is just a guess with low assurance, then leave it alone, until better information becomes available; don't link the zone to another zone that was similar but had a different "guess" for the data of the switch from LMT to a standard time. If you learn better information, then adjust the database accordingly. --apb (Alan Barrett)
On 11/07/14 08:04, Alan Barrett wrote:
On Thu, 10 Jul 2014, Paul Eggert wrote:
I doubt whether there's a genuine lack of concern about accuracy or correctness. On the contrary, I think we all want the data to be as accurate and correct as it can be. The only dispute here is how much stability trumps these other concerns.
Here's where I'd draw the line: If we obtain new information with higher assurance than the old information (or guesses), then we should update the database with the new information, for the sake of accuracy. If we believe that the old information has low assurance, but we do not have higher assurance information to replace it, then we should leave the old information in place, for the sake of stability.
For example, if you believe that the date that a zone switched from LMT to a standard time with a "round number" offset from UT is just a guess with low assurance, then leave it alone, until better information becomes available; don't link the zone to another zone that was similar but had a different "guess" for the data of the switch from LMT to a standard time. If you learn better information, then adjust the database accordingly.
I had thought there was an agreement that accurate historic data would be retained when the previous cull was proposed even though the 1970 limit is used? There are still documented historic updates which have not been included, and I hope that moving forward as other archives become available, proven historic data will be incorporated? Some of the 'invented' material IS of concern, but a portion of that is simply a matter of when a country started using timezones. Exact dates may not be available but that some historic changes happened is a fact? We just may never be able to confirm a proven date? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Lester Caine wrote:
I had thought there was an agreement that accurate historic data would be retained when the previous cull was proposed even though the 1970 limit is used?
Yes, we want to keep data that we have a reasonable amount of confidence in. The recent proposal attempted to discard data for which the only source (Shanks) has proven to be very unreliable in the past, when applied to out-of-the-way locations before 1970. (I clearly went too far in that proposal, and plan to fix that soon.) The general rule of thumb Shanks used seems to have been that when there was a gap in his database, he made something up. The tz database does not need to perpetuate this tradition everywhere.
On Thu, Jul 10, 2014 at 09:58:29PM -0700, Paul Eggert <eggert@cs.ucla.edu> wrote:
The standards haven't changed. Nor are the proposed changes
As a long-time lurker, I beg to differ. Maybe the word "standards" is misleading, but the effective maintainance has definitely changed. Never before did I hear that changes from one set of (likely bogus) data to another set of (likely bogus) data is ok because it doesn't cause "signficant disruption".
entries are. They're really bad. I can't prove it, but I have the strong impression that most of the entries that we haven't already checked were simply invented. If I'd known how bad they were I would not have put them there in the first place.
That may well be, but replacing one set of invented data by another set of invented data doesn't seem to be an obvious improvement.
most people seem to have been concerned about stability of timestamps. not about accuracy, correctness or regression testing.
I doubt whether there's a genuine lack of concern about accuracy or correctness.
Who is concerned that the older data is less accurate/correct than the newer one and therefore the data needs to be changed?
On the contrary, I think we all want the data to be as accurate and correct as it can be. The only dispute here is how much stability trumps these other concerns.
What other concerns? The new data isn't more accurate or more correct than the old one, it's just different. Correctness and accuracy do trump stability, but neither accuracy nor correctness seem to be involved in this change - the change simply lumps together zones because the pre-1970 data was deemed incorrect, without there being reason to believe that this merge improves either correctness or accuracy. People in this discussion primarily complain about needless instability, not about instability caused by fixes. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schmorp@schmorp.de -=====/_/_//_/\_,_/ /_/\_\
On 07/11/2014 07:19 AM, Marc Lehmann wrote:
Never before did I hear that changes from one set of (likely bogus) data to another set of (likely bogus) data is ok because it doesn't cause "signficant disruption".
Even if you didn't hear about it, we have done changes like that in the past, and we did not encounter problems in the field because the changes were done reasonably conservatively, just as the current set of changes will be done reasonably conservatively once things are ironed out. I appreciate having more eyes to check over the changes these days to make sure that significant disruptions will not ensue, and so far we've gotten useful reports from Alan Barrett, Tim Parenti, and David Patte along these lines, which will improve the quality of the database. We do make mistakes, and generate a release that does cause significant disruption. The most recent example of this was the disruption caused by the 2014c release, which broke GNOME's Glib. Testing to avoid significant disruption to end users should focus on those sorts of issues; in contrast, there's no real need to worry about minor adjustments to the placeholder UT offset of Upper Volta in 1911.
The charge is that this is "changes from one set of (likely bogus) data to another set of (likely bogus) data ". Simply because you think it hasn't caused issues does not make it so. Those of us who listen to this list are the consumers and customers of this data, and our feedback should inform on the desirability of change. Yet every other person who has responded to this thread has questioned this change and indicated they would prefer it not to occur. What has to happen to make it stop? No one has an issue with changing data to make it better. Everyone has an issue with changing data from possibly bogus to possibly bogus. Standard maxim - if it ain't broke don't fix it. Stephen On 11 July 2014 19:51, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 07/11/2014 07:19 AM, Marc Lehmann wrote:
Never before did I hear that changes from one set of (likely bogus) data to another set of (likely bogus) data is ok because it doesn't cause "signficant disruption".
Even if you didn't hear about it, we have done changes like that in the past, and we did not encounter problems in the field because the changes were done reasonably conservatively, just as the current set of changes will be done reasonably conservatively once things are ironed out. I appreciate having more eyes to check over the changes these days to make sure that significant disruptions will not ensue, and so far we've gotten useful reports from Alan Barrett, Tim Parenti, and David Patte along these lines, which will improve the quality of the database.
We do make mistakes, and generate a release that does cause significant disruption. The most recent example of this was the disruption caused by the 2014c release, which broke GNOME's Glib. Testing to avoid significant disruption to end users should focus on those sorts of issues; in contrast, there's no real need to worry about minor adjustments to the placeholder UT offset of Upper Volta in 1911.
participants (9)
-
Alan Barrett -
David Patte ₯ -
Derick Rethans -
Lester Caine -
Marc Lehmann -
Paul Eggert -
Paul_Koning@dell.com -
Stephen Colebourne -
Tim Parenti