[PATCH 1/3] * backward: Move some links back to primary files.
This should allay concerns that the links would go away any time soon. Suggested by Stephen Colebourne in <http://mm.icann.org/pipermail/tz/2013-September/019801.html>. Change "`" to "'"; these days, "`" and "'" are not symmetric. * antarctica (Antarctica/McMurdo): * europe (Europe/Jersey, Europe/Guernsey, Europe/Isle_of_Man) (Europe/Mariehamn, Europe/Busingen, Europe/Vatican, Europe/San_Marino) (Arctic/Longyearbyen, Europe/Ljubljana, Europe/Podgorica) (Europe/Sarajevo, Europe/Skopje, Europe/Zagreb, Europe/Bratislava): * northamerica (America/St_Barthelemy, America/Marigot): * southamerica (America/Lower_Princes, America/Kralendijk): Move here from 'backward'. This reverts a 2013-08-09 change. --- antarctica | 15 +++++++-------- backward | 19 ------------------- europe | 28 +++++++++++++++++++++------- northamerica | 44 +++++++++++++++++++++++--------------------- southamerica | 12 ++++++++---- 5 files changed, 59 insertions(+), 59 deletions(-) diff --git a/antarctica b/antarctica index 4cb7476..234e59c 100644 --- a/antarctica +++ b/antarctica @@ -16,9 +16,9 @@ # # Except for the French entries, # I made up all time zone abbreviations mentioned here; corrections welcome! -# FORMAT is `zzz' and GMTOFF is 0 for locations while uninhabited. +# FORMAT is 'zzz' and GMTOFF is 0 for locations while uninhabited. -# These rules are stolen from the `southamerica' file. +# These rules are stolen from the 'southamerica' file. # Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S Rule ArgAQ 1964 1966 - Mar 1 0:00 0 - Rule ArgAQ 1964 1966 - Oct 15 0:00 1:00 S @@ -231,7 +231,7 @@ Zone Antarctica/Syowa 0 - zzz 1957 Jan 29 # Scott Base, Ross Island, since 1957-01. # See Pacific/Auckland. # -# These rules for New Zealand are stolen from the `australasia' file. +# These rules for New Zealand are stolen from the 'australasia' file. # Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S Rule NZAQ 1974 only - Nov 3 2:00s 1:00 D Rule NZAQ 1975 1988 - Oct lastSun 2:00s 1:00 D @@ -269,11 +269,11 @@ Rule NZAQ 2008 max - Apr Sun>=1 2:00s 0 S # From Lee Hotz (2001-03-08): # I queried the folks at Columbia who spent the summer at Vostok and this is # what they had to say about time there: -# ``in the US Camp (East Camp) we have been on New Zealand (McMurdo) +# "in the US Camp (East Camp) we have been on New Zealand (McMurdo) # time, which is 12 hours ahead of GMT. The Russian Station Vostok was # 6 hours behind that (although only 2 miles away, i.e. 6 hours ahead # of GMT). This is a time zone I think two hours east of Moscow. The -# natural time zone is in between the two: 8 hours ahead of GMT.'' +# natural time zone is in between the two: 8 hours ahead of GMT." # # From Paul Eggert (2001-05-04): # This seems to be hopelessly confusing, so I asked Lee Hotz about it @@ -339,10 +339,7 @@ Zone Antarctica/Palmer 0 - zzz 1965 # # # McMurdo Station, Ross Island, since 1955-12 -# See Pacific/Auckland. -# # Amundsen-Scott South Pole Station, continuously occupied since 1956-11-20 -# See Pacific/Auckland. # # From Chris Carrier (1996-06-27): # Siple, the first commander of the South Pole station, @@ -363,3 +360,5 @@ Zone Antarctica/Palmer 0 - zzz 1965 # makes all of the clocks run fast. So every couple of days, # we have to go around and set them back 5 minutes or so. # Maybe if we let them run fast all of the time, we'd get to leave here sooner!! + +Link Pacific/Auckland Antarctica/McMurdo diff --git a/backward b/backward index 561e2ff..b8db6bb 100644 --- a/backward +++ b/backward @@ -18,19 +18,13 @@ Link America/Indiana/Indianapolis America/Fort_Wayne Link America/Indiana/Indianapolis America/Indianapolis Link America/Argentina/Jujuy America/Jujuy Link America/Indiana/Knox America/Knox_IN -Link America/Curacao America/Kralendijk Link America/Kentucky/Louisville America/Louisville -Link America/Curacao America/Lower_Princes -Link America/Guadeloupe America/Marigot Link America/Argentina/Mendoza America/Mendoza Link America/Rio_Branco America/Porto_Acre Link America/Argentina/Cordoba America/Rosario Link America/Denver America/Shiprock -Link America/Guadeloupe America/St_Barthelemy Link America/St_Thomas America/Virgin -Link Pacific/Auckland Antarctica/McMurdo Link Pacific/Auckland Antarctica/South_Pole -Link Europe/Oslo Arctic/Longyearbyen Link Asia/Ashgabat Asia/Ashkhabad Link Asia/Kolkata Asia/Calcutta Link Asia/Chongqing Asia/Chungking @@ -74,20 +68,7 @@ Link America/Havana Cuba Link Africa/Cairo Egypt Link Europe/Dublin Eire Link Europe/London Europe/Belfast -Link Europe/Prague Europe/Bratislava -Link Europe/Zurich Europe/Busingen -Link Europe/London Europe/Guernsey -Link Europe/London Europe/Isle_of_Man -Link Europe/London Europe/Jersey -Link Europe/Belgrade Europe/Ljubljana -Link Europe/Helsinki Europe/Mariehamn -Link Europe/Belgrade Europe/Podgorica -Link Europe/Rome Europe/San_Marino -Link Europe/Belgrade Europe/Sarajevo -Link Europe/Belgrade Europe/Skopje Link Europe/Chisinau Europe/Tiraspol -Link Europe/Rome Europe/Vatican -Link Europe/Belgrade Europe/Zagreb Link Europe/London GB Link Europe/London GB-Eire Link Etc/GMT GMT+0 diff --git a/europe b/europe index 52f15e1..4f972a5 100644 --- a/europe +++ b/europe @@ -440,6 +440,9 @@ Zone Europe/London -0:01:15 - LMT 1847 Dec 1 0:00s 1:00 - BST 1971 Oct 31 2:00u 0:00 GB-Eire %s 1996 0:00 EU GMT/BST +Link Europe/London Europe/Jersey +Link Europe/London Europe/Guernsey +Link Europe/London Europe/Isle_of_Man Zone Europe/Dublin -0:25:00 - LMT 1880 Aug 2 -0:25:21 - DMT 1916 May 21 2:00 -0:25:21 1:00 IST 1916 Oct 1 2:00s @@ -1105,7 +1108,9 @@ Zone Europe/Helsinki 1:39:52 - LMT 1878 May 31 1:39:52 - HMT 1921 May # Helsinki Mean Time 2:00 Finland EE%sT 1983 2:00 EU EE%sT -# Use Europe/Helsinki for the Aaland Islands. + +# Aaland Is +Link Europe/Helsinki Europe/Mariehamn # France @@ -1255,7 +1260,10 @@ Zone Europe/Berlin 0:53:28 - LMT 1893 Apr # Source for the time in Busingen 1980: # http://www.srf.ch/player/video?id=c012c029-03b7-4c2b-9164-aa5902cd58d3 -# Use Europe/Zurich for Busingen. +# From Arthur David Olson (2012-03-03): +# Busingen and Zurich have shared clocks since 1970. + +Link Europe/Zurich Europe/Busingen # Georgia # Please see the "asia" file for Asia/Tbilisi. @@ -1474,7 +1482,9 @@ Zone Europe/Rome 0:49:56 - LMT 1866 Sep 22 1:00 C-Eur CE%sT 1944 Jul 1:00 Italy CE%sT 1980 1:00 EU CE%sT -# Use Europe/Rome also for San Marino and Vatican City. + +Link Europe/Rome Europe/Vatican +Link Europe/Rome Europe/San_Marino # Latvia @@ -1853,7 +1863,7 @@ Zone Europe/Oslo 0:43:00 - LMT 1895 Jan 1 # before 1895, and therefore probably changed the local time somewhere # between 1895 and 1925 (inclusive). -# From Paul Eggert (2013-08-09): +# From Paul Eggert (2013-09-02): # # Actually, Jan Mayen was never occupied by Germany during World War II, # so it must have diverged from Oslo time during the war, as Oslo was @@ -1879,6 +1889,7 @@ Zone Europe/Oslo 0:43:00 - LMT 1895 Jan 1 # # All these events predate our cutoff date of 1970, so use Europe/Oslo # for these regions. +Link Europe/Oslo Arctic/Longyearbyen # Poland # Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S @@ -2449,11 +2460,14 @@ Zone Europe/Belgrade 1:22:00 - LMT 1884 # Shanks & Pottenger don't give as much detail, so go with Kozelj. 1:00 - CET 1982 Nov 27 1:00 EU CE%sT -# Use Europe/Belgrade also for Bosnia and Herzegovina, Croatia, Macedonia, -# Montenegro, and Slovenia. +Link Europe/Belgrade Europe/Ljubljana # Slovenia +Link Europe/Belgrade Europe/Podgorica # Montenegro +Link Europe/Belgrade Europe/Sarajevo # Bosnia and Herzegovina +Link Europe/Belgrade Europe/Skopje # Macedonia +Link Europe/Belgrade Europe/Zagreb # Croatia # Slovakia -# See Europe/Prague. +Link Europe/Prague Europe/Bratislava # Slovenia # See Europe/Belgrade. diff --git a/northamerica b/northamerica index 418261f..d3b124e 100644 --- a/northamerica +++ b/northamerica @@ -20,7 +20,7 @@ # Howse writes (pp 121-125) that time zones were invented by # Professor Charles Ferdinand Dowd (1825-1904), # Principal of Temple Grove Ladies' Seminary (Saratoga Springs, NY). -# His pamphlet ``A System of National Time for Railroads'' (1870) +# His pamphlet "A System of National Time for Railroads" (1870) # was the result of his proposals at the Convention of Railroad Trunk Lines # in New York City (1869-10). His 1870 proposal was based on Washington, DC, # but in 1872-05 he moved the proposed origin to Greenwich. @@ -40,8 +40,8 @@ # From Paul Eggert (2001-03-06): # Daylight Saving Time was first suggested as a joke by Benjamin Franklin -# in his whimsical essay ``An Economical Project for Diminishing the Cost -# of Light'' published in the Journal de Paris (1784-04-26). +# in his whimsical essay "An Economical Project for Diminishing the Cost +# of Light" published in the Journal de Paris (1784-04-26). # Not everyone is happy with the results: # # I don't really care how time is reckoned so long as there is some @@ -167,8 +167,8 @@ Zone PST8PDT -8:00 US P%sT # of the Aleutian islands. No DST. # From Paul Eggert (1995-12-19): -# The tables below use `NST', not `NT', for Nome Standard Time. -# I invented `CAWT' for Central Alaska War Time. +# The tables below use 'NST', not 'NT', for Nome Standard Time. +# I invented 'CAWT' for Central Alaska War Time. # From U. S. Naval Observatory (1989-01-19): # USA EASTERN 5 H BEHIND UTC NEW YORK, WASHINGTON @@ -237,9 +237,9 @@ Zone PST8PDT -8:00 US P%sT # H.R. 6, Energy Policy Act of 2005, SEC. 110. DAYLIGHT SAVINGS. # (a) Amendment- Section 3(a) of the Uniform Time Act of 1966 (15 # U.S.C. 260a(a)) is amended-- -# (1) by striking `first Sunday of April' and inserting `second +# (1) by striking 'first Sunday of April' and inserting 'second # Sunday of March'; and -# (2) by striking `last Sunday of October' and inserting `first +# (2) by striking 'last Sunday of October' and inserting 'first # Sunday of November'. # (b) Effective Date- Subsection (a) shall take effect 1 year after the # date of enactment of this Act or March 1, 2007, whichever is later. @@ -678,13 +678,13 @@ Zone America/Boise -7:44:49 - LMT 1883 Nov 18 12:15:11 # and Switzerland counties have their own time zone histories as noted below. # # Shanks partitioned Indiana into 345 regions, each with its own time history, -# and wrote ``Even newspaper reports present contradictory information.'' +# and wrote "Even newspaper reports present contradictory information." # Those Hoosiers! Such a flighty and changeable people! # Fortunately, most of the complexity occurred before our cutoff date of 1970. # # Other than Indianapolis, the Indiana place names are so nondescript -# that they would be ambiguous if we left them at the `America' level. -# So we reluctantly put them all in a subdirectory `America/Indiana'. +# that they would be ambiguous if we left them at the 'America' level. +# So we reluctantly put them all in a subdirectory 'America/Indiana'. # From Paul Eggert (2005-08-16): # http://www.mccsc.edu/time.html says that Indiana will use DST starting 2006. @@ -948,8 +948,8 @@ Zone America/Kentucky/Monticello -5:39:24 - LMT 1883 Nov 18 12:20:36 # This story is too entertaining to be false, so go with Howse over Shanks. # # From Paul Eggert (2001-03-06): -# Garland (1927) writes ``Cleveland and Detroit advanced their clocks -# one hour in 1914.'' This change is not in Shanks. We have no more +# Garland (1927) writes "Cleveland and Detroit advanced their clocks +# one hour in 1914." This change is not in Shanks. We have no more # info, so omit this for now. # # Most of Michigan observed DST from 1973 on, but was a bit late in 1975. @@ -989,7 +989,7 @@ Zone America/Menominee -5:50:27 - LMT 1885 Sep 18 12:00 # occupied 1857/1900 by the Navassa Phosphate Co # US lighthouse 1917/1996-09 # currently uninhabited -# see Mark Fineman, ``An Isle Rich in Guano and Discord'', +# see Mark Fineman, "An Isle Rich in Guano and Discord", # _Los Angeles Times_ (1998-11-10), A1, A10; it cites # Jimmy Skaggs, _The Great Guano Rush_ (1994). @@ -1023,7 +1023,7 @@ Zone America/Menominee -5:50:27 - LMT 1885 Sep 18 12:00 # Milne J. Civil time. Geogr J. 1899 Feb;13(2):173-94 # <http://www.jstor.org/stable/1774359>. # -# See the `europe' file for Greenland. +# See the 'europe' file for Greenland. # Canada @@ -1224,7 +1224,7 @@ Zone America/St_Johns -3:30:52 - LMT 1884 # most of east Labrador -# The name `Happy Valley-Goose Bay' is too long; use `Goose Bay'. +# The name 'Happy Valley-Goose Bay' is too long; use 'Goose Bay'. # Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone America/Goose_Bay -4:01:40 - LMT 1884 # Happy Valley-Goose Bay -3:30:52 - NST 1918 @@ -1405,7 +1405,6 @@ Zone America/Montreal -4:54:16 - LMT 1884 -5:00 Mont E%sT 1974 -5:00 Canada E%sT - # Ontario # From Paul Eggert (2006-07-09): @@ -2211,7 +2210,7 @@ Zone America/Dawson -9:17:40 - LMT 1900 Aug 20 # From Paul Eggert (1996-06-12): # For an English translation of the decree, see # <a href="http://mexico-travel.com/extra/timezone_eng.html"> -# ``Diario Oficial: Time Zone Changeover'' (1996-01-04). +# "Diario Oficial: Time Zone Changeover" (1996-01-04). # </a> # From Rives McDow (1998-10-08): @@ -2640,7 +2639,7 @@ Rule CR 1991 1992 - Jan Sat>=15 0:00 1:00 D # go with Shanks & Pottenger. Rule CR 1991 only - Jul 1 0:00 0 S Rule CR 1992 only - Mar 15 0:00 0 S -# There are too many San Joses elsewhere, so we'll use `Costa Rica'. +# There are too many San Joses elsewhere, so we'll use 'Costa Rica'. # Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone America/Costa_Rica -5:36:13 - LMT 1890 # San Jose -5:36:13 - SJMT 1921 Jan 15 # San Jose Mean Time @@ -2931,7 +2930,10 @@ Zone America/Grenada -4:07:00 - LMT 1911 Jul # St George's # Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone America/Guadeloupe -4:06:08 - LMT 1911 Jun 8 # Pointe a Pitre -4:00 - AST -# Use America/Guadeloupe also for St Barthelemy and for St Martin (French part). +# St Barthelemy +Link America/Guadeloupe America/St_Barthelemy +# St Martin (French part) +Link America/Guadeloupe America/Marigot # Guatemala # @@ -3177,7 +3179,7 @@ Zone America/Panama -5:18:08 - LMT 1890 -5:00 - EST # Puerto Rico -# There are too many San Juans elsewhere, so we'll use `Puerto_Rico'. +# There are too many San Juans elsewhere, so we'll use 'Puerto_Rico'. # Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone America/Puerto_Rico -4:24:25 - LMT 1899 Mar 28 12:00 # San Juan -4:00 - AST 1942 May 3 @@ -3196,7 +3198,7 @@ Zone America/St_Lucia -4:04:00 - LMT 1890 # Castries -4:00 - AST # St Pierre and Miquelon -# There are too many St Pierres elsewhere, so we'll use `Miquelon'. +# There are too many St Pierres elsewhere, so we'll use 'Miquelon'. # Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone America/Miquelon -3:44:40 - LMT 1911 May 15 # St Pierre -4:00 - AST 1980 May diff --git a/southamerica b/southamerica index ff47cef..ea264a1 100644 --- a/southamerica +++ b/southamerica @@ -1349,9 +1349,13 @@ Zone America/Curacao -4:35:47 - LMT 1912 Feb 12 # Willemstad -4:30 - ANT 1965 # Netherlands Antilles Time -4:00 - AST -# Sint Maarten -# Caribbean Netherlands -# Use America/Curacao. +# From Arthur David Olson (2011-06-15): +# use links for places with new iso3166 codes. +# The name "Lower Prince's Quarter" is both longer than fourteen charaters +# and contains an apostrophe; use "Lower_Princes" below. + +Link America/Curacao America/Lower_Princes # Sint Maarten +Link America/Curacao America/Kralendijk # Caribbean Netherlands # Ecuador # @@ -1646,7 +1650,7 @@ Rule Uruguay 1937 1941 - Mar lastSun 0:00 0 - # Whitman gives 1937 Oct 3; go with Shanks & Pottenger. Rule Uruguay 1937 1940 - Oct lastSun 0:00 0:30 HS # Whitman gives 1941 Oct 24 - 1942 Mar 27, 1942 Dec 14 - 1943 Apr 13, -# and 1943 Apr 13 ``to present time''; go with Shanks & Pottenger. +# and 1943 Apr 13 "to present time"; go with Shanks & Pottenger. Rule Uruguay 1941 only - Aug 1 0:00 0:30 HS Rule Uruguay 1942 only - Jan 1 0:00 0 - Rule Uruguay 1942 only - Dec 14 0:00 1:00 S -- 1.8.1.2
* northamerica (America/Anguilla, America/Dominica, America/Grenada) (America/Guadeloupe, America/Montserrat, America/St_Kitts) (America/St_Lucia, America/St_Vincent, America/Tortola) (America/St_Thomas): Link to America/Port_of_Spain instead of having a separate zone that differs only in LMT. Each LMT entry is just a placeholder for unavailable info, and by itself does not justify a separate zone. Using just one zone simplifies maintenance, makes the runtime a bit smaller, and can help simplify user selection of zones. * southamerica (America/Aruba): Likewise, for America/Curacao. * backward (America/Virgin): * northamerica (America/St_Barthelemy, America/Marigot): Adjust to the other changes, by not linking to a link. --- backward | 2 +- northamerica | 50 ++++++++++++-------------------------------------- southamerica | 5 +---- 3 files changed, 14 insertions(+), 43 deletions(-) diff --git a/backward b/backward index b8db6bb..06fb192 100644 --- a/backward +++ b/backward @@ -23,7 +23,7 @@ Link America/Argentina/Mendoza America/Mendoza Link America/Rio_Branco America/Porto_Acre Link America/Argentina/Cordoba America/Rosario Link America/Denver America/Shiprock -Link America/St_Thomas America/Virgin +Link America/Port_of_Spain America/Virgin Link Pacific/Auckland Antarctica/South_Pole Link Asia/Ashgabat Asia/Ashkhabad Link Asia/Kolkata Asia/Calcutta diff --git a/northamerica b/northamerica index d3b124e..3a572ba 100644 --- a/northamerica +++ b/northamerica @@ -2547,9 +2547,7 @@ Zone America/Santa_Isabel -7:39:28 - LMT 1922 Jan 1 0:20:32 ############################################################################### # Anguilla -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/Anguilla -4:12:16 - LMT 1912 Mar 2 - -4:00 - AST +Link America/Port_of_Spain America/Anguilla # Antigua and Barbuda # Zone NAME GMTOFF RULES FORMAT [UNTIL] @@ -2871,9 +2869,7 @@ Zone America/Havana -5:29:28 - LMT 1890 -5:00 Cuba C%sT # Dominica -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/Dominica -4:05:36 - LMT 1911 Jul 1 0:01 # Roseau - -4:00 - AST +Link America/Port_of_Spain America/Dominica # Dominican Republic @@ -2922,18 +2918,13 @@ Zone America/El_Salvador -5:56:48 - LMT 1921 # San Salvador -6:00 Salv C%sT # Grenada -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/Grenada -4:07:00 - LMT 1911 Jul # St George's - -4:00 - AST - +Link America/Port_of_Spain America/Grenada # Guadeloupe -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/Guadeloupe -4:06:08 - LMT 1911 Jun 8 # Pointe a Pitre - -4:00 - AST +Link America/Port_of_Spain America/Guadeloupe # St Barthelemy -Link America/Guadeloupe America/St_Barthelemy +Link America/Port_of_Spain America/St_Barthelemy # St Martin (French part) -Link America/Guadeloupe America/Marigot +Link America/Port_of_Spain America/Marigot # Guatemala # @@ -3100,12 +3091,7 @@ Zone America/Martinique -4:04:20 - LMT 1890 # Fort-de-France -4:00 - AST # Montserrat -# From Paul Eggert (2006-03-22): -# In 1995 volcanic eruptions forced evacuation of Plymouth, the capital. -# world.gazetteer.com says Cork Hill is the most populous location now. -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/Montserrat -4:08:52 - LMT 1911 Jul 1 0:01 # Cork Hill - -4:00 - AST +Link America/Port_of_Spain America/Montserrat # Nicaragua # @@ -3187,15 +3173,10 @@ Zone America/Puerto_Rico -4:24:25 - LMT 1899 Mar 28 12:00 # San Juan -4:00 - AST # St Kitts-Nevis -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/St_Kitts -4:10:52 - LMT 1912 Mar 2 # Basseterre - -4:00 - AST +Link America/Port_of_Spain America/St_Kitts # St Lucia -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/St_Lucia -4:04:00 - LMT 1890 # Castries - -4:04:00 - CMT 1912 # Castries Mean Time - -4:00 - AST +Link America/Port_of_Spain America/St_Lucia # St Pierre and Miquelon # There are too many St Pierres elsewhere, so we'll use 'Miquelon'. @@ -3206,10 +3187,7 @@ Zone America/Miquelon -3:44:40 - LMT 1911 May 15 # St Pierre -3:00 Canada PM%sT # St Vincent and the Grenadines -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/St_Vincent -4:04:56 - LMT 1890 # Kingstown - -4:04:56 - KMT 1912 # Kingstown Mean Time - -4:00 - AST +Link America/Port_of_Spain America/St_Vincent # Turks and Caicos # @@ -3243,11 +3221,7 @@ Zone America/Grand_Turk -4:44:32 - LMT 1890 -5:00 TC E%sT # British Virgin Is -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/Tortola -4:18:28 - LMT 1911 Jul # Road Town - -4:00 - AST +Link America/Port_of_Spain America/Tortola # Virgin Is -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/St_Thomas -4:19:44 - LMT 1911 Jul # Charlotte Amalie - -4:00 - AST +Link America/Port_of_Spain America/St_Thomas diff --git a/southamerica b/southamerica index ea264a1..f2e8f12 100644 --- a/southamerica +++ b/southamerica @@ -631,10 +631,7 @@ Zone America/Argentina/Ushuaia -4:33:12 - LMT 1894 Oct 31 -3:00 - ART # Aruba -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/Aruba -4:40:24 - LMT 1912 Feb 12 # Oranjestad - -4:30 - ANT 1965 # Netherlands Antilles Time - -4:00 - AST +Link America/Curacao America/Aruba # Bolivia # Zone NAME GMTOFF RULES FORMAT [UNTIL] -- 1.8.1.2
On 2013-09-03 01:34, Paul Eggert wrote: ...
Link to America/Port_of_Spain instead of having a separate zone that differs only in LMT. Each LMT entry is just a placeholder for unavailable info, and by itself does not justify a separate zone. Using just one zone simplifies maintenance, makes the runtime a bit smaller, and can help simplify user selection of zones. * southamerica (America/Aruba): Likewise, for America/Curacao. * backward (America/Virgin): * northamerica (America/St_Barthelemy, America/Marigot):
Folks seem to have forgotten that initial LMT entries contain the Standard Time start date, as well as the prior GMTOFF based on longitude of that location. While I agree that merging zones with different GMTOFF makes no significant difference, merging zones with different Standard Time start date loses useful historical data, unless the tzdb is seen mainly as a source of DST changes since 1970. In the changes highlighted below, Standard Time start dates vary from 1890-1921, with some different dates in 1911-1912. Zones should be merged only if all linked locations have the same Standard Time start date, if we wish to maintain the history of when Standard Time was established in each time zone. ...
# Anguilla -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/Anguilla -4:12:16 - LMT 1912 Mar 2 - -4:00 - AST +Link America/Port_of_Spain America/Anguilla
# Antigua and Barbuda # Zone NAME GMTOFF RULES FORMAT [UNTIL] @@ -2871,9 +2869,7 @@ Zone America/Havana -5:29:28 - LMT 1890 -5:00 Cuba C%sT
# Dominica -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/Dominica -4:05:36 - LMT 1911 Jul 1 0:01 # Roseau - -4:00 - AST +Link America/Port_of_Spain America/Dominica
# Dominican Republic
@@ -2922,18 +2918,13 @@ Zone America/El_Salvador -5:56:48 - LMT 1921 # San Salvador -6:00 Salv C%sT
# Grenada -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/Grenada -4:07:00 - LMT 1911 Jul # St George's - -4:00 - AST - +Link America/Port_of_Spain America/Grenada # Guadeloupe -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/Guadeloupe -4:06:08 - LMT 1911 Jun 8 # Pointe a Pitre - -4:00 - AST +Link America/Port_of_Spain America/Guadeloupe ... # Montserrat -# From Paul Eggert (2006-03-22): -# In 1995 volcanic eruptions forced evacuation of Plymouth, the capital. -# world.gazetteer.com says Cork Hill is the most populous location now. -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/Montserrat -4:08:52 - LMT 1911 Jul 1 0:01 # Cork Hill - -4:00 - AST +Link America/Port_of_Spain America/Montserrat ... # St Kitts-Nevis -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/St_Kitts -4:10:52 - LMT 1912 Mar 2 # Basseterre - -4:00 - AST +Link America/Port_of_Spain America/St_Kitts
# St Lucia -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/St_Lucia -4:04:00 - LMT 1890 # Castries - -4:04:00 - CMT 1912 # Castries Mean Time - -4:00 - AST +Link America/Port_of_Spain America/St_Lucia ... # St Vincent and the Grenadines -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/St_Vincent -4:04:56 - LMT 1890 # Kingstown - -4:04:56 - KMT 1912 # Kingstown Mean Time - -4:00 - AST +Link America/Port_of_Spain America/St_Vincent ... # British Virgin Is -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/Tortola -4:18:28 - LMT 1911 Jul # Road Town - -4:00 - AST +Link America/Port_of_Spain America/Tortola
# Virgin Is -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/St_Thomas -4:19:44 - LMT 1911 Jul # Charlotte Amalie - -4:00 - AST +Link America/Port_of_Spain America/St_Thomas ... # Aruba -# Zone NAME GMTOFF RULES FORMAT [UNTIL] -Zone America/Aruba -4:40:24 - LMT 1912 Feb 12 # Oranjestad - -4:30 - ANT 1965 # Netherlands Antilles Time - -4:00 - AST +Link America/Curacao America/Aruba
On 3 September 2013 09:15, Brian Inglis <Brian.Inglis@systematicsw.ab.ca> wrote:
On 2013-09-03 01:34, Paul Eggert wrote:
Link to America/Port_of_Spain instead of having a separate zone that differs only in LMT. Each LMT entry is just a placeholder for unavailable info, and by itself does not justify a separate zone. Using just one zone simplifies maintenance, makes the runtime a bit smaller, and can help simplify user selection of zones. * southamerica (America/Aruba): Likewise, for America/Curacao. * backward (America/Virgin): * northamerica (America/St_Barthelemy, America/Marigot):
Folks seem to have forgotten that initial LMT entries contain the Standard Time start date, as well as the prior GMTOFF based on longitude of that location.
While I agree that merging zones with different GMTOFF makes no significant difference, merging zones with different Standard Time start date loses useful historical data, unless the tzdb is seen mainly as a source of DST changes since 1970.
In the changes highlighted below, Standard Time start dates vary from 1890-1921, with some different dates in 1911-1912.
Zones should be merged only if all linked locations have the same Standard Time start date, if we wish to maintain the history of when Standard Time was established in each time zone.
Yes, unfortunately I agree with Brian. I note that the patch also changed the observable time-zone abbreviations (KMT/CMT etc). Again, while this might be invented data, through existence over a long period it has become undeletable. As such, most of this patch would need reverting :-( Stephen
Brian Inglis wrote:
merging zones with different Standard Time start date loses useful historical data
That's not a problem for the proposed change, because in this particular case, no useful historical data are lost. There is no real evidence that the start dates actually differed. All we have is some guesswork from Shanks. The Shanks data often contain guesswork, and the abovementioned transition dates from LMT are almost certainly part of that guesswork. The associated time zone abbreviations are also guesswork (in this case, my guesswork and not Shanks'), so they do not contain any useful historical data either.
On 3 September 2013 15:47, Paul Eggert <eggert@cs.ucla.edu> wrote:
Brian Inglis wrote:
merging zones with different Standard Time start date loses useful historical data
That's not a problem for the proposed change, because in this particular case, no useful historical data are lost. There is no real evidence that the start dates actually differed. All we have is some guesswork from Shanks. The Shanks data often contain guesswork, and the abovementioned transition dates from LMT are almost certainly part of that guesswork.
The associated time zone abbreviations are also guesswork (in this case, my guesswork and not Shanks'), so they do not contain any useful historical data either.
Your long experience with the tzdb leads you to assert that these are guesswork, therefore OK to delete. Unfortunately, the consumers of the tzdb see the data as fact, and since the data has been there for many years that simply reaffirms the "fact" status. I'd also suggest that Shanks information has at least some degree of thought behind it, certainly better than picking up the thoughts of some other location that has always been under another jurisdiction. I really wasn't joking when I suggested that pretty much no refactoring is now possible. Its certainly the case that pretty much every refactoring you can think of has some data loss. thanks Stephen
Stephen Colebourne wrote:
Shanks information has at least some degree of thought behind it,
As a longtime reader of the Shanks data, I have a good feeling for when it's reliable and when it isn't. This is definitely a place where isn't.
the consumers of the tzdb see the data as fact
That's a problem, since much of the data are wrong. And it's all the more reason to omit dubious data when we can, which is the case here.
Paul Eggert wrote:
Stephen Colebourne wrote:
Shanks information has at least some degree of thought behind it, As a longtime reader of the Shanks data, I have a good feeling for when it's reliable and when it isn't. This is definitely a place where isn't.
Followers of Time Team will recognise this :) Nice drawings of some ancient site which when reopened bears no relation to real remains ...
the consumers of the tzdb see the data as fact That's a problem, since much of the data are wrong. And it's all the more reason to omit dubious data when we can, which is the case here.
Am I detecting something of a pattern here ... if material is inaccurate or unsubstantiated, then fine it needs to be removed. The git version control is not ideal for tracking this type of update but it does at least retain the history of corrections and the commit message should explain the detail. But in keeping what remains, can we at least agree that the quality of the material is heading in the right direction? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On 3 September 2013 16:52, Paul Eggert <eggert@cs.ucla.edu> wrote:
Stephen Colebourne wrote:
the consumers of the tzdb see the data as fact
That's a problem, since much of the data are wrong. And it's all the more reason to omit dubious data when we can, which is the case here.
If you were just removing incorrect data, then that would be an enhancement. But you are in fact removing one set of dubious data and replacing it with another set of dubious data from an entirely different location, via the link. Thats a problem (not an earth shattering one, but its still negative, not positive). Whereas if you did nothing, made no change, then we would't be having this discussion. It is my judgement that this change, like many recently, are not what I would describe as enhancements - they are just refactoring. Its your desire to refactor that is at the heart of pretty much every recent thread about reverts. If you had made no changes since May, apart from to the C code and to current local time, then I, and many others, would be entirely happy. Stephen
Stephen Colebourne <scolebourne@joda.org> writes:
If you had made no changes since May, apart from to the C code and to current local time, then I, and many others, would be entirely happy.
I've been sitting on this comment for quite some time, but, seriously, haven't you noticed that it's trivial for you or anyone else to produce a tz distribution that satisfies these requirements? If that's what you want, well, you have all of the tools required to do this right now, and could have been doing this since May with far less investment of effort than you're currently putting into long messages to this mailing list. I guess I don't understand why you seem so deeply invested in convincing Paul that you're right and he's wrong when it's fairly trivially possible for you to generate exactly the data that you want according to the criteria that you want followed without insisting that other people do the work for you. If you want to freeze all historical data in stone at the state they were in as of May of this year and only adopt forward-looking changes, then by all means do so! Set up a web site, set up a Git repository, cherry-pick the changes you want, and enjoy the perfect control you then have over your data source. I have been following this mailing list for many, many years, and when I first started following it I read all messages to this list going back to the foundation of the list. With that information in mind, I'm quite comfortable in saying that the maintenance policy that you're asking for is not the maintenance policy that ado used, and is not the maintenance policy that has ever been used for this project. I don't understand why you're so insistent on pushing a different maintenance policy on the project against multiple objections instead of just filtering the project changes down to the ones you approve of and publishing your own work. -- Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>
On 3 September 2013 22:27, Russ Allbery <rra@stanford.edu> wrote:
Stephen Colebourne <scolebourne@joda.org> writes:
If you had made no changes since May, apart from to the C code and to current local time, then I, and many others, would be entirely happy.
I've been sitting on this comment for quite some time, but, seriously, haven't you noticed that it's trivial for you or anyone else to produce a tz distribution that satisfies these requirements? If that's what you want, well, you have all of the tools required to do this right now, and could have been doing this since May with far less investment of effort than you're currently putting into long messages to this mailing list.
I guess I don't understand why you seem so deeply invested in convincing Paul that you're right and he's wrong when it's fairly trivially possible for you to generate exactly the data that you want according to the criteria that you want followed without insisting that other people do the work for you.
If you want to freeze all historical data in stone at the state they were in as of May of this year and only adopt forward-looking changes, then by all means do so! Set up a web site, set up a Git repository, cherry-pick the changes you want, and enjoy the perfect control you then have over your data source.
I have been following this mailing list for many, many years, and when I first started following it I read all messages to this list going back to the foundation of the list. With that information in mind, I'm quite comfortable in saying that the maintenance policy that you're asking for is not the maintenance policy that ado used, and is not the maintenance policy that has ever been used for this project. I don't understand why you're so insistent on pushing a different maintenance policy on the project against multiple objections instead of just filtering the project changes down to the ones you approve of and publishing your own work.
There are a number of points here. Firstly, creating a fork of the data is entirely unsatisfactory. Alongside stability, I have indicated that ubiquity is a key value of the data. The same data is used everywhere from Unix to Java to mobile phones. Creating a fork greatly devalues the data (the value of the two parts is less than the value of the whole). In addition, the tzdb charter requires changes to have consensus, so I have every right to complain here. Secondly, I'm not speaking on behalf of myself, but on behalf of Java development generally. I'd also suggest that I'm expressing the opinions of enterprise software development more generally (see the comments to the list from Bloomberg and Oracle RDMS). Thirdly, I note that the leading supporters of Paul's approach are from an academic background (Paul, Guy, yourself). With respect, I wonder if that academic background insulates from the needs of enterprise software, primarily stability. Fourthly, it seems to me that the recent batch of changes are far in excess of what has happened over previous years. For example, https://github.com/eggert/tz/commits/master/backward shows that the backward file was modified a number of times in the past few years, but almost always for changes to the spelling or naming of zone IDs (something which I've not opposed, even though I know CLDR finds that problematic). Finally, I'm NOT asking for all historical data to be frozen. I'm asking for no historical data to be changed UNLESS the replacement is a clear enhancement. This is a common sense and perfectly reasonable request of the database. Its also how all well-resected APIs and data sets (like CLDR) operate. Such an approach does explicitly rule out zone ID merging that loses the start date of offsets or abbreviations, even if those are guesswork/invented (because the replacement is not an enhancement, its worse). To be clear, I hate the fact that I'm having to write these emails. I think this list should be incredibly quiet, only receiving an email when a current time changes, or when someone does some historical research into a zone. Its not my fault that the database maintainer is insisting on making uneccessary and negative changes, and then refuses to revert them when the lack of consensus is clear. In summary, given the importance of the data set, and how it is currently being abused, I have no choice but to pusue my objections. Stephen
Stephen Colebourne wrote:
ubiquity is a key value of the data. The same data is used everywhere from Unix to Java to mobile phones.
No, it's pretty routinely filtered before it hits many platforms. One example: QNX has unsigned time_t, which by design filters out all data before 1970. Furthermore, there is an inevitable delay in propagating changes to the field. Even if we're talking a single host with 64-bit signed time_t (so that it matches Java's 'long'), I've seen situations where Java's copy of the data disagree with the POSIX copy. And certainly a distributed application cannot assume ubiquity, as the client and server may be updated at different times. So, for various reasons unrelated to the proposed changes, it's already the case that applications cannot assume that the data are ubiquitous and that the same data are used everywhere. That's not to say that we should introduce changes merely for the sake of changes; far from it. I agree with you that stability is a good property. But we shouldn't be inhibited from change because of the goal of having the data be the same everywhere. That goal is unattainable, and always has been.
I'm not speaking on behalf of myself, but on behalf of Java development generally.
These comments would have more weight if they pointed to user problems that occurred when we made similar changes in the past. Based on my experience I'm skeptical that there were significant user problems. I've asked the list for reports of problems but nobody else has reported problems either. This suggests that the concerns are misplaced. On this list I have also noted that the changes promise to make life easier for users in some cases, by omitting irrelevant choices. This is a real advantage that should trump stability concerns.
the leading supporters of Paul's approach are from an academic background (Paul, Guy, yourself)
This appears to be based on a misconception. I won't speak for Guy and Russ, but my career has been spent more in industry than in academia. I developed most of the tz database while in industry: I worked on enterprise software, and built several distributed applications involving many clients and using the tz database. I am attempting to use the tz maintenance practices that I used while in industry.
the recent batch of changes are far in excess of what has happened over previous years.
Sometimes I get up the energy to fix things. Often I don't. (Let's not be looking at gift horses in the mouth. :-)
zone ID merging that loses the start date of offsets or abbreviations, even if those are guesswork/invented (because the replacement is not an enhancement, its worse).
I've had quite a bit of experience in dealing with the Shanks data. From my experience the proposed change is a fairer representation of what we know than the previous version was. You're right that we don't know that the new version is correct and the old is wrong (both are guesses), but it's not right to say that the new version is worse.
On 4 September 2013 15:32, Paul Eggert <eggert@cs.ucla.edu> wrote:
ubiquity is a key value of the data. The same data is used everywhere from Unix to Java to mobile phones.
No, it's pretty routinely filtered before it hits many platforms. One example: QNX has unsigned time_t, which by design filters out all data before 1970.
Furthermore, there is an inevitable delay in propagating changes to the field. Even if we're talking a single host with 64-bit signed time_t (so that it matches Java's 'long'), I've seen situations where Java's copy of the data disagree with the POSIX copy. And certainly a distributed application cannot assume ubiquity, as the client and server may be updated at different times. So, for various reasons unrelated to the proposed changes, it's already the case that applications cannot assume that the data are ubiquitous and that the same data are used everywhere.
Not the ubiquity I meant. I meant that if there exists a fork of tzdb, then it will over time diverge. If Scotland has its own time different from England, then one tzdb might name it Edinburgh and the forked tzdb uses Glasgow. That divergence, or non-ubiquity, would be very unhelpful to everyone that needs time-zone data.
I'm not speaking on behalf of myself, but on behalf of Java development generally. These comments would have more weight if they pointed to user problems that occurred when we made similar changes in the past. Based on my experience I'm skeptical that there were significant user problems. I've asked the list for reports of problems but nobody else has reported problems either. This suggests that the concerns are misplaced.
The data is used far more widely that just zic. Some of those uses work directly from the source tzdb data. Some of those uses expose source tzdb data that is not exposed via the zic binary. Thus, it certainly is the case that there are people affected by each change. Chances are they won't know it until a month or two down the line. For example, the removal of "Castries Mean Time" and "Kingstown Mean Time" will be visible in Joda-Time, and the change to the end of LMT will be visible in Joda-Time and JSR-310. Most people will adapt, not complain. But it is clearly a fact to say that data has been deleted and that deletion is observable to consumers of the data. The only argument is whether that data was sufficently in error to warrant deletion.
On this list I have also noted that the changes promise to make life easier for users in some cases, by omitting irrelevant choices. This is a real advantage that should trump stability concerns.
There are other ways (winnowing) to reduce the selection problem, because that problem is based off zic. For those of us parsing the source tzdb files directly, any data loss is data loss.
zone ID merging that loses the start date of offsets or abbreviations, even if those are guesswork/invented (because the replacement is not an enhancement, its worse).
I've had quite a bit of experience in dealing with the Shanks data. From my experience the proposed change is a fairer representation of what we know than the previous version was. You're right that we don't know that the new version is correct and the old is wrong (both are guesses), but it's not right to say that the new version is worse.
Changes like this are a longstanding part of maintenance, and I'm becoming inclined to think that we shouldn't discontinue this practice purely from a desire to not change things.
Again, I don't want to stop enhancements, that would be counter-productive to all. However, by making the change you are asserting that it is more true that all 10 locations in the Carribean have the same local time since the year dot, than that they had different time (that we just don't know anything about). ie. the fact that you have made a change at all means that the data should now be more reliable. Yet you accept that both the before and after are guesses. So, if you can reply to this email and say "with additional research I can say with reasonable certainty that all 10 locations have always had the same local time" then the change is entirely justified. If you can't, then the change should be reverted as not based on enough evidence. (To put it another way, the barrier required for changing the ID like this is higher than the barrier was when the ID was created in the first place. For example, America/Curacao and America/Aruba have exactly the same time since the year dot apart from the LMT value (they do have the same LMT end date). As such a Link is appropriate. Whereas, America/Tortola and America/Port_of_Spain have different LMT end dates. This means that Joda-time users will see a change in local time when querying between 1911-07-01 and 1912-03-02. More broadly, I'd suggest it would have been wiser to only suggest this patch once the current mess has settled down. Stephen
On 09/04/13 09:24, Stephen Colebourne wrote:
If Scotland has its own time different from England, then one tzdb might name it Edinburgh and the forked tzdb uses Glasgow. That divergence, or non-ubiquity, would be very unhelpful to everyone that needs time-zone data.
Yes, we should try to avoid this. Nothing like that is being proposed, thanks goodness.
For example, the removal of "Castries Mean Time" and "Kingstown Mean Time" will be visible in Joda-Time, and the change to the end of LMT will be visible in Joda-Time and JSR-310.
No historical data are being lost here. "Castries Mean Time" and "Kingstown Mean Time" are artifices of the tz database (I should know, since I invented them) and do not reflect any known historical data. More generally, no doubt regression tests will report changes because of the proposed patch, because that's what regression tests do: they report changes. But ordinary users won't care that time stamps in Aruba on February 12, 1912 from 04:35:47 to 04:40:24 UTC will have a UTC offset that differs by a few minutes. They just won't. We've done this sort of thing before, and it doesn't cause problems. The tz database has often seen updates like this, and it's just not that big a deal.
America/Curacao and America/Aruba have exactly the same time since the year dot apart from the LMT value (they do have the same LMT end date).
The transitions don't have the same UTC end time, though, so merging these two would also cause regression software to report a change. If "no change" is the criterion, then no changes will pass muster.
On Wed 2013-09-04T11:01:24 -0700, Paul Eggert hath writ:
But ordinary users won't care that time stamps in Aruba on February 12, 1912 from 04:35:47 to 04:40:24 UTC will have a UTC offset that differs by a few minutes. They just won't.
I request caution in making it clear to ordinary users that the name UTC cannot be proleptically extended to dates prior to 1960. No such concept existed in contemporary records. -- Steve Allen <sla@ucolick.org> WGS-84 (GPS) UCO/Lick Observatory--ISB Natural Sciences II, Room 165 Lat +36.99855 1156 High Street Voice: +1 831 459 3046 Lng -122.06015 Santa Cruz, CA 95064 http://www.ucolick.org/~sla/ Hgt +250 m
Steve Allen wrote:
On Wed 2013-09-04T11:01:24 -0700, Paul Eggert hath writ:
But ordinary users won't care that time stamps in Aruba on February 12, 1912 from 04:35:47 to 04:40:24 UTC will have a UTC offset that differs by a few minutes. They just won't. I request caution in making it clear to ordinary users that the name UTC cannot be proleptically extended to dates prior to 1960. No such concept existed in contemporary records.
Being pedantic :) 1963 ... But none of the sources actually quote the date for Recommendation 374 :( -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On Wed 2013-09-04T20:43:18 +0100, Lester Caine hath writ:
I request caution in making it clear to ordinary users that the name UTC cannot be proleptically extended to dates prior to 1960. No such concept existed in contemporary records.
Being pedantic :) 1963 ... But none of the sources actually quote the date for Recommendation 374 :(
I have examined the pages and kept images of that document. http://www.ucolick.org/~sla/leapsecs/timescales.html The 10th Plenary Assembly of the CCIR was held in Geneva from 1963-01-15 to 1963-02-16. During that meeting Recommendation 374 was approved to supersede Recommendation 319. Recommendation 374 does not contain any form of the words "coordination", "coordinated", nor the letters "UTC". The text states that the broadcasts of time signals should "be offset to keep the time pulses in close agreement with UT2" and "maintained within approximately 100 ms of universal time UT2" with steps of exactly 50 ms. -- Steve Allen <sla@ucolick.org> WGS-84 (GPS) UCO/Lick Observatory--ISB Natural Sciences II, Room 165 Lat +36.99855 1156 High Street Voice: +1 831 459 3046 Lng -122.06015 Santa Cruz, CA 95064 http://www.ucolick.org/~sla/ Hgt +250 m
On 09/04/13 11:21, Steve Allen wrote:
But ordinary
users won't care that time stamps in Aruba on February 12, 1912 from 04:35:47 to 04:40:24 UTC will have a UTC offset that differs by a few minutes. They just won't. I request caution in making it clear to ordinary users that the name UTC cannot be proleptically extended to dates prior to 1960. No such concept existed in contemporary records.
Yeowch! You're right; sorry about that. I should have written "GMT". I was misled by zdump's output. Should we change the output of "zdump" etc to fix this error? Currently zdump says "UTC" for old time stamps, which isn't correct. Should it say "UT" instead? Or is even "UT" a bad idea for a time stamp in (say) 1627? I also should have mentioned that even with GMT, my comment was incorrect in some sense. Common practice back then for Dutch possessions was to use non-integer offsets from GMT, and the tz format cannot represent these. I don't have good data for Aruba, but Capt. Thomas Henry Tizard of the Royal Navy reported that Curacao's port kept time at -04:35:46.9; see Milne 1899.
On Wed 2013-09-04T14:40:57 -0700, Paul Eggert hath writ:
Yeowch! You're right; sorry about that. I should have written "GMT". I was misled by zdump's output.
Except that GMT prior to 1925 means one thing for civil timestamps and a different thing for nautical timestamps.
Should we change the output of "zdump" etc to fix this error? Currently zdump says "UTC" for old time stamps, which isn't correct. Should it say "UT" instead? Or is even "UT" a bad idea for a time stamp in (say) 1627?
Unlike UTC, the concept of UT can be validly extended into the indefinite past.
I also should have mentioned that even with GMT, my comment was incorrect in some sense. Common practice back then for Dutch possessions was to use non-integer offsets from GMT, and the tz format cannot represent these. I don't have good data for Aruba, but Capt. Thomas Henry Tizard of the Royal Navy reported that Curacao's port kept time at -04:35:46.9; see Milne 1899.
One might argue over the validity of the conventional formula for UT being used in the far past, and for the purposes of geodesy it is relevant to be precise, but to the relevant precision for civil timestamps any reasonable conventional expression for UT is as valid as the timestamps themselves. -- Steve Allen <sla@ucolick.org> WGS-84 (GPS) UCO/Lick Observatory--ISB Natural Sciences II, Room 165 Lat +36.99855 1156 High Street Voice: +1 831 459 3046 Lng -122.06015 Santa Cruz, CA 95064 http://www.ucolick.org/~sla/ Hgt +250 m
Steve Allen wrote:
Unlike UTC, the concept of UT can be validly extended into the indefinite past.
Thanks for bringing this up. Here's a proposed patch that tries to fix the occurrences of this problem that I found in the tz code and data. I've pushed this to the experimental repository.
From 42801d2b1fb0a644096c228daec1be20278d9bd0 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Wed, 4 Sep 2013 19:07:31 -0700 Subject: [PATCH] Correct some UTC-vs-UT solecisms.
In several places the code and documentation incorrectly used "UTC" to describe time stamps that might precede the introduction of UTC and for which UTC is therefore undefined. Change these uses to "UT", as that's the correct term when talking about these time stamps. Problem reported by Steve Allen in <http://mm.icann.org/pipermail/tz/2013-September/019907.html>. The major compatibility issue here is with 'zdump -v'; it'll now output "UT" instead of the possibly-incorrect "UTC". Many files change in minor ways in the commentary. * zdump.c (show): * zic.c (inzsub, addtype): In output, say "UT" rather than "UTC", since the time stamp we're talking about might precede the introduction of UTC. --- Makefile | 4 ++-- Theory | 8 ++++---- australasia | 2 +- date.1 | 2 +- date.c | 2 +- etcetera | 4 ++-- europe | 4 ++-- localtime.c | 12 ++++++------ newctime.3 | 4 ++-- newstrftime.3 | 4 ++-- newtzset.3 | 5 +++-- strftime.c | 2 +- tzfile.5 | 8 ++++---- tzfile.h | 4 ++-- zdump.c | 2 +- zic.8 | 12 ++++++------ zic.c | 6 +++--- 17 files changed, 43 insertions(+), 42 deletions(-) diff --git a/Makefile b/Makefile index e439055..41b6ffc 100644 --- a/Makefile +++ b/Makefile @@ -28,7 +28,7 @@ LOCALTIME= GMT # time zone files, or adding it to a time zone file). # (When a POSIX-style environment variable is handled, the rules in the # template file are used to determine "spring forward" and "fall back" days and -# times; the environment variable itself specifies UTC offsets of standard and +# times; the environment variable itself specifies UT offsets of standard and # summer time.) # Alternately, if you discover you've got the wrong time zone, you can just # zic -p rightzone @@ -196,7 +196,7 @@ GCC_DEBUG_FLAGS = -Dlint -g3 -O3 -fno-common -fstrict-aliasing \ # that gives an offset to add to the time_t when converting it. # "timelocal" is equivalent to "mktime". # "timegm" is like "timelocal" except that it turns a struct tm into -# a time_t using UTC (rather than local time as "timelocal" does). +# a time_t using UT (rather than local time as "timelocal" does). # "timeoff" is like "timegm" except that it accepts a second (long) argument # that gives an offset to use when converting to a time_t. # "posix2time" and "time2posix" are described in an included manual page. diff --git a/Theory b/Theory index ea2d004..39ef7ac 100644 --- a/Theory +++ b/Theory @@ -43,7 +43,7 @@ POSIX has the following properties and limitations. "+" and "-" in the names. offset is of the form '[+-]hh:[mm[:ss]]' and specifies the - offset west of UTC. 'hh' may be a single digit; 0<=hh<=24. + offset west of UT. 'hh' may be a single digit; 0<=hh<=24. The default DST offset is one hour ahead of standard time. date[/time],date[/time] specifies the beginning and end of DST. If this is absent, @@ -189,7 +189,7 @@ Points of interest to folks with other systems: but this functionality was removed in later versions of BSD. * In SVR2, time conversion fails for near-minimum or near-maximum - time_t values when doing conversions for places that don't use UTC. + time_t values when doing conversions for places that don't use UT. This package takes care to do these conversions correctly. The functions that are conditionally compiled if STD_INSPIRED is defined @@ -428,14 +428,14 @@ in decreasing order of importance: Use 'LMT' for local mean time of locations before the introduction of standard time; see "Scope of the tz database". - Use UTC (with time zone abbreviation 'zzz') for locations while + Use UT (with time zone abbreviation 'zzz') for locations while uninhabited. The 'zzz' mnemonic is that these locations are, in some sense, asleep. Application writers should note that these abbreviations are ambiguous in practice: e.g. 'EST' has a different meaning in Australia than it does in the United States. In new applications, it's often better -to use numeric UTC offsets like '-0500' instead of time zone +to use numeric UT offsets like '-0500' instead of time zone abbreviations like 'EST'; this avoids the ambiguity. diff --git a/australasia b/australasia index 74ebee2..3e1bfc6 100644 --- a/australasia +++ b/australasia @@ -736,7 +736,7 @@ Zone Pacific/Funafuti 11:56:52 - LMT 1901 # 1886-1891; Baker was similar but exact dates are not known. # Inhabited by civilians 1935-1942; U.S. military bases 1943-1944; # uninhabited thereafter. -# Howland observed Hawaii Standard Time (UTC-10:30) in 1937; +# Howland observed Hawaii Standard Time (UT-10:30) in 1937; # see page 206 of Elgen M. Long and Marie K. Long, # Amelia Earhart: the Mystery Solved, Simon & Schuster (2000). # So most likely Howland and Baker observed Hawaii Time from 1935 diff --git a/date.1 b/date.1 index 7957f74..72c44d0 100644 --- a/date.1 +++ b/date.1 @@ -134,7 +134,7 @@ the seconds part of the new time; if no seconds are given, zero is assumed. These options are available: .TP .BR \-u " or " \-c -Use UTC when setting and showing the date and time. +Use Universal Time when setting and showing the date and time. .TP .BI "\-r " seconds Output the date that corresponds to diff --git a/date.c b/date.c index d8ff50e..fe1311a 100644 --- a/date.c +++ b/date.c @@ -118,7 +118,7 @@ main(const int argc, char *argv[]) switch (ch) { default: usage(); - case 'u': /* do it in UTC */ + case 'u': /* do it in UT */ case 'c': dogmt(); break; diff --git a/etcetera b/etcetera index a9ff729..9ba7f7b 100644 --- a/etcetera +++ b/etcetera @@ -31,9 +31,9 @@ Link Etc/GMT Etc/GMT0 # even though this is the opposite of what many people expect. # POSIX has positive signs west of Greenwich, but many people expect # positive signs east of Greenwich. For example, TZ='Etc/GMT+4' uses -# the abbreviation "GMT+4" and corresponds to 4 hours behind UTC +# the abbreviation "GMT+4" and corresponds to 4 hours behind UT # (i.e. west of Greenwich) even though many people would expect it to -# mean 4 hours ahead of UTC (i.e. east of Greenwich). +# mean 4 hours ahead of UT (i.e. east of Greenwich). # # In the draft 5 of POSIX 1003.1-200x, the angle bracket notation allows for # TZ='<GMT-4>+4'; if you want time zone abbreviations conforming to diff --git a/europe b/europe index 4f972a5..04f61d9 100644 --- a/europe +++ b/europe @@ -1863,7 +1863,7 @@ Zone Europe/Oslo 0:43:00 - LMT 1895 Jan 1 # before 1895, and therefore probably changed the local time somewhere # between 1895 and 1925 (inclusive). -# From Paul Eggert (2013-09-02): +# From Paul Eggert (2013-09-04): # # Actually, Jan Mayen was never occupied by Germany during World War II, # so it must have diverged from Oslo time during the war, as Oslo was @@ -1874,7 +1874,7 @@ Zone Europe/Oslo 0:43:00 - LMT 1895 Jan 1 # 1941 with a small Norwegian garrison and continued operations despite # frequent air ttacks from Germans. In 1943 the Americans established a # radiolocating station on the island, called "Atlantic City". Possibly -# the UTC offset changed during the war, but I think it unlikely that +# the UT offset changed during the war, but I think it unlikely that # Jan Mayen used German daylight-saving rules. # # Svalbard is more complicated, as it was raided in August 1941 by an diff --git a/localtime.c b/localtime.c index a0a4e5e..619a656 100644 --- a/localtime.c +++ b/localtime.c @@ -77,11 +77,11 @@ static const char gmt[] = "GMT"; #endif /* !defined TZDEFDST */ struct ttinfo { /* time type information */ - int_fast32_t tt_gmtoff; /* UTC offset in seconds */ + int_fast32_t tt_gmtoff; /* UT offset in seconds */ int tt_isdst; /* used to set tm_isdst */ int tt_abbrind; /* abbreviation list index */ int tt_ttisstd; /* TRUE if transition is std time */ - int tt_ttisgmt; /* TRUE if transition is UTC */ + int tt_ttisgmt; /* TRUE if transition is UT */ }; struct lsinfo { /* leap second information */ @@ -842,7 +842,7 @@ getrule(const char *strp, register struct rule *const rulep) /* ** Given the Epoch-relative time of January 1, 00:00:00 UTC, in a year, the -** year, a rule, and the offset from UTC at the time that rule takes effect, +** year, a rule, and the offset from UT at the time that rule takes effect, ** calculate the Epoch-relative time that rule takes effect. */ @@ -925,10 +925,10 @@ transtime(const time_t janfirst, const int year, } /* - ** "value" is the Epoch-relative time of 00:00:00 UTC on the day in + ** "value" is the Epoch-relative time of 00:00:00 UT on the day in ** question. To get the Epoch-relative time of the specified local ** time on that day, add the transition time and the current offset - ** from UTC. + ** from UT. */ return value + rulep->r_time + offset; } @@ -1379,7 +1379,7 @@ gmtsub(const time_t *const timep, const int_fast32_t offset, #ifdef TM_ZONE /* ** Could get fancy here and deliver something such as - ** "UTC+xxxx" or "UTC-xxxx" if offset is non-zero, + ** "UT+xxxx" or "UT-xxxx" if offset is non-zero, ** but this is no time for a treasure hunt. */ if (offset != 0) diff --git a/newctime.3 b/newctime.3 index 3583a91..8528deb 100644 --- a/newctime.3 +++ b/newctime.3 @@ -165,7 +165,7 @@ includes the following fields: int tm_yday; /\(** day of year (0 - 365) \(**/ int tm_isdst; /\(** is summer time in effect? \(**/ char \(**tm_zone; /\(** abbreviation of timezone name \(**/ - long tm_gmtoff; /\(** offset from UTC in seconds \(**/ + long tm_gmtoff; /\(** offset from UT in seconds \(**/ .fi .RE .PP @@ -184,7 +184,7 @@ is non-zero if summer time is in effect. .PP .I Tm_gmtoff is the offset (in seconds) of the time represented -from UTC, with positive values indicating east +from UT, with positive values indicating east of the Prime Meridian. .SH FILES .ta \w'/usr/local/etc/zoneinfo/posixrules\0\0'u diff --git a/newstrftime.3 b/newstrftime.3 index ef79e4d..5b0dcdc 100644 --- a/newstrftime.3 +++ b/newstrftime.3 @@ -164,7 +164,7 @@ using AM/PM notation. is replaced by the second as a decimal number (00-60). .TP %s -is replaced by the number of seconds since the Epoch, UTC (see mktime(3)). +is replaced by the number of seconds since the Epoch, UT (see mktime(3)). .TP %T is replaced by the time in the format %H:%M:%S. @@ -211,7 +211,7 @@ is replaced by the time zone name, or by the empty string if this is not determinable. .TP %z -is replaced by the offset from UTC in the format +HHMM or -HHMM as appropriate, +is replaced by the offset from UT in the format +HHMM or -HHMM as appropriate, with positive values representing locations east of Greenwich, or by the empty string if this is not determinable. .TP diff --git a/newtzset.3 b/newtzset.3 index fd6b677..8162920 100644 --- a/newtzset.3 +++ b/newtzset.3 @@ -26,8 +26,9 @@ in the system time conversion information directory, is used by If .B TZ appears in the environment but its value is a null string, -Coordinated Universal Time (UTC) is used (without leap second -correction). If +Universal Time (UT) is used, with the abbreviation "UTC" +and without leap second +correction. If .B TZ appears in the environment and its value is not a null string: .IP diff --git a/strftime.c b/strftime.c index 821ce7f..aba3d33 100644 --- a/strftime.c +++ b/strftime.c @@ -501,7 +501,7 @@ label: diff = t->TM_GMTOFF; #else /* !defined TM_GMTOFF */ /* - ** C99 says that the UTC offset must + ** C99 says that the UT offset must ** be computed by looking only at ** tm_isdst. This requirement is ** incorrect, since it means the code diff --git a/tzfile.5 b/tzfile.5 index 10698a2..e92eaed 100644 --- a/tzfile.5 +++ b/tzfile.5 @@ -20,7 +20,7 @@ These values are, in order: .TP .I tzh_ttisgmtcnt -The number of UTC/local indicators stored in the file. +The number of UT/local indicators stored in the file. .TP .I tzh_ttisstdcnt The number of standard/wall indicators stored in the file. @@ -83,7 +83,7 @@ and a one-byte value for .IR tt_abbrind . In each structure, .I tt_gmtoff -gives the number of seconds to be added to UTC, +gives the number of seconds to be added to UT, .I tt_isdst tells whether .I tm_isdst @@ -118,9 +118,9 @@ time zone environment variables. .PP Finally there are .I tzh_ttisgmtcnt -UTC/local indicators, each stored as a one-byte value; +UT/local indicators, each stored as a one-byte value; they tell whether the transition times associated with local time types -were specified as UTC or local time, +were specified as UT or local time, and are used when a time zone file is used in handling POSIX-style time zone environment variables. .PP diff --git a/tzfile.h b/tzfile.h index d04fe04..0cf2943 100644 --- a/tzfile.h +++ b/tzfile.h @@ -55,7 +55,7 @@ struct tzhead { ** tzh_timecnt (char [4])s coded transition times a la time(2) ** tzh_timecnt (unsigned char)s types of local time starting at above ** tzh_typecnt repetitions of -** one (char [4]) coded UTC offset in seconds +** one (char [4]) coded UT offset in seconds ** one (unsigned char) used to set tm_isdst ** one (unsigned char) that's an abbreviation list index ** tzh_charcnt (char)s '\0'-terminated zone abbreviations @@ -68,7 +68,7 @@ struct tzhead { ** if absent, transition times are ** assumed to be wall clock time ** tzh_ttisgmtcnt (char)s indexed by type; if TRUE, transition -** time is UTC, if FALSE, +** time is UT, if FALSE, ** transition time is local time ** if absent, transition times are ** assumed to be local time diff --git a/zdump.c b/zdump.c index 3d5ec64..2a9860c 100644 --- a/zdump.c +++ b/zdump.c @@ -628,7 +628,7 @@ show(char *zone, time_t t, int v) (void) printf(tformat(), t); } else { dumptime(tmp); - (void) printf(" UTC"); + (void) printf(" UT"); } (void) printf(" = "); } diff --git a/zic.8 b/zic.8 index d0d34ce..5c8b59c 100644 --- a/zic.8 +++ b/zic.8 @@ -288,13 +288,13 @@ This is the name used in creating the time conversion information file for the zone. .TP .B GMTOFF -The amount of time to add to UTC to get standard time in this zone. +The amount of time to add to UT to get standard time in this zone. This field has the same format as the .B AT and .B SAVE fields of rule lines; -begin the field with a minus sign if time must be subtracted from UTC. +begin the field with a minus sign if time must be subtracted from UT. .TP .B RULES/SAVE The name of the rule(s) that apply in the time zone or, @@ -315,10 +315,10 @@ a slash (/) separates standard and daylight abbreviations. .TP .B UNTILYEAR [MONTH [DAY [TIME]]] -The time at which the UTC offset or the rule(s) change for a location. +The time at which the UT offset or the rule(s) change for a location. It is specified as a year, a month, a day, and a time of day. If this is specified, -the time zone information is generated from the given UTC offset +the time zone information is generated from the given UT offset and rule change until the time specified. The month, day, and time of day have the same format as the IN, ON, and AT fields of a rule; trailing fields can be omitted, and default to the @@ -485,9 +485,9 @@ If, for a particular zone, a clock advance caused by the start of daylight saving coincides with and is equal to -a clock retreat caused by a change in UTC offset, +a clock retreat caused by a change in UT offset, .IR zic -produces a single transition to daylight saving at the new UTC offset +produces a single transition to daylight saving at the new UT offset (without any change in wall clock time). To get separate transitions use multiple zone continuation lines diff --git a/zic.c b/zic.c index 1715c4a..97786ae 100644 --- a/zic.c +++ b/zic.c @@ -1005,7 +1005,7 @@ inzsub(register char **const fields, const int nfields, const int iscont) } z.z_filename = filename; z.z_linenum = linenum; - z.z_gmtoff = gethms(fields[i_gmtoff], _("invalid UTC offset"), TRUE); + z.z_gmtoff = gethms(fields[i_gmtoff], _("invalid UT offset"), TRUE); if ((cp = strchr(fields[i_format], '%')) != 0) { if (*++cp != 's' || strchr(cp, '%') != 0) { error(_("invalid abbreviation format")); @@ -2079,7 +2079,7 @@ wp = ecpyalloc(_("no POSIX environment variable for zone")); INITIALIZE(ktime); if (useuntil) { /* - ** Turn untiltime into UTC + ** Turn untiltime into UT ** assuming the current gmtoff and ** stdoff values. */ @@ -2253,7 +2253,7 @@ addtype(const zic_t gmtoff, const char *const abbr, const int isdst, exit(EXIT_FAILURE); } if (! (-1L - 2147483647L <= gmtoff && gmtoff <= 2147483647L)) { - error(_("UTC offset out of range")); + error(_("UT offset out of range")); exit(EXIT_FAILURE); } gmtoffs[i] = gmtoff; -- 1.8.1.2
On Wed 2013-09-04T19:09:52 -0700, Paul Eggert hath writ:
Steve Allen wrote:
Unlike UTC, the concept of UT can be validly extended into the indefinite past.
Thanks for bringing this up. Here's a proposed patch that tries to fix the occurrences of this problem that I found in the tz code and data. I've pushed this to the experimental repository.
These patches make my POSIX conformance side very uncomfortable. The comments about the data are generally more accurate by using UT. The comments in and about the POSIX API get very confusing for not using UTC (as is done in the POSIX docs themselves), and I think they might best be left referring to UTC. Unfortunately there is no single good answer for the name of the Greenwich-like time scale that is presumed by computer APIs. In the absence of switching the zdump output string between UT and UTC as of 1960 some sort of apologetic explanation may be required. -- Steve Allen <sla@ucolick.org> WGS-84 (GPS) UCO/Lick Observatory--ISB Natural Sciences II, Room 165 Lat +36.99855 1156 High Street Voice: +1 831 459 3046 Lng -122.06015 Santa Cruz, CA 95064 http://www.ucolick.org/~sla/ Hgt +250 m
Thanks for the review. I pushed the following patch into the experimental repository to try to fix (or at least document) the problems that you noted. diff --git a/newctime.3 b/newctime.3 index 8528deb..ece8507 100644 --- a/newctime.3 +++ b/newctime.3 @@ -36,8 +36,6 @@ asctime, ctime, difftime, gmtime, localtime, mktime \- convert date and time to .I Ctime\^ converts a long integer, pointed to by .IR clock , -representing the time in seconds since -00:00:00 UTC, 1970-01-01, and returns a pointer to a string of the form .br @@ -59,6 +57,17 @@ These unusual formats are designed to make it less likely that older software that expects exactly 26 bytes of output will mistakenly output misleading values for out-of-range years. .PP +The +.BI * clock +time stamp represents the time in seconds since 1970-01-01 00:00:00 +Coordinated Universal Time (UTC). +The POSIX standard says that time stamps must be nonnegative +must ignore leap seconds. +Many implementations extend POSIX by allowing negative time stamps, +and can therefore represent time stamps that that predate the +introduction of UTC and are some other flavor of Universal Time (UT). +Some implementations support leap seconds, in contradiction to POSIX. +.PP .I Localtime\^ and .I gmtime\^ @@ -186,6 +195,7 @@ is non-zero if summer time is in effect. is the offset (in seconds) of the time represented from UT, with positive values indicating east of the Prime Meridian. +The field's name is derived from Greenwich Mean Time, a precursor of UT. .SH FILES .ta \w'/usr/local/etc/zoneinfo/posixrules\0\0'u /usr/local/etc/zoneinfo time zone information directory diff --git a/newstrftime.3 b/newstrftime.3 index 5b0dcdc..d39915c 100644 --- a/newstrftime.3 +++ b/newstrftime.3 @@ -164,7 +164,7 @@ using AM/PM notation. is replaced by the second as a decimal number (00-60). .TP %s -is replaced by the number of seconds since the Epoch, UT (see mktime(3)). +is replaced by the number of seconds since the Epoch (see newctime(3)). .TP %T is replaced by the time in the format %H:%M:%S. @@ -211,7 +211,8 @@ is replaced by the time zone name, or by the empty string if this is not determinable. .TP %z -is replaced by the offset from UT in the format +HHMM or -HHMM as appropriate, +is replaced by the offset from the Prime Meridian +in the format +HHMM or \(miHHMM as appropriate, with positive values representing locations east of Greenwich, or by the empty string if this is not determinable. .TP diff --git a/newtzset.3 b/newtzset.3 index 8162920..3689e50 100644 --- a/newtzset.3 +++ b/newtzset.3 @@ -27,8 +27,9 @@ If .B TZ appears in the environment but its value is a null string, Universal Time (UT) is used, with the abbreviation "UTC" -and without leap second -correction. If +and without leap second correction; please see +.IR newctime (3) +for more about UT, UTC, and leap seconds. If .B TZ appears in the environment and its value is not a null string: .IP diff --git a/zdump.8 b/zdump.8 index af3277d..106361a 100644 --- a/zdump.8 +++ b/zdump.8 @@ -61,6 +61,14 @@ Time discontinuities are found by sampling the results returned by localtime at twelve-hour intervals. This works in all real-world cases; one can construct artificial time zones for which this fails. +.PP +In the output, "UT" denotes the value returned by +.IR gmtime (3), +which uses UTC for modern time stamps and some other UT flavor for +time stamps that predate the introduction of UTC. +No attempt is currently made to have the output use "UTC" for newer +and "UT" for older time stamps, +partly because the exact date of the introduction of UTC is problematic. .SH "SEE ALSO" newctime(3), tzfile(5), zic(8) .\" This file is in the public domain, so clarified as of
On Sep 4, 2013, at 2:32 AM, Stephen Colebourne <scolebourne@joda.org> wrote:
Thirdly, I note that the leading supporters of Paul's approach are from an academic background (Paul, Guy, yourself).
Actually, no, I'm not "from an academic background", except to the extent that I have an academic degree; everywhere I've worked since 1979 was a business of some sort, not in academia. ("alum" is short for "alumni"/"alumnae", if you're guessing from my e-mail address that I work in academia; it's a mail-forwarding service.)
With respect, I wonder if that academic background insulates from the needs of enterprise software, primarily stability.
I've worked as a software developer at various companies, such as Sun Microsystems, Network Appliance, and Apple; only one of them was a major producer of "enterprise" equipment when I worked there (Sun was a technical workstation company, not "the dot in dot com", then, and Apple hasn't been particularly enterprise-oriented in ages). As for whether I'm a "supporter of Paul's approach", I think that's an overstatement. I'm somebody who's certainly sympathetic to concerns about having to offer end-users choices between tzids that won't make any real difference to them in practice - i.e., I'm not 100% an *opponent* of Paul's approrach - but I think the *ideal* way to handle that is not to offer end-users choices between tzids, which is why I've cited OS X's "Time Zone" subpane of "Date & Time" in System Preferences as the right way to handle that, rather than the cheesy "I know! I'll just turn all the underscores in tzids into spaces and make an option menu out of them!" stuff done by some other OSes. There are now data files freely available that can support that. For developers that are hard-coding tzids into code or files, they'd have to know what they're doing and pick the appropriate tzids. For those letting the users specify a tzid, something OS X-style would be best, so they don't need to know about tzids.
Finally, I'm NOT asking for all historical data to be frozen. I'm asking for no historical data to be changed UNLESS the replacement is a clear enhancement. This is a common sense and perfectly reasonable request of the database. Its also how all well-resected APIs and data sets (like CLDR) operate. Such an approach does explicitly rule out zone ID merging that loses the start date of offsets or abbreviations, even if those are guesswork/invented (because the replacement is not an enhancement, its worse).
I personally think it reasonable that if two entries differ in the start date for standardized times, *and* the start dates have a reliable source for them, they not be coalesced. If the start dates are just guesses, that sounds like a change from one set of data that won't necessarily give you the same answer to another set of data that won't necessarily give you the same answer, so I don't think it's clear that merging them is a bad idea - stability might be an argument against merging them, but the only way in which the old data could be considered "better" is that it doesn't represent a change, not that it necessarily better reflects reality. I.e., in the case you mention, the replacement is "worse" only in that it's different, not in that it's further from reality. And I think anybody using the tzdb to handle pre-1970 dates, or claiming that their APIs can be used to handle pre-1970 dates, should either state it as "you can use it, but we make no claim that the results will be correct, so don't rely on it for anything where a difference of an hour or two will actually matter" or should back up their confidence with historical research (and send the results of that research to Paul so the comments can be updated).
On 4 September 2013 18:00, Guy Harris <guy@alum.mit.edu> wrote:
On Sep 4, 2013, at 2:32 AM, Stephen Colebourne <scolebourne@joda.org> wrote: Actually, no, I'm not "from an academic background", except to the extent that I have an academic degree; everywhere I've worked since 1979 was a business of some sort, not in academia. ("alum" is short for "alumni"/"alumnae", if you're guessing from my e-mail address that I work in academia; it's a mail-forwarding service.) As for whether I'm a "supporter of Paul's approach", I think that's an overstatement. I'm somebody who's certainly sympathetic to concerns about having to offer end-users choices between tzids that won't make any real difference to them in practice - i.e., I'm not 100% an *opponent* of Paul's approrach - but I think the *ideal* way to handle that is not to offer end-users choices between tzids, which is why I've cited OS X's "Time Zone" subpane of "Date & Time" in System Preferences as the right way to handle that, rather than the cheesy "I know! I'll just turn all the underscores in tzids into spaces and make an option menu out of them!" stuff done by some other OSes. There are now data files freely available that can support that.
Moving on from my dumb academic statement, I agree entirely that a good user interface should not expose raw or mangled raw zone IDs. CLDR offers good textual forms for example.
And I think anybody using the tzdb to handle pre-1970 dates, or claiming that their APIs can be used to handle pre-1970 dates, should either state it as "you can use it, but we make no claim that the results will be correct, so don't rely on it for anything where a difference of an hour or two will actually matter" or should back up their confidence with historical research (and send the results of that research to Paul so the comments can be updated).
I might add something to JSR-310 to that effect. thanks Stephen
Stephen Colebourne wrote:
I'll just turn all the underscores in tzids into spaces and make an option menu out of them!" stuff done by some other OSes. There are now data files freely available that can support that. Moving on from my dumb academic statement, I agree entirely that a good user interface should not expose raw or mangled raw zone IDs. CLDR offers good textual forms for example.
Or after asking their home country, only add a timezone selection if that is necessary? I think that the naming of zones is something of a smoke screen, although we do need a good cross reference as to what location uses what timezone - and when, and on edge cases it may well be difficult for a local to pick which of two time zones is right for them? It is the quality of the data that is the first problem, and making sure that what is presented is as complete as possible rather than saying 'It's too difficult so we will not try'. The best people to research local data are the locals themselves and while some parts of the world documentary evidence may be none existent, even that is a fact that should be advised! Basically despite previous best practice, there is now a growing need to complete the picture but it does require the assistance of the world in general to do that. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On 09/04/13 11:07, Lester Caine wrote:
Or after asking their home country, only add a timezone selection if that is necessary?
Yes, tzselect already does that, and I assume other country-based selectors also do that. This is one reason that we want to winnow out choices -- more often, it will mean that knowing just the country is enough.
Stephen Colebourne <scolebourne@joda.org> writes:
Secondly, I'm not speaking on behalf of myself, but on behalf of Java development generally.
That's quite a mantle for you to be assuming for yourself. I have to say that I would be very reluctant to make a similar claim. I'm not sure I'd want to be responsible for expressing the opinions of the entire Java development community as a self-appointed spokesperson.
Thirdly, I note that the leading supporters of Paul's approach are from an academic background (Paul, Guy, yourself). With respect, I wonder if that academic background insulates from the needs of enterprise software, primarily stability.
I'm not that horribly interested in a war of credentials, but perhaps some additional professional background would be helpful for context. I am, professionally, an IT architect and software developer for the central IT department at Stanford University. I'm not sure whether you count that as an academic background or not, but my day job is running a large hetergeneous server infrastructure. My professional specialty is authentication systems, which tend to care very deeply about accurate time, but my involvement in this mailing list is out of personal interest and curiosity. I am also a member of the Debian Technical Committee and one of the editors of the Debian Policy documentation, so I'm reasonably familiar with distribution packaging issues, although I am not personally involved in the Debian packaging of the tz database or in time zone selection during the Debian installation process. That said, I want to make it clear that I don't speak for Debian, let alone for Linux packagers in general. I'm participating, as I hope all of us are, as an individual with some interest in helping the project make the best decisions that we can make, and to provide support to Paul as the primary maintainer.
Fourthly, it seems to me that the recent batch of changes are far in excess of what has happened over previous years. For example, https://github.com/eggert/tz/commits/master/backward shows that the backward file was modified a number of times in the past few years, but almost always for changes to the spelling or naming of zone IDs (something which I've not opposed, even though I know CLDR finds that problematic).
I think it's important to distinguish between two different things that are happening. There have been quite a few changes made recently on a trial basis in an attempt to address some of the geopolitical concerns. I have no specific comments on that other than to say that I wholeheartedly approve of and support the *process* that Paul has been using in trying to reach consensus on how to address those problems, including floating trials and then backing them out when people disagree with them. I'm personally frustrated by people treating every proposed change as if the world might end; by all means, argue your side of this debate if you have strong opinions, but some of the comments have bordered on accusing Paul of acting in bad faith, and that's not sitting well with me. I have my own opinions about the origin of the recent flood of geopolitical concerns (and they're much harsher than Paul's, which is one of the reasons why I've been sitting on my hands and letting Paul handle it, much better than I would have). But I think it's important to remember that Paul is making a good-faith effort to address issues that have been raised, and to act and debate accordingly. However, apart from that set of changes, from where I sit, you and a small number of other people have gone beyond that argument and have now started objecting to nearly every change Paul makes for any reason, including changes that would have been entirely uncontroversial in previous years. And that's what I'm taking exception to.
Finally, I'm NOT asking for all historical data to be frozen. I'm asking for no historical data to be changed UNLESS the replacement is a clear enhancement.
The bar that you're setting for "clear enhancement" is not consistent with how this project has ever been run in the past.
In summary, given the importance of the data set, and how it is currently being abused, I have no choice but to pusue my objections.
You are bringing far too much drama to this situation, in my opinion. Paul has demonstrated repeatedly that he's not only open to reasonable discussion, he's open to reverting changes even when people aren't able to express coherent objections but are just upset. None of this is the end of the world. There are some very hard problems around the intersection of geopolitics and time zone selection, but there are also multiple layers of correction and UI between the core database and those issues. And the changes that people are worried about have not been in any official release. I understand why you want there to be a single tz database run according to your criteria, but that isn't an option on the table. There can be multiple tz databases, of which one is run according to your criteria, or there can be one database where you provide only one point of input among many other people. The contributions you have brought are very valuable. It is very useful to have someone deeply involved in the project who understands and cares about Java's use of the database, particularly since some of those uses are quite different than the typical POSIX use of the database (which was its original raison d'etre and still tends to shine through, although the project has grown beyond that). I would certainly prefer for you to continue to collaborate here. But from where I sit, the recent discussions have felt like more of a hostile takeover than a collaboration, particularly when you casually dismiss all the work Paul has done over the past several months. That is what prompted me to speak up. -- Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>
On 4 September 2013 20:19, Russ Allbery <rra@stanford.edu> wrote:
Stephen Colebourne <scolebourne@joda.org> writes: There have been quite a few changes made recently on a trial basis in an attempt to address some of the geopolitical concerns. I have no specific comments on that other than to say that I wholeheartedly approve of and support the *process* that Paul has been using in trying to reach consensus on how to address those problems, including floating trials and then backing them out when people disagree with them. I'm personally frustrated by people treating every proposed change as if the world might end;
Part of the problem is that commits to the repo form the basis of the permanent record of the group. IMO, if the commits are experimental, then they should occur first on a branch, and only be included onto the master branch once they are accepted. With the current way of working, someone in 5 years time is going to have to wade through a lot of rubbish over the past month to see what really happened. Secondly, I would be less peeved if reversions were actually reversions. Each reversion that has occurred has been partial rather than complete, or has included some other unrelated change. A much better practice would be to revert in full, and then reapply a smaller change with the hope that may be more acceptable. Again, such an approach will prove hugely helpful in the future when looking back at the archive of events.
by all means, argue your side of this debate if you have strong opinions, but some of the comments have bordered on accusing Paul of acting in bad faith, and that's not sitting well with me.
The partial reversion described at "revert most" certainly felt like bad faith to me even if it wasn't intended as such.
However, apart from that set of changes, from where I sit, you and a small number of other people have gone beyond that argument and have now started objecting to nearly every change Paul makes for any reason, including changes that would have been entirely uncontroversial in previous years. And that's what I'm taking exception to. The bar that you're setting for "clear enhancement" is not consistent with how this project has ever been run in the past.
When the database moved to IANA, I argued that it should have gone to CLDR. They have a full technical architecture committee and a team of people used to dealing with complex political and i18n issues backed by enterprises such as IBM and Oracle. If it had gone there, the stability that I am calling for would have happened. For the record, I am not objecting to every change, I am objecting to every change that actively deletes data in a way that can be observed by a consumer, unless such a change is a clear enhancement. That may seem to you to be beyond what has happened in the past, but to me it is exactly how I expect the pre-eminent source of time-zone data to operate. In addition, if you read the entire archives of the last month as I did today, you will find numerous people asking for and expecting a greater degree of stability than recent events have shown, or has perhaps been the case in the past. Please don't fall into the mistake of assuming its just me.
But from where I sit, the recent discussions have felt like more of a hostile takeover than a collaboration, particularly when you casually dismiss all the work Paul has done over the past several months.
I hope I have highlighted that a greater degree of stability is now a major concern not just of me, but of others, and I hope that will be taken into account in the future. I don't consider a request for stability to be a hostile takeover, but I do consider some of the proposed commits to be far in excess of what is acceptable to me. For the record, TZ Coordinator is not a job I would enjoy. Stephen
Stephen Colebourne <scolebourne@joda.org> writes:
Part of the problem is that commits to the repo form the basis of the permanent record of the group. IMO, if the commits are experimental, then they should occur first on a branch, and only be included onto the master branch once they are accepted. With the current way of working, someone in 5 years time is going to have to wade through a lot of rubbish over the past month to see what really happened.
I think you're reading too much into this. The use of any modern VCS for this project at all is quite new; prior to the change of maintainer, it was maintained in SCCS and the SCCS files were not even generally public. There are many possible workflows with Git, and they are, to a large extent, a matter of personal opinion and personal comfort. I probably would have used a branch, but I might not have, and in any event I don't think it's particularly important. I would certainly not object if Paul decided that maintaining a high quality commit stream with rich metadata was a maintenance standard that he wanted to adopt for the project, but to me this is a lot more about maintainer workflow than a vital output product of the project, and if it's not work he wants to do, oh well. We will all certainly survive, just as we survived just fine through years of ado using private SCCS files. I think your second point here is more on point:
Secondly, I would be less peeved if reversions were actually reversions. Each reversion that has occurred has been partial rather than complete, or has included some other unrelated change. A much better practice would be to revert in full, and then reapply a smaller change with the hope that may be more acceptable.
I do think that the way the changes were made and reverted has made it more difficult to understand what portion of the change was reverted and what portion of the change was not reverted. It's fairly easy for anyone to determine the remaining changes by doing a git diff across the relevant objects, but I do concur that using the git revert command to back out the whole change and then applying the change that one wants to retain as a separate change is cleaner and easier for everyone else to understand. This has the advantage of providing a place to put rationale for the retained change separate from rationale for the original change. It is, however, inconsistent with a GNU ChangeLog style of documentation of project history, at least as I understand the ChangeLog standard. It's much more of a "native Git" way of handling things.
When the database moved to IANA, I argued that it should have gone to CLDR. They have a full technical architecture committee and a team of people used to dealing with complex political and i18n issues backed by enterprises such as IBM and Oracle. If it had gone there, the stability that I am calling for would have happened.
I'm very glad this didn't happen. I would have had much less respect for and much less faith in that sort of process. The tz database has always been a labor of love by a small number of technical professionals acting on their own time and with their best personal judgement. It is, in that way, either a throwback to a different era of computing or a participant in the free software world, depending on your preferred way of looking at things. Personally, I believe that's one of the major reasons why it has been so successful over such a long period of time.
For the record, I am not objecting to every change, I am objecting to every change that actively deletes data in a way that can be observed by a consumer, unless such a change is a clear enhancement.
It is certainly within your right to make that objection. I personally disagree with you, and am taking this opportunity to say so. I do not believe that is the correct standard to use when deciding on changes. I prefer the standard that has been used by Paul, ado, kre, and others in the past, which takes into account stability, accuracy of data, proliferation of zones, and ease of future maintenance to arrive at a criteria with more nuance than that. The policy that I'm speaking up to advocate for is that the people doing the work should have the most weight in deciding on the strategy that's the most effective. They're the ones bearing the burden of the work, and they're also the ones with the most experience with the data and the most understanding of which data points are significant and which are not. This, like most software projects, is as much art as science; subjective judgements matter, and are the path to increased quality and consistency when the same judgement is consistently applied. Putting aside the issues around tzselect and geopolitical sorting for the moment, most of the other changes that have happened recently would probably have never been noticed by anyone who wasn't reading the commit stream for the project. There seems to be quite a tempest in a teapot here, at least from my perspective.
I don't consider a request for stability to be a hostile takeover, but I do consider some of the proposed commits to be far in excess of what is acceptable to me.
I hope you can understand why I find phrases like "far in excess of what is acceptable to me" to be hostile and confrontational, and not helpful in reaching a collaborative consensus. That's a bargaining tactic in a hostile negotiation, which I would really hope we could avoid. -- Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>
<<On Wed, 04 Sep 2013 14:01:32 -0700, Russ Allbery <rra@stanford.edu> said:
I hope you can understand why I find phrases like "far in excess of what is acceptable to me" to be hostile and confrontational, and not helpful in reaching a collaborative consensus. That's a bargaining tactic in a hostile negotiation, which I would really hope we could avoid.
Seconded. I've actually been moderately unhappy with some of the recent changes myself, but I've refrained from saying anything because I think the atmosphere has been somewhat poisoned by the confrontational attitudes being taken by other participants. I do think, on a more general level, that it is a bit unfortunate that the format of the tzdata files is now effectively frozen because so many external users have chosen to parse the source rather than the compiled form. This will make it much more difficult to make changes in the future, should they be needed. I suppose this can be put down to the human-readable nature of the source files, as compared to the binary format emitted by zic (which has itself never had a backward-incompatible change, and like the source files is machine-independent). -GAWollman
On Sep 4, 2013, at 2:23 PM, Garrett Wollman <wollman@csail.mit.edu> wrote:
I do think, on a more general level, that it is a bit unfortunate that the format of the tzdata files is now effectively frozen because so many external users have chosen to parse the source rather than the compiled form. This will make it much more difficult to make changes in the future, should they be needed.
If we want to do that (e.g., if we were to introduce the CLDR concept of meta-zones into the tzdb, so that the CLDR doesn't have to duplicate transition time/date information that's in the tzdb - see the usesMetazone items in metaZones.xml), we might end up creating a new source file format and have a tool that reads the new format and emits the old format.
On 4 September 2013 22:32, Guy Harris <guy@alum.mit.edu> wrote:
On Sep 4, 2013, at 2:23 PM, Garrett Wollman <wollman@csail.mit.edu> wrote:
I do think, on a more general level, that it is a bit unfortunate that the format of the tzdata files is now effectively frozen because so many external users have chosen to parse the source rather than the compiled form. This will make it much more difficult to make changes in the future, should they be needed.
If we want to do that (e.g., if we were to introduce the CLDR concept of meta-zones into the tzdb, so that the CLDR doesn't have to duplicate transition time/date information that's in the tzdb - see the usesMetazone items in metaZones.xml), we might end up creating a new source file format and have a tool that reads the new format and emits the old format.
Just to note that I parse the source tzdb because the binary data does not contain all the data AFAIK. A new source format that generates a file in the old format would work so long as the distribution contained the generated old format files. Stephen
On Sep 4, 2013, at 2:36 PM, Stephen Colebourne <scolebourne@joda.org> wrote:
Just to note that I parse the source tzdb because the binary data does not contain all the data AFAIK.
Which data is that? Perhaps we should see if we can extend the binary data format (in a backwards-compatible fashion, so that old code can read the new files) to include that data.
On 4 September 2013 22:45, Guy Harris <guy@alum.mit.edu> wrote:
On Sep 4, 2013, at 2:36 PM, Stephen Colebourne <scolebourne@joda.org> wrote:
Just to note that I parse the source tzdb because the binary data does not contain all the data AFAIK.
Which data is that?
I have seen Java programs that expose every last drop of data in the files, except the comments. Notably, I don't believe that the binary data provides the rules themselves which allow future DST to be calculated. The parsers I write use those rules to create DST transitions to the far future. Stephen
Stephen Colebourne wrote:
Notably, I don't believe that the binary data provides the rules themselves which allow future DST to be calculated.
The last item in a version-2 tzfile is a POSIX (System V style) TZ string to be used for the indefinite future of the zone. My tzfile parser (the Perl module) makes use of this. There are a small number of zones for which such a TZ string can't be formulated, and in those cases zic tries to provide 400 years of explicit future transitions, so if the zone's rules are based on the Gregorian calendar you can repeat that 400 years indefinitely. There are an even smaller number of zones for which neither of these strategies work, but in those cases the zic source can't represent the rule either. (Incidentally, I've posted a patch more than one that would make the 400 year hack more robust, and it seems to have got lost each time. It's sitting on a branch in my public git repo.) The extra information not in the tzfiles that you can get from the zic source is limited to things like whether a transition was the result of a long-term rule as opposed to a one-off decision, and what the rule was. I don't think this is really meaningful information for the purposes for which the zone data is intended. It also doesn't look as if previous practice has tried to maintain this information in a meaningful way, other than perhaps in the comments. For example, the UK during the 1980s and early 1990s had DST dates officially set separately by Statutory Instrument for each year, but the individual decisions tended to follow a consistent pattern that the "europe" file describes by three multi-year "Rule" entries. -zefram
Zefram wrote:
The extra information not in the tzfiles that you can get from the zic source is limited to things like whether a transition was the result of a long-term rule as opposed to a one-off decision, and what the rule was. I don't think this is really meaningful information for the purposes for which the zone data is intended. It also doesn't look as if previous practice has tried to maintain this information in a meaningful way, other than perhaps in the comments.
That's correct. In the data I've often coalesced repeated entries into a single rule that explains these entries. Usually these entries came from Shanks, so who knows how reliable they are? But sometimes they came from more-reliable sources. It's generally not that well-defined whether a series of laws, some marked as extensions of others, constitute one rule or several in the tz sense, and generally speaking I coalesced them when I could. Conversely, sometimes just a single legal rule (e.g., "first Sunday in April unless that's Easter, in which case the second Sunday") had to be split into multiple tz rules, since the tz rules don't have the notion of Easter built-in, so I had to enter each Easter manually as a special case.
Zefram wrote:
I've posted a patch more than one that would make the 400 year hack more robust, and it seems to have got lost each time. It's sitting on a branch in my public git repo.
Sorry about that, it was still in my inbox. I revisited it today. It inserts an artificial mark into the zic binary file so that, for zones in which POSIX TZ strings cannot support time stamps into the indefinite future, a consumer of the binary file can more easily determine where its data are valid. This jogged my brain to look two other items in my inbox that are relevant and less problematic. I fixed these and pushed the fixes into the experimental github. First, "Record that San Luis is at UTC-3, not UTC-4 with perpetual DST." <http://mm.icann.org/pipermail/tz/2013-September/019996.html> adjusts Argentina/San_Luis to not have a weird setting of perpetual DST indefinitely into the future, a setting that POSIX TZ strings cannot represent. Second, "Support time stamps past 2038 in zones like America/Santiago" <http://mm.icann.org/pipermail/tz/2013-September/020013.html> fixes zic to generate TZ strings that work into the indefinite future for almost all zones where this was a problem, thus removing the need for the artificial mark in these zones. There's one remaining zone which has the problem, namely Asia/Tehran, but I don't think your 400-year change fixes it. So, I'm hoping that your fix is no longer needed, in that the other two patches have fixed things in a different way. Anyway, thanks for bringing this up, as it prompted fixes for the problems in the database or the way it's processed on POSIX platforms.
Paul Eggert wrote:
There's one remaining zone which has the problem, namely Asia/Tehran, but I don't think your 400-year change fixes it.
Yeah, that zone doesn't work to a Gregorian-based rule.
So, I'm hoping that your fix is no longer needed, in that the other two patches have fixed things in a different way.
I don't think that follows. Even if the 400 year hack is no longer used for any current zone, that doesn't mean it'll never be applicable in the future. If zic is to retain the logic for the 400 year hack, it ought to be the robustified form of the hack. -zefram
Zefram wrote:
If zic is to retain the logic for the 400 year hack, it ought to be the robustified form of the hack.
Fair enough. But then I have a question about the change. It increases the window from 400 to 402 years. Is that part of the change needed? As I understand it, it's to avoid coalescing (say) a 399-year run of a rule to an adjacent one-off that *happens* to look like the extension of the rule. But is there really any harm to that? Such coalescing is what we already do, when preparing the input data.
Paul Eggert wrote:
It increases the window from 400 to 402 years. Is that part of the change needed?
Yes, that's an essential part of the robustification.
As I understand it, it's to avoid coalescing (say) a 399-year run of a rule to an adjacent one-off that *happens* to look like the extension of the rule.
You misunderstand it. The situation of concern is where the 400 year period governed by a rule is immediately preceded by something that is *different from* what the rule would have. If the preceding DST behaviour has transitions later in the year than the transitions produced by the rule, the last preceding transitions could occur less than 400 years before the last rule-generated transition that zic puts in the output. For example, suppose you have a rule (not expressible in POSIX form) applying from 2012 onwards that has transitions in March and September, plus some one-off transitions in November 2013. zic sets max_year=2413, with the intent that the 400 years 2014 to 2413 inclusive will be repeated. The last transition listed in the tzfile is in 2413-09, less than 400 years after the last one-off transition in 2013-11. The problem is how a tzfile reader is to determine which 400-year period to repeat. When creating the tzfile, zic internally has a clear idea of what period it intends to be repeated, but it doesn't write anything into the tzfile to make that period explicit. Indeed, it doesn't even explicitly indicate that it intends any such repetition. (Adding a flag for this, in the 15 reserved octets, might be a good idea.) The nearest thing the tzfile has to an explicit statement of the repeat period is the last transition time in the file; an obvious approach is to repeat the 400 years immediately preceding the last transition time. A tzfile reader could potentially do better by rounding the last transition time up to the end of the containing calendar year. That would fix the example that I gave above. But it's based on knowledge of zic implementation details. Interpretation of the tzfile should be more objective than that. There are also edge cases around using extended times of day, whereby a transition notionally associated with one year could actually be located in a different year. (And do you use UT or local time for the year boundaries?) Adding on some margin to the period of future explicit transitions ensures that any reasonable implementation of the repeat on the reading side will work. Two years might be overkill, but with the kind of edge cases available I'm not convinced that one year would suffice, and an extra two or so transitions in a tzfile is very cheap overkill. -zefram
On Fri, Sep 6, 2013, at 11:38, Zefram wrote:
You misunderstand it. The situation of concern is where the 400 year period governed by a rule is immediately preceded by something that is *different from* what the rule would have. If the preceding DST behaviour has transitions later in the year than the transitions produced by the rule, the last preceding transitions could occur less than 400 years before the last rule-generated transition that zic puts in the output.
Why not put a non-transition (i.e. a transition to the same offset already in effect) at the beginning and end of the 400 year period?
Zefram wrote:
For example, suppose you have a rule (not expressible in POSIX form) applying from 2012 onwards that has transitions in March and September, plus some one-off transitions in November 2013.
The idea behind the recent changes is that all the rules one can write can be expressed in extended-POSIX form; if that's good enough, then the scenario you describe is no longer possible. Can you supply an example illustrating the problem? Does this have anything to do with yearistype.sh? Come to think of it, there is one situation the zic code doesn't cover -- perpetual DST. The code claims that this is a POSIX no-no, but on further thought that's incorrect; for example, for San Luis's perpetual-DST (before the database changed yesterday) we can use TZ='WART4WARST,J1/0,J365/24'. I'll come up with a patch to fix this.
This causes zic to generate, e.g., TZ='WART4WARST,J1/0,J365/24' for perpetual DST in San Luis, Argentina. See <http://mm.icann.org/pipermail/tz/2013-September/020056.html>. --- zic.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/zic.c b/zic.c index 260dc2e..502d81e 100644 --- a/zic.c +++ b/zic.c @@ -1886,7 +1886,7 @@ stringzone(char *result, const struct zone *const zpfirst, const int zonecount) stdrp = rp; } if (stdrp != NULL && stdrp->r_stdoff != 0) - return; /* We end up in DST (a POSIX no-no). */ + dstrp = stdrp; /* We end up in DST. */ /* ** Horrid special case: if year is 2037, ** presume this is a zone handled on a year-by-year basis; @@ -1913,12 +1913,16 @@ stringzone(char *result, const struct zone *const zpfirst, const int zonecount) return; } (void) strcat(result, ","); - if (stringrule(result, dstrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) { + if (dstrp == stdrp) + (void) strcat(result, "J1/0"); + else if (stringrule(result, dstrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) { result[0] = '\0'; return; } (void) strcat(result, ","); - if (stringrule(result, stdrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) { + if (dstrp == stdrp) + (void) strcat(result, "J365/24"); + else if (stringrule(result, stdrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) { result[0] = '\0'; return; } -- 1.8.1.2
Paul Eggert wrote:
The idea behind the recent changes is that all the rules one can write can be expressed in extended-POSIX form;
Two periods of DST each year can't be expressed, or two-stage onset of DST. Anything involving more than two Rule entries being applicable up to max_year.
Does this have anything to do with yearistype.sh?
I hadn't considered year-type rules. That's another way to confound POSIX-TZ-ification.
San Luis's perpetual-DST (before the database changed yesterday) we can use TZ='WART4WARST,J1/0,J365/24'.
Emitting that kind of TZ value would be a bad idea. What you've actually written there, because of the way the transition time-of-day gets interpreted, has an hour each year of standard time. If you fix that (glossing over the question of whether it can be fixed), you're calling for two transitions to occur simultaneously, the behaviour of which is not well defined. -zefram
Zefram wrote:
What you've actually written there, because of the way the transition time-of-day gets interpreted, has an hour each year of standard time. If you fix that (glossing over the question of whether it can be fixed), you're calling for two transitions to occur simultaneously, the behaviour of which is not well defined.
Good catch. Since POSIX doesn't say what to do when the end-DST and start-DST transitions are simultaneous, this idea relies on an extension to POSIX. Since we're already relying on other extensions to POSIX as part of the recent changes, it should be OK to rely on this one as well, so long as we document what we're doing. Here's a further patch to do that. I plan to look into the other points your email mentions after addressing Arthur David Olson's points.
From 88e130ed2f5a9e13310f38d08428f1230067d568 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sun, 8 Sep 2013 07:49:22 -0700 Subject: [PATCH] Improve the support for perpetual DST.
Problem reported by Zefram in <http://mm.icann.org/pipermail/tz/2013-September/020059.html>. * localtime.c (tzparse): Elide simultaneous entries out of and into DST. Since this optimization can elide all entries, avoid looping forever looking for entries that will never arrive. While we're at it, fix another portability bug where the code assumed wraparound on signed integer overflow. If the DST stop and start times are simultaneous, assume perpetual DST; the old version of this code did this for San Luis but I suspect it might not have done so for hypothetical examples. * newtzset.3, tzfile.5: Mention that as an extension to POSIX, if DST stops and starts at the same instant, it's assumed to be in effect all year. Give an example. Also, mention the old posix limit of 23 hours rather than 24. * zic.c (stringrule): Omit the "J" in January and February, as this can save a byte or two in the output. (rule_cmp): New function. (stringzone): Do a better job of constructing the standard-time abbreviation when there is perpetual DST. Defer to the new stringrule to construct the times for perpetual DST. Fix bug noted by Zefram, which caused a stray hour of standard time to be inserted in an otherwise perpetual DST. Previously, this code generated "WARST4WARST,J1/0,J365/24" for the San Luis example; now it generates "WART4WARST,0/1,0". Not only does this fix the bug, it is a bit shorter and more likely to work better with non-tzcode implementations that mistakenly treat this as specifying standard time all year. --- localtime.c | 47 ++++++++++++++++++++++++--------------------- newtzset.3 | 16 ++++++++++++++-- tzfile.5 | 9 ++++++--- zic.c | 63 +++++++++++++++++++++++++++++++++++++++++++++---------------- 4 files changed, 93 insertions(+), 42 deletions(-) diff --git a/localtime.c b/localtime.c index 91a3171..eb9a1a6 100644 --- a/localtime.c +++ b/localtime.c @@ -1008,6 +1008,7 @@ tzparse(const char *name, register struct state *const sp, struct rule start; struct rule end; register int year; + register int yearlim; register time_t janfirst; time_t starttime; time_t endtime; @@ -1035,35 +1036,39 @@ tzparse(const char *name, register struct state *const sp, atp = sp->ats; typep = sp->types; janfirst = 0; - sp->timecnt = 0; - for (year = EPOCH_YEAR; - sp->timecnt + 2 <= TZ_MAX_TIMES; - ++year) { - time_t newfirst; + yearlim = EPOCH_YEAR + YEARSPERREPEAT; + for (year = EPOCH_YEAR; year < yearlim; year++) { + int_fast32_t yearsecs; starttime = transtime(janfirst, year, &start, stdoffset); endtime = transtime(janfirst, year, &end, dstoffset); - if (starttime > endtime) { - *atp++ = endtime; - *typep++ = 1; /* DST ends */ - *atp++ = starttime; - *typep++ = 0; /* DST begins */ - } else { - *atp++ = starttime; - *typep++ = 0; /* DST begins */ - *atp++ = endtime; - *typep++ = 1; /* DST ends */ + if (starttime != endtime) { + if (&sp->ats[TZ_MAX_TIMES - 2] < atp) + break; + yearlim = year + YEARSPERREPEAT + 1; + if (starttime > endtime) { + *atp++ = endtime; + *typep++ = 1; /* DST ends */ + *atp++ = starttime; + *typep++ = 0; /* DST begins */ + } else { + *atp++ = starttime; + *typep++ = 0; /* DST begins */ + *atp++ = endtime; + *typep++ = 1; /* DST ends */ + } } - sp->timecnt += 2; - newfirst = janfirst; - newfirst += year_lengths[isleap(year)] * - SECSPERDAY; - if (newfirst <= janfirst) + yearsecs = (year_lengths[isleap(year)] + * SECSPERDAY); + if (time_t_max - janfirst < yearsecs) break; - janfirst = newfirst; + janfirst += yearsecs; } + sp->timecnt = atp - sp->ats; + if (!sp->timecnt) + sp->typecnt = 1; /* Perpetual DST. */ } else { register int_fast32_t theirstdoffset; register int_fast32_t theirdstoffset; diff --git a/newtzset.3 b/newtzset.3 index bb40c01..b05a6a3 100644 --- a/newtzset.3 +++ b/newtzset.3 @@ -108,7 +108,8 @@ follows summer time is assumed to be one hour ahead of standard time. One or more digits may be used; the value is always interpreted as a decimal number. The hour must be between zero and 24, and the minutes (and -seconds) \(em if present \(em between zero and 59. If preceded by a +seconds) \(em if present \(em between zero and 59. (Older versions +of POSIX do not allow the hour to be 24.) If preceded by a .RB `` \(mi '', the time zone shall be east of the Prime Meridian; otherwise it shall be west (which may be indicated by an optional preceding @@ -132,6 +133,9 @@ describes when the change back happens. Each .I time field describes when, in current local time, the change to the other time is made. +As an extension to POSIX, if daylight saving time stops and +starts at the same instant of time, daylight saving time is +assumed to be in effect all year. .IP The format of .I date @@ -183,7 +187,7 @@ or .RB `` \(pl ''). As an extension to POSIX, the hours part of .I time -can range from \(mi167 to 167; this allows for unusual rules such +can range from \(mi167 through 167; this allows for unusual rules such as "the Saturday before the first Sunday of March". The default, if .I time is not given, is @@ -212,6 +216,14 @@ stands for Israel standard time (IST) and Israel daylight time (IDT), fourth Thursday in March (i.e., 02:00 on the first Friday on or after March 23), and fall back at 02:00 on the last Sunday in October. .TP +.B WART4WARST,0/1,0 +stands for Western Argentina Summer Time (WARST), 3 hours behind UTC. +There is a dummy transition to standard time on January 1 at 02:00 +daylight saving time, and a simultaneous transition back to DST at +01:00 standard time, so DST is in effect all year and the initial +.B WART +is a placeholder. +.TP .B WGT3WGST,M3.5.0/\(mi2,M10.5.0/\(mi1 stands for Western Greenland Time (WGT) and Western Greenland Summer Time (WGST), 3 hours behind UTC, where clocks follow the EU rules of diff --git a/tzfile.5 b/tzfile.5 index c7bd40e..e69cf3e 100644 --- a/tzfile.5 +++ b/tzfile.5 @@ -145,10 +145,13 @@ POSIX-TZ-environment-variable-style string for use in handling instants after the last transition time stored in the file (with nothing between the newlines if there is no POSIX representation for such instants). -This string may use a minor extension to the POSIX TZ format: the -hours part of its transition times may be signed and range from +As described in +.IR newtzset (3), +this string may use two minor extensions to the POSIX TZ format. +First, the hours part of its transition times may be signed and range from \(mi167 through 167 instead of the POSIX-required unsigned values -from 0 through 24. +from 0 through 24 (formerly 23). Second, if DST stops and starts +at the same time, it is assumed to be in effect all year. .SH SEE ALSO newctime(3), newtzset(3) .\" This file is in the public domain, so clarified as of diff --git a/zic.c b/zic.c index 502d81e..cd787b0 100644 --- a/zic.c +++ b/zic.c @@ -1804,7 +1804,11 @@ stringrule(char *result, const struct rule *const rp, const zic_t dstoff, total = 0; for (month = 0; month < rp->r_month; ++month) total += len_months[0][month]; - (void) sprintf(result, "J%d", total + rp->r_dayofmonth); + /* Omit the "J" in Jan and Feb, as that's shorter. */ + if (rp->r_month <= 1) + (void) sprintf(result, "%d", total + rp->r_dayofmonth - 1); + else + (void) sprintf(result, "J%d", total + rp->r_dayofmonth); } else { register int week; register int wday = rp->r_wday; @@ -1842,6 +1846,20 @@ stringrule(char *result, const struct rule *const rp, const zic_t dstoff, return 0; } +static int +rule_cmp(struct rule const *a, struct rule const *b) +{ + if (!a) + return -!!b; + if (!b) + return 1; + if (a->r_hiyear != b->r_hiyear) + return a->r_hiyear < b->r_hiyear ? -1 : 1; + if (a->r_month - b->r_month != 0) + return a->r_month - b->r_month; + return a->r_dayofmonth - b->r_dayofmonth; +} + static void stringzone(char *result, const struct zone *const zpfirst, const int zonecount) { @@ -1851,6 +1869,7 @@ stringzone(char *result, const struct zone *const zpfirst, const int zonecount) register struct rule * dstrp; register int i; register const char * abbrvar; + struct rule stdr, dstr; result[0] = '\0'; zp = zpfirst + zonecount - 1; @@ -1874,19 +1893,17 @@ stringzone(char *result, const struct zone *const zpfirst, const int zonecount) if (stdrp == NULL && dstrp == NULL) { /* ** There are no rules running through "max". - ** Let's find the latest rule. + ** Find the latest std rule in stdabbrrp + ** and latest rule of any type in stdrp. */ + register struct rule *stdabbrrp = NULL; for (i = 0; i < zp->z_nrules; ++i) { rp = &zp->z_rules[i]; - if (stdrp == NULL || rp->r_hiyear > stdrp->r_hiyear || - (rp->r_hiyear == stdrp->r_hiyear && - (rp->r_month > stdrp->r_month || - (rp->r_month == stdrp->r_month && - rp->r_dayofmonth > stdrp->r_dayofmonth)))) - stdrp = rp; + if (rp->r_stdoff == 0 && rule_cmp(stdabbrrp, rp) < 0) + stdabbrrp = rp; + if (rule_cmp(stdrp, rp) < 0) + stdrp = rp; } - if (stdrp != NULL && stdrp->r_stdoff != 0) - dstrp = stdrp; /* We end up in DST. */ /* ** Horrid special case: if year is 2037, ** presume this is a zone handled on a year-by-year basis; @@ -1894,6 +1911,24 @@ stringzone(char *result, const struct zone *const zpfirst, const int zonecount) */ if (stdrp != NULL && stdrp->r_hiyear == 2037) return; + + if (stdrp != NULL && stdrp->r_stdoff != 0) { + /* Perpetual DST. */ + stdr.r_month = dstr.r_month = TM_JANUARY; + stdr.r_dycode = dstr.r_dycode = DC_DOM; + stdr.r_dayofmonth = dstr.r_dayofmonth = 1; + stdr.r_tod = 2 * SECSPERHOUR; + dstr.r_tod = stdr.r_tod - stdrp->r_stdoff; + stdr.r_todisstd = dstr.r_todisstd = FALSE; + stdr.r_todisgmt = dstr.r_todisgmt = FALSE; + stdr.r_stdoff = 0; + dstr.r_stdoff = stdrp->r_stdoff; + stdr.r_abbrvar + = (stdabbrrp ? stdabbrrp->r_abbrvar : ""); + dstr.r_abbrvar = stdrp->r_abbrvar; + stdrp = &stdr; + dstrp = &dstr; + } } if (stdrp == NULL && (zp->z_nrules != 0 || zp->z_stdoff != 0)) return; @@ -1913,16 +1948,12 @@ stringzone(char *result, const struct zone *const zpfirst, const int zonecount) return; } (void) strcat(result, ","); - if (dstrp == stdrp) - (void) strcat(result, "J1/0"); - else if (stringrule(result, dstrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) { + if (stringrule(result, dstrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) { result[0] = '\0'; return; } (void) strcat(result, ","); - if (dstrp == stdrp) - (void) strcat(result, "J365/24"); - else if (stringrule(result, stdrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) { + if (stringrule(result, stdrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) { result[0] = '\0'; return; } -- 1.8.1.2
Paul Eggert wrote:
Since we're already relying on other extensions to POSIX as part of the recent changes, it should be OK to rely on this one as well,
If you're going to extend POSIX TZ as much as necessary, it can be extended much more cleanly than this. Allowing a wider range of time-of-day in the TZ format is qualitatively less troublesome than defining entirely new semantics. -zefram
Zefram wrote:
If you're going to extend POSIX TZ as much as necessary, it can be extended much more cleanly than this.
Absolutely, but one goal of extending POSIX TZ in that less-clean way is to make the extension more likely to work with current systems; that is why my original idea specified January 1 through December 31.
Allowing a wider range of time-of-day in the TZ format is qualitatively less troublesome
OK, but we can implement perpetual DST merely in terms of a wider range of time-of-day. Further testing with tzcode2013d, GNU/Linux and Solaris suggests that we can use a pattern like this: WART4WARST,0/0,J365/25 This already works with these existing systems, i.e., it's a POSIX extension that's already widely supported. Here's a revised patch to implement this. I've pushed this into the experimental version on github.
From 30364485a6fdc0333f48905db387e07abb70bac1 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sun, 8 Sep 2013 07:49:22 -0700 Subject: [PATCH] Improve the support for perpetual DST.
Problem reported by Zefram in <http://mm.icann.org/pipermail/tz/2013-September/020059.html>. * localtime.c (tzparse): Elide simultaneous entries out of and into DST if DST goes for an hour more than a year (actually, for more than the DST offset more than a year). Since this optimization can elide all entries, avoid looping forever looking for entries that will never arrive. While we're at it, fix another portability bug where the code assumed wraparound on signed integer overflow. * newtzset.3, tzfile.5: Mention that as an extension to POSIX, if DST covers the entire year plus the DST offset, it's assumed to be in effect all year. Give an example. * zic.c (stringrule): Omit the "J" in January and February, as this can save a byte or two in the output. (rule_cmp): New function. (stringzone): Do a better job of constructing the standard-time abbreviation when there is perpetual DST. Defer to the new stringrule to construct the times for perpetual DST. Fix bug noted by Zefram, which caused a stray hour of standard time to be inserted in an otherwise perpetual DST. Previously, this code generated "WARST4WARST,J1/0,J365/24" for the San Luis example; now it generates "WART4WARST,0/0,J365/25". --- localtime.c | 51 +++++++++++++++++++++++++++-------------------- newtzset.3 | 16 ++++++++++++++- tzfile.5 | 10 +++++++--- zic.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++--------------- 4 files changed, 102 insertions(+), 41 deletions(-) diff --git a/localtime.c b/localtime.c index 91a3171..f2004b5 100644 --- a/localtime.c +++ b/localtime.c @@ -1008,6 +1008,7 @@ tzparse(const char *name, register struct state *const sp, struct rule start; struct rule end; register int year; + register int yearlim; register time_t janfirst; time_t starttime; time_t endtime; @@ -1035,35 +1036,43 @@ tzparse(const char *name, register struct state *const sp, atp = sp->ats; typep = sp->types; janfirst = 0; - sp->timecnt = 0; - for (year = EPOCH_YEAR; - sp->timecnt + 2 <= TZ_MAX_TIMES; - ++year) { - time_t newfirst; + yearlim = EPOCH_YEAR + YEARSPERREPEAT; + for (year = EPOCH_YEAR; year < yearlim; year++) { + int_fast32_t yearsecs; starttime = transtime(janfirst, year, &start, stdoffset); endtime = transtime(janfirst, year, &end, dstoffset); - if (starttime > endtime) { - *atp++ = endtime; - *typep++ = 1; /* DST ends */ - *atp++ = starttime; - *typep++ = 0; /* DST begins */ - } else { - *atp++ = starttime; - *typep++ = 0; /* DST begins */ - *atp++ = endtime; - *typep++ = 1; /* DST ends */ + yearsecs = (year_lengths[isleap(year)] + * SECSPERDAY); + if (starttime > endtime + || (starttime < endtime + && (endtime - starttime + < (yearsecs + + (stdoffset - dstoffset))))) { + if (&sp->ats[TZ_MAX_TIMES - 2] < atp) + break; + yearlim = year + YEARSPERREPEAT + 1; + if (starttime > endtime) { + *atp++ = endtime; + *typep++ = 1; /* DST ends */ + *atp++ = starttime; + *typep++ = 0; /* DST begins */ + } else { + *atp++ = starttime; + *typep++ = 0; /* DST begins */ + *atp++ = endtime; + *typep++ = 1; /* DST ends */ + } } - sp->timecnt += 2; - newfirst = janfirst; - newfirst += year_lengths[isleap(year)] * - SECSPERDAY; - if (newfirst <= janfirst) + if (time_t_max - janfirst < yearsecs) break; - janfirst = newfirst; + janfirst += yearsecs; } + sp->timecnt = atp - sp->ats; + if (!sp->timecnt) + sp->typecnt = 1; /* Perpetual DST. */ } else { register int_fast32_t theirstdoffset; register int_fast32_t theirdstoffset; diff --git a/newtzset.3 b/newtzset.3 index bb40c01..618c92d 100644 --- a/newtzset.3 +++ b/newtzset.3 @@ -132,6 +132,10 @@ describes when the change back happens. Each .I time field describes when, in current local time, the change to the other time is made. +As an extension to POSIX, daylight saving is assumed to be in effect +all year if it begins January 1 at 00:00 and ends December 31 at +24:00 plus the difference between daylight saving and standard time, +leaving no room for standard time in the calendar. .IP The format of .I date @@ -183,7 +187,7 @@ or .RB `` \(pl ''). As an extension to POSIX, the hours part of .I time -can range from \(mi167 to 167; this allows for unusual rules such +can range from \(mi167 through 167; this allows for unusual rules such as "the Saturday before the first Sunday of March". The default, if .I time is not given, is @@ -212,6 +216,16 @@ stands for Israel standard time (IST) and Israel daylight time (IDT), fourth Thursday in March (i.e., 02:00 on the first Friday on or after March 23), and fall back at 02:00 on the last Sunday in October. .TP +.B WART4WARST,J1/0,J365/25 +stands for Western Argentina Summer Time (WARST), 3 hours behind UTC. +There is a dummy transition to standard time on December 31 at 25:00 +daylight saving time (i.e., 24:00 standard time, equivalent to January +1 at 00:00 standard time), and a simultaneous transition to daylight +saving time on January 1 at 00:00 standard time, so daylight saving +time is in effect all year and the initial +.B WART +is a placeholder. +.TP .B WGT3WGST,M3.5.0/\(mi2,M10.5.0/\(mi1 stands for Western Greenland Time (WGT) and Western Greenland Summer Time (WGST), 3 hours behind UTC, where clocks follow the EU rules of diff --git a/tzfile.5 b/tzfile.5 index c7bd40e..b2d1a4d 100644 --- a/tzfile.5 +++ b/tzfile.5 @@ -145,10 +145,14 @@ POSIX-TZ-environment-variable-style string for use in handling instants after the last transition time stored in the file (with nothing between the newlines if there is no POSIX representation for such instants). -This string may use a minor extension to the POSIX TZ format: the -hours part of its transition times may be signed and range from +As described in +.IR newtzset (3), +this string may use two minor extensions to the POSIX TZ format. +First, the hours part of its transition times may be signed and range from \(mi167 through 167 instead of the POSIX-required unsigned values -from 0 through 24. +from 0 through 24. Second, DST is in effect all year if it starts +January 1 at 00:00 and ends December 31 at 24:00 plus the difference +between daylight saving and standard time. .SH SEE ALSO newctime(3), newtzset(3) .\" This file is in the public domain, so clarified as of diff --git a/zic.c b/zic.c index 502d81e..15a1f5f 100644 --- a/zic.c +++ b/zic.c @@ -1804,7 +1804,11 @@ stringrule(char *result, const struct rule *const rp, const zic_t dstoff, total = 0; for (month = 0; month < rp->r_month; ++month) total += len_months[0][month]; - (void) sprintf(result, "J%d", total + rp->r_dayofmonth); + /* Omit the "J" in Jan and Feb, as that's shorter. */ + if (rp->r_month <= 1) + (void) sprintf(result, "%d", total + rp->r_dayofmonth - 1); + else + (void) sprintf(result, "J%d", total + rp->r_dayofmonth); } else { register int week; register int wday = rp->r_wday; @@ -1842,6 +1846,20 @@ stringrule(char *result, const struct rule *const rp, const zic_t dstoff, return 0; } +static int +rule_cmp(struct rule const *a, struct rule const *b) +{ + if (!a) + return -!!b; + if (!b) + return 1; + if (a->r_hiyear != b->r_hiyear) + return a->r_hiyear < b->r_hiyear ? -1 : 1; + if (a->r_month - b->r_month != 0) + return a->r_month - b->r_month; + return a->r_dayofmonth - b->r_dayofmonth; +} + static void stringzone(char *result, const struct zone *const zpfirst, const int zonecount) { @@ -1851,6 +1869,7 @@ stringzone(char *result, const struct zone *const zpfirst, const int zonecount) register struct rule * dstrp; register int i; register const char * abbrvar; + struct rule stdr, dstr; result[0] = '\0'; zp = zpfirst + zonecount - 1; @@ -1874,19 +1893,17 @@ stringzone(char *result, const struct zone *const zpfirst, const int zonecount) if (stdrp == NULL && dstrp == NULL) { /* ** There are no rules running through "max". - ** Let's find the latest rule. + ** Find the latest std rule in stdabbrrp + ** and latest rule of any type in stdrp. */ + register struct rule *stdabbrrp = NULL; for (i = 0; i < zp->z_nrules; ++i) { rp = &zp->z_rules[i]; - if (stdrp == NULL || rp->r_hiyear > stdrp->r_hiyear || - (rp->r_hiyear == stdrp->r_hiyear && - (rp->r_month > stdrp->r_month || - (rp->r_month == stdrp->r_month && - rp->r_dayofmonth > stdrp->r_dayofmonth)))) - stdrp = rp; + if (rp->r_stdoff == 0 && rule_cmp(stdabbrrp, rp) < 0) + stdabbrrp = rp; + if (rule_cmp(stdrp, rp) < 0) + stdrp = rp; } - if (stdrp != NULL && stdrp->r_stdoff != 0) - dstrp = stdrp; /* We end up in DST. */ /* ** Horrid special case: if year is 2037, ** presume this is a zone handled on a year-by-year basis; @@ -1894,6 +1911,27 @@ stringzone(char *result, const struct zone *const zpfirst, const int zonecount) */ if (stdrp != NULL && stdrp->r_hiyear == 2037) return; + + if (stdrp != NULL && stdrp->r_stdoff != 0) { + /* Perpetual DST. */ + dstr.r_month = TM_JANUARY; + dstr.r_dycode = DC_DOM; + dstr.r_dayofmonth = 1; + dstr.r_tod = 0; + dstr.r_todisstd = dstr.r_todisgmt = FALSE; + dstr.r_stdoff = stdrp->r_stdoff; + dstr.r_abbrvar = stdrp->r_abbrvar; + stdr.r_month = TM_DECEMBER; + stdr.r_dycode = DC_DOM; + stdr.r_dayofmonth = 31; + stdr.r_tod = SECSPERDAY + stdrp->r_stdoff; + stdr.r_todisstd = stdr.r_todisgmt = FALSE; + stdr.r_stdoff = 0; + stdr.r_abbrvar + = (stdabbrrp ? stdabbrrp->r_abbrvar : ""); + dstrp = &dstr; + stdrp = &stdr; + } } if (stdrp == NULL && (zp->z_nrules != 0 || zp->z_stdoff != 0)) return; @@ -1913,16 +1951,12 @@ stringzone(char *result, const struct zone *const zpfirst, const int zonecount) return; } (void) strcat(result, ","); - if (dstrp == stdrp) - (void) strcat(result, "J1/0"); - else if (stringrule(result, dstrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) { + if (stringrule(result, dstrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) { result[0] = '\0'; return; } (void) strcat(result, ","); - if (dstrp == stdrp) - (void) strcat(result, "J365/24"); - else if (stringrule(result, stdrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) { + if (stringrule(result, stdrp, dstrp->r_stdoff, zp->z_gmtoff) != 0) { result[0] = '\0'; return; } -- 1.8.1.2
Zefram wrote:
Two periods of DST each year can't be expressed, or two-stage onset of DST. Anything involving more than two Rule entries being applicable up to max_year.
Good point. I have pushed what I think is a merge of your change into the experimental version on github (see attached). In the current tz data this change affects only Asia/Tehran, and does not affect the output of 'zdump -v Asia/Tehran'. I assume that makes sense; if not, please let me know.
From cd270af1583e13bf70331d78d8aecc56052190b0 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sun, 8 Sep 2013 23:53:35 -0700 Subject: [PATCH] * zic.c: Tweak 400-years-hack to better handle some edge cases better.
Derived from Zefram's patch mentioned in <http://mm.icann.org/pipermail/tz/2013-July/019470.html>. With the current tz data, this affects only the Asia/Tehran file, and it doesn't affect zdump output. (YEAR_BY_YEAR_ZONE): New constant. (stringzone): Return it when applicable. (outzone): Search through a couple of extra years when extending. Extend when we ran past 2037 in the data, too. --- zic.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 65 insertions(+), 8 deletions(-) diff --git a/zic.c b/zic.c index dcab3aa..9939195 100644 --- a/zic.c +++ b/zic.c @@ -1873,6 +1873,8 @@ rule_cmp(struct rule const *a, struct rule const *b) return a->r_dayofmonth - b->r_dayofmonth; } +enum { YEAR_BY_YEAR_ZONE = 1 }; + static int stringzone(char *result, const struct zone *const zpfirst, const int zonecount) { @@ -1925,7 +1927,7 @@ stringzone(char *result, const struct zone *const zpfirst, const int zonecount) ** do not try to apply a rule to the zone. */ if (stdrp != NULL && stdrp->r_hiyear == 2037) - return -1; + return YEAR_BY_YEAR_ZONE; if (stdrp != NULL && stdrp->r_stdoff != 0) { /* Perpetual DST. */ @@ -2006,6 +2008,7 @@ outzone(const struct zone * const zpfirst, const int zonecount) register int max_envvar_len; register int prodstic; /* all rules are min to max */ register int compat; + register int do_extend; max_abbr_len = 2 + max_format_len + max_abbrvar_len; max_envvar_len = 2 * max_abbr_len + 5 * 9; @@ -2055,7 +2058,8 @@ outzone(const struct zone * const zpfirst, const int zonecount) ** Generate lots of data if a rule can't cover all future times. */ compat = stringzone(envvar, zpfirst, zonecount); - if (noise && compat != 0) { + do_extend = compat < 0 || compat == YEAR_BY_YEAR_ZONE; + if (noise && compat != 0 && compat != YEAR_BY_YEAR_ZONE) { if (compat < 0) warning("%s %s", _("no POSIX environment variable for zone"), @@ -2069,12 +2073,27 @@ outzone(const struct zone * const zpfirst, const int zonecount) zpfirst->z_name, compat); } } - if (envvar[0] == '\0') { - if (min_year >= ZIC_MIN + YEARSPERREPEAT) - min_year -= YEARSPERREPEAT; + if (do_extend) { + /* + ** Search through a couple of extra years past the obvious + ** 400, to avoid edge cases. For example, suppose a non-POSIX + ** rule applies from 2012 onwards and has transitions in March + ** and September, plus some one-off transitions in November + ** 2013. If zic looked only at the last 400 years, it would + ** set max_year=2413, with the intent that the 400 years 2014 + ** through 2413 will be repeated. The last transition listed + ** in the tzfile would be in 2413-09, less than 400 years + ** after the last one-off transition in 2013-11. Two years + ** might be overkill, but with the kind of edge cases + ** available we're not sure that one year would suffice. + */ + enum { years_of_observations = YEARSPERREPEAT + 2 }; + + if (min_year >= ZIC_MIN + years_of_observations) + min_year -= years_of_observations; else min_year = ZIC_MIN; - if (max_year <= ZIC_MAX - YEARSPERREPEAT) - max_year += YEARSPERREPEAT; + if (max_year <= ZIC_MAX - years_of_observations) + max_year += years_of_observations; else max_year = ZIC_MAX; /* ** Regardless of any of the above, @@ -2084,7 +2103,7 @@ outzone(const struct zone * const zpfirst, const int zonecount) */ if (prodstic) { min_year = 1900; - max_year = min_year + YEARSPERREPEAT; + max_year = min_year + years_of_observations; } } /* @@ -2250,6 +2269,44 @@ error(_("can't determine time zone abbreviation to use just after until time")); starttime = tadd(starttime, -gmtoff); } } + if (do_extend) { + /* + ** If we're extending the explicitly listed observations + ** for 400 years because we can't fill the POSIX-TZ field, + ** check whether we actually ended up explicitly listing + ** observations through that period. If there aren't any + ** near the end of the 400-year period, add a redundant + ** one at the end of the final year, to make it clear + ** that we are claiming to have definite knowledge of + ** the lack of transitions up to that point. + */ + struct rule xr; + struct attype *lastat; + xr.r_month = TM_JANUARY; + xr.r_dycode = DC_DOM; + xr.r_dayofmonth = 1; + xr.r_tod = 0; + for (lastat = &attypes[0], i = 1; i < timecnt; i++) + if (attypes[i].at > lastat->at) + lastat = &attypes[i]; + if (lastat->at < rpytime(&xr, max_year - 1)) { + /* + ** Create new type code for the redundant entry, + ** to prevent it being optimised away. + */ + if (typecnt >= TZ_MAX_TYPES) { + error(_("too many local time types")); + exit(EXIT_FAILURE); + } + gmtoffs[typecnt] = gmtoffs[lastat->type]; + isdsts[typecnt] = isdsts[lastat->type]; + ttisstds[typecnt] = ttisstds[lastat->type]; + ttisgmts[typecnt] = ttisgmts[lastat->type]; + abbrinds[typecnt] = abbrinds[lastat->type]; + ++typecnt; + addtt(rpytime(&xr, max_year + 1), typecnt-1); + } + } writezone(zpfirst->z_name, envvar); free(startbuf); free(ab); -- 1.8.1.2
On Wed, Sep 4, 2013, at 17:36, Stephen Colebourne wrote:
Just to note that I parse the source tzdb because the binary data does not contain all the data AFAIK. A new source format that generates a file in the old format would work so long as the distribution contained the generated old format files.
The only data that the binary data doesn't contain is the question whether some transition was based on a "last sunday" rule or a fixed date, which is an entirely academic question for past data, and which it actually _does_ usually contain for the present/open-ended-future data (where it doesn't, it attempts to include a full 400 years on the assumption that the transitions will repeat exactly after 400 years) It's not clear why you need that information. Is it possible that you made this design transition before the 64-bit format was added, and needed to support dates after 2038?
On 5 September 2013 15:12, <random832@fastmail.us> wrote:
The only data that the binary data doesn't contain is the question whether some transition was based on a "last sunday" rule or a fixed date, which is an entirely academic question for past data, and which it actually _does_ usually contain for the present/open-ended-future data (where it doesn't, it attempts to include a full 400 years on the assumption that the transitions will repeat exactly after 400 years)
It's not clear why you need that information. Is it possible that you made this design transition before the 64-bit format was added, and needed to support dates after 2038?
We do suport data after 2038, and do expose the actual rule for DST. So a JSR-310 API user can query the fact that the rule is at 1am wall time on the Last Sunday in March etc. http://download.java.net/jdk8/docs/api/java/time/zone/ZoneOffsetTransitionRu... Even if the binary data does now contain the info, with the parser written working and tested there would be no value to us in changing to the binary form. Stephen
I do think, on a more general level, that it is a bit unfortunate that the format of the tzdata files is now effectively frozen because so many external users have chosen to parse the source rather than the compiled form. This will make it much more difficult to make changes in the future, should they be needed.
If my understanding is correct, the output from zic does not preserve daylight saving amount. TimeZone class in our project (ICU) exposes an API for accessing base offset and daylight saving amount separately for historical reason (same thing found in JDK Calendar class). This is one of the reasons why we decided to customize zic and use the text source format as input. I personally think consumers of our code don't care about daylight saving amount (mostly +1 hour, but few exceptions) although. -Yoshito
yoshito_umaoka@us.ibm.com wrote:
If my understanding is correct, the output from zic does not preserve daylight saving amount.
Interesting, this is a better case than Stephen offered. The tzfile does indicate *which* clock settings are DST, so it's possible to determine what is the most recently used base offset for any time. But that's not necessarily the current base offset if there's a change of base offset while a zone remains on DST. There are some instances of that in the database, such as America/Argentina/San_Luis's change from -03:00 to -04:00 on 2008-01-21 while on DST. -zefram
( Written earlier but not sent :( ) Stephen Colebourne wrote:
I don't consider a request for stability to be a hostile takeover, but I do consider some of the proposed commits to be far in excess of what is acceptable to me. For the record, TZ Coordinator is not a job I would enjoy.
I'd second that, but personally I feel that there is still a difference of opinion as to the target we are trying to reach? 'Simplifying' the data may be removing material of a questionably origin, but it would be nice to see a little more substance to justify that? My own trawling through possibly supporting some of Paul's hatchet, but I am still uncomfortable that historic records are being lost and that there is still an underlying design to ignore pre 1970 data when to do so creates an area of the calendar where we can't rely on consistency between operating systems? If culling pre 1970 data is allowed then we end up with a second class view of the data in some environments? I now have a local copy of the git repo and can browse at leisure and One thing that DVCS is very good at is providing a platform to play and experiment. I do agree with Stephen that the correct action once the initial re-factoring was objected to should have been to roll back to a mutually acceptable version. At that point changes could have been cherry picked and those that were agreed reapplied. Where data is then removed it could be individually committed and tagged. And to further tidy up the repo ... moving the data to it's own repo will keep data commits away from code ones? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On 09/04/13 13:08, Stephen Colebourne wrote:
commits to the repo form the basis of the permanent record of the group.
No, the github repository eggert/tz is my personal experimental version. The permanent record of this group (including all the diffs sent via email) is published by the IANA at <http://www.iana.org/time-zones>. The recent flurry of detailed comments about relatively minor points in the experimental version has tempted me to go back to how Arthur David Olson did it, which was to not publish experimental versions. Perhaps Arthur's way was better after all -- it was less work for everybody involved, anyway.
A much better practice would be to revert in full
I suspect that we're looking at a clash of maintenance styles here. I prefer not to create clones of earlier versions (if you want the earlier version, it's easy enough to find it), but other people do. No matter which style is chosen, people who prefer the other style will find it irritating. I'm not sure it's worth micro-managing this, though.
On 4 September 2013 22:42, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 09/04/13 13:08, Stephen Colebourne wrote:
commits to the repo form the basis of the permanent record of the group.
No, the github repository eggert/tz is my personal experimental version. The permanent record of this group (including all the diffs sent via email) is published by the IANA at <http://www.iana.org/time-zones>.
I don't see a git repo there (or any other kind of repo). If there isn't such a thing then there should be one. Patches on the mailing list are insufficient to keep track of the changes as recently shown.
The recent flurry of detailed comments about relatively minor points in the experimental version has tempted me to go back to how Arthur David Olson did it, which was to not publish experimental versions.
Its only by having such a git repo that I have been able to meaningfully criticise the changes and figure out what is going on. The problems stem from changes that were not accepted.
A much better practice would be to revert in full I suspect that we're looking at a clash of maintenance styles here. I prefer not to create clones of earlier versions (if you want the earlier version, it's easy enough to find it), but other people do. No matter which style is chosen, people who prefer the other style will find it irritating.
As an open and public project, it should be possible and desirable to have external people review things. The incomplete reversions made life extremely difficult. What I was asking for was that you do a full git revert for the original commit, and then add a second commit with the more accepted changes. That way if the second "fixed" commit turns out to need to be reversed then that can be easily seen as well. Git is all about lots of small commits. Stephen
Stephen Colebourne wrote:
The permanent record of this group (including all the diffs sent via email) is published by the IANA at <http://www.iana.org/time-zones>.
I don't see a git repo there (or any other kind of repo).
The permanent record isn't published in the form of a public git repository. I long ago asked that IANA create one for that purpose, but it wasn't feasible given their infrastructure and (understandable) security concerns. If my experimental repository isn't set up the way you like, feel free to make a repository of your own and manage it the way you prefer. Git works well for that sort of thing.
The Shanks data often contain guesswork, and the abovementioned transition dates from LMT are almost certainly part of that guesswork. ------- This is true. Gary Christian (CEO Astrolabe) has admitted that time zone info represents "interpretations" trying to argue its copyrightability based on the idea that time zone data aren't "facts" (apparently not understanding the legal definition). Also the transitions from LMT to the onset of standard time is grossly underreported in the ACS Atlas. This data is nowhere near complete, but it would be nice to keep this info in the tzdb.
From what I understand, the original research by Thomas Shanks et al involved going through local newspapers all over the globe looking for reports of time changes. Many times they would see a time for a baseball game stated and then a later game on another date with a different time format (or something similar), but not exact transition date and time for daylight savings onset, so they probably did an interpolation and preferred weekends for setting this change, but they guard this information as to exactly what criteria they used as a secret. Personally I think the data would have gained a lot more respect if they had cited their sources and found a way to highlight these uncertain periods both in book form and in software.
It would have been nice if the IATASSIM had kept all its data going back to its founding. On Tue, Sep 3, 2013 at 10:47 AM, Paul Eggert <eggert@cs.ucla.edu> wrote:
Brian Inglis wrote:
merging zones with different Standard Time start date loses useful historical data
That's not a problem for the proposed change, because in this particular case, no useful historical data are lost. There is no real evidence that the start dates actually differed. All we have is some guesswork from Shanks. The Shanks data often contain guesswork, and the abovementioned transition dates from LMT are almost certainly part of that guesswork.
The associated time zone abbreviations are also guesswork (in this case, my guesswork and not Shanks'), so they do not contain any useful historical data either.
Zoidsoft wrote:
so they probably did an interpolation and preferred weekends for setting this change
That's pretty much what I inferred from looking at the Shanks data. Partly it's because I use the same process when doing my own research. See, for example, the latest change to Pacific/Johnston. For the proposed changes to America/Anguilla etc. the most likely explanation is that Shanks et al. lacked transition info and guessed, and (worse) guessed different transition times in different locations because they made individual guesses and didn't check their work later. The tz data have long been maintained trying to fix stuff like this when I find it. In several places I've corrected Shanks data when I think it's bogus, even if I can't prove it's bogus. These are noted in the comments.
they guard this information as to exactly what criteria they used as a secret
That's where we differ from Shanks. The new Pacific/Johnston entry has a comment saying that it's guesswork, and giving the primary source on which the guesswork is based. It's another place where Shanks is incorrect and I'm supplying my own guesswork, but at least I'm trying to note my assumptions. Changes like this are a longstanding part of maintenance, and I'm becoming inclined to think that we shouldn't discontinue this practice purely from a desire to not change things.
Paul Eggert wrote:
The tz data have long been maintained trying to fix stuff like this when I find it. In several places I've corrected Shanks data when I think it's bogus,
Your recent discussion of Shanks's methodology and quality, and how to deal with it, has been most enlightening. I think it would be useful for you to put these notes into the distributed files, either as a section in Theory or as a new Sources file. -zefram
On 09/04/2013 10:16 AM, Zefram wrote:
Your recent discussion of Shanks's methodology and quality, and how to deal with it, has been most enlightening. I think it would be useful for you to put these notes into the distributed files, either as a section in Theory or as a new Sources file.
Good idea. I pushed this patch. This affects only commentary, so it should be safe.
From 9d3b5229caa1cef1a9000f9612fac5ce60304355 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sat, 14 Sep 2013 22:28:37 -0700 Subject: [PATCH] * Theory (Accuracy of the tz database): New section.
It contains material moved here from other sections, along with material taken from my recent emails to the tz mailing list. Suggested by Zefram in <http://mm.icann.org/pipermail/tz/2013-September/019863.html>. --- Theory | 160 ++++++++++++++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 130 insertions(+), 30 deletions(-) diff --git a/Theory b/Theory index 0c1ffdd..1c78a46 100644 --- a/Theory +++ b/Theory @@ -216,7 +216,10 @@ data, the world is partitioned into regions whose clocks all agree about time stamps that occur after the somewhat-arbitrary cutoff point of the POSIX Epoch (1970-01-01 00:00:00 UTC). For each such region, the database records all known clock transitions, and labels the region -with a notable location. +with a notable location. Although 1970 is a somewhat-arbitrary +cutoff, there are significant challenges to moving the cutoff earlier +even by a decade or two, due to the wide variety of local practices +before computer timekeeping became prevalent. Clock transitions before 1970 are recorded for each such location, because most POSIX-compatible systems support negative time stamps and @@ -224,39 +227,136 @@ could misbehave if data were omitted for pre-1970 transitions. However, the database is not designed for and does not suffice for applications requiring accurate handling of all past times everywhere, as it would take far too much effort and guesswork to record all -details of pre-1970 civil timekeeping. The pre-1970 data in this -database covers only a tiny sliver of how clocks actually behaved; -the vast majority of the necessary information was lost or never -recorded, and much of what little remains is fabricated. -Although 1970 is a somewhat-arbitrary cutoff, there are significant -challenges to moving the cutoff back even by a decade or two, due to -the wide variety of local practices before computer timekeeping -became prevalent. - -Local mean time (LMT) offsets are recorded in the database only -because the format requires an offset. They should not be considered -meaningful, and should not prompt creation of zones merely because two -locations differ in LMT. Historically, not only did different -locations in the same zone typically use different LMT offsets, often -different people in the same location maintained mean-time clocks that -differed significantly, many people used solar or some other time -instead of mean time, and standard time often replaced LMT only -gradually at each location. As for leap seconds, civil time was not -based on atomic time before 1972, and we don't know the history of -earth's rotation accurately enough to map SI seconds to historical -solar time to more than about one-hour accuracy. See: Morrison LV, -Stephenson FR. Historical values of the Earth's clock error Delta T -and the calculation of eclipses. J Hist Astron. 2004;35:327-36 -<http://adsabs.harvard.edu/full/2004JHA....35..327M>; Historical -values of the Earth's clock error. J Hist Astron. 2005;36:339 -<http://adsabs.harvard.edu/full/2005JHA....36..339M>. - -As noted in the README file, the tz database is not authoritative -(particularly not for pre-1970 time stamps), and it surely has errors. +details of pre-1970 civil timekeeping. + + +----- Accuracy of the tz database ----- + +The tz database is not authoritative, and it surely has errors. Corrections are welcome and encouraged. Users requiring authoritative data should consult national standards bodies and the references cited in the database's comments. +Errors in the tz database arise from many sources: + + * The tz database predicts future time stamps, and current predictions + will be incorrect after future governments change the rules. + For example, if today someone schedules a meeting for 13:00 next + October 1, Casablanca time, and tomorrow Morocco changes its + daylight saving rules, software can mess up after the rule change + if it blithely relies on conversions made before the change. + + * The pre-1970 data in this database cover only a tiny sliver of how + clocks actually behaved; the vast majority of the necessary + information was lost or never recorded. Thousands more zones would + be needed if the tz database's scope were extended to cover even + just the known or guessed history of standard time; for example, + the current single entry for France would need to split into dozens + of entries, perhaps hundreds. + + * Most of the pre-1970 data comes from unreliable sources, often + astrology books that lack citations and whose compilers evidently + invented entries when the true facts were unknown, without + reporting which entries were known and which were invented. + These books often contradict each other or give implausible entries, + and on the rare occasions when their old data are checked they are + typically found to be incorrect. + + * For the UK the tz database relies on years of first-class work done by + Joseph Myers and others; see <http://www.polyomino.org.uk/british-time/>. + Other countries are not done nearly as well. + + * Sometimes, different people in the same city would maintain clocks + that differed significantly. Railway time was used by railroad + companies (which did not always agree with each other), + church-clock time was used for birth certificates, etc. + Often this was merely common practice, but sometimes it was set by law. + For example, from 1891 to 1911 the UT offset in France was legally + 0:09:21 outside train stations and 0:04:21 inside. + + * Although a named location in the tz database stands for the + containing region, its pre-1970 data entries are often accurate for + only a small subset of that region. For example, Europe/London + stands for the United Kingdom, but its pre-1847 times are valid + only for locations that have London's exact meridian, and its 1847 + transition to GMT is known to be valid only for the L&NW and the + Caledonian railways. + + * The tz database does not record the earliest time for which a + zone's data is thereafter valid for every location in the region. + For example, Europe/London is valid for all locations in the its + region after GMT was made the standard time, but the date of + standardization (1880-08-02) is not in the tz database, other than + in commentary. For many zones the earlist time of validity is + unknown. + + * The tz database does not record a region's boundaries, and in many + cases the boundaries are not known. For example, the zone + America/Kentucky/Louisville represents a region around the city of + Louisville, the boundaries of which are unclear. + + * Changes that are modeled as instantaneous transitions in the tz + database were often spread out over hours, days, or even decades. + + * Even if the time is specified by law, locations sometimes + deliberately flout the law. + + * Early timekeeping practices, even assuming perfect clocks, were + often not specified to the accuracy that the tz database requires. + + * Sometimes historical timekeeping was specified more precisely + than what the tz database can handle. For example, from 1909 to + 1937 Netherlands clocks were legally UT+00:19:32.13, but the tz + database cannot represent the fractional second. + + * Even when all the timestamp transitions recorded by the tz database + are correct, the tz rules that generate them may not faithfully + reflect the historical rules. For example, from 1922 until World + War II the UK moved clocks forward the day following the third + Saturday in April unless that was Easter, in which case it moved + clocks forward the previous Sunday. Because the tz database has no + way to specify Easter, these exceptional years are entered as + separate tz Rule lines, even though the legal rules did not change. + + * The tz database models pre-standard time using the Gregorian + calendar and local mean time (LMT), but many people used other + calendars and other timescales. For example, the Roman Empire used + the Julian calendar, and had 12 varying-length daytime hours with a + non-hour-based system at night. + + * Early clocks were less reliable, and the data do not represent this + unreliability. + + * As for leap seconds, civil time was not based on atomic time before + 1972, and we don't know the history of earth's rotation accurately + enough to map SI seconds to historical solar time to more than + about one-hour accuracy. See: Morrison LV, Stephenson FR. + Historical values of the Earth's clock error Delta T and the + calculation of eclipses. J Hist Astron. 2004;35:327-36 + <http://adsabs.harvard.edu/full/2004JHA....35..327M>; + Historical values of the Earth's clock error. J Hist Astron. 2005;36:339 + <http://adsabs.harvard.edu/full/2005JHA....36..339M>. + + * The relationship between POSIX time (that is, UTC but ignoring leap + seconds) and UTC is not agreed upon after 1972. Although the POSIX + clock officially stops during an inserted leap second, at least one + proposed standard has it jumping back a second instead; and in + practice POSIX clocks more typically either progress glacially during + a leap second, or are slightly slowed while near a leap second. + + * The tz database does not represent how uncertain its information is. + Ideally it would contain information about when the data are + incomplete or dicey. Partial temporal knowledge is a field of + active research, though, and it's not clear how to apply it here. + +In short, many, perhaps most, of the tz database's pre-1970 and future +time stamps are either wrong or misleading. Any attempt to pass the +tz database off as the definition of time should be unacceptable to +anybody who cares about the facts. In particular, the tz database's +LMT offsets should not be considered meaningful, and should not prompt +creation of zones merely because two locations differ in LMT or +transitioned to standard time at different dates. + ----- Names of time zone rule files ----- -- 1.8.3.1
On 15 September 2013 07:32, Paul Eggert <eggert@cs.ucla.edu> wrote:
+ * The tz database does not record the earliest time for which a + zone's data is thereafter valid for every location in the region. + For example, Europe/London is valid for all locations in the its + region after GMT was made the standard time, but the date of + standardization (1880-08-02) is not in the tz database, other than + in commentary. For many zones the earlist time of validity is + unknown.
"all locations in the its region" is off, but I'm not sure what to change it to (simply "all locations in its region" without the "the"? "all locations in the appropriate region" or the like?). Also, in the last sentence, "earlist" should be "earliest". Cheers, Philip -- Philip Newton <philip.newton@gmail.com>
Thanks for reporting those bugs. I pushed this into the experimental version:
From a48e137422570ce2b52cb495f5c910a87718ff75 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Sun, 15 Sep 2013 18:36:34 -0500 Subject: [PATCH] * Theory: Fix typos noted by Philip Newton
in <http://mm.icann.org/pipermail/tz/2013-September/020197.html>. --- Theory | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Theory b/Theory index 1c78a46..0b02ddc 100644 --- a/Theory +++ b/Theory @@ -284,10 +284,10 @@ Errors in the tz database arise from many sources: * The tz database does not record the earliest time for which a zone's data is thereafter valid for every location in the region. - For example, Europe/London is valid for all locations in the its + For example, Europe/London is valid for all locations in its region after GMT was made the standard time, but the date of standardization (1880-08-02) is not in the tz database, other than - in commentary. For many zones the earlist time of validity is + in commentary. For many zones the earliest time of validity is unknown. * The tz database does not record a region's boundaries, and in many -- 1.8.3.1
What a waste of a morning. Changing the motherboard only took 30 minutes, but I'm 3 hours into getting 'vista' running again ... unfortunately need that for the accounts software - and I need to get an end of year run :( Oh for the time when I can use Linux software and still keep the accountant happy :( Paul Eggert wrote:
Good idea. I pushed this patch. This affects only commentary, so it should be safe.
Nice summary of the state of play ... But I feel that the 'problem' of managing what good history is available is still being brushed under the carpet ... hoping it will go away? Filtering the data before building a post-1970 POSIX based 'clock' is fine, and that is what is currently being targeted. One thing that comes to mind though is that currently I'm seeing 'summertime' transitions prior to 1970. Derick has confirmed that he is using the binary data to build the PHP libraries, so presumably these are currently included? What will the situation be post the 'post-1970 only' clean-up? Will these be dropped back to a simple time offset? It's a bit like the problem of identifying which version of tz data is being used so you know that it may be out of date, but there may be two versions of a new distribution? As I have said, I don't use 'seconds' as my time base anyway, so processing leapseconds is just a matter of using 1/86401 on a leapsecond day rather than 1/86400 so I'm also not restricted when it comes to fractional seconds. So it would complete the picture if the rare rules like the Netherlands could be incorporated while not affecting the base POSIX rules? It is almost a case that 'extended mode' should support fractional seconds anyway? Working mainly from database stored material, the concept of 'NULL' as an answer has a number of advantages, but is not practical when using numeric based results. I keep coming back to some sort of 'timezone' result which is in essence 'NULL' such as any LMT results can be taken as 'can be recalculated using local location data' while 'LMTZ' indicates a documented local standard. This removes the need to create any new timezones for what is essentially outside the scope of the 'default' TZ distribution? Given that the theory file contains additional documentation on the content of the 'data' package, would it not make sense to to include it with that package? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On 09/15/2013 07:31 AM, Lester Caine wrote:
I'm seeing 'summertime' transitions prior to 1970. Derick has confirmed that he is using the binary data to build the PHP libraries, so presumably these are currently included?
Yes, every unique location has a time zone history back to the introduction of standard time.
What will the situation be post the 'post-1970 only' clean-up?
No change there. It's just that there would be fewer unique locations; more would be like Europe/Vatican, which is simply a link to Europe/Rome.
So it would complete the picture if the rare rules like the Netherlands could be incorporated while not affecting the base POSIX rules?
We could easily extend the zic input format, without affecting the output of zic, simply by having it drop fractional seconds.
It is almost a case that 'extended mode' should support fractional seconds anyway?
That hasn't been discussed, but something like that seems reasonable.
This removes the need to create any new timezones for what is essentially outside the scope of the 'default' TZ distribution?
I don't think anybody's proposing creating a new zone for every location on the planet, no. The problem is that we'd need thousands of zones merely to handle standard time.
Paul Eggert wrote:
This removes the need to create any new
timezones for what is essentially outside the scope of the 'default' TZ distribution? I don't think anybody's proposing creating a new zone for every location on the planet, no. The problem is that we'd need thousands of zones merely to handle standard time.
That is the bit I'm not currently getting? I can't see that there are that many more 'standard time' timezones. Yes there may be a lot of cross reference links that indicate when a location joined some sort of standard time, and I'm happy that is a separate list, but why would there be 'thousands' of new timezones? Perhaps it's just interpretation, but if there is evidence for thousands more why are we saying it's all be lost? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Lester Caine wrote:
why would there be 'thousands' of new timezones?
Because, if one defines a "region" to be "a set of geographical locations where civil-time clocks have agreed since standard time was introduced", there must be thousands of such regions, each unique as far as UT-offset history goes. Shanks identifies over 300 such regions for Indiana alone, and most likely he missed some. (Again, I'm ignoring UT offset before standard time was introduced, and I'm ignoring the date of transition to standard time.)
The ACS Atlas would give a clue as to how many would be needed; approximately 2000 zones in the Olson format would be needed to match what is in the ACS atlas and it isn't really "complete". The mapping of time zones appears mathematically to be like a logarithmic spike with one end toward relative uniformity (around 400 zones on the low end after 1970) and the other end to where each clock has its own unique time zone history (aka LMT). It is a matter of where you want to draw the line. The farther in the past you go toward the date of onset of standard time or LMT the more chaotic it gets (to the point where it simply would be impossible to have certainty for many locations). Due to the fact that most people reside in cities, ACS went about as far as reasonably possible given the limited manpower. It was no doubt an extremely tedious task, but I do wish that they had made their research methods and criteria for selecting daylight transitions public. As the data currently stand, my confidence in transition times for DST for dates in the first half of the 20th century is shaky at best. I only feel relatively confident if it is near the dead of winter or the height of summer for this period in history. On Mon, Sep 16, 2013 at 9:32 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
Lester Caine wrote:
why would there be 'thousands' of new timezones?
Because, if one defines a "region" to be "a set of geographical locations where civil-time clocks have agreed since standard time was introduced", there must be thousands of such regions, each unique as far as UT-offset history goes. Shanks identifies over 300 such regions for Indiana alone, and most likely he missed some. (Again, I'm ignoring UT offset before standard time was introduced, and I'm ignoring the date of transition to standard time.)
PS - I recently moved back to Pulaski, NY and I had a conversation with a real estate broker and he said that there were 42 Amish families now living in the Pulaski area (When I was born I knew of none). They don't appear to observe DST because their horse drawn carriages (which are nearly as ubiquitous as cars around here) which go by every morning taking their kids to school by the cemetery did not keep up with the shift in DST (according to my father). Given their lack of electricity I'm not even sure they have clocks. This kind of political division is another dimension that divides not by boundaries on a map, but by who is keeping the clocks. On Tue, Sep 17, 2013 at 12:45 AM, Zoidsoft <zoidsoft@gmail.com> wrote:
The ACS Atlas would give a clue as to how many would be needed; approximately 2000 zones in the Olson format would be needed to match what is in the ACS atlas and it isn't really "complete". The mapping of time zones appears mathematically to be like a logarithmic spike with one end toward relative uniformity (around 400 zones on the low end after 1970) and the other end to where each clock has its own unique time zone history (aka LMT). It is a matter of where you want to draw the line. The farther in the past you go toward the date of onset of standard time or LMT the more chaotic it gets (to the point where it simply would be impossible to have certainty for many locations).
Due to the fact that most people reside in cities, ACS went about as far as reasonably possible given the limited manpower. It was no doubt an extremely tedious task, but I do wish that they had made their research methods and criteria for selecting daylight transitions public. As the data currently stand, my confidence in transition times for DST for dates in the first half of the 20th century is shaky at best. I only feel relatively confident if it is near the dead of winter or the height of summer for this period in history.
On Mon, Sep 16, 2013 at 9:32 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
Lester Caine wrote:
why would there be 'thousands' of new timezones?
Because, if one defines a "region" to be "a set of geographical locations where civil-time clocks have agreed since standard time was introduced", there must be thousands of such regions, each unique as far as UT-offset history goes. Shanks identifies over 300 such regions for Indiana alone, and most likely he missed some. (Again, I'm ignoring UT offset before standard time was introduced, and I'm ignoring the date of transition to standard time.)
On 3 September 2013 08:33, Paul Eggert <eggert@cs.ucla.edu> wrote:
This should allay concerns that the links would go away any time soon. Suggested by Stephen Colebourne in <http://mm.icann.org/pipermail/tz/2013-September/019801.html>. Change "`" to "'"; these days, "`" and "'" are not symmetric. * antarctica (Antarctica/McMurdo): * europe (Europe/Jersey, Europe/Guernsey, Europe/Isle_of_Man) (Europe/Mariehamn, Europe/Busingen, Europe/Vatican, Europe/San_Marino) (Arctic/Longyearbyen, Europe/Ljubljana, Europe/Podgorica) (Europe/Sarajevo, Europe/Skopje, Europe/Zagreb, Europe/Bratislava): * northamerica (America/St_Barthelemy, America/Marigot): * southamerica (America/Lower_Princes, America/Kralendijk): Move here from 'backward'. This reverts a 2013-08-09 change.
Thank you. I note that America/Shiprock and Antarctica/South_Pole were not reverted. In my view, stability indicates that they should be reverted. For example, as per http://norbertlindenberg.com/ecmascript/intl.html#sec-6.4.2, the normalized ID seen in ECMAscript for the south pole will change from South_Pole to Auckland. While this is not incorrect from a local time perspective, it doesn't seem right from a human perspective. I note that this Link is also cross-ISO boundary, which should also be a warning sign. At a minimum I would ask that the "backward" Link be from South_Pole to McMurdo. I don't know enough about Shiprock to comment. Stephen
Stephen Colebourne wrote:
I note that America/Shiprock and Antarctica/South_Pole were not reverted. In my view, stability indicates that they should be reverted.
Those names were put into zone.tab in error, as they didn't follow the rules given in Theory (not the current stable rules, nor the currently-proposed rules, nor even the older rules). We need to be able to correct errors in the database; stability doesn't trump fixing mistakes.
While this is not incorrect from a local time perspective, it doesn't seem right from a human perspective.
On the contrary, it's a win from the human perspective. First, it simplifies the choice of TZ, by removing irrelevant choices. Second, for this particular example, the folks in the South Pole are well aware that they keep New Zealand time, since that's where they get their supplies from, and so this change will match their human perspective.
I would ask that the "backward" Link be from South_Pole to McMurdo
Longstanding practice has been to avoid links-to-links; otherwise zic can misbehave.
participants (13)
-
Brian Inglis -
Garrett Wollman -
Guy Harris -
Lester Caine -
Paul Eggert -
Philip Newton -
random832@fastmail.us -
Russ Allbery -
Stephen Colebourne -
Steve Allen -
yoshito_umaoka@us.ibm.com -
Zefram -
Zoidsoft