Getting close to new release: testing, testing, ...
We need to release a new version soon, because of Fiji. I did further testing and it appears that the recent changes to add meta-information to the zic output won't fly, because they're incompatible with the GNU C library on Ubuntu 12.04. So we need to rethink that feature. There's no rush to introduce it, so let's omit it from the next release. Also, the -t option recently added to tzselect seems like it will be superseded by the tzwinnow approach, whenever we get that working and stable, so for now I think I'd rather omit this. There are a few other problems found by "make public", mostly links that were in the wrong files. I'll post proposed changes soon. If other people could do some testing that'd be nice. You can run "make public". The output should contain the following diagnostics: warning: "antarctica", line 318: Antarctica/Palmer: pre-2013 clients may mishandle distant timestamps warning: "asia", line 1412: 24:00 not handled by pre-1998 versions of zic warning: "asia", line 2332: 24:00 not handled by pre-1998 versions of zic warning: "asia", line 1026: Asia/Jerusalem: pre-2013 clients may mishandle distant timestamps (rule from "asia", line 1021) warning: "asia", line 1315: Asia/Amman: pre-1994 clients may mishandle distant timestamps warning: "asia", line 2016: Asia/Gaza: pre-2013 clients may mishandle distant timestamps warning: "asia", line 2347: Asia/Hebron: pre-2013 clients may mishandle distant timestamps warning: "australasia", line 269: Pacific/Fiji: pre-2013 clients may mishandle distant timestamps warning: "europe", line 998: America/Godthab: pre-2013 clients may mishandle distant timestamps warning: "southamerica", line 1141: America/Santiago: pre-2013 clients may mishandle distant timestamps warning: "southamerica", line 1313: Pacific/Easter: pre-2013 clients may mishandle distant timestamps along with a very long section that starts as follows, which shows that the tz code is still broken on platforms with unsigned 32-bit time_t (this is a longstanding problem). checking uint32_t zones ... --- tzpublic/int64_t.out 2013-09-11 02:43:50.871343089 -0700 +++ tzpublic/uint32_t.out 2013-09-11 02:43:55.907343185 -0700 @@ -210,278 +210,6 @@ Europe/Andorra Sun Mar 29 01:00:00 2037 UT = Sun Mar 29 03:00:00 2037 CEST isdst=1 Europe/Andorra Sun Oct 25 00:59:59 2037 UT = Sun Oct 25 02:59:59 2037 CEST isdst=1 Europe/Andorra Sun Oct 25 01:00:00 2037 UT = Sun Oct 25 02:00:00 2037 CET isdst=0 -Europe/Andorra Sun Mar 28 00:59:59 2038 UT = Sun Mar 28 01:59:59 2038 CET isdst=0 -Europe/Andorra Sun Mar 28 01:00:00 2038 UT = Sun Mar 28 03:00:00 2038 CEST isdst=1 -Europe/Andorra Sun Oct 31 00:59:59 2038 UT = Sun Oct 31 02:59:59 2038 CEST isdst=1
This was experimental, and it appears that the tzwinnow approach will be better. We need to cut a new stable release soon, and the -t option might make it harder to integrate tzwinnow later, so let's omit -t for now. * .gitignore: Remove time.tab. * Makefile (ZONETABTYPE): Remove. All uses removed. (time.tab): Remove. All uses removed. * Theory: Omit discussion of time.tab. * zone-time.awk: Remove. * tzselect.8: Omit -t and time.tab. * tzselect.ksh (ZONETABTYPE): Remove. All uses removed. Remove -t ZONETABTYPE option. * zone.tab: Restore first comment line, since there's no longer a need to distinguish this file from time.tab. --- .gitignore | 1 - Makefile | 22 ++++------------------ Theory | 10 +++------- tzselect.8 | 51 +-------------------------------------------------- tzselect.ksh | 13 +++---------- zone-time.awk | 34 ---------------------------------- zone.tab | 2 +- 7 files changed, 12 insertions(+), 121 deletions(-) delete mode 100644 zone-time.awk diff --git a/.gitignore b/.gitignore index 18dbbcc..2b93d4b 100644 --- a/.gitignore +++ b/.gitignore @@ -6,7 +6,6 @@ ChangeLog date leapseconds -time.tab tzselect version.h yearistype diff --git a/Makefile b/Makefile index eb0ea59..9052eeb 100644 --- a/Makefile +++ b/Makefile @@ -40,15 +40,6 @@ LOCALTIME= GMT POSIXRULES= America/New_York -# Default time zone table type for 'tzselect'. See tzselect.8 for details. -# Possible values are: -# 'time' - for a smaller time zone table -# 'zone' - for a backward compatible time zone table; it contains -# alternative TZ values present for compatibility with older versions of -# this software. - -ZONETABTYPE= zone - # Also see TZDEFRULESTRING below, which takes effect only # if the time zone files cannot be accessed. @@ -329,11 +320,11 @@ YDATA= $(PRIMARY_YDATA) pacificnew etcetera backward NDATA= systemv factory SDATA= solar87 solar88 solar89 TDATA= $(YDATA) $(NDATA) $(SDATA) -TABDATA= iso3166.tab time.tab zone.tab +TABDATA= iso3166.tab zone.tab DATA= $(YDATA) $(NDATA) $(SDATA) $(TABDATA) \ leap-seconds.list yearistype.sh WEB_PAGES= tz-art.htm tz-link.htm -AWK_SCRIPTS= checktab.awk leapseconds.awk zone-time.awk +AWK_SCRIPTS= checktab.awk leapseconds.awk MISC= usno1988 usno1989 usno1989a usno1995 usno1997 usno1998 \ $(WEB_PAGES) $(AWK_SCRIPTS) workman.sh \ zoneinfo2tdf.pl @@ -352,8 +343,8 @@ install: all $(DATA) $(REDO) $(DESTDIR)$(TZLIB) $(MANS) $(ZIC) -y $(YEARISTYPE) \ -d $(DESTDIR)$(TZDIR) -l $(LOCALTIME) -p $(POSIXRULES) -rm -f $(DESTDIR)$(TZDIR)/iso3166.tab \ - $(DESTDIR)$(TZDIR)/time.tab $(DESTDIR)$(TZDIR)/zone.tab - cp iso3166.tab time.tab zone.tab $(DESTDIR)$(TZDIR)/. + $(DESTDIR)$(TZDIR)/zone.tab + cp iso3166.tab zone.tab $(DESTDIR)$(TZDIR)/. -mkdir $(DESTDIR)$(TOPDIR) $(DESTDIR)$(ETCDIR) cp tzselect zic zdump $(DESTDIR)$(ETCDIR)/. -mkdir $(DESTDIR)$(TOPDIR) $(DESTDIR)$(MANDIR) \ @@ -430,9 +421,6 @@ posix_right: posix_only leapseconds zones: $(REDO) -time.tab: $(YDATA) zone.tab zone-time.awk - $(AWK) -f zone-time.awk $(YDATA) >$@ - $(DESTDIR)$(TZLIB): $(LIBOBJS) -mkdir -p $(DESTDIR)$(TOPDIR) $(DESTDIR)$(LIBDIR) ar ru $@ $(LIBOBJS) @@ -450,7 +438,6 @@ tzselect: tzselect.ksh -e 's|\(REPORT_BUGS_TO\)=.*|\1=$(BUGEMAIL)|' \ -e 's|TZDIR=[^}]*|TZDIR=$(TZDIR)|' \ -e 's|\(TZVERSION\)=.*|\1=$(VERSION)|' \ - -e 's|^\(ZONETABTYPE\)=.*|\1=$(ZONETABTYPE)|' \ <$? >$@ chmod +x $@ @@ -467,7 +454,6 @@ check_web: $(WEB_PAGES) clean_misc: rm -f core *.o *.out \ - time.tab \ date leapseconds tzselect version.h zdump zic yearistype clean: clean_misc rm -f -r tzpublic diff --git a/Theory b/Theory index 13b5565..0c1ffdd 100644 --- a/Theory +++ b/Theory @@ -357,13 +357,9 @@ in decreasing order of importance: The file 'zone.tab' lists geographical locations used to name time zone rule files. It is intended to be an exhaustive list of names for geographic regions as described above; this is a subset of the -Zone entries in the data. The file 'time.tab' is a simplified -version of 'zone.tab', the intent being that entries are coalesced -if their time stamps agree after 1970, which means the entries are -distinct in 'zone.tab' only because of the abovementioned political -constraints. Although a 'zone.tab' location's longitude corresponds -to its LMT offset with one hour for every 15 degrees east longitude, -this relationship is not exact and is not true for 'time.tab'. +names in the data. Although a 'zone.tab' location's longitude +corresponds to its LMT offset with one hour for every 15 degrees east +longitude, this relationship is not exact. Older versions of this package used a different naming scheme, and these older names are still supported. diff --git a/tzselect.8 b/tzselect.8 index 39436ae..1dd721a 100644 --- a/tzselect.8 +++ b/tzselect.8 @@ -10,9 +10,6 @@ tzselect \- select a time zone .B \-n .I limit ] [ -.B \-t -.I zonetabtype -] [ .B \-\-help ] [ .B \-\-version @@ -73,52 +70,8 @@ When is used, display the closest .I limit locations (default 10). -.TP -.BI "\-t " zonetabtype -Make selections from the time zone table of type -.I zonetabtype. -Possible -.I zonetabtype -values include: -.RS -.TP -.B time -A time zone table with a smaller set of zone names. -.TP -.B zone -A time zone table that also contains alternative zone names, for -backward compatibility with older versions of this software. The -alternative names are not needed for proper operation of time stamps; -they are present only to avoid surprises with people who are -accustomed to the old names. These alternative names arose from -political issues that are outside the scope of -.BR tzselect . .PP -For example, both tables have entries for countries like -Bosnia, Croatia, and Serbia, which are in a zone where the clocks -have all agreed since 1970. Although the -.B time -table lists "Europe/Belgrade" for this zone wherever it occurs, the -.B zone -table instead lists the names "Europe/Sarajevo", "Europe/Zagreb", -etc. under Bosnia, Croatia, etc. This means that the -.B "\-t\ time" -option causes -.B tzselect -to generate "Europe/Belgrade" for this zone, whereas -.B "\-t\ zone" -causes it to generate different names depending on the country, -names that are equivalent in effect to "Europe/Belgrade". -.PP -The default -.I zonetabtype -is system-dependent, so applications that care about the set of -names that -.B tzselect -generates should use the -.B "\-t" -option. Regardless of what options are used, applications should not -assume that +Applications should not assume that .BR tzselect 's output matches the user's political preferences. .RE @@ -144,8 +97,6 @@ Name of the directory containing time zone data files (default: \f2TZDIR\fP\f3/iso3166.tab\fP Table of ISO 3166 2-letter country codes and country names. .TP -\f2TZDIR\fP\f3/time.tab\fP -.TP \f2TZDIR\fP\f3/zone.tab\fP Tables of country codes, latitude and longitude, zone names, and descriptive comments. diff --git a/tzselect.ksh b/tzselect.ksh index 3e7788e..1934dd0 100644 --- a/tzselect.ksh +++ b/tzselect.ksh @@ -3,7 +3,6 @@ PKGVERSION='(tzcode) ' TZVERSION=see_Makefile REPORT_BUGS_TO=tz@iana.org -ZONETABTYPE=zone # Ask the user about the time zone, and output the resulting TZ value to stdout. # Interact with the user via stderr and stdin. @@ -44,7 +43,7 @@ ZONETABTYPE=zone coord= location_limit=10 -usage="Usage: tzselect [--version] [--help] [-c COORD] [-n LIMIT] [-t ZONETABTYPE] +usage="Usage: tzselect [--version] [--help] [-c COORD] [-n LIMIT] Select a time zone interactively. Options: @@ -60,10 +59,6 @@ Options: -n LIMIT Display at most LIMIT locations when -c is used (default $location_limit). - -t ZONETABTYPE - Use time zone table ZONETABTYPE. ZONETABTYPE should be one of - 'time' or 'zone'. - --version Output version information. @@ -72,15 +67,13 @@ Options: Report bugs to $REPORT_BUGS_TO." -while getopts c:n:t:-: opt +while getopts c:n:-: opt do case $opt$OPTARG in c*) coord=$OPTARG ;; n*) location_limit=$OPTARG ;; - t*) - ZONETABTYPE=$OPTARG ;; -help) exec echo "$usage" ;; -version) @@ -100,7 +93,7 @@ esac # Make sure the tables are readable. TZ_COUNTRY_TABLE=$TZDIR/iso3166.tab -TZ_ZONE_TABLE=$TZDIR/$ZONETABTYPE.tab +TZ_ZONE_TABLE=$TZDIR/zone.tab for f in $TZ_COUNTRY_TABLE $TZ_ZONE_TABLE do <$f || { diff --git a/zone-time.awk b/zone-time.awk deleted file mode 100644 index 5210c1f..0000000 --- a/zone-time.awk +++ /dev/null @@ -1,34 +0,0 @@ -# Generate 'time.tab' from 'zone.tab'. Standard input should be the zic input. - -# This file is in the public domain. - -# Contributed by Paul Eggert. - -$1 == "Link" { link[$3] = $2 } - -END { - FS = "\t" - while (getline < "zone.tab") { - line = $0 - if (line ~ /^# TZ zone descriptions/) - line = "# TZ zone descriptions, with a smaller set of zone names" - if (line ~ /^# 4. Comments;/) { - print "# Zones can cross country-code boundaries, so the" - print "# location named by column 3 need not lie in the" - print "# locations identified by columns 1 or 2." - } - if (line ~ /^[^#]/) { - code = $1 - target = $3 - while (link[target]) - target = link[target] - if (already_seen[code, target]) - continue - already_seen[code, target] = 1 - line = code "\t" $2 "\t" target - if ($4) - line = line "\t" $4 - } - print line - } -} diff --git a/zone.tab b/zone.tab index 7d4c575..fa4df5f 100644 --- a/zone.tab +++ b/zone.tab @@ -1,4 +1,4 @@ -# TZ zone descriptions, with alternative zone names for backward compatibility +# TZ zone descriptions # # This file is in the public domain, so clarified as of # 2009-05-17 by Arthur David Olson. -- 1.8.1.2
Reported by Alois Treindl in <http://mm.icann.org/pipermail/tz/2011-August/008722.html>. --- zic.8 | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/zic.8 b/zic.8 index b1d3348..61113b9 100644 --- a/zic.8 +++ b/zic.8 @@ -470,10 +470,8 @@ input, intended to illustrate many of its features. .ta \w'# Rule\0\0'u +\w'NAME\0\0'u +\w'FROM\0\0'u +\w'1973\0\0'u +\w'TYPE\0\0'u +\w'Apr\0\0'u +\w'lastSun\0\0'u +\w'2:00\0\0'u +\w'SAVE\0\0'u .sp # Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S -Rule Swiss 1940 only - Nov 2 0:00 1:00 S -Rule Swiss 1940 only - Dec 31 0:00 0 - -Rule Swiss 1941 1942 - May Sun>=1 2:00 1:00 S -Rule Swiss 1941 1942 - Oct Sun>=1 0:00 0 +Rule Swiss 1941 1942 - May Mon>=1 1:00 1:00 S +Rule Swiss 1941 1942 - Oct Mon>=1 2:00 0 - .sp .5 Rule EU 1977 1980 - Apr Sun>=1 1:00u 1:00 S Rule EU 1977 only - Sep lastSun 1:00u 0 - @@ -484,7 +482,7 @@ Rule EU 1996 max - Oct lastSun 1:00u 0 - .sp .ta \w'# Zone\0\0'u +\w'Europe/Zurich\0\0'u +\w'0:34:08\0\0'u +\w'RULES/SAVE\0\0'u +\w'FORMAT\0\0'u # Zone NAME GMTOFF RULES FORMAT UNTIL -Zone Europe/Zurich 0:34:08 - LMT 1848 Sep 12 +Zone Europe/Zurich 0:34:08 - LMT 1855 0:29:44 - BMT 1894 Jun 1:00 Swiss CE%sT 1981 1:00 EU CE%sT @@ -495,16 +493,14 @@ Link Europe/Zurich Switzerland .fi In this example, the zone is named Europe/Zurich but it has an alias as Switzerland. Zurich was 34 minutes and 8 seconds west of GMT until -1848-09-12 at 00:00, when the offset changed to 29 minutes and 44 +1855-01-01 at 00:00, when the offset changed to 29 minutes and 44 seconds. After 1894-06-01 at 00:00 Swiss daylight saving rules (defined with lines beginning with "Rule Swiss") apply, and the GMT offset became one hour. From 1981 to the present, EU daylight saving rules have applied, and the UTC offset has remained at one hour. .PP -In 1940, daylight saving time applied from November 2 at 00:00 to -December 31 at 00:00. In 1941 and 1942, daylight saving time applied -from the first Sunday in May at 02:00 to the first Sunday in October -at 00:00. +In 1941 and 1942, daylight saving time applied from the first Monday +in May at 01:00 to the first Monday in October at 02:00. The pre-1981 EU daylight-saving rules have no effect here, but are included for completeness. Since 1981, daylight saving has begun on the last Sunday in March at 01:00 UTC. -- 1.8.1.2
Mostly this moves links so that files can be zic'ed standalone. * antarctica (Antarctica/McMurdo): Move to australasia, * australasia (Pacific/Johnston): Move to northamerica. * checktab.awk: Add special case for America/Montreal, pending the tzwinnow approach. * northamerica (America/Anguilla, America/Dominica, America/Grenada) (America/Guadeloupe, America/St_Barthelemy, America/Marigot) (America/Montserrat, America/St_Kitts, America/St_Lucia) (America/St_Vincent, America/Tortola, America/St_Thomas): Move to southamerica. * southamerica: Receive above-described moves. * zic.c (writezone): Remove unused local. --- antarctica | 4 ++-- australasia | 3 ++- checktab.awk | 3 +++ northamerica | 23 +++++++++-------------- southamerica | 13 +++++++++++++ zic.c | 1 - 6 files changed, 29 insertions(+), 18 deletions(-) diff --git a/antarctica b/antarctica index 234e59c..5333b7b 100644 --- a/antarctica +++ b/antarctica @@ -360,5 +360,5 @@ Zone Antarctica/Palmer 0 - zzz 1965 # makes all of the clocks run fast. So every couple of days, # we have to go around and set them back 5 minutes or so. # Maybe if we let them run fast all of the time, we'd get to leave here sooner!! - -Link Pacific/Auckland Antarctica/McMurdo +# +# See 'australasia' for Antarctica/McMurdo. diff --git a/australasia b/australasia index c822b08..8685d00 100644 --- a/australasia +++ b/australasia @@ -496,6 +496,7 @@ Zone Pacific/Auckland 11:39:04 - LMT 1868 Nov 2 Zone Pacific/Chatham 12:13:48 - LMT 1957 Jan 1 12:45 Chatham CHA%sT +Link Pacific/Auckland Antarctica/McMurdo # Auckland Is # uninhabited; Maori and Moriori, colonial settlers, pastoralists, sealers, @@ -768,7 +769,7 @@ Zone Pacific/Funafuti 11:56:52 - LMT 1901 # We have no better information, so for now, assume this has been true # indefinitely into the past. # -Link Pacific/Honolulu Pacific/Johnston +# See 'northamerica' for Pacific/Johnston. # Kingman # uninhabited diff --git a/checktab.awk b/checktab.awk index 5cdce56..fec4f62 100644 --- a/checktab.awk +++ b/checktab.awk @@ -9,6 +9,9 @@ BEGIN { if (!zone_table) zone_table = "zone.tab" if (!want_warnings) want_warnings = -1 + # A special (and we hope temporary) case. + tztab["America/Montreal"] = 1 + while (getline <iso_table) { iso_NR++ if ($0 ~ /^#/) continue diff --git a/northamerica b/northamerica index 55755dd..c3921d3 100644 --- a/northamerica +++ b/northamerica @@ -600,6 +600,8 @@ Zone Pacific/Honolulu -10:31:26 - LMT 1896 Jan 13 12:00 #Schmitt&Cox -10:30 - HST 1947 Jun 8 2:00 #Schmitt&Cox+2 -10:00 - HST +Link Pacific/Honolulu Pacific/Johnston + # Now we turn to US areas that have diverged from the consensus since 1970. # Arizona mostly uses MST. @@ -2547,7 +2549,7 @@ Zone America/Santa_Isabel -7:39:28 - LMT 1922 Jan 1 0:20:32 ############################################################################### # Anguilla -Link America/Port_of_Spain America/Anguilla +# See 'southamerica'. # Antigua and Barbuda # Zone NAME GMTOFF RULES FORMAT [UNTIL] @@ -2869,7 +2871,7 @@ Zone America/Havana -5:29:28 - LMT 1890 -5:00 Cuba C%sT # Dominica -Link America/Port_of_Spain America/Dominica +# See 'southamerica'. # Dominican Republic @@ -2918,13 +2920,10 @@ Zone America/El_Salvador -5:56:48 - LMT 1921 # San Salvador -6:00 Salv C%sT # Grenada -Link America/Port_of_Spain America/Grenada # Guadeloupe -Link America/Port_of_Spain America/Guadeloupe # St Barthelemy -Link America/Port_of_Spain America/St_Barthelemy # St Martin (French part) -Link America/Port_of_Spain America/Marigot +# See 'southamerica'. # Guatemala # @@ -3086,7 +3085,7 @@ Zone America/Martinique -4:04:20 - LMT 1890 # Fort-de-France -4:00 - AST # Montserrat -Link America/Port_of_Spain America/Montserrat +# See 'southamerica'. # Nicaragua # @@ -3168,10 +3167,8 @@ Zone America/Puerto_Rico -4:24:25 - LMT 1899 Mar 28 12:00 # San Juan -4:00 - AST # St Kitts-Nevis -Link America/Port_of_Spain America/St_Kitts - # St Lucia -Link America/Port_of_Spain America/St_Lucia +# See 'southamerica'. # St Pierre and Miquelon # There are too many St Pierres elsewhere, so we'll use 'Miquelon'. @@ -3182,7 +3179,7 @@ Zone America/Miquelon -3:44:40 - LMT 1911 May 15 # St Pierre -3:00 Canada PM%sT # St Vincent and the Grenadines -Link America/Port_of_Spain America/St_Vincent +# See 'southamerica'. # Turks and Caicos # @@ -3216,7 +3213,5 @@ Zone America/Grand_Turk -4:44:32 - LMT 1890 -5:00 TC E%sT # British Virgin Is -Link America/Port_of_Spain America/Tortola - # Virgin Is -Link America/Port_of_Spain America/St_Thomas +# See 'southamerica'. diff --git a/southamerica b/southamerica index aab2e1a..464c548 100644 --- a/southamerica +++ b/southamerica @@ -1642,6 +1642,19 @@ Zone America/Paramaribo -3:40:40 - LMT 1911 Zone America/Port_of_Spain -4:06:04 - LMT 1912 Mar 2 -4:00 - AST +Link America/Port_of_Spain America/Anguilla +Link America/Port_of_Spain America/Dominica +Link America/Port_of_Spain America/Grenada +Link America/Port_of_Spain America/Guadeloupe +Link America/Port_of_Spain America/Marigot +Link America/Port_of_Spain America/Montserrat +Link America/Port_of_Spain America/St_Barthelemy +Link America/Port_of_Spain America/St_Kitts +Link America/Port_of_Spain America/St_Lucia +Link America/Port_of_Spain America/St_Thomas +Link America/Port_of_Spain America/St_Vincent +Link America/Port_of_Spain America/Tortola + # Uruguay # From Paul Eggert (1993-11-18): # Uruguay wins the prize for the strangest peacetime manipulation of the rules. diff --git a/zic.c b/zic.c index e59a15f..60bcdfa 100644 --- a/zic.c +++ b/zic.c @@ -1760,7 +1760,6 @@ writezone(const char *const name, const char *const string) fprintf(fp, "name=%s%c", name, 0); for (i = 0; i < genoptions; i++) { register char const *v = genoption[i]; - register int namelen = strchr(v, '=') - v; fprintf(fp, "%s%c", v, 0); } fprintf(fp, "%c\n%s\n", 0, string); -- 1.8.1.2
Further testing found that it was incompatible with Ubuntu 12.04 glibc so this feature requires redesign and more testing. * Makefile (ZFLAGS): Remove comment about name and version info. Make it an empty var instead. * tzfile.5, tzfile.h: Remove description of meta-information. * zic.8: Remove options -n and -o. * zic.c: Don't include <stddef.h>. (genoption, genoptions, genname, addgenoption, writevalue): Remove. (usage, main, writezone): Remove support for -n and -o. --- Makefile | 3 +-- tzfile.5 | 29 ++++------------------------- tzfile.h | 6 ------ zic.8 | 14 -------------- zic.c | 59 ++--------------------------------------------------------- 5 files changed, 7 insertions(+), 104 deletions(-) diff --git a/Makefile b/Makefile index 9052eeb..e493f2e 100644 --- a/Makefile +++ b/Makefile @@ -241,8 +241,7 @@ LDFLAGS= $(LFLAGS) zic= ./zic ZIC= $(zic) $(ZFLAGS) -# Uncomment this to put name and version info into zic output files. -#ZFLAGS= -n -o version='$(VERSION)' +ZFLAGS= # The name of a Posix-compliant `awk' on your system. AWK= awk diff --git a/tzfile.5 b/tzfile.5 index edfa475..4089910 100644 --- a/tzfile.5 +++ b/tzfile.5 @@ -134,43 +134,22 @@ For version-2-format time zone files, the above header and data are followed by a second header and data, identical in format except that eight bytes are used for each transition time or leap second time. -After the second header and data, -and just before the end of the file, comes a newline-enclosed, +After the second header and data comes a newline-enclosed, POSIX-TZ-environment-variable-style string for use in handling instants after the last transition time stored in the file (with nothing between the newlines if there is no POSIX representation for such instants). .PP -Version-3-format time zone files have the following additions: -.IP -The POSIX-TZ-style string may use two minor extensions to the -POSIX TZ format, as described in +For version-3-format time zone files, the POSIX-TZ-style string may +use two minor extensions to the POSIX TZ format, as described in .IR newtzset (3). First, the hours part of its transition times may be signed and range from \(mi167 through 167 instead of the POSIX-required unsigned values from 0 through 24. Second, DST is in effect all year if it starts January 1 at 00:00 and ends December 31 at 24:00 plus the difference between daylight saving and standard time. -.IP -The newline-enclosed POSIX-TZ-style string is preceded by a section -containing auxiliary meta-information that is not needed to process -time stamps. This section consists of another copy of the -newline-enclosed POSIX-TZ-style string (this is for the benefit of -version-2-only clients), followed by a four-byte integer size value, -followed by zero or more NUL-terminated byte strings, followed by an -additional NUL. The size value is the total number of bytes in all -the byte strings, including the trailing NULs at the end of the -strings, but not including the additional NUL. Each byte string -consists of a name-value pair separated by "=". Names consist of -ASCII letters, digits and underscores, and start with a letter; -duplicate names are not allowed. Two common names are "name", the -Zone name for the data, and "version", the data's version number. -Values can contain any bytes except NUL. .PP -Future additions to the format may insert more data just before the -newline-enclosed POSIX-TZ-style string at the end of the file, so -clients should not assume that this string immediately follows -the auxiliary meta-information. +Future changes to the format may append more data. .SH SEE ALSO newctime(3), newtzset(3) .\" This file is in the public domain, so clarified as of diff --git a/tzfile.h b/tzfile.h index 233563c..a2955dd 100644 --- a/tzfile.h +++ b/tzfile.h @@ -89,12 +89,6 @@ struct tzhead { ** Second, its DST start time may be January 1 at 00:00 and its stop ** time December 31 at 24:00 plus the difference between DST and ** standard time, indicating DST all year. -** Third, the newline-enclosed TZ string is preceded by a new section -** consisting of another copy of the string, followed by a four-byte -** integer size value, followed by zero or more NUL-terminated -** name=value byte strings, followed by an additional NUL. The size -** value gives the total size of the name=value byte strings, -** including their terminating NUL bytes, but excluding the additional NUL. */ /* diff --git a/zic.8 b/zic.8 index 61113b9..b61ebd3 100644 --- a/zic.8 +++ b/zic.8 @@ -15,11 +15,6 @@ zic \- time zone compiler .B \-l .I localtime ] [ -.B \-n -] [ -.B \-o -.IB name = value -] [ .B \-p .I posixrules ] [ @@ -67,15 +62,6 @@ will act as if the input contained a link line of the form .ti +.5i Link \fItimezone\fP localtime .TP -.B "\-n" -Store each zone's name into its generated file, as meta-information -with the name "name" and value the zone's name. -.TP -.BI "\-o " name = value -Store the given name-value pair into the generated file, as -meta-information. This option can be repeated, once for each distinct -name. -.TP .BI "\-p " timezone Use the given time zone's rules when handling POSIX-format time zone environment variables. diff --git a/zic.c b/zic.c index 60bcdfa..9939195 100644 --- a/zic.c +++ b/zic.c @@ -9,7 +9,6 @@ #include "tzfile.h" #include <stdarg.h> -#include <stddef.h> #define ZIC_VERSION '3' @@ -141,9 +140,6 @@ static int yearistype(int year, const char * type); static int charcnt; static int errors; static const char * filename; -static const char ** genoption; -static int genoptions; -static int genname; static int leapcnt; static int leapseen; static zic_t leapminyear; @@ -436,8 +432,7 @@ static _Noreturn void usage(FILE *stream, int status) { (void) fprintf(stream, _("%s: usage is %s \ -[ --version ] [ --help ] [ -v ] [ -l localtime ]\\\n\ -\t[ -n ] [ -o name=value ]... [ -p posixrules ] \\\n\ +[ --version ] [ --help ] [ -v ] [ -l localtime ] [ -p posixrules ] \\\n\ \t[ -d directory ] [ -L leapseconds ] [ -y yearistype ] [ filename ... ]\n\ \n\ Report bugs to %s.\n"), @@ -451,29 +446,6 @@ static const char * directory; static const char * leapsec; static const char * yitcommand; -static int -addgenoption(char const *option) -{ - register char const *o = option; - register ptrdiff_t namelen; - register int i; - if (! (isascii (*o) && isalpha(*o))) - return 0; - while (*++o != '=') - if (! (isascii (*o) && (isalnum(*o) || *o == '_'))) - return 0; - namelen = o - option; - if (namelen == sizeof "name" - 1 - && memcmp(option, "name", namelen) == 0) - return 0; - for (i = 0; i < genoptions; i++) - if (strncmp(genoption[i], option, namelen + 1) == 0) - return 0; - genoption = erealloc(genoption, (genoptions + 1) * sizeof *genoption); - genoption[genoptions++] = option; - return 1; -} - int main(int argc, char **argv) { @@ -504,7 +476,7 @@ main(int argc, char **argv) } else if (strcmp(argv[i], "--help") == 0) { usage(stdout, EXIT_SUCCESS); } - while ((c = getopt(argc, argv, "d:l:p:L:no:vsy:")) != EOF && c != -1) + while ((c = getopt(argc, argv, "d:l:p:L:vsy:")) != EOF && c != -1) switch (c) { default: usage(stderr, EXIT_FAILURE); @@ -528,17 +500,6 @@ _("%s: More than one -l option specified\n"), exit(EXIT_FAILURE); } break; - case 'n': - genname = TRUE; - break; - case 'o': - if (!addgenoption(optarg)) { - fprintf(stderr, - _("%s: %s: invalid -o option\n"), - progname, optarg); - exit(EXIT_FAILURE); - } - break; case 'p': if (psxrules == NULL) psxrules = optarg; @@ -1432,7 +1393,6 @@ writezone(const char *const name, const char *const string) register int leapcnt32, leapi32; register int timecnt32, timei32; register int pass; - register int_fast32_t genlen; static char * fullname; static const struct tzhead tzh0; static struct tzhead tzh; @@ -1748,21 +1708,6 @@ writezone(const char *const name, const char *const string) (void) putc(ttisgmts[i], fp); } (void) fprintf(fp, "\n%s\n", string); - - genlen = 0; - if (genname) - genlen += sizeof "name=" + strlen (name); - for (i = 0; i < genoptions; i++) - genlen += strlen (genoption[i]) + 1; - puttzcode(genlen, fp); - - if (genname) - fprintf(fp, "name=%s%c", name, 0); - for (i = 0; i < genoptions; i++) { - register char const *v = genoption[i]; - fprintf(fp, "%s%c", v, 0); - } - fprintf(fp, "%c\n%s\n", 0, string); if (ferror(fp) || fclose(fp)) { (void) fprintf(stderr, _("%s: Error writing %s\n"), progname, fullname); -- 1.8.1.2
On Wed, Sep 11, 2013, at 5:55, Paul Eggert wrote:
We need to release a new version soon, because of Fiji.
I did further testing and it appears that the recent changes to add meta-information to the zic output won't fly, because they're incompatible with the GNU C library on Ubuntu 12.04. So we need to rethink that feature. There's no rush to introduce it, so let's omit it from the next release.
We're getting stuck on ways to compatibly extend the current format. What about adding another file in a new location, and maintaining the current output as a backwards-compatibility thing? It would also allow for a chance to clean things up like adding isstd/isgmt to the main 'ttinfo' structure, and no longer maintaining two copies of everything.
random832@fastmail.us wrote:
We're getting stuck on ways to compatibly extend the current format.
Appending new fields to the existing format is a serviceable mechanism. The problem with glibc, and other tzfile parsers that are sensitive to such changes, is best addressed by giving advance notice of the new format before releasing the new zic.
What about adding another file in a new location,
I don't think there's any pressing need to do this, but it has its attractions. If we do work on an incompatibly different file format, it should be a long-term project to produce a format to serve for the next 30 years, not a hasty process that we'd need to redo. You've listed a couple of issues we could tackle this way, and I've got my own laundry list of desiderata. -zefram
On Wed, Sep 11, 2013, at 8:50, Zefram wrote:
random832@fastmail.us wrote:
We're getting stuck on ways to compatibly extend the current format.
Appending new fields to the existing format is a serviceable mechanism. The problem with glibc, and other tzfile parsers that are sensitive to such changes, is best addressed by giving advance notice of the new format before releasing the new zic.
My thinking is, If it's not going to actually be compatible with current implementations, what's the point of being compatible at all?
What about adding another file in a new location,
I don't think there's any pressing need to do this, but it has its attractions. If we do work on an incompatibly different file format, it should be a long-term project to produce a format to serve for the next 30 years, not a hasty process that we'd need to redo. You've listed a couple of issues we could tackle this way, and I've got my own laundry list of desiderata.
I had been thinking about a PNG-like "chunk" structure which would allow implementations to ignore anything they don't understand. I thought about proposing this back when we were talking about tz-to-metazone mapping (something which I still think should belong to this project rather than CLDR). That would take care of future extensibility.
random832@fastmail.us wrote:
My thinking is, If it's not going to actually be compatible with current implementations, what's the point of being compatible at all?
Much less change is required to cope with the extended format than would be required to cope with a new format. I'm going to have to make such code changes myself: my tzfile-parsing Perl module will reject '3' for the version byte. (It does so because it wasn't obvious, at version 2, in what manner future changes to the format would be signalled, nor to what extent they'd maintain compatibility. Indeed, along with accepting '3' I'll have to change the way the POSIXish-TZ field is processed.)
I had been thinking about a PNG-like "chunk" structure which would allow implementations to ignore anything they don't understand.
Yes, PNG's chunk system is a good model. More generally, a new format should have a well-defined extension mechanism, letting parsers know which parts they need to understand and letting them skip the inessential parts without understanding them. -zefram
On Wed, Sep 11, 2013, at 10:13, Zefram wrote:
It does so because it wasn't obvious, at version 2, in what manner future changes to the format would be signalled, nor to what extent they'd maintain compatibility.
I don't understand this - it seems to me it was very obvious, precisely from the extent to which version 2 maintained compatibility with version 1 (including an entire redundant copy of everything; being unwilling to even do so little as add entries to the existing ttinfo table that are not used in the period covered by 32-bit timestamps, instead creating a second ttinfo table.) It hadn't even occurred to me that any tools would simply reject any higher version number, since if that were the case there would be no tools that could actually make use of the backwards-compatible version-1-format data in a version-2 file.
On 2013-09-11 15:41, random832@fastmail.us wrote:
On Wed, Sep 11, 2013, at 10:13, Zefram wrote:
It does so because it wasn't obvious, at version 2, in what manner future changes to the format would be signalled, nor to what extent they'd maintain compatibility.
I don't understand this - it seems to me it was very obvious, precisely from the extent to which version 2 maintained compatibility with version 1 (including an entire redundant copy of everything; being unwilling to even do so little as add entries to the existing ttinfo table that are not used in the period covered by 32-bit timestamps, instead creating a second ttinfo table.)
It hadn't even occurred to me that any tools would simply reject any higher version number, since if that were the case there would be no tools that could actually make use of the backwards-compatible version-1-format data in a version-2 file.
Even so, it ought to be documented how version-0 parsers and version-2 parsers are supposed to parse version-N data files. -- -=( Ian Abbott @ MEV Ltd. E-mail: <abbotti@mev.co.uk> )=- -=( Tel: +44 (0)161 477 1898 FAX: +44 (0)161 718 3587 )=-
Ian Abbott wrote:
It ought to be documented how version-0 parsers and version-2 parsers are supposed to parse version-N data files.
I snuck that into one of yesterday's patches. The latest tzfile(5) ends with: "Future changes to the format may append more data." That describes how version 0 parsers can parse version 2 format, and is one plausible way that any version 4 might evolve from version 3. Version 3 is an exception, since it appends zero data to version 2, and slightly extends the interpretation of the version-2 format in a way that seems to match existing practice already in most cases (and the exceptions don't misbehave significantly).
I don't think there's any pressing need to do this, but it has its attractions. If we do work on an incompatibly different file format, it should be a long-term project to produce a format to serve for the next 30 years, not a hasty process that we'd need to redo.
Agreed. And I don't think we should be sacrificing clean design for any kind of compatibility.
I had been thinking about a PNG-like "chunk" structure which would allow implementations to ignore anything they don't understand. I thought about proposing this back when we were talking about tz-to-metazone mapping (something which I still think should belong to this project rather than CLDR). That would take care of future extensibility.
What's wrong with XML. Parsers are easy to obtain. The code that interprets the structure can ignore any entity they don't understand, giving complete extensibility. Or you can have a specific entity: <mandatory type="abc"> ... </mandatory> which means "if you don't know what an 'abc' is, raise an error now rather than trying to understand the contents". -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
Clive D.W. Feather wrote:
What's wrong with XML.
It's hideously verbose; it doesn't actually provide a representation of the data (it only provides structure); it has awkward corners that end up poorly tested; as it's a textual format it's subject to all the usual problems around character sets; the context encourages the use of forgiving parsers for the actual textual data representation, which leads to incompatibility; and it's a bad design in the first place because it tries to tackle too many classes of job. -zefram
Zefram <zefram@fysh.org> wrote: |to incompatibility; and it's a bad design in the first place because it |tries to tackle too many classes of job. And ... i still wonder what was wrong with SGML. (That is: soap is fine, but where are the bath <em/salts/?) |-zefram Besides i am lucky that this wonderful project has a maintainer that is genius enough to have an overview of what is actually the current state. I would have voted for him if i knew that in first place. Ciao, --steffen
Zefram wrote:
What's wrong with XML. It's hideously verbose; it doesn't actually provide a representation of the data (it only provides structure); it has awkward corners that end up poorly tested; as it's a textual format it's subject to all the usual problems around character sets; the context encourages the use of forgiving parsers for the actual textual data representation, which leads to incompatibility; and it's a bad design in the first place because it tries to tackle too many classes of job.
Seconded. XML has a place in passing material between systems, but as a general store it's simply wrong. OSM was originated as 'xml' but the raw data is now in a binary file format to cut the flab. I am looking at creating an SQL database with the current data in, which can also then have a built in history of changes to that data. This will allow filtering in several ways, such as number of timezones on a particular date, and history of the 'creation' of data for a particular timezone can be displayed. Outputting a current clean set of data could then be in any format as required and providing a base to then hang the 'evidence' as additional records which can be displayed as html pages. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On Thu, 12 Sep 2013, Clive D.W. Feather wrote:
What's wrong with XML. Parsers are easy to obtain.
Please not XML. The tzfile parser needs to be lightweight, whereas XML parsers tend to be large, or limited to a subset of the format.
The code that interprets the structure can ignore any entity they don't understand, giving complete extensibility.
You can do the same with simple binary formats, such as Type-Length-Value, and you can have a bit in the type field to say ...
"if you don't know what an 'abc' is, raise an error now rather than trying to understand the contents".
--apb (Alan Barrett)
On 11 September 2013 10:55, Paul Eggert <eggert@cs.ucla.edu> wrote:
We need to release a new version soon, because of Fiji.
I'm afraid I've very much lost track of what changes have happened and what have not. This particularly concerns me due to the much larger than typical data deletion in this release (much of which I, and other source data readers, have disagreed with). It does seem to me that it would be worthwhile releasing the next version with minimal and non-harmful data changes (ie. just changes like Fiji) while other things are cleared up. We also desparately need a non-experimental git repo so we can see what is going in to the actual releases. In lieu of this, a single patch of all the data changes since the last release would seem necessary to perform an effective review. Stephen
Stephen Colebourne wrote:
I've very much lost track of what changes have happened and what have not.
To do that, compare the latest stable release (2013d, or 8f10e5c in the experimental repository) to the master head. If you have the git repository, you can run this shell command: git diff 8f10e5c...HEAD Or you can visit this URL: https://github.com/eggert/tz/compare/8f10e5c...HEAD and click on 'Files Changed' to see each change to each file. I'd like to simplify this process by adding tags to the experimental repository, so that you can say something like '2013d...HEAD' instead; see <https://github.com/eggert/tz/issues/1>. This should make it more convenient to look at old releases via git. Unfortunately Git has multiple types of tags and Github has a "Releases" feature, and I haven't yet had time to understand all the issues involved. As for releasing just the Fiji change, the typical practice has been for downstream distributions to do that sort of thing, instead of doing it in the tz release itself. For example, Debian squeeze <http://packages.debian.org/squeeze/tzdata> is currently using 2012g with just the fall-2012 Brazilian DST patch (along with some Debian-specific patches). I expect that we can continue with this practice. The current changeset is larger than usual, but the more-problematic proposals have been reverted or were never added in the first place. In the past we published far more drastic changes and the world rolled along pretty much at the same rate as before.
On 12 September 2013 11:29, Paul Eggert <eggert@cs.ucla.edu> wrote:
I've very much lost track of what changes have happened and what have not.
To do that, compare the latest stable release (2013d, or 8f10e5c in the experimental repository) to the master head. If you have the git repository, you can run this shell command:
git diff 8f10e5c...HEAD
Or you can visit this URL:
https://github.com/eggert/tz/compare/8f10e5c...HEAD
and click on 'Files Changed' to see each change to each file.
I can do this, yes, but more important is that the reasons behind the changes to these 36 files are lost amongst those 74 commits. For effective review, I think it would be quite prudent to summarize specifically what changes are (and are not) made by this changeset, because this is the sum of so many different changes that I really can't make sense of it all from just a diff, and I don't believe I'm alone. -- Tim Parenti
Tim Parenti wrote:
I can do this, yes, but more important is that the reasons behind the changes to these 36 files are lost amongst those 74 commits. For effective review, I think it would be quite prudent to summarize specifically what changes are (and are not) made by this changeset, because this is the sum of so many different changes that I really can't make sense of it all from just a diff, and I don't believe I'm alone.
Using DVCS properly does allow for finer commit details. One of the reasons for my suggestion to split data from code to make that easier to achieve. I run Hg here rather than Git as it provides more visual tools to view and manage changes. It allows me to maintain a local repo with my own changes and cherry pick other commits. And to add to the picture, the tools like TortoiseHg run on both Linux and Windows .... the online view falls short of what can be achieved locally offline -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On 09/12/13 08:39, Tim Parenti wrote:
it would be quite prudent to summarize specifically what changes are (and are not) made by this changeset
Yes, although I normally write a change summary when preparing a release and publish it as part of the release announcement, the proposed change is large enough that it'd be helpful to circulate a draft of the change summary now. Here is a draft for the current set of proposed changes, which is: https://github.com/eggert/tz/compare/8f10e5c...738ad89 Changes affecting near-future time stamps This year Fiji will start DST on October 27, not October 20. (Thanks to David Wheeler for the heads-up.) For now, guess that Fiji will continue to spring forward the Sunday before the fourth Monday in October. Changes affecting current and future time zone abbreviations Use WIB/WITA/WIT rather than WIT/CIT/EIT for alphabetic Indonesian time zone abbreviations since 1932. (Thanks to George Ziegler, Priyadi Iman Nurcahyo, Zakaria, Jason Grimes, Martin Pitt, and Benny Lin.) This affects Asia/Dili, Asia/Jakarta, Asia/Jayapura, Asia/Makassar, and Asia/Pontianak. Use ART (UTC-3, standard time), rather than WARST (also UTC-3, but daylight saving time) for San Luis, Argentina since 2009. Changes affecting Godthab time stamps after 2037 if version mismatch Allow POSIX-like TZ strings where the transition time's hour can range from -167 through 167, instead of the POSIX-required 0 through 24. E.g., TZ='FJT-12FJST,M10.3.1/146,M1.3.4/75' for the new Fiji rules. This is a more-compact way to represent far-future time stamps for America/Godthab, America/Santiago, Antarctica/Palmer, Asia/Gaza, Asia/Hebron, Asia/Jerusalem, Pacific/Easter, and Pacific/Fiji. Other zones are unaffected by this change. (Derived from a suggestion by Arthur David Olson.) Allow POSIX-like TZ strings where daylight saving time is in effect all year. E.g., TZ='WART4WARST,J1/0,J365/25' for Western Argentina Summer Time all year. This supports a more-compact way to represent the 2013d data for America/Argentina/San_Luis. Because of the change for San Luis noted above this change does not affect the current data. (Thanks to Andrew Main (Zefram) for suggestions that improved this change.) Where these two TZ changes take effect, there is a minor extension to the tz file format in that it allows new values for the embedded TZ-format string, and the tz file format version number has therefore been increased from 2 to 3 as a precaution. Version-2-based client code should continue to work as before for all time stamps before 2038. Existing version-2-based client code (tzcode, GNU/Linux, Solaris) has been tested on version-3-format files, and typically works in practice even for time stamps after 2037; the only known exception is America/Godthab. Changes affecting time stamps before 1970 Pacific/Johnston is now a link to Pacific/Honolulu. This corrects some errors before 1947. Some zones have been turned into links, when they differ from existing zones only in older data that was likely invented or that differs only in LMT or transition from LMT. These changes affect only time stamps before 1943. The affected zones are: Africa/Juba, America/Anguilla, America/Aruba, America/Dominica, America/Grenada, America/Guadeloupe, America/Marigot, America/Montserrat, America/St_Barthelemy, America/St_Kitts, America/St_Lucia, America/St_Thomas, America/St_Vincent, America/Tortola, and Europe/Vaduz. Change Kingston Mean Time from -5:07:12 to -5:07:11. This affects America/Cayman, America/Jamaica and America/Grand_Turk time stamps from 1890 to 1912. Change the UT offset of Bern Mean Time from 0:29:44 to 0:29:46. This affects Europe/Zurich time stamps from 1853 to 1894. (Thanks to Alois Treindl). Change the date of the circa-1850 Zurich transition from 1849-09-12 to 1853-07-16, overriding Shanks with data from Messerli about postal and telegraph time in Switzerland. Changes affecting time zone abbreviations before 1970 For Asia/Jakarta, use BMT (not JMT) for mean time from 1923 to 1932, as Jakarta was called Batavia back then. Changes affecting API The 'zic' command now outputs a dummy transition when far-future data can't be summarized using a TZ string, and uses a 402-year window rather than a 400-year window. For the current data, this affects only the Asia/Tehran file. It does not affect any of the time stamps that this file represents, so zdump outputs the same information as before. (Thanks to Andrew Main (Zefram).) The 'date' command has a new '-r' option, which lets you specify the integer time to display, a la FreeBSD. The 'tzselect' command has two new options '-c' and '-n', which lets you select a zone based on latitude and longitude. The 'zic' command's '-v' option now warns about constructs that require the new version-3 binary file format. (Thanks to Arthur David Olson for the suggestion.) Support for floating-point time_t has been removed. It was always dicey, and POSIX no longer requires it. (Thanks to Eric Blake for suggesting to the POSIX committee to remove it, and thanks to Alan Barrett, Clive D.W. Feather, Andy Heninger, Arthur David Olson, and Alois Treindl, for reporting bugs and elucidating some of the corners of the old floating-point implementation.) The signatures of 'offtime', 'timeoff', and 'gtime' have been changed back to the old practice of using 'long' to represent UT offsets. This had been inadvertently and mistakenly changed to 'int_fast32_t'. (Thanks to Christos Zoulos.) The code avoids undefined behavior on integer overflow in some more places, including gmtime, localtime, mktime and zdump. Changes affecting the zdump utility zdump now outputs "UT" when referring to Universal Time, not "UTC". "UTC" does not make sense for time stamps that predate the introduction of UTC, whereas "UT", a more-generic term, does. (Thanks to Steve Allen for clarifying UT vs UTC.) Data changes affecting behavior of tzselect and similar programs Country code BQ is now called the more-common name "Caribbean Netherlands" rather than the more-official "Bonaire, St Eustatius & Saba". Remove from zone.tab the names America/Montreal, America/Shiprock, and Antarctica/South_Pole, as they are equivalent to existing same-country-code zones for post-1970 time stamps. The data for these names are unchanged, so the names continue to work as before. Changes affecting code internals zic -c now runs way faster on 64-bit hosts when given large numbers. zic now uses vfprintf to allocating and freeing some memory. tzselect now computes the list of continents from the data, rather than have it hard-coded. Minor changes pacify GCC 4.7.3 and GCC 4.8.1. Changes affecting the build procedure The 'leapseconds' file is now generated automatically from a new file 'leap-seconds.list', which is a copy of <ftp://time.nist.gov/pub/leap-seconds.list>. A new source file 'leapseconds.awk' implements this. The goal is simplification of the future maintenance of 'leapseconds'. When building the 'posix' or 'right' subdirectories, if the subdirectory would be a copy of the default subdirectory, it is now made a symbolic link if that is supported. This saves about 2 MB of file system space. The links America/Shiprock and Antarctica/South_Pole have been moved to the 'backward' file. This affects only nondefault builds that omit 'backward'. Changes affecting version-control only .gitignore now ignores 'date'. Changes affecting documentation and commentary Changes to the 'tzfile' man page It now mentions that the binary file format may be extended in future versions by appending data. It now refers to the 'zdump' and 'zic' man pages. Changes to the 'zic' man page It lists conditions that elicit a warning with '-v'. It says that the behavior is unspecified when duplicate names are given, or if the source of one link is the target of another. Its examples are updated to match the latest data. The definition of white space has been clarified slightly. (Thanks to Michael Deckers.) Changes to the 'Theory' file There is a new section about the accuracy of the tz database, describing the many ways that errors can creep in, and explaining why so many of the pre-1970 time stamps are wrong or misleading (thanks to Steve Allen, Lester Caine, and Garrett Wollman for discussions that contributed to this). The 'Theory' file describes LMT better (this follows a suggestion by Guy Harris). It refers to the 2013 edition of POSIX rather than the 2004 edition. It's mentioned that excluding 'backward' should not affect the other data, and it suggests at least one zone.tab name per inhabited country (thanks to Stephen Colebourne). Some longstanding restrictions on names are documented, e.g., 'America/New_York' precludes 'America/New_York/Bronx'. It gives more reasons for the 1970 cutoff. It now mentions which time_t variants are supported, such as signed integer time_t. (Thanks to Paul Goyette for reporting typos in an experimental version of this change.) (Thanks to Philip Newton for correcting typos in these changes.) Documentation and commentary is more careful to distinguish UT in general from UTC in particular. (Thanks to Steve Allen.) Add a better source for the Zurich 1894 transition. (Thanks to Pierre-Yves Berger.) Update shapefile citations in tz-link.htm. (Thanks to Guy Harris.)
Minor glitch?
zic now uses vfprintf to allocating and freeing some memory.
Should that be 'to avoid allocating and freeing some memory'? On Tue, Sep 17, 2013 at 2:38 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 09/12/13 08:39, Tim Parenti wrote:
it would be quite prudent to summarize specifically what changes are (and are not) made by this changeset
Yes, although I normally write a change summary when preparing a release and publish it as part of the release announcement, the proposed change is large enough that it'd be helpful to circulate a draft of the change summary now. Here is a draft for the current set of proposed changes, which is:
https://github.com/eggert/tz/compare/8f10e5c...738ad89
[...]
Changes affecting code internals
zic -c now runs way faster on 64-bit hosts when given large numbers.
zic now uses vfprintf to allocating and freeing some memory.
tzselect now computes the list of continents from the data, rather than have it hard-coded.
Minor changes pacify GCC 4.7.3 and GCC 4.8.1. [...]
-- Jonathan Leffler <jonathan.leffler@gmail.com> #include <disclaimer.h> Guardian of DBD::Informix - v2013.0521 - http://dbi.perl.org "Blessed are we who can laugh at ourselves, for we shall never cease to be amused."
Thanks you very much for preparing this, I think it was vital on this occasion. On 17 September 2013 22:38, Paul Eggert <eggert@cs.ucla.edu> wrote:
Changes affecting time stamps before 1970
Some zones have been turned into links, when they differ from existing zones only in older data that was likely invented or that differs only in LMT or transition from LMT. These changes affect only time stamps before 1943. The affected zones are: Africa/Juba, America/Anguilla, America/Aruba, America/Dominica, America/Grenada, America/Guadeloupe, America/Marigot, America/Montserrat, America/St_Barthelemy, America/St_Kitts, America/St_Lucia, America/St_Thomas, America/St_Vincent, America/Tortola, and Europe/Vaduz.
From my perspective, these are the only changes that concern me. The change was IMO unecessary, deleted longstanding information and gained little if anything in return. The new values are, from the perspective of an end consumer, more confusing/worse/wrong than the originals (accepting that all the data is guesswork does not mean that one guess should take priority over another when that data has been in existence for a long time).
If others want to save the deleted data in these locations, then they need to speak up now. I don't intend on pursuing the matter further without indications that others understand the dubious nature of these changes. Stephen
As I said earlier, I also feel the loss of the transition dates from LMT are an unwelcome loss to data in the database. On 2013-09-17 18:10, Stephen Colebourne wrote:
Thanks you very much for preparing this, I think it was vital on this occasion.
On 17 September 2013 22:38, Paul Eggert <eggert@cs.ucla.edu> wrote:
Changes affecting time stamps before 1970
Some zones have been turned into links, when they differ from existing zones only in older data that was likely invented or that differs only in LMT or transition from LMT. These changes affect only time stamps before 1943. The affected zones are: Africa/Juba, America/Anguilla, America/Aruba, America/Dominica, America/Grenada, America/Guadeloupe, America/Marigot, America/Montserrat, America/St_Barthelemy, America/St_Kitts, America/St_Lucia, America/St_Thomas, America/St_Vincent, America/Tortola, and Europe/Vaduz. From my perspective, these are the only changes that concern me. The change was IMO unecessary, deleted longstanding information and gained little if anything in return. The new values are, from the perspective of an end consumer, more confusing/worse/wrong than the originals (accepting that all the data is guesswork does not mean that one guess should take priority over another when that data has been in existence for a long time).
If others want to save the deleted data in these locations, then they need to speak up now. I don't intend on pursuing the matter further without indications that others understand the dubious nature of these changes.
Stephen
--
Stephen Colebourne <scolebourne@joda.org> writes:
On 17 September 2013 22:38, Paul Eggert <eggert@cs.ucla.edu> wrote:
Changes affecting time stamps before 1970
Some zones have been turned into links, when they differ from existing zones only in older data that was likely invented or that differs only in LMT or transition from LMT. These changes affect only time stamps before 1943. The affected zones are: Africa/Juba, America/Anguilla, America/Aruba, America/Dominica, America/Grenada, America/Guadeloupe, America/Marigot, America/Montserrat, America/St_Barthelemy, America/St_Kitts, America/St_Lucia, America/St_Thomas, America/St_Vincent, America/Tortola, and Europe/Vaduz.
From my perspective, these are the only changes that concern me. The change was IMO unecessary, deleted longstanding information and gained little if anything in return.
The longstanding information appears to be of negative value, so losing it is itself a gain, I think. It means less unsourced or poorly-sourced data that people can be fooled into thinking is actually meaningful. Someone who wants to tackle this problem can certainly work out higher-quality information about transitions and pre-standardized time offsets, and such data seems, to me at least, like it would be valuable to record in an expanded database. But the data in question here seems to just be a meaningless distraction from that effort. It doesn't appear to be of sufficient quality to serve as a foundation for further work. (Thanks to Paul for the extensive discussion and background for the Europe/Vaduz change.) Given that, I support making this change. -- Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>
I agree its possible that the transition dates may not be known to be perfect. Maybe I am misunderstanding, but doesn't this proposal, in effect, removes the ability of us documenting improved transition dates in areas outside of the active regions? I thought the goal was to increase the overall accuracy and usability of the complete database. But the current proposal removes timezone information, and also removes a way of recording improvements when they are discovored. On 2013-09-17 19:14, Russ Allbery wrote:
Stephen Colebourne <scolebourne@joda.org> writes:
On 17 September 2013 22:38, Paul Eggert <eggert@cs.ucla.edu> wrote:
Changes affecting time stamps before 1970
Some zones have been turned into links, when they differ from existing zones only in older data that was likely invented or that differs only in LMT or transition from LMT. These changes affect only time stamps before 1943. The affected zones are: Africa/Juba, America/Anguilla, America/Aruba, America/Dominica, America/Grenada, America/Guadeloupe, America/Marigot, America/Montserrat, America/St_Barthelemy, America/St_Kitts, America/St_Lucia, America/St_Thomas, America/St_Vincent, America/Tortola, and Europe/Vaduz. From my perspective, these are the only changes that concern me. The change was IMO unecessary, deleted longstanding information and gained little if anything in return. The longstanding information appears to be of negative value, so losing it is itself a gain, I think. It means less unsourced or poorly-sourced data that people can be fooled into thinking is actually meaningful.
Someone who wants to tackle this problem can certainly work out higher-quality information about transitions and pre-standardized time offsets, and such data seems, to me at least, like it would be valuable to record in an expanded database. But the data in question here seems to just be a meaningless distraction from that effort. It doesn't appear to be of sufficient quality to serve as a foundation for further work.
(Thanks to Paul for the extensive discussion and background for the Europe/Vaduz change.)
Given that, I support making this change.
--
David Patte ₯ <dpatte@relativedata.com> writes:
Maybe I am misunderstanding, but doesn't this proposal, in effect, removes the ability of us documenting improved transition dates in areas outside of the active regions?
No, any zone that's turned into a link can be trivially turned back into a zone with its own rules with no user-visible impact.
I thought the goal was to increase the overall accuracy and usability of the complete database. But the current proposal removes timezone information, and also removes a way of recording improvements when they are discovored.
I think you have misunderstood what a link is in the tz database. All a link says is that the given zone has exactly the same time transitions and abbreviations as another given zone. It doesn't remove any timezone information at all. If there is high-quality information for that zone, it can be made not a link (by copying the rules of the zone to which it was linked) and then modified to include that information. The end user of the tz database doesn't know whether a zone identifier is a link or not. It behaves the same way either way from the user perspective. -- Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>
Yes, but you are removing transition dates from LMT. Thats the loss of data that concerns me. The transitions dates currently may not be accurate, as you say, but if we determine a perfectly accurate transition date for a link, that is different from the location we link to, we no longer have a way of recording it in the database. On 2013-09-17 21:32, Russ Allbery wrote:
David Patte ₯ <dpatte@relativedata.com> writes:
Maybe I am misunderstanding, but doesn't this proposal, in effect, removes the ability of us documenting improved transition dates in areas outside of the active regions? No, any zone that's turned into a link can be trivially turned back into a zone with its own rules with no user-visible impact.
I thought the goal was to increase the overall accuracy and usability of the complete database. But the current proposal removes timezone information, and also removes a way of recording improvements when they are discovored. I think you have misunderstood what a link is in the tz database. All a link says is that the given zone has exactly the same time transitions and abbreviations as another given zone. It doesn't remove any timezone information at all. If there is high-quality information for that zone, it can be made not a link (by copying the rules of the zone to which it was linked) and then modified to include that information.
The end user of the tz database doesn't know whether a zone identifier is a link or not. It behaves the same way either way from the user perspective.
--
David Patte ₯ <dpatte@relativedata.com> writes:
Yes, but you are removing transition dates from LMT. Thats the loss of data that concerns me. The transitions dates currently may not be accurate, as you say, but if we determine a perfectly accurate transition date for a link, that is different from the location we link to, we no longer have a way of recording it in the database.
Given that the whole point of my message to which you are responding was to describe in detail how this change doesn't affect our ability to record accurate LMT transition dates for those zones, I'm not sure how to proceed with this discussion. So far as I can tell, nothing about this change makes it more difficult to record an accurate LMT transition date for those zones. The editing steps required to record that information in the database files will be slightly different. That's all. Apologies for just saying the same thing again in different words, but I'm not quite sure what else to say. -- Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>
On Sep 17, 2013, at 6:46 PM, David Patte ₯ <dpatte@relativedata.com> wrote:
Yes, but you are removing transition dates from LMT. Thats the loss of data that concerns me. The transitions dates currently may not be accurate, as you say, but if we determine a perfectly accurate transition date for a link, that is different from the location we link to, we no longer have a way of recording it in the database.
If by that you mean that if we link together Gondwanaland/Central_City and Gondwanaland/Coast_City because their post-Epoch history is the same, and later determine that they adopted standard time in different years, we have no way of recording that in the database, the answer is "yes, we just break the link". Before breaking the link, we have two tzids, Gondwanaland/Central_City and Gondwanaland/Coast_City, which, on UN*X systems using zic, are handled by one compiled file with two hard links to it. After breaking the link, we have two tzids, Gondwanaland/Central_City and Gondwanaland/Coast_City, which, on UN*X systems using zic, are handled by two separate compiled files. Other systems might think that the existence of a Link line means that one of those is the "real" tzid and the other is an alias, but it's wrong of them to think so, and I think we should make that very very very very clear in the Theory document. (And, yes, this means that http://norbertlindenberg.com/ecmascript/intl.html#sec-6.4.2 should continue to treat the "backward" file as special, with paragraph 2 changed to If ianaTimeZone is a Link name *that appears in the “backward” file of the IANA Time Zone Database*, then let ianaTimeZone be the corresponding Zone name as specified in that file. Only links in the "backward" file should be treated as aliases, all others should be treated as "for now, these two tzids behave the same, but that's not guaranteed to remain true forever".)
So, are you saying that wherever our best guess that the transition from LMT to standard time is different, they should be kept as separate - not linked? If so, than i agree. I don't want to lose transition dates wherever he have already have different transition dates in the db. On 2013-09-17 22:08, Guy Harris wrote:
and later determine that they adopted standard time in different years, we have no way of recording that in the database, the answer is "yes, we just break the link".
--
On Sep 17, 2013, at 8:39 PM, David Patte ₯ <dpatte@relativedata.com> wrote:
So, are you saying that wherever our best guess that the transition from LMT to standard time is different, they should be kept as separate - not linked?
No. I'm saying that if we were to get rid of that transition information (if, for example, we have little confidence in it), and make some tzdb zones Links to another zone when, with the removal of the transition information, they have the same information, that does not prevent us from later putting the transition information back in (if, for example, we get more reliable information) and making those tzdb zones separate again. Removing the transition information would, obviously, mean that it would not currently be in the database, but it would *NOT* mean that we could never put it back in. (That does raise the question of what should be done if not all locations within a tzdb zone adopted standard time on the same date: "split the zone" means creating zones that would probably be of no interest to many end-users (I suspect a large fraction of the localtime() calls on my machine convert times that result from stat() and company. and few if any of those date back to the previous millennium, much less before the adoption of nationwide standard time in the US), so I don't think we should do that until we have some way for packagers of the tzdb to winnow out that stuff; "pick a location in the zone and use that" gives wrong answers for locations other than the one chosen and locations that went to standard time on the same date.)
Guy Harris wrote:
On Sep 17, 2013, at 8:39 PM, David Patte ₯ <dpatte@relativedata.com> wrote:
So, are you saying that wherever our best guess that the transition from LMT to standard time is different, they should be kept as separate - not linked?
No.
I'm saying that if we were to get rid of that transition information (if, for example, we have little confidence in it), and make some tzdb zones Links to another zone when, with the removal of the transition information, they have the same information, that does not prevent us from later putting the transition information back in (if, for example, we get more reliable information) and making those tzdb zones separate again.
Removing the transition information would, obviously, mean that it would not currently be in the database, but it would *NOT* mean that we could never put it back in.
(That does raise the question of what should be done if not all locations within a tzdb zone adopted standard time on the same date:
"split the zone" means creating zones that would probably be of no interest to many end-users (I suspect a large fraction of the localtime() calls on my machine convert times that result from stat() and company. and few if any of those date back to the previous millennium, much less before the adoption of nationwide standard time in the US), so I don't think we should do that until we have some way for packagers of the tzdb to winnow out that stuff;
"pick a location in the zone and use that" gives wrong answers for locations other than the one chosen and locations that went to standard time on the same date.)
No need to shout ... nothing I've said implied we could not put data back, but what I'm going on about here is that currently people are USING data with these values in. *YES* correcting data in the future is the target, but changing one assumption for another short term is simply wrong. So I'd prefer to maintain that information and I'm happy that it's simply moved to the 'extended' data store. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On 18 September 2013 02:32, Russ Allbery <rra@stanford.edu> wrote:
David Patte ₯ <dpatte@relativedata.com> writes:
Maybe I am misunderstanding, but doesn't this proposal, in effect, removes the ability of us documenting improved transition dates in areas outside of the active regions?
No, any zone that's turned into a link can be trivially turned back into a zone with its own rules with no user-visible impact.
On 18 September 2013 09:19, Guy Harris <guy@alum.mit.edu> wrote:
I'm saying that if we were to get rid of that transition information (if, for example, we have little confidence in it), and make some tzdb zones Links to another zone when, with the removal of the transition information, they have the same information, that does not prevent us from later putting the transition information back in (if, for example, we get more reliable information) and making those tzdb zones separate again. Removing the transition information would, obviously, mean that it would not currently be in the database, but it would *NOT* mean that we could never put it back in.
Both these comments are perfectly reasonable. A Link can, in technical terms, be converted back to a Zone if additional researched/reliable data is found. However, it is my understanding of Paul's comments that he will refuse to do so: Me: None of the above should stop existing Zone and Link entries from being expanded with researched historic data. ie. pre-1970 data for the existing set of IDs should remain in the main files... http://article.gmane.org/gmane.comp.time.tz/7171 Paul: That doesn't follow. If the main files currently have a link, and we want to turn this into a zone only because of pre-1970 data, we want to keep it a link in the main file, so that we can support existing implementations that are based on the typical current practice. http://article.gmane.org/gmane.comp.time.tz/7175 Me: Furthermore, had someone provided detailed pre-1970 data for America/Aruba a year ago, I think you would have accepted it. Yet you are arguing that now you've made it a Link you can no longer accept it. I would suggest that isn't logical or best practice. http://article.gmane.org/gmane.comp.time.tz/7178 Paul: No, we've excluded similar pre-1970 data in the past, e.g.,Europe/Zagreb. http://article.gmane.org/gmane.comp.time.tz/7181 Based on this I infer that the conversion from Zone to Link is not temporary, but permanent. There will be no way to reinsert the data into the main tzdb (it might be reinsertable into a secondary/extended/pre-1970 file, but not the main data set as currently designed). As such and from my perspective, the changes under discussion result in a permanently worse set of data for source code consumers like Java and PHP which rely on accurate data for each ID (and don't care about the Link vs Zone or zic compiled data size). My strong preference is to retain this kind of data (reverting the deletions) until there is some alternate means of representing it and a long enough period for source code data consumers to adapt their parsers (ie. at least 6 months). Stephen
On Wed, 18 Sep 2013, Stephen Colebourne wrote:
Based on this I infer that the conversion from Zone to Link is not temporary, but permanent. There will be no way to reinsert the data into the main tzdb (it might be reinsertable into a secondary/extended/pre-1970 file, but not the main data set as currently designed).
As such and from my perspective, the changes under discussion result in a permanently worse set of data for source code consumers like Java and PHP which rely on accurate data for each ID (and don't care about the Link vs Zone or zic compiled data size).
My strong preference is to retain this kind of data (reverting the deletions) until there is some alternate means of representing it and a long enough period for source code data consumers to adapt their parsers (ie. at least 6 months).
I fully support this. cheers, Derick -- http://derickrethans.nl | http://xdebug.org Like Xdebug? Consider a donation: http://xdebug.org/donate.php twitter: @derickr and @xdebug
On Tue, 17 Sep 2013, Stephen Colebourne wrote:
On 17 September 2013 22:38, Paul Eggert <eggert@cs.ucla.edu> wrote:
Changes affecting time stamps before 1970
Some zones have been turned into links, when they differ from existing zones only in older data that was likely invented or that differs only in LMT or transition from LMT. These changes affect only time stamps before 1943. The affected zones are: Africa/Juba, America/Anguilla, America/Aruba, America/Dominica, America/Grenada, America/Guadeloupe, America/Marigot, America/Montserrat, America/St_Barthelemy, America/St_Kitts, America/St_Lucia, America/St_Thomas, America/St_Vincent, America/Tortola, and Europe/Vaduz.
From my perspective, these are the only changes that concern me.
I also disagree with this change, except when two zones differ only in LMT value, in which case I think it's fine to convert one to a link. If two zones differ in date of transition form LMT to standard time, then I think that we should retain that information, not convert one to a link, even if the difference is only in pre-1970 data. If we know that old data was wrong, then we should correct it, but if we merely suspect that the old data is unreliable then we should retain our best estimate, not remove it. --apb (Alan Barrett)
Am 18.09.2013 10:28, schrieb Alan Barrett:
If we know that old data was wrong, then we should correct it, but if we merely suspect that the old data is unreliable then we should retain our best estimate, not remove it.
--apb (Alan Barrett)
So I support the removal, since I think the discussed data obviously appear to be of very questionable nature. We cannot even consider the discussed data as "our best estimate". Of course, there is no 100% guarantee - no black and white. If someone can know it better then it is easy to add the lost data again. But we should really not let the users in the state of wrong assumptions. Stability of data should not be the primary concern, rather correctness. And most users (near 100%) don't care about old that is to say archeological timezone data.
Meno Hochschild <mhochschild@gmx.de> wrote: |Am 18.09.2013 10:28, schrieb Alan Barrett: |> If we know that old data was wrong, then we should correct it, but if |> we merely suspect that the old data is unreliable then we should |> retain our best estimate, not remove it. | |So I support the removal, since I think the discussed data obviously I don't. |appear to be of very questionable nature. We cannot even consider the What do you mean by that? The people who collected the data are Bachelor of Science and Master of Science. Sorry? (Not that i give anything on that, i'm German ;) Do you know it any better? |discussed data as "our best estimate". Of course, there is no 100% Well of course tz can, because it did and does and continues to do so. You do not get used to anything better than that, can you? I still fail to understand the discussion. It is clear that it gets fuzzy in the past, and the only problem about that is that pupils don't learn anything about it all. Of course we can ask the question wether knowing an exact time is of any value at all, or wouldn't it be cooler to wear only the titanium, gold and diamonds? A few seconds here and there, they will get eroded away soon anyway, the leap-second specialists will do it and drop 'em, so as to make algorithms easier or i-don't-know-why. To be honest, i don't even know if life was at all possible before 1972, as i don't have any serious data to confirm it. --steffen
On 18 September 2013 10:59, Meno Hochschild <mhochschild@gmx.de> wrote:
So I support the removal, since I think the discussed data obviously appear to be of very questionable nature. We cannot even consider the discussed data as "our best estimate". Of course, there is no 100% guarantee - no black and white. If someone can know it better then it is easy to add the lost data again. But we should really not let the users in the state of wrong assumptions. Stability of data should not be the primary concern, rather correctness. And most users (near 100%) don't care about old that is to say archeological timezone data.
Pre-1970 data matters to some more than others. I can see a range of positions: a) delete all pre-1970 data b) only have Zones for areas distinct after 1970, other IDs are Links, full data where available for each Zone c) only have IDs for areas distinct after 1970, full data where available for each ID d) create new IDs where data only differs before 1970 I'm arguing for (c), which I previously believed was the tzdb's goal. The data deletion is based on (b). The quality of data deleted is also of different value to different people. I'll try to explain it in a different way... We know that the quality of the historic data for the Carribean is dubious. Lets give it an accuracy rating of 20%. One argument is that removing data with 20% accuracy is a good thing, and that is an understandable position. However, its important to look at the consequences of the deletion. Previously, location A (eg. Guadeloupe) had a 20% accuracy rating for its historic data. After the change it still has a 20% accuracy rating just with different data. But, that 20% accuracy rating refers to location B (eg. Port of Spain). From the perspective of location A, the accuracy is now lower, say 5%, because 20% accurate data for location B clearly is even less accurate for location A. I understand that the distinction here is fine. But its rather like saying "we published a guess 10 years ago for when the first factory opened in Brussels, but it now OK to replace that data with the guess we made for when the first factory opened in London" (assuming we recorded factory opening dates). I value each guess being distinct for each location in the absence of better information. Stephen
The use I have for the data are primarily for accurate display of the sky at any given moment which the post 1970 data does very well. Astronomers avoid using zones generally and use UTC or GMT/LMT for past events but there are some programs such as Starry Night and Skymap (any my software) that use the PC's clock and a zone offset (which can be set to past dates researching such phenomena as the timing and accuracy of occultations with select planetary perturbation terms, etc). Astrologers (not astronomers) have been the primary source for this older information and anyone who wants to display the sky with only the knowledge of what time is on their watch would be interested in this historical data. While the time zone information of the past (particularly the first half of the 20th century) is largely suspect (and really anyone who has done even modest research can tell this), it is important to anyone with this interest. Since the historical data doesn't change much, I maintain a separate database for this info (with warnings) since the tzdb can't promise that it will always be there and many have no interest in this type of use. One might want to keep in mind that there are many who have written access to the tzdb in other languages. Here is one example: http://code.google.com/p/delphi-tzdb/wiki/TZDBNewsAndUpdates To some extent the historical usage accuracy issue is offset by the fact that the particular use I describe above has worn grooves into the more densely populated areas which has a tendency to be more accurate than outlying areas. On Wed, Sep 18, 2013 at 8:22 AM, Stephen Colebourne <scolebourne@joda.org>wrote:
On 18 September 2013 10:59, Meno Hochschild <mhochschild@gmx.de> wrote:
So I support the removal, since I think the discussed data obviously appear to be of very questionable nature. We cannot even consider the discussed data as "our best estimate". Of course, there is no 100% guarantee - no black and white. If someone can know it better then it is easy to add the lost data again. But we should really not let the users in the state of wrong assumptions. Stability of data should not be the primary concern, rather correctness. And most users (near 100%) don't care about old that is to say archeological timezone data.
Pre-1970 data matters to some more than others. I can see a range of positions: a) delete all pre-1970 data b) only have Zones for areas distinct after 1970, other IDs are Links, full data where available for each Zone c) only have IDs for areas distinct after 1970, full data where available for each ID d) create new IDs where data only differs before 1970 I'm arguing for (c), which I previously believed was the tzdb's goal. The data deletion is based on (b).
The quality of data deleted is also of different value to different people. I'll try to explain it in a different way...
We know that the quality of the historic data for the Carribean is dubious. Lets give it an accuracy rating of 20%. One argument is that removing data with 20% accuracy is a good thing, and that is an understandable position. However, its important to look at the consequences of the deletion. Previously, location A (eg. Guadeloupe) had a 20% accuracy rating for its historic data. After the change it still has a 20% accuracy rating just with different data. But, that 20% accuracy rating refers to location B (eg. Port of Spain). From the perspective of location A, the accuracy is now lower, say 5%, because 20% accurate data for location B clearly is even less accurate for location A.
I understand that the distinction here is fine. But its rather like saying "we published a guess 10 years ago for when the first factory opened in Brussels, but it now OK to replace that data with the guess we made for when the first factory opened in London" (assuming we recorded factory opening dates). I value each guess being distinct for each location in the absence of better information.
Stephen
Can I suggest that we split this into two updates, the first containing only the Fiji change, and the second with everything else? The Fiji change will need to roll into our production systems quickly. The raft of other changes have a non-negligible chance of causing a glitch or two as they moves through our internal tooling, so it would be nice to decouple the two. Thanks, -- Andy Heninger On Tue, Sep 17, 2013 at 2:38 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
On 09/12/13 08:39, Tim Parenti wrote:
it would be quite prudent to summarize specifically what changes are (and are not) made by this changeset
Yes, although I normally write a change summary when preparing a release and publish it as part of the release announcement, the proposed change is large enough that it'd be helpful to circulate a draft of the change summary now. Here is a draft for the current set of proposed changes, which is:
https://github.com/eggert/tz/compare/8f10e5c...738ad89
Changes affecting near-future time stamps
This year Fiji will start DST on October 27, not October 20. (Thanks to David Wheeler for the heads-up.) For now, guess that Fiji will continue to spring forward the Sunday before the fourth Monday in October.
Changes affecting current and future time zone abbreviations
Use WIB/WITA/WIT rather than WIT/CIT/EIT for alphabetic Indonesian time zone abbreviations since 1932. (Thanks to George Ziegler, Priyadi Iman Nurcahyo, Zakaria, Jason Grimes, Martin Pitt, and Benny Lin.) This affects Asia/Dili, Asia/Jakarta, Asia/Jayapura, Asia/Makassar, and Asia/Pontianak.
Use ART (UTC-3, standard time), rather than WARST (also UTC-3, but daylight saving time) for San Luis, Argentina since 2009.
Changes affecting Godthab time stamps after 2037 if version mismatch
Allow POSIX-like TZ strings where the transition time's hour can range from -167 through 167, instead of the POSIX-required 0 through 24. E.g., TZ='FJT-12FJST,M10.3.1/146,M1.3.4/75' for the new Fiji rules. This is a more-compact way to represent far-future time stamps for America/Godthab, America/Santiago, Antarctica/Palmer, Asia/Gaza, Asia/Hebron, Asia/Jerusalem, Pacific/Easter, and Pacific/Fiji. Other zones are unaffected by this change. (Derived from a suggestion by Arthur David Olson.)
Allow POSIX-like TZ strings where daylight saving time is in effect all year. E.g., TZ='WART4WARST,J1/0,J365/25' for Western Argentina Summer Time all year. This supports a more-compact way to represent the 2013d data for America/Argentina/San_Luis. Because of the change for San Luis noted above this change does not affect the current data. (Thanks to Andrew Main (Zefram) for suggestions that improved this change.)
Where these two TZ changes take effect, there is a minor extension to the tz file format in that it allows new values for the embedded TZ-format string, and the tz file format version number has therefore been increased from 2 to 3 as a precaution. Version-2-based client code should continue to work as before for all time stamps before 2038. Existing version-2-based client code (tzcode, GNU/Linux, Solaris) has been tested on version-3-format files, and typically works in practice even for time stamps after 2037; the only known exception is America/Godthab.
Changes affecting time stamps before 1970
Pacific/Johnston is now a link to Pacific/Honolulu. This corrects some errors before 1947.
Some zones have been turned into links, when they differ from existing zones only in older data that was likely invented or that differs only in LMT or transition from LMT. These changes affect only time stamps before 1943. The affected zones are: Africa/Juba, America/Anguilla, America/Aruba, America/Dominica, America/Grenada, America/Guadeloupe, America/Marigot, America/Montserrat, America/St_Barthelemy, America/St_Kitts, America/St_Lucia, America/St_Thomas, America/St_Vincent, America/Tortola, and Europe/Vaduz.
Change Kingston Mean Time from -5:07:12 to -5:07:11. This affects America/Cayman, America/Jamaica and America/Grand_Turk time stamps from 1890 to 1912.
Change the UT offset of Bern Mean Time from 0:29:44 to 0:29:46. This affects Europe/Zurich time stamps from 1853 to 1894. (Thanks to Alois Treindl).
Change the date of the circa-1850 Zurich transition from 1849-09-12 to 1853-07-16, overriding Shanks with data from Messerli about postal and telegraph time in Switzerland.
Changes affecting time zone abbreviations before 1970
For Asia/Jakarta, use BMT (not JMT) for mean time from 1923 to 1932, as Jakarta was called Batavia back then.
Changes affecting API
The 'zic' command now outputs a dummy transition when far-future data can't be summarized using a TZ string, and uses a 402-year window rather than a 400-year window. For the current data, this affects only the Asia/Tehran file. It does not affect any of the time stamps that this file represents, so zdump outputs the same information as before. (Thanks to Andrew Main (Zefram).)
The 'date' command has a new '-r' option, which lets you specify the integer time to display, a la FreeBSD.
The 'tzselect' command has two new options '-c' and '-n', which lets you select a zone based on latitude and longitude.
The 'zic' command's '-v' option now warns about constructs that require the new version-3 binary file format. (Thanks to Arthur David Olson for the suggestion.)
Support for floating-point time_t has been removed. It was always dicey, and POSIX no longer requires it. (Thanks to Eric Blake for suggesting to the POSIX committee to remove it, and thanks to Alan Barrett, Clive D.W. Feather, Andy Heninger, Arthur David Olson, and Alois Treindl, for reporting bugs and elucidating some of the corners of the old floating-point implementation.)
The signatures of 'offtime', 'timeoff', and 'gtime' have been changed back to the old practice of using 'long' to represent UT offsets. This had been inadvertently and mistakenly changed to 'int_fast32_t'. (Thanks to Christos Zoulos.)
The code avoids undefined behavior on integer overflow in some more places, including gmtime, localtime, mktime and zdump.
Changes affecting the zdump utility
zdump now outputs "UT" when referring to Universal Time, not "UTC". "UTC" does not make sense for time stamps that predate the introduction of UTC, whereas "UT", a more-generic term, does. (Thanks to Steve Allen for clarifying UT vs UTC.)
Data changes affecting behavior of tzselect and similar programs
Country code BQ is now called the more-common name "Caribbean Netherlands" rather than the more-official "Bonaire, St Eustatius & Saba".
Remove from zone.tab the names America/Montreal, America/Shiprock, and Antarctica/South_Pole, as they are equivalent to existing same-country-code zones for post-1970 time stamps. The data for these names are unchanged, so the names continue to work as before.
Changes affecting code internals
zic -c now runs way faster on 64-bit hosts when given large numbers.
zic now uses vfprintf to allocating and freeing some memory.
tzselect now computes the list of continents from the data, rather than have it hard-coded.
Minor changes pacify GCC 4.7.3 and GCC 4.8.1.
Changes affecting the build procedure
The 'leapseconds' file is now generated automatically from a new file 'leap-seconds.list', which is a copy of <ftp://time.nist.gov/pub/leap-seconds.list>. A new source file 'leapseconds.awk' implements this. The goal is simplification of the future maintenance of 'leapseconds'.
When building the 'posix' or 'right' subdirectories, if the subdirectory would be a copy of the default subdirectory, it is now made a symbolic link if that is supported. This saves about 2 MB of file system space.
The links America/Shiprock and Antarctica/South_Pole have been moved to the 'backward' file. This affects only nondefault builds that omit 'backward'.
Changes affecting version-control only
.gitignore now ignores 'date'.
Changes affecting documentation and commentary
Changes to the 'tzfile' man page
It now mentions that the binary file format may be extended in future versions by appending data.
It now refers to the 'zdump' and 'zic' man pages.
Changes to the 'zic' man page
It lists conditions that elicit a warning with '-v'.
It says that the behavior is unspecified when duplicate names are given, or if the source of one link is the target of another.
Its examples are updated to match the latest data.
The definition of white space has been clarified slightly. (Thanks to Michael Deckers.)
Changes to the 'Theory' file
There is a new section about the accuracy of the tz database, describing the many ways that errors can creep in, and explaining why so many of the pre-1970 time stamps are wrong or misleading (thanks to Steve Allen, Lester Caine, and Garrett Wollman for discussions that contributed to this).
The 'Theory' file describes LMT better (this follows a suggestion by Guy Harris).
It refers to the 2013 edition of POSIX rather than the 2004 edition.
It's mentioned that excluding 'backward' should not affect the other data, and it suggests at least one zone.tab name per inhabited country (thanks to Stephen Colebourne).
Some longstanding restrictions on names are documented, e.g., 'America/New_York' precludes 'America/New_York/Bronx'.
It gives more reasons for the 1970 cutoff.
It now mentions which time_t variants are supported, such as signed integer time_t. (Thanks to Paul Goyette for reporting typos in an experimental version of this change.)
(Thanks to Philip Newton for correcting typos in these changes.)
Documentation and commentary is more careful to distinguish UT in general from UTC in particular. (Thanks to Steve Allen.)
Add a better source for the Zurich 1894 transition. (Thanks to Pierre-Yves Berger.)
Update shapefile citations in tz-link.htm. (Thanks to Guy Harris.)
Andy Heninger wrote:
Can I suggest that we split this into two updates, the first containing only the Fiji change
For convenience, if you'd like to have just the Fiji patch (relative to 2013d) it's appended to this email. As for redoing the patches and generating two updates, sorry, I thought I'd covered this point earlier, but I can't seem to find it in my outgoing mail, so I guess not. Anyway, in the past we haven't bothered to split out changes like that, and common practice has been for software distributions to incorporate just the changes they want, if they're leery about upgrading to a new release. For example, Fedora 19 is using 2013c, but is incorporating post-2013c patches for Morocco and Israel; see: http://pkgs.fedoraproject.org/cgit/tzdata.git/diff/tzdata.spec?h=f19 I expect this sort of thing to continue with 2013d too, as well as with 2013e whenever it comes out. Here's the Fiji-only patch that I mentioned above. (I haven't tested it.) --- old/australasia +++ new/australasia @@ -352,16 +352,25 @@ Zone Indian/Cocos 6:27:40 - LMT 1900 # today confirmed that Fiji will start daylight savings at 2 am on Sunday 21st # October 2012 and end at 3 am on Sunday 20th January 2013. # http://www.fiji.gov.fj/index.php?option=com_content&view=article&id=6702&cat... + +# From the Fijian Government Media Center (2013-08-30) via David Wheeler: +# Fiji will start daylight savings on Sunday 27th October, 2013 and end at 3am +# on Sunday 19th January, 2014.... move clocks forward by one hour from 2am +# http://www.fiji.gov.fj/Media-Center/Press-Releases/DAYLIGHT-SAVING-STARTS-ON... # -# From Paul Eggert (2012-08-31): -# For now, guess a pattern of the penultimate Sundays in October and January. +# From Paul Eggert (2013-09-09): +# For now, guess that Fiji springs forward the Sunday before the fourth +# Monday in October. This matches both recent practice and +# timeanddate.com's current spring-forward prediction. +# For the January 2014 transition we guessed right while timeanddate.com +# guessed wrong, so leave the fall-back prediction alone. # Rule NAME FROM TO TYPE IN ON AT SAVE LETTER/S Rule Fiji 1998 1999 - Nov Sun>=1 2:00 1:00 S Rule Fiji 1999 2000 - Feb lastSun 3:00 0 - Rule Fiji 2009 only - Nov 29 2:00 1:00 S Rule Fiji 2010 only - Mar lastSun 3:00 0 - -Rule Fiji 2010 max - Oct Sun>=18 2:00 1:00 S +Rule Fiji 2010 max - Oct Sun>=21 2:00 1:00 S Rule Fiji 2011 only - Mar Sun>=1 3:00 0 - Rule Fiji 2012 max - Jan Sun>=18 3:00 0 - # Zone NAME GMTOFF RULES FORMAT [UNTIL]
On Tue, Sep 17, 2013 at 5:59 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
Andy Heninger wrote:
Can I suggest that we split this into two updates, the first containing only the Fiji change
For convenience, if you'd like to have just the Fiji patch (relative to 2013d) it's appended to this email.
As for redoing the patches and generating two updates, sorry, I thought I'd covered this point earlier, but I can't seem to find it in my outgoing mail, so I guess not. Anyway, in the past we haven't bothered to split out changes like that, and common practice has been for software distributions to incorporate just the changes they want, if they're leery about upgrading to a new release. For example, Fedora 19 is using 2013c, but is incorporating post-2013c patches for Morocco and Israel; see:
We try to have the time zone data in use be exactly the latest public tz data, identified by it's public name, 2013d or whatever. Patching is always possible, it's just cleaner and more convenient not to have to. The Fedora approach of patching the data but continuing to identify it as something that it is not can lead to real confusion. In an ideal world, from my perspective, changes that affect the present or near future time would be kept separate from other changes, and have a fast-track release process. And perhaps substantial cleanup and historical data updates would be kept away from the busy times in March-April, and September-October, when all too many countries seem to think it's OK to announce that they changed their clocks last weekend. Thanks, -- Andy
Andy Heninger wrote:
The Fedora approach of patching the data but continuing to identify it as something that it is not can lead to real confusion.
Sure, but Fedora doesn't identify it as plain 2013c: Fedora 19, for example, uses what it calls version 2013c-2. Debian is similar; Debian 7.1 uses what it calls 2013c-0wheezy1. OpenSUSE 12.3 uses its own 2013d-1.2. And so forth. This common approach works and scales reasonably well to a large number of distributions. If we changed the tz maintenance approach, and identified some patches to be higher priority and shipped them out in a separate tarball, that would complicate maintenance. Almost inevitably we'd need one or more forks in the upstream tz release: at least a "stable" branch versus an "even more stable" branch, and possibly more branches besides, depending on which distributions want which changes. Each new branch would need a separate release schedule and separate testing, and maintaining all the branches would be more hassle both for upstream and downstream maintainers. For a sufficiently-complicated software system this kind of complexity might be worthwhile, but the tz code and data are reasonably simple, and such complexity hasn't been needed in the past, even for tz updates that were fairly hefty. I'll admit to having a bit of a bias here, as much of the maintenance burden would fall on me while the benefit would accrue to you; but even so the overall benefits of changing the tz maintenance procedure don't clearly outweigh the costs.
On Wed, 18 Sep 2013, Andy Heninger wrote:
In an ideal world, from my perspective, changes that affect the present or near future time would be kept separate from other changes, and have a fast-track release process. And perhaps substantial cleanup and historical data updates would be kept away from the busy times in March-April, and September-October, when all too many countries seem to think it's OK to announce that they changed their clocks last weekend.
From my point of view, as the person who handles tzdata updates for NetBSD, I would prefer to have no controversial changes in any tzdata update ever. I suggest that a possible way of achieving the goal of no controversial changes, would be to have at least two branches in the upstream repository. I'll name the branches "proposed" and "approved" for the sake of this message. Changes could be committed first to the "proposed" branch, then merged to the "approved" branch after discussion. Releases would be made from the "approved" branch. I also like the idea of avoiding potentially disruptive changes during the busy times of March-April and September-October. If we had already been using this scheme over the past month or so, then the Fiji and Liechtenstein changes would be in the "approved" branch, and would be released soon, while the changes that some people are unhappy about would be in the "proposed" branch, and would be discussed further, and possibly reverted or modified. Of course, any OS vendor can do its own separation of changes into different categories, and merge only the uncontroversial tzdata changes into OS release branches. The question has never come up before, for NetBSD, because we have not been aware of such controversial changes before. --apb (Alan Barrett)
Alan Barrett wrote:
Of course, any OS vendor can do its own separation of changes into different categories, and merge only the uncontroversial tzdata changes into OS release branches. The question has never come up before, for NetBSD, because we have not been aware of such controversial changes before.
I have to ask ... where do you stand on omitting pre-1972 data? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On Thu, 19 Sep 2013, Lester Caine wrote:
Alan Barrett wrote:
Of course, any OS vendor can do its own separation of changes into different categories, and merge only the uncontroversial tzdata changes into OS release branches. The question has never come up before, for NetBSD, because we have not been aware of such controversial changes before.
I have to ask ... where do you stand on omitting pre-1972 data?
In the short term, I think that the tz project should stick to the existing practice of using 1970 as a cutoff date. In the long term, think that the tz project should attempt to provide as much high-quality data as is reasonably feasible, and that users of the tz data should have the ability to "winnow" and install only a subset of the data. Here, users of the tz data includes OS vendors, appliance vendors, other software projects, software packagers, system administrators, and end users. --apb (Alan Barrett)
Alan Barrett wrote:
we have not been aware of such controversial changes before.
And that's the main difference. In the past, development was made privately, with intermediate patches sometimes emailed to the list but often not, and the only real notice of a change was a new release. Now that I've been doing things on a public github site, we're getting lots more comments from people affected by changes as they're being considered. So even though the development practice is roughly the same as before, and even though the proposed set of changes is far less intrusive than some previous changesets, people are understandably concerned by seeing the details of a process that were formerly invisible. I continue to be tempted to go back to the old way of keeping development private, not only because it'd be less work for me, but because I suspect it'd be less work for everyone else.
While noisy, I think the very public development process is a good thing, and that we will end up with better data overall as a result. I don't see this as being at all incompatible with having quick, small updates for late breaking rule changes. But pretty much no matter what is decided here as a policy for managing releases, I can deal with it. -- Andy On Thu, Sep 19, 2013 at 8:37 AM, Paul Eggert <eggert@cs.ucla.edu> wrote:
Alan Barrett wrote:
we have not been aware of such controversial changes before.
And that's the main difference. In the past, development was made privately, with intermediate patches sometimes emailed to the list but often not, and the only real notice of a change was a new release. Now that I've been doing things on a public github site, we're getting lots more comments from people affected by changes as they're being considered. So even though the development practice is roughly the same as before, and even though the proposed set of changes is far less intrusive than some previous changesets, people are understandably concerned by seeing the details of a process that were formerly invisible.
I continue to be tempted to go back to the old way of keeping development private, not only because it'd be less work for me, but because I suspect it'd be less work for everyone else.
Andy Heninger <aheninger@google.com> writes:
While noisy, I think the very public development process is a good thing, and that we will end up with better data overall as a result. I don't see this as being at all incompatible with having quick, small updates for late breaking rule changes.
The tricky part with quick, small releases is that, if you've already staged things for the next release, you have to revert those changes or you have to branch. (You can, of course, branch proactively to do the staging, but it amounts to the same thing.) This means an additional level of complexity to the development process and possibly the versioning process that historically hasn't been used, and basically means more work for Paul. -- Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>
Russ Allbery <rra@stanford.edu> wrote: |Andy Heninger <aheninger@google.com> writes: |> While noisy, I think the very public development process is a good |> thing, and that we will end up with better data overall as a result. I |> don't see this as being at all incompatible with having quick, small |> updates for late breaking rule changes. | |The tricky part with quick, small releases is that, if you've already |staged things for the next release, you have to revert those changes or |you have to branch. (You can, of course, branch proactively to do the |staging, but it amounts to the same thing.) | |This means an additional level of complexity to the development process |and possibly the versioning process that historically hasn't been used, |and basically means more work for Paul. This is not true with modern VCS, and especially git(1). I can think of a number of scenarios that could be used to cleanly separate development of TZ, and having different mainline branches for data and code changes comes to mind immediately. Additional sub-branches for speculative work may be easily created as forks from those branches. That is what happens today almost everywhere. (In projects much larger than TZ, though possibly more mature in respect to code changes.) And if there really is a change so pervasive that it causes conflicts, there is git-rerere(1) (i never had a need to use, however). Anyway, like that, in the worst case (no git-rebase(1) easily possible), you git-cherry-pick(1) a range of commits plus possibly other commits individually onto the main line of development. I'm pretty sure that Paul Eggert is aware of all this. I'm convinced even complex workflows with multiple branches as mentioned in the first paragraph are easier to accomplish and much easier to maintain than with sccs(1) or cvs(1) and patchbombs, that are mailed manually. |Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/> The pretty old one --steffen
On 19 September 2013 16:37, Paul Eggert <eggert@cs.ucla.edu> wrote:
Alan Barrett wrote:
we have not been aware of such controversial changes before.
And that's the main difference. In the past, development was made privately, with intermediate patches sometimes emailed to the list but often not, and the only real notice of a change was a new release. Now that I've been doing things on a public github site, we're getting lots more comments from people affected by changes as they're being considered. So even though the development practice is roughly the same as before, and even though the proposed set of changes is far less intrusive than some previous changesets, people are understandably concerned by seeing the details of a process that were formerly invisible.
I humbly suggest that its not the more open process that caused recent long threads, but the nature of the changes proposed. One email described it as "revolution not evolution" IIRC. Had you used the same open process, but not made any of the controversial changes I simply do not think that there would have been the same level of debate. ie. not all of us see the last two months as being "roughly the same as before". The real question at this point is not what has happened, but what will happen. Do you believe you have effective consensus to release as is? Or do you believe you will gain greater harmony by selecting an uncontroversial subset to release now? Stephen
On 19 September 2013 13:21, Andy Heninger <aheninger@google.com> wrote:
While noisy, I think the very public development process is a good thing, and that we will end up with better data overall as a result. I don't see this as being at all incompatible with having quick, small updates for late breaking rule changes.
I, too, fail to see any incompatibility. While I am now mildly receptive to Paul's assertions that some of the changes are not major issues (although I still don't agree with all of them, for reasons others have mentioned), there is a clear separation in my mind: The Fiji change needs to happen. A vast majority of the others simply don't. Yes, these other changes may (or may not) improve upon some aspect of tz's broader goals, but regardless they are of a completely different level of importance. If we are to turn away from the more ardent "if it ain't broke, don't fix it" mentality of the past (and I maintain that that is an okay thing to do), and thus start making changes to the scope and/or spirit of this project, it would indeed be wise to separate that long-term incremental work in the project's evolution from the short-term necessities of publishing correct data for timestamps in the immediate vicinity of now. I don't see such an approach requiring more than three Git branches: one for official releases, one for timely data changes mostly ready for release, and one for longer-term development of the database and code itself. On 19 September 2013 15:06, Stephen Colebourne <scolebourne@joda.org> wrote:
I humbly suggest that its not the more open process that caused recent long threads, but the nature of the changes proposed. One email described it as "revolution not evolution" IIRC. Had you used the same open process, but not made any of the controversial changes I simply do not think that there would have been the same level of debate. ie. not all of us see the last two months as being "roughly the same as before".
Not just the nature of the changes proposed, but also how those proposals were presented to the list. Nearly all of the feedback came after proposed patches were pushed, as it seems little feedback was sought in advance of those patches being created. I respect the challenges of Paul's maintenance role, and do not fault him for trying to make things easier on himself; however, his personal wishlist for the future of tz is not congruent to the users' collective wishlist. The more public development process under Paul's leadership is indeed a good thing, but I think it requires a bit more proactive consensus on the direction we should be moving with things. On 19 September 2013 15:06, Stephen Colebourne <scolebourne@joda.org> wrote:
The real question at this point is not what has happened, but what
will happen. Do you believe you have effective consensus to release as
is? Or do you believe you will gain greater harmony by selecting an uncontroversial subset to release now?
I'm interested in the response to this. -- Tim Parenti
Tim Parenti wrote:
Nearly all of the feedback came after proposed patches were pushed
That is how it's always been done. I always maintained a private repository before, as did Arthur David Olson and Robert Elz. I pushed changes into the repository first, and didn't always post them on the list afterwards, and I assume ADO and RE did that as well -- it's a natural way to proceed. It's odd to see a complaint that little feedback was sought, given that this time every proposed change was posted to the list, including a draft release notice -- in other words, far more effort was made this time to get feedback. In hindsight, opening up the development process in this way was probably a mistake. It's slowed development, and made it less fun. And let's not discount the cost of making things less fun in a volunteer project. The large quantity of repeated discussion of trivialities has driven one valuable contributor off the mailing list, a person who in the past contributed more useful changes to the data than anybody in this discussion other than ADO and myself. That's a net minus. All specific technical objections to the proposed changes have been responded to. The remaining objections are either vague (so there's not much specific one can do), or suggestions to complicate the development process (not a good idea right now, if ever), or are about such trivial matters that no real users will care. There have been dozens of emails on the LMT transition issue, which for our data boil down down to a relatively small number of specific questions like this one: on March 2, 1912, did St Kitts advance its clocks by 10 minutes and 52 seconds at 04:10:52 UT, or by 6 minutes and 4 seconds at 04:06:04 UT? Really? Dozens of emails about a minor cleanup of what is almost surely bogus noise? All specific new objections that have come up during the discussion of the release notice have been addressed, albeit perhaps not to everyone's liking. It's improbable we'll ever get complete consensus on the changes, but the current changeset will work and is in the spirit of how the project has always been maintained, so I'm inclined to release what we have now. The process has been too messy this time, admittedly, and I will try to do better next time.
Paul Eggert <eggert@cs.ucla.edu> wrote: |it less fun. And let's not discount the cost of making |things less fun in a volunteer project. The large quantity |have been responded to. The remaining objections are either |vague (so there's not much specific one can do), or |suggestions to complicate the development process (not a |Dozens of emails about a minor cleanup of what is almost |surely bogus noise? To have it said once: i would love to see a kind person that respects cultural circumstances when they arise as the maintainer of the TZ database, not one who irons over them. It's surely only because of misunderstanding, but that doesn't make it any better. The only real, usable improvement in all that discussion was the suggestion that the data could be improved and that the tools could be adjusted to cramp the range that is used to build the binary data, so as to save space (and maybe speedup some algorithm, i have never dealt with the TZ code). Zefram seemed to be willing to write the code necessary for that. Imho it would be an improvement if that would be possible, adding some new valid historical data, leaving it off on purpose, on request. *Much* better than the other way around, for sure. Speaking of it, and being on a level with it: ?0[steffen@sherwood tz.git]$ git blame --line-porcelain origin/master -- europe | sed -n 's/^author //p' | sort | uniq -c 2868 Arthur David Olson 155 Paul Eggert ?0[steffen@sherwood tz.git]$ git lo -1 0fdbcdc * 0fdbcdc Release tzcode2013a and tzdata2013a. ?0[steffen@sherwood tz.git]$ git blame --line-porcelain 0fdbcdc -- europe | sed -n 's/^author //p' | sort | uniq -c 2925 Arthur David Olson 48 Paul Eggert I personally, and having spoken with noone whomsoever, am the opinion that the TZ database would be better off if a project like NetBSD, that aims in producing portable code, and regulary tests it on a widespread basis, would hold both the code but also the data as trustee for IANA. I think this change would be a win for both parts of TZ. It is hard to imagine that, there, such changes, and especially the irritating ones, would be introduced at first. I agree that it was much nicer once i didn't know how the result was achieved. --steffen
On 20 September 2013 14:07, Steffen Daode <sdaoden@gmail.com> wrote:
Speaking of it, and being on a level with it:
?0[steffen@sherwood tz.git]$ git blame --line-porcelain origin/master -- europe | sed -n 's/^author //p' | sort | uniq -c 2868 Arthur David Olson 155 Paul Eggert ?0[steffen@sherwood tz.git]$ git lo -1 0fdbcdc * 0fdbcdc Release tzcode2013a and tzdata2013a. ?0[steffen@sherwood tz.git]$ git blame --line-porcelain 0fdbcdc -- europe | sed -n 's/^author //p' | sort | uniq -c 2925 Arthur David Olson 48 Paul Eggert
I suspect your point would be made more clearly if you expressed in words what you tried to convey there. I believe not everyone knows git inside-out and can immediately grasp the relevance of the command lines you used and the output they produced. Cheers, Philip
Philip Newton wrote:
I suspect your point would be made more clearly if you expressed in words what you tried to convey there.
I think he was trying to say that Arthur David Olson has far more entries in the git commit log than I, and therefore he's by far the main creator of the database (and I am a mere interloper :-). There are a couple of things wrong about that analysis. First, until recently ADO did all the commits, including committing changes that I wrote, which means a simple git+grep check will wrongly credit him with most of my changes. Second, ADO used SCCS, which creates a separate commit record for each change to each file, whereas with git (which is what I'm using) there's one commit record per change even if the change affects multiple files.
+1 it would be good to have a public update that keep things more or less as they are until * there is a consensus about what to change exactly * there is time to test if more radical changes actually break things Gunther On 18/09/2013 0:43, Andy Heninger wrote:
Can I suggest that we split this into two updates, the first containing only the Fiji change, and the second with everything else?
The Fiji change will need to roll into our production systems quickly. The raft of other changes have a non-negligible chance of causing a glitch or two as they moves through our internal tooling, so it would be nice to decouple the two.
Thanks,
-- Andy Heninger
On Thu, Sep 19, 2013 at 7:02 AM, gunther vermeir <gunther.vermeir@oracle.com
wrote:
+1
it would be good to have a public update that keep things more or less as they are until * there is a consensus about what to change exactly * there is time to test if more radical changes actually break things
Gunther
On 18/09/2013 0:43, Andy Heninger wrote:
Can I suggest that we split this into two updates, the first containing only the Fiji change, and the second with everything else?
The Fiji change will need to roll into our production systems quickly. The raft of other changes have a non-negligible chance of causing a glitch or two as they moves through our internal tooling, so it would be nice to decouple the two.
+1 here as well. I would prefer only the pending Fiji change be pushed out now and continue work on the other changes after DST season has passed. -Andrew
On 9/17/13 11:38 PM, Paul Eggert wrote:
Some zones have been turned into links, when they differ from existing zones only in older data that was likely invented or that differs only in LMT or transition from LMT. ... Europe/Vaduz.
I have already pointed out that it is wrong to link Europe/Vaduz with Europe/Zurich Vaduz (Liechtenstein) is a different country, with a different timezone history: no DST in 1941 and 1942. It should not be linked with Zurich, Switzerland. I do not understand why the current zone for Vaduz, which is correct, should be thrown away and be replaced with a link to an incorrect zone. This is against the Theory rules, as pre-1970 differences are supposed to be only ignored for areas belonging to the same country.
On 9/17/13 11:38 PM, Paul Eggert wrote:
Some zones have been turned into links, when they differ from existing zones only in older data that was likely invented or that differs only in LMT or transition from LMT. ... Europe/Vaduz.
I have already pointed out that it is wrong to link Europe/Vaduz with Europe/Zurich Vaduz (Liechtenstein) is a different country, with a different timezone history: no DST in 1941 and 1942. It should not be linked with Zurich, Switzerland. I do not understand why the current zone for Vaduz, which is correct, should be thrown away and be replaced with a link to an incorrect zone. This is against the Theory rules, as pre-1970 differences are supposed to be only ignored for areas belonging to the same country. PS: If it is really true that from now on, you want to ingnore differences between countries, if they apply only to pre-1970, when since 1970 the two countries agree, you would also have to remove Europe/Paris and link it with Europe/Berlin, because since 1970 these two zones have not differed. The same appleies to MANY central European zones, which all agree since 1970. It would be a horrible loss of correct history information in TZ.
On Wed, 18 Sep 2013, Alois Treindl wrote:
I have already pointed out that it is wrong to link Europe/Vaduz with Europe/Zurich
Vaduz (Liechtenstein) is a different country, with a different timezone history: no DST in 1941 and 1942. It should not be linked with Zurich, Switzerland.
Do you have a source for that?
I do not understand why the current zone for Vaduz, which is correct, should be thrown away and be replaced with a link to an incorrect zone.
This is against the Theory rules, as pre-1970 differences are supposed to be only ignored for areas belonging to the same country.
PS: If it is really true that from now on, you want to ingnore differences between countries, if they apply only to pre-1970, when since 1970 the two countries agree, you would also have to remove Europe/Paris and link it with Europe/Berlin, because since 1970 these two zones have not differed.
The same appleies to MANY central European zones, which all agree since 1970.
It would be a horrible loss of correct history information in TZ.
I agree, but it doesn't seem that Paul is understanding that this data destruction is not wanted :-/ cheers, Derick
On 2013-09-18 10:50, Alois Treindl wrote:
On 9/17/13 11:38 PM, Paul Eggert wrote:
Some zones have been turned into links, when they differ from existing zones only in older data that was likely invented or that differs only in LMT or transition from LMT. ... Europe/Vaduz.
I have already pointed out that it is wrong to link Europe/Vaduz with Europe/Zurich
Vaduz (Liechtenstein) is a different country, with a different timezone history: no DST in 1941 and 1942. It should not be linked with Zurich, Switzerland.
I do not understand why the current zone for Vaduz, which is correct, should be thrown away and be replaced with a link to an incorrect zone.
This is against the Theory rules, as pre-1970 differences are supposed to be only ignored for areas belonging to the same country.
PS: If it is really true that from now on, you want to ingnore differences between countries, if they apply only to pre-1970, when since 1970 the two countries agree, you would also have to remove Europe/Paris and link it with Europe/Berlin, because since 1970 these two zones have not differed.
The same appleies to MANY central European zones, which all agree since 1970.
It would be a horrible loss of correct history information in TZ.
+1 --
On Thu, 12 Sep 2013, Paul Eggert wrote:
To do that, compare the latest stable release (2013d, or 8f10e5c in the experimental repository) to the master head. If you have the git repository, you can run this shell command:
git diff 8f10e5c...HEAD
Or you can visit this URL:
https://github.com/eggert/tz/compare/8f10e5c...HEAD
and click on 'Files Changed' to see each change to each file.
I'd like to simplify this process by adding tags to the experimental repository, so that you can say something like '2013d...HEAD' instead; see <https://github.com/eggert/tz/issues/1>. This should make it more convenient to look at old releases via git.
Actually, most people would probably create a "feature branch" for each version (from master), then merge into master upon release and tag it.
Unfortunately Git has multiple types of tags and Github has a "Releases" feature, and I haven't yet had time to understand all the issues involved.
I wouldn't bother with that :-) cheers, Derick -- http://derickrethans.nl | http://xdebug.org Like Xdebug? Consider a donation: http://xdebug.org/donate.php twitter: @derickr and @xdebug Posted with an email client that doesn't mangle email: alpine
Derick Rethans wrote:
Unfortunately Git has multiple types of tags and Github has a "Releases" feature, and I haven't yet had time to understand all the issues involved. I wouldn't bother with that:-)
Well I've taken a step on ... http://lsces.org.uk/hg/tzdata/ In theory I should be able to match the actual releases to versions in that repo. It's probably obvious to in some workflows, but why is the Makefile included in the data? There are 269 commits to the Makefile and 1165 relating to the raw data. ( There were less until I corrected the spelling of a couple of file names ;) ) I've included it in the repo simply because it is present in what I'm trying to sync with, but I'm not quite sure how to 'retrospectively' tag versions with their release number. http://lsces.org.uk/hg/tzdata/file/1a3f8e8a3ee7 is 2103d ... got it sussed. Now just need to remember how to do that in-line. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Lester Caine wrote:
http://lsces.org.uk/hg/tzdata/file/1a3f8e8a3ee7 is 2103d ... got it sussed. Now just need to remember how to do that in-line.
Half way there ... http://lsces.org.uk/hg/tzdata/tags So http://lsces.org.uk/hg/tzdata/archive/tzdata2013d.tar.bz2 pulls an archived copy of the files. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
FYI, someone previously did all the work to import the full tzdata history into a GitHub repo with all releases tagged, but it stops at a certain point in time in 2011: https://github.com/valodzka/tzdata The nice thing about preserving complete history like this is that it allows you to view the blame for each data file and see exactly when each line in the file last changed by tzdata release tag: https://github.com/valodzka/tzdata/blame/master/northamerica If an "official" repo is ever created, these features are very useful. -Andrew On Fri, Sep 13, 2013 at 3:00 PM, Lester Caine <lester@lsces.co.uk> wrote:
Lester Caine wrote:
http://lsces.org.uk/hg/tzdata/**file/1a3f8e8a3ee7<http://lsces.org.uk/hg/tzdata/file/1a3f8e8a3ee7>is 2103d ... got it sussed. Now just need to remember how to do that in-line.
Half way there ... http://lsces.org.uk/hg/tzdata/**tags <http://lsces.org.uk/hg/tzdata/tags> So http://lsces.org.uk/hg/tzdata/**archive/tzdata2013d.tar.bz2<http://lsces.org.uk/hg/tzdata/archive/tzdata2013d.tar.bz2> pulls an archived copy of the files.
-- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=**contact<http://lsces.co.uk/wiki/?page=contact> L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.**uk<http://rainbowdigitalmedia.co.uk>
Andrew Paprocki wrote:
FYI, someone previously did all the work to import the full tzdata history into a GitHub repo with all releases tagged, but it stops at a certain point in time in 2011: https://github.com/valodzka/tzdata
The nice thing about preserving complete history like this is that it allows you to view the blame for each data file and see exactly when each line in the file last changed by tzdata release tag: https://github.com/valodzka/tzdata/blame/master/northamerica
If an "official" repo is ever created, these features are very useful.
https://github.com/eggert Which is what I've cloned from so I can add pre 1970 material, but I feel that Paul's SHOULD just be his own workspace with a separate 'master'. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Derick Rethans <tz@derickrethans.nl> writes:
Actually, most people would probably create a "feature branch" for each version (from master), then merge into master upon release and tag it.
I certainly wouldn't. That's way more overhead than I would do for even very large projects unless I had a special need, let alone something like tz. -- Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>
participants (22)
-
Alan Barrett -
Alois Treindl -
Andrew Paprocki -
Andy Heninger -
Clive D.W. Feather -
David Patte ₯ -
Derick Rethans -
gunther vermeir -
Guy Harris -
Ian Abbott -
Jonathan Leffler -
Lester Caine -
Meno Hochschild -
Paul Eggert -
Philip Newton -
random832@fastmail.us -
Russ Allbery -
Steffen Daode Nurpmeso -
Stephen Colebourne -
Tim Parenti -
Zefram -
Zoidsoft