Behaviour in the face of multiple identical rules
This is more of an awareness-raising post than anything else... the data here is very far in the past. While trying to validate Noda Time against zic, I came up against issues with data releases between 1993 and 1996 in Egypt/Cairo. It turns out that this is because the Egypt rule is duplicated in the asia file, and this affects things. As a smallish example, if you create a file called testzone like this: Rule Egypt 1900 only - Oct 1 0:00 0 -
Rule Egypt 1943 1945 - Nov 1 0:00 0 - Rule Egypt 1945 only - Apr 16 0:00 1:00 " DST" Rule Egypt 1957 only - May 10 0:00 1:00 " DST" Rule Egypt 1957 1958 - Oct 1 0:00 0 - Rule Egypt 1958 only - May 1 0:00 1:00 " DST" # Rule Egypt 1957 only - May 10 0:00 1:00 " DST"
# Zone NAME GMTOFF RULES FORMAT [UNTIL] Zone Africa/Cairo 2:05:00 - LMT 1900 Oct 2:00 Egypt EET%s
... then execute: $ zic -d . testzone && zdump -v -c 1955,1958 $PWD/Africa/Cairo You'll see output like this for the May 1957 transition: .../Africa/Cairo Thu May 9 21:59:59 1957 UT = Thu May 9 23:59:59 1957
EET isdst=0 gmtoff=7200 ...Africa/Cairo Thu May 9 22:00:00 1957 UT = Fri May 10 01:00:00 1957 EET DST isdst=1 gmtoff=10800
That looks okay - the transition is around midnight, as described. If, however, you uncomment that last "Rule" line above - which is an exact duplicate of the earlier one - you get: .../Africa/Cairo Thu May 9 20:59:59 1957 UT = Thu May 9 22:59:59 1957
EET isdst=0 gmtoff=7200 .../Africa/Cairo Thu May 9 21:00:00 1957 UT = Fri May 10 00:00:00 1957 EET DST isdst=1 gmtoff=10800
Now the transition is one hour earlier. As it happens, this duplicate rule doesn't make any difference to Noda Time - it still emits the first output. I *strongly suspect* that I shouldn't care about this - that situations with duplicate rules should be deemed out of scope for any implementation. However, if anyone can think of any good justification why an implementation *should* behave like zic in this case, I'd be really interested to hear it. (I'm not asking for zic to change here. It already emits a warning because the rule is defined in multiple files, and I think that's good enough.) Jon
A bug indeed. Our clue is the zic.c comment "Mark which rules to do in the current year" which zic is hellbent on doing even if they're duplicates. A start on a fix is the (tab munged) change below; it flags the case reported by Jon Skeet, and does not produce any false reports of errors for the files in the time zone package. It should pick up glitches other than exactly duplicated rules. An unmunged version of the patch is attached. --ado 1.1 2959 lines 1.3 2966 lines *** /tmp/,azic.c 2015-07-19 14:29:15.286318600 -0400 --- /tmp/,bzic.c 2015-07-19 14:29:15.391324600 -0400 *************** *** 2372,2377 **** --- 2372,2384 ---- jtime == max_time) continue; jtime = tadd(jtime, -offset); + if (jtime == ktime) { + eats(zp->z_filename, zp->z_linenum, rp->r_filename, rp->r_linenum); + warning(_("two rules for same instant")); + rp = &zp->z_rules[k]; + eats(zp->z_filename, zp->z_linenum, rp->r_filename, rp->r_linenum); + error(_("two rules for same instant")); + } if (k < 0 || jtime < ktime) { k = j; ktime = jtime; On Sun, Jul 19, 2015 at 1:36 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
Jon Skeet wrote:
if anyone can think of any good justification why an implementation*should* behave like zic in this case, I'd be really interested to hear it.
I think you've identified a bug in zic. Perhaps Arthur can chime in?
Arthur David Olson wrote:
An unmunged version of the patch is attached.
Thanks; that was fast! I found one minor glitch (ktime could be used before it's initialized) and installed the attached variant: this also reindents the new code to fit, and documents constraints on simultaneous transitions. Although the new diagnostics are a bit wordy, it's probably not worth spending a lot of time tweaking them for this unlikely error.
The attached suppresses a warning that was (incorrectly) generated by GCC 4.9.2 due to the previous patch.
participants (3)
-
Arthur David Olson -
Jon Skeet -
Paul Eggert