Date: Wed, 27 Sep 2006 17:38:45 -0700 From: Ken Pizzini <tz.@explicate.org> Message-ID: <20060928003845.GA17660@866863.msa.explicate.org> | I'll make an attempt at making the text clearer... but then again, | since I understood the original text and you found it misleading, | perhaps you'd like to take a stab at clarifying it? I suspect the problem (like a lot of things that lead to ambiguities) is that it all depends upon yor state of mind when you start (what you already believe is true). If you have it in mind that the rules are defining periods during which a particular offset from UTC (and a particular abbreviation) applies, then you're likely to read the text in an entirely different way than if you start out believing that what is being defined is a set of points at which the offset frm UTC (and/or the associated abbreviation alters). For anyone who thinks carefully about it, the first of those two is clearly not rational, for example, consider the following two lines (rules) from some version or other of the australasia file (this might not be current, it's just a version I had conveniently lying around) Rule AT 1991 1999 - Oct Sun>=1 2:00s 1:00 - Rule AT 1991 max - Mar lastSun 2:00s 0 - If you believe the "specifies a range of times during which an offset applies" is the correct interpretation, then the first of those rules says that from some Sunday early Oct 1991 (the 6th it happened to be) until Some Sunday, early oct 1999, the ofset from UTC (for Tasmania) should have been +11:00 (the base offset is +10:00). The second rule says that from some Sunday late March 1991 (the 31st that year), into the unknown indefinite future, the offset from UTC is +10:00. What that would have to mean is that all during 1992, 1993, ... there were two offsets defined to run concurrently. That would be absurd, so, proof by contradiction (reductio ad absurdum -- or something like that) the original hypothesis must be incorrect. On the other hand, if you treat the rules as simply saying Mar 31 1991 (02:00s) change offset to 10:00 Oct 6 1991 (02:00s) change offset to 11:00 Mar 29 1992 (02:00s) change offset to 10:00 Oct 4 1992 (02:00s) change offset to 11:00 Mar 28 1993 (02:00s) change offset to 10:00 Oct 3 1993 (02:00s) change offset to 11:00 (etc) That is, as a shorthand notation from writing all of that out (which would also be possible, of course), then it all fits perfectly well, we have the transitions, and the offset (and abbreviation) between any two transitions, and the offset (& abbreviation) after the last transition, and even what applies before the first transition is all trivial to obtain. If the zic.8 text needs clarifying, perhaps what is needed is not any kind of change to the text that has been suggested, but to make it quite clear that a list of transitions is what is being specified, not a list of ranges of times (those of us who have "grown up" alongside the development of the database simply know this, but it is apparently not as clear to those who have started looking at it more recently). On spaces separating fields, I suspect the answer is that it all works the same way as the (unix shell) read command - white space separates fields, until we have as many fields as we need - after that all the rest of the input line (including anything which would otherwise be a separator) all just gets included in the value of the final field. So, to the unix shell echo a b c | read var puts (aside from problems of using "read" from a pipe) "a b c" into var. echo a b c | read v1 v2 puts "a" into v1, and "b c" into v2, and echo a b c | read v1 v2 v3 v4 puts "a" into v1, "b" into v2 ,"c" into v3 and "" (empty) into v4. In the database source format, the "until" is the final field (would be the last name on the "read" command, if there were one), so if this parsing method is assumed, then the "spaces in until field are OK" all just works out... It is also certainly not really harder to parse, the parsing method simply finds the first N fields (delimited by white space) and leaves whatever is left over (if anything) as the final field - that's trivial to code. Explaining it is also not really difficult either - though perhaps a few extra words making it clear that the line has a fixed maximum number of fields, and any excess data is all part of the final field (white space included). kre