Re: Time zone confusion and implementation hints

July 6, 2010

      Date:        Mon, 5 Jul 2010 17:26:54 +0100
    From:        Tony Finch <dot@dotat.at>
    Message-ID:  <alpine.LSU.2.00.1007051718140.10878@hermes-2.csi.cam.ac.uk>

  | Because time_t is still usually signed 32 bits. (I don't know how the tz
  | code deals with 64 bit time_t.)

It deals with it just fine - handling 64 bit time_t's correctly was added
some time ago now - and several systems are using 64 but time_t's with no
problems of note (current NetBSD is one of those - that is, when NetBSD 6 is
released it will be 64 bit time's throughout, FreeBSD is as well probably).

However ...

Yves Goerge <nospam.list@unclassified.de> said:
  |  Okay, if I just specify that my calendar won't work before 1900 and after
  | 2100, it's 2,4 kB. Per timezone. 

Before 1900 (approximately, it varies from timezone to timezone) all of
this is nonsense anyway, as there was no real standardised time - you
could perhaps go back another 100 years in some areas, but we don't have
much in the way of reliable data for anywhere in that period, and essentially
no incentive to go and collect it where there are any historical records
from which we might be able to get the data - it just isn't important
for any practical purpose.

Into the future, for any data past July 6, 2010, we're all just speculating.
That is, any answer is a guess.   For the near future (say, July 7, 2010)
we have a very high confidence level that our speculation will turn out to
be correct - the further into the future we go, the more that level drops.

Guessing times offsets more than about 3 or 4 years into the future in
many timezones seems to be a wildly dangerous thing to do if you're going
to claim any degree of confidence in your answer - that is, if you're
not clearly labelling it as a guess.

I am at least glad that Guy Harris' message is being treated seriously,
I have been continually amazed about the number of messages we see here from
people who want to work with the timezone data in some software or other,
and who then set out to attempt to write parsers for the source data files.

Other than as an academic exercise, frankly, that's insane.

The only purpose of the source files is to be maintained by the people on
this list to be as accurate a representation of the current (and past)
known intentions of the various authorities who set the world's time offsets.
The format of those files can, has, and will again, change whenever it is
needed to be able to better express what is expressed in the various policies.

For example, one thing that we're currently lacking, which hasn't yet
been enough of a problem for us to need to fix - but might be one day,
is any way of expressing dates that vary depending upon the various
variable religious holidays that exist - we have the "yearistype" method
of handling conditional evaluation (though that isn't generally regarded as
being a particularly good solution I don't think), but nothing that calculates
the dates of the various events that have been known to affect summer time
transition vents when they clash.   To the best of my knowledge, all of those
dates can be calculated with sufficient code, if and when we ever decide it
is needed - if we do that' it would probably mean another change of some form
to the timezone source data files.

When that happens, for sure, zic will deal with it.

The other representation is the zic output - that's simple, and in a format
that is essentially never going to change in any incompatible way, as that's
the format that software everywhere is reading to actually convert times
between UTC and local time (both directions) - and because its format already
allows for everything (being so simple).   Yes, it is verbose - but that's
OK, computers easily deal with volumes of data, and while this stuff is 
growing, and will continue to grow, it is growing at a much slower rate
than ram and processing speeds (even i/o transfer rates).

Any code that is being written with a purpose of actually usefully translating
times between UTC and the various local timezones, which you actually expect
that should last longer than a year or so (and which isn't being written just 
to prove that "yes, it is possible to write a tzdata parser in lisp/fortran/
apl/...") really should be working exclusively with the output from zic - if
the binary format is a problem, then a 10 minute conversion program could
convert the file into any other format that you'd prefer instead.

Yves - this rant has not been aimed at you particularly, you're at least
considering Guy's message, but at all the others out there who keep trying to
do the same thing, and then, attempt to argue against changes to the tzdata
source format, because their particular parser wouldn't be able to handle
that.    Tough!

kre

kre

Re: Time zone confusion and implementation hints

Robert Elz