John Cowan <cowan@drv.cbc.com>
Nathan Myers wrote:
a "double", relative to Jan 1, 2001, ... would give microsecond accuracy for now, with precision decreasing as we pass and recede from that date; or we could use a 64-bit integer count of microseconds from practically any choice of epochs. Each representation has its advantages.
First, note that the term "epoch" refers to the moment of time represented as zero, not the range of expressible times. The Unix epoch is 1997 Jan 1 00:00:00 GMT.
I believe I used the term "epoch" correctly. Of course (as I said) the choice of epoch is of no interest given a fixed-precision time value. A floating-point time has greatest precision near the epoch. With a 48 bit mantissa, precision would still be better than a millisecond for dates throughout recorded history.
... I favor the Java convention: a 64-bit signed integer representing milliseconds, with the same epoch as Unix. That provides sufficient resolution for normal purposes (anything that *requires* microsecond resolution probably requires microcode, embedded programming, or the like), and has a range clear back to the Carboniferous Period (~300 My B.P.)
Some of us do write microcode and embedded programs, and code in such environments tends make more use of time quantities than the is conventional (hence the term "real-time"). Still, the current Java convention seems tolerable for most uses.
The second familiar problem is the crossover periods for time changes. Converting from microseconds-past-epoch to local time is well-defined; but conversion back demands accommodation to local time that doesn't exist (during "spring forward" gaps); and [2] local time that is ambiguous (before or after "fall back"). Since there isn't any really satisfactory solution to these, they just need to be visible (and unavoidable) in the interface.
The ADO package uses the is_dst field to resolve the ambiguity. Except during "fall back" overlaps, this field isn't required when converting from broken-out local time to absolute time, but it is used to resolve the ambiguity in that case.
The is_dst field fails to solve the problem. The problem is that the caller of the conversion function is generally not equipped to "resolve the ambiguity", or even to determine whether there is an ambiguity to resolve. The overwhelming majority of programs do not (and probably cannot) correctly or meaningfully use the is_dst field. The conversion function itself can tell whether there might be an ambiguity, but has no way to communicate it back to the caller, or to identify the set of alternatives. I believe that the conversion function must return a value that represents 0, 1, or 2 time values, so that the caller is equipped to (and forced to) deal with the ambiguity after all the necessary information is available. We also need a time representation equivalent to "NaN" that permits us to talk about times shortly before the epoch.
The third problem is that the TZ database on line is available only in a form that would be very clumsy to handle in an on-line transaction. That is, if it changes at all (which happens frequently), there is no way to check for rule changes in a particular locality without downloading the whole thing and unpacking it, which would introduce an unacceptable delay in an interactive application.
Why so? A server can load single "zoneinfo" binary files as needed from a conventional HTTP server: several could be provided around the Internet, and intranets could have one or two available also.
This seems like the beginning of a reasonable approach. Better failure tolerance than "Error 200" seems needed. Nathan Myers ncm@cantrip.org