Date: Sun, 06 Mar 2005 23:18:26 -0800 From: Paul Eggert <eggert@CS.UCLA.EDU> Message-ID: <87mztfudod.fsf@penguin.cs.ucla.edu> | It is entirely reasonable run a program for 32-bit time_t, since we | need to generate predictions for future conversions only out to the | year 2038. It isn't necessarily reasonable to do it for 64-bit | time_t, since (in theory, anyway) we'd need to generate preductions | for future conversions out to the year 292,277,026,596 or so, assuming | I've done the arithmetic right. Since there's no possible rational reason for pretending to know what the DST rules will be like in the year 3000 (or even 2100), attempting to generate DST transitions that far off into the future is just absurd. | For example, suppose we arbitrarily cut off 100 years into the future. | Do we need to generate new tables every year, as the cutoff time | advances? This may sound like a trivial issue but in practice trivial | issues like these build up. Every year, probably not, but from time to time, probably. We do that anyway - simply having a new set of data generated for each new OS version distribution is probably going to stay safe. I mean, how many people do you still expect to be running Windows XT (or NetBSD 2, or Solaris 10) in 50 years from now? If the data remains stable (if the rules change, obviously it needs to be regenerated anyway) then people will be getting new code with updated data in it frequently enough for us to not worry about a 100 year cutoff, or the regeneration that means will be needed to make sure that the end point is far enough away not to bother anyone. | But unfortunately it is required for POSIX compliance, at least if | tm_year is representable as an int (true for years up to about 2**31 | on most hosts) and if TZ uses the POSIX format. What exactly si required for POSIX conformance? Do they require that we get DST conversions correct (for everywhere on the planet) for all years that are representable as an int? If they do, screw posix (they're asking for the impossible) - but frankly, I doubt it. Note that the database we're dealing with is a list of DST conversions. DST rules are all that matters here. Everything else is just an algorithmic conversion - I'm not suggesting that we don't pretend to convert the time_t with the value (~0 - 100) into a struct tm (assuming it fits), but I also don't care if the result we get from that turns out (after we get to that time, and know what human representation it actually has) to be an hour or two (or even a day or two) different than what we guessed and converted the time_t into. | > Backwards, 1970 is far enough to be accurate. | | Here I think you're being a bit too modest in aim. The existing code | already works for dates back to 1901 (in 32-bit time_t), Does it really? For all timezones, for all DST rules? | I'd say we might as well go back at least to the introduction of | standard time (circa 1850), for time_t wide enough to support that. I | don't see any fundamental technical objection to going back that far. I doubt that we get to decide what a time_t format should be - in fact, some of the recent changes to the code are there precisely because we don't get to make that decision (if we did, we wouldn't be bothering with that floating point nonsense). The range of a time_t is something that the OS dumps upon us. The job of this code is to convert that into a struct tm. By all means, generate a struct tm for every time_t supported by the implementation that can be represented in a struct tm (another data type that we don't get to define), just don't pretend that we're going to necessarily have the DST rules 100% correct for times before about 1970, or for time in the future further ahead of the current time than it is reasonable to expect the DST rules to remain stable. I mean, worrying about future DST in Isreal is just plain crazy - historical evidence would suggest that there will be a change of government there sometime in he next 10 years, and the new one will decide on a whole different set of rules. About the only thing that I'd suggest, is that we make it clear to people who deal with dates, that they should choose an appropriate data type for the actual purpose they need to represent the data, and that time_t is most certainly not appropriate for everything. Personally (for example), I think it would be just plain crazy to express people's date of birth as a time_t (just which second, or even sub-second for some proposed time_t extensions, do you record as the time of birth anyway? When the head appears, when the big toe is finally extracted, when the umbilical cord is cut ??? And who cares anyway?) And which records that retain DoB information, also bother to record the timezone that applied at the place of birth, so the correct DST conversions can be done on it? One of the true evils of computing is the temptation to add meaningless precision to all kinds of data, just because there is space available to allow it to be added. kre