Stephen Colebourne <scolebourne@joda.org> writes:
Guy/Paul/Russ, I thought I'd explained it pretty clearly why recent this change is problematic. All the replies miss the point.
I got your point just fine. I just think you're wrong. It's quite possible for me to have a complete understanding of your position and continue to disagree with you, as is the case here.
The tzdb is not some kind of theoretical project, it is used directly as input data by millions of developers. Those developers have never previously has to cross-reference tzdb data with any other data (ie. when somewhere was inhabited) to get a reasonable answer to the question "what is local time in 1930" (reasonable not accurate).
And this has not changed. "What is local time in 1930" still returns a reasonable but not accurate response for McMurdo, insofar as such a thing exists given that the question is undefined. It's just a different reasonable but not accurate answer than it used to return, similar to how finalize() now runs at a different but still reasonable point in a Java program with a new version of the JVM.
Describing the input as malformed is unhelpful to the debate, because a developer just using the data has no idea from the tzdb that this input *is* malformed.
I completely agree that this would be a nice thing to fix. That was my point about preferring to return errors for undefined inputs. However, it's very difficult to do this for exactly the same sorts of reasons as why finalize() is still part of the Java language despite the fact that almost every use of it is wrong. This software and database has existed for many years, and its behavior has always been to return a reasonable but inaccurate response for dates in the past prior to standardized time. If we had a time machine to go back and change the original behavior to cause localtime() to return an error for such inputs, that would probably be, overall, a better situation. However, we don't. The current reality is that innumerable programs exist in the wild that will find localtime() failing to be highly surprising. Yes, POSIX and other standards say that it *can* fail, but in practice it *doesn't* fail, which means that a lot of software does not handle the failure case at all. Making this change would probably involve creating a new interface (not only in C but in the other languages that have relied on the historic behavior of the API) that can now return errors for undefined questions, and then a lot of data collection about where the boundaries of undefined should be. It's quite a large project. I do think the world would be a better place if someone completed that project, but I also don't think that it's that horribly important. Computing has survived for many years with the current behavior.
I don't just want *an* answer, and I'm not obsessed by the *right* answer, my problem here is a *wrong* answer (any clock change including DST before 1956 is wrong).
I don't see why you think that a DST shift somehow crosses some line into making the database responses unreasonable.
Such wrong data implies human activity when there was none.
I don't see any such implication. It's a simple backward-projection of current rules into the past. If anything, it's your proposal that implies human activity and makes the clocks less accurate, since it implies that at some point someone made a conscious decision to introduce DST to a location that previously had non-DST local time. But that's not what happened. Instead, people brought the existing DST rules (and time zone) of their staging base with them, and if you had asked those original inhabitants about times prior to their arrival, they would have projected those rules backwards, just like the database now does. -- Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>