Date: Sun, 14 Jan 2024 11:41:10 -0800 From: Paul Eggert <eggert@cs.ucla.edu> Message-ID: <0573ccfb-4c07-4886-916c-f521180e949a@cs.ucla.edu> Note. in what follows, lines starting " | " are quotes of text written by Paul in the message to which I'm replying, except for those lines which start " | >" which are lines Paul quoted from a message Steve sent (see the header for full names & e-mail addresses). | Although that's one interpretation of the standard, it's not the only | one. It is, however, approximately the correct one. | As I've been saying, although the POSIX and C standards can easily | be misinterpreted, Anything can be misinterpreted, that means nothing. | they have a better interpretation which says that on | a system with tm_gmtoff and tm_zone strftime need not use mktime or | equivalent, not even for %s. Nothing says that anything needs to use mktime(). What the spec for strftime("%s") says is that the result, in a POSIX system, must represent the same value as mktime() on the same struct tm would produce. That is the value that should be produced is specified. That's all. Allowing the implementation to produce any answer it likes would make it kind of difficult for anyone to use reliably, don't you think? The mechanism the implementation uses to produce that specified result is entirely up to it, provided it uses only the data that users are told they need to provide (otherise the implementation risks using garbage). | > The struct tm handed to strftime must be one returned by | > an immediately preceding call to localtime or gmtime. | | This is good advice, It actually isn't. It isn't required at all. All that is required is that the fields required for the conversions specified in the format string be correctly initialised to the desired values. Certainly calling one of the functions which fills in a struct tm will do that, and that's a very common usage, but it isn't the only way (using the results from parsedate() on systems that have it is another, as is simply doing a scanf() on a date/time string, perhape one previously created by strftime()). Or many other ways, including simply reading the struct tm from a file. | and (at least in a "should" form) it should be in POSIX. It certainly should not. | While I was at it I noticed that the man page doesn't say strftime | behaves as if tzset were called (even though this is no longer needed). But in general, it doesn't, only for the 3 conversions that need it. In any case "behaves as if tzset() were called" is more or less (not fully) irrelevant to the call of strftime() itself, what is crucial about that is that calls to tzset() can affect later function results, and the lifetimes of data returned from earlier calls - so it is important to know when that might happen. | Yes, but the standards give leeway as to how to "make this work" for %z | and %Z, and this leeway includes using members like tm_gmtoff and | tm_zone that the C standard does not specify. Certainly, POSIX has added stuff which the C standard does not require to exist - C is trying to be able to run in more environments than just POSIX ones, which necessarily affects just how much it can specify when dealing with interfaces to external systems (like time). Eg: in C, a time_t is *not* a count of seconds since some epoch, and simply printing a time_t value and expecting that to be seconds since the epoch, in a portable C application is incorrect. POSIX specifies it as an integer count of seconds since 1970-01-01T00:00:00Z (at exactly 86400 seconds per day, every day, always). C does not. A C time_t might be a count of milliseconds, of microseconds, or 2-seconds, or BCD encoded, or almost anything (though I think it is now required to be an integer type - I believe it was once allowed to be a float). | > (Which brings me back to my conclusion that %s | > shouldn't exist, because it's impossible to implement correctly. Nonsense. It is trivial to implement correctly. Perhaps you mean that the specification does not achieve what you want it to produce - that's a different issue entirely. That is, your "correctly" means "what I want" rather than "as specified". And you're certainly right that would be impossible, as what you want, and what I want, and what someone else wants might all be different - the implementation needs to pick one of them (or add more interface to select) - it cannot simply guess which one the current user expects to happen, and implement that. That is impossible. Lots of functions don't do what I'd like them to do. Sad, but true. Live with it. | It's impossible only if one uses a too-strict interpretation of the | standards. I have no idea what that means. It isn't impossible no matter how strictly the standard is interpreted (which should always be "very"). | Let's not do that, as it would make our implementations | worse, our users more confused, and our software buggier. All that is needed is to make it clear just what the %s value represents. It is *not* the time_t value that produced this struct tm - it cannot be, as no such thing need exist. It is the time_t value which localtime() would convert into the same values as are in the struct tm given to strftime() (for the fields that strftime() uses, or might). Note, only localtime() for this, never gmtime() or anything else. | > You might think that the sequence | > | > struct tm *tm = localtime(&t); | > strftime(buf, sizeof buf, "%s", tm); | > | > is fundamentally guaranteed to place a decimal representation | > of t into buf, where "fundamentally" implies that it just | > *has* to work, even in the face of serious bugs in other, Come on, be serious. Nothing is ever guaranteed to work in the face of serious bugs. If there's a serious bug in cc, you might not even be able to compile the code to test that (for example). If the startup code (what used to be crt0 but I think that's been replaced by something different - never mind) has a serious bug, your code might never start running. If ... (I could go on forever). | > unrelated parts of the time-conversion logic. But no, this | > sequence is in fact utterly vulnerable to bugs in other | > parts of the time-conversion logic, Everything is vulnerable to bugs in all kinds of things. The whole of tzcode assumes that read(2) works, so that the TZif file can be read to get the information it contains, but if there were a bug in read(2) such that every second byte was complemented, or something else weird like that, nothing would work. Do you worry about that, and abandon all uses of tzcode because of it? I certainly don't. We cannot specify things such that we are assuming that other things will be broken, or we cannot expect to rely upon anything at all. Instead, we assume that everything works, and write code based upon that assumption, and then if something doesn't behave as expected, we first double check that our expectation is correct (that is, don't simply assume that because it looks like as if it should do X, that X is what it must do - verify that the specification says that), and then if that's true, and the implementation isn't doing what it should, we file a bug report and get the thing fixed. | > because it is necessarily | > equivalent to the sequence | > | > struct tm *tm = localtime(&t); | > time_t t2 = mktime(tm); Yes. | > which sets t2 == t only in the presence of a perfectly- | > implemented mktime, Of course, and a perfectly implemented localtime(), and a perfectly implemented compiler, and correctly functioning hardware, and ... Incidentally, a bug free localtime() is much harder to achieve than a bug free mktime(), as mktime() can easily be implemented simply by making calls to localtime() and comparing the results with the input struct tm, until the input time_t to locatime() which produces the expected results is found. Perhaps not all that efficient, but very easy, and if localtime() is correct, then so will be mktime(). mktime() needs to normalise the values in the struct tm first, or they'd never compare equal to localtime results of course - strftime() doesn't need to do that, as its results are unspecified if any of the relevant struct tm values are out of their specified ranges. | > and also given certain other constraints, | > such as that TZ has not changed. Yes - mktime() uses the current TZ specified local time to do its conversion, just as does localtime. You might as well say that struct tm *tm1 = localtime(&t); struct tm *tm2 = localtime(&t); isn't guaranteed to produce the same values in *tm1 and *tm2, as it depends upon a perfectly implemented localtime() and that TZ isn't altered between the two calls, and ... (and that t doesn't change in the interim). There's nothing specific to mktime() or strftime("%s") which makes things any different in this area. | Assuming that localtime and strftime both succeed (localtime returns | non-null and strftime's output fits), then a warning stated this baldly | would be incorrect for current tzcode as its strftime %s is indeed the | inverse of localtime. As it should be. Exactly that, and nothing else. Ever. In this regard, note that localtime() uses the TZ timezone, not anything different, so what you're saying is that strftime("%s") uses the TZ timezone, and never anything else (whatever value might happen to be in the tm_gmtoff field of the struct tm passed to it). | > Please rely on %s only if you're the implementor of | > date(1) or the equivalent. Nonsense. Further the implementor of date(1) doesn't care about %s at all, the '+format' operand is simply passed directly to strftime (and then the leading '+' in the resulting string removed - there are reasons for doing it that way rather than removing the '+' first) without examining it at all. | It's true that strftime %s has problems on other platforms, What platforms have issues? That is, of ones which actually support %s of course. (Though it sounds as if perhaps the current unreleased but patched tzcode might perhaps be one of them.) | so a portability warning is appropriate for tzcode strftime's man | page. It would be better to file bugs against the broken ones. This isn't a case (like say "echo") where there are two competing specifications, and people simply will not agree on which is correct, so we just tell everyone to avoid it for safety. | NetBSD's strftime_z does that. But it's not needed in current tzcode, | which addresses the problem in a simpler way. There is no problem to address. It is just that %s is not designed to do what soem people apparently want it to do (which is, in general, not really all that useful). If there is a real need for something different, the way to deal with that is to suggest to the implementors that some other conversion be added (or a modifier applied to the %s conversion perhaps) to achieve different results - not to arbitrarily simply change the speficication of %s and by so doing break code which is justifiably relying upon it working as it is specified to work. kre