I have a question about the strftime function. This is not related to time zones per se, but there are a lot of date/time programming experts here, and of course there's a reference implementation of strftime in tzcode. [Full disclosure: I am posting the same question to StackOverflow: https://stackoverflow.com/questions/51145727/ .] Most definitions of the %H, %M, and %S specifiers (including those in the ANSI/ISO C Standards) contain language like %H is replaced by the hour (24-hour clock) as a decimal number (00-23). Now, that little notation "(00-23)" at the end tells you the range of numbers you're going to see *if the struct tm you're printing was just created by localtime or gmtime*. My question concerns the presence or absence of a requirement that the struct tm being referenced by strftime *must* have been one just generated by localtime or gmtime. I don't think there's such a requirement, but if there isn't, those notations "(00-23)" and "(00-59)" don't really have any normative implications for strftime -- they're more of a misplaced requirement on the values generated by localtime and gmtime. But if strftime doesn't restrict the ranges of the various struct tm fields to their "normal" values, it makes me wonder how much latitude a push-the-boundaries programmer actually has when calling strftime. Are the values as unlimited as they are when calling, say, mktime? To make the question more concrete, it came to me when I found myself writing this code: time_t dt = t2 - t1; struct tm *tmp = gmtime(&dt); if(dt > 86400) tmp->tm_hour += dt / 86400 * 24; strftime(etbuf, sizeof(etbuf), "elapsed: %H:%M:%S", tmp); That is, I'm subtracting two times, and letting gmtime and strftime convert the difference to HH:MM:SS format for me. But if the time delta was more than a day, I'm not printing it as days; I'm just lumping it in with the hours (which, it's true, might end up being a very large number if there were months or years between t1 and t2). So, language lawyer question: Is this legal and portable? When calling strftime, are tm_hour, tm_min, and the rest limited to their normal ranges, or not? (Yes, I've tried it, and nonnormalized values work -- not at all surprisingly -- as expected under the popular implementations, but that's not the question.)
Date: Mon, 02 Jul 2018 22:13:51 -0400 From: scs@eskimo.com (Steve Summit) Message-ID: <2018Jul02.2213.scs.0001@quinine.home> | But if strftime doesn't restrict the ranges of the various struct | tm fields to their "normal" values, it makes me wonder how much | latitude a push-the-boundaries programmer actually has when | calling strftime. Are the values as unlimited as they are when | calling, say, mktime? What POSIX says is: If any of the specified values are outside the normal range, the characters stored are unspecified. which means that if you provide a value for tm_hour outside the 0..23 range, what happens is anyone's guess (and you cannot complain.) kre
Robert Elz said:
| But if strftime doesn't restrict the ranges of the various struct | tm fields to their "normal" values, it makes me wonder how much | latitude a push-the-boundaries programmer actually has when | calling strftime. Are the values as unlimited as they are when | calling, say, mktime?
What POSIX says is:
If any of the specified values are outside the normal range, the characters stored are unspecified.
which means that if you provide a value for tm_hour outside the 0..23 range, what happens is anyone's guess (and you cannot complain.)
Not quite. There's an important difference between "unspecified" and "undefined". In this case, the resulting string could be anything but it must still be a string and the call can't break anything else. If it had been "undefined", then the call - or even the *existence* of the call in your code - can cause the program to behave in any way the implementation likes, including - for example - wiping or corrupting the entire filesystem. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
Clive D.W. Feather wrote:
In this case, the resulting string could be anything but it must still be a string and the call can't break anything else.
Yes, and tzcode substitutes "?" for some values out of range, e.g., month names when tm_mon < 0. This behavior is common in other implementations. Alternatively, strftime could simply return 0, pretending that the resulting string is infinitely long. The trickiest issue here is when tm_year < 1 - 1900, that is, before Gregorian year 1. Although C11 and POSIX-2017 say that strftime must produce a decimal number, they don't say what the number should be. tzcode, like all other implementations I know of, says the years before year 1 are years 0, -1, -2, .... This follows the astronomical tradition used by Kepler and popularized by Cassini <https://en.wikipedia.org/wiki/Astronomical_year_numbering>.
Paul wrote:
Clive D.W. Feather wrote:
In this case, the resulting string could be anything but it must still be a string and the call can't break anything else.
...tzcode substitutes "?" for some values out of range, e.g., month names when tm_mon < 0.
...The trickiest issue here is when tm_year < 1 - 1900, that is, before Gregorian year 1...
Thanks for the several replies. I guess this is the best that can reasonably be hoped for. (I shall take care not to imagine that the code of mine which prompted the question is strictly conforming.) Somehow I'd never considered %A and %B; thanks for pointing that out. There are occasionally implementations that try to make various kinds of points by playing with particularly novel interpretations of undefined or unspecified behavior (I'm pretty sure gcc once did something particularly extreme with #pragma), so I'm now thinking that somewhere there's a world where it might be fun to map out-of-range tm_mon and tm_yday values to Tricember, Pentember, Dodecahember, and Humpday, Caturday, Yesterday, etc. (Or perhaps not. Forget I mentioned it. And let's not even think about %p.)
Steve Summit wrote:
I'm now thinking that somewhere there's a world where it might be fun to map out-of-range tm_mon and tm_yday values to Tricember, Pentember, Dodecahember
There's another point I forgot to mention, stretching things from the application side instead of from the implementation side. strftime is required to support dates that are not in the Gregorian calendar. For example, a program using the Julian calendar can fill in a struct tm with the date February 29, 1900, and strftime is obliged to process it to the corresponding string even though this date does not exist in the Gregorian calendar. This is well-defined in C11 and POSIX-2017. You can even process "February 30, 1712" (a valid date in Sweden!) using strftime. Similarly for timestamps. You can use strftime with "%H:%M:%S" to generate the timestamp 07:32:60 if you like, even if the implementation conforms strictly to POSIX and does not support leap seconds at all.
Steve Summit wrote:
When calling strftime, are tm_hour, tm_min, and the rest limited to their normal ranges, or not?
Although they aren't limited, you can't rely on what strftime outputs from out-of-range values. The C11 standard and POSIX-2018 both say, "If any of the specified values is outside the normal range, the characters stored are unspecified."
participants (4)
-
Clive D.W. Feather -
Paul Eggert -
Robert Elz -
scs@eskimo.com