Re: big-picture comments on your proposed extensions to ISO C <time.h>

(I hope you don't mind Paul, if I reply on tz) Paul Eggert wrote on 1998-09-29 01:34 UTC:
Here are some big-picture comments on your proposed extensions to ISO C <time.h>. I've hacked up a copy of your HTML proposal along these lines, if you're interested in reading it. (I still don't have a revised rationale, though.)
It's very hard work to come up with a good spec! After trying my hand at it, I respect the work you've done so far; the following comments are in the spirit of improving it to be as useful as possible.
I have some more detailed comments once you've had time to digest these bigger-picture comments, but first things first.....
* Terminology proposal: let's rename `xtime' to `stime' (short for ``standard C time''), uniformly. The rest of my discussion assumes this new terminology.
Hm, see my separate posting on the many different renaming proposals that I received.
* struct stime is inconvenient for programmers. It's hard to portably subtract such values, much less do other arithmetic on them. Even comparing them is error-prone. This structure's members are derived from a similar structure in POSIX.1, but the POSIX.1 structure was predicated on not having integer types longer than 32 bits.
Instead, let's just use a signed integer count of the number of time intervals since the epoch. We can define a type stime_t for this integer, and a stime_t macro STIMES_PER_SEC giving the number of time intervals per second. This makes time arithmetic much, much easier.
I am afraid, but I strongly disagree here. I think my current approach is more robust, functional, and therefore preferable. The reason: Most C 9X implementations will not provide any type with more than 64-bits. If we take a 64-bit type, then you can have only nanosecond resolution over an interval of 58 years. This is unacceptable; even the notorious Y2K COBOL programs fail only after 99 years. Note that this is independent of whether you use a 64-bit int or a 64-bit float type, 64-bit is just not enough for a high-resolution timestamp. About the "convenience" argument in general (and I'll come back to this for Antoine's xtime_get return value critique): Compared to modern programming languages with their variable length arrays, exceptions, garbage collectors, controlled name spaces, etc. C is and will always be a comparatively simple and quite inconvenient early-1970s language. If you want convenience, look for another language than C. The interesting thing about C these days is mostly that it is ubiquitously implemented. Most programming environments for more modern and comfortable languages are built on top of the existing C API. If you read the C++, Java, and Ada95 standards, then it becomes embarrassingly obvious that the abilities of their standard library has been limited by the existing ability of the C library, therefore is is extremely important to get the functionality of the C API right. The comfort should certainly come after functional criteria. I do not understand, why it should be difficult to portably add and subtract struct xtime values. The new <stdint.h> offer a reasonably good support here. The algorithm is trivial: Before any arithmetic, you first check whether there is a leap second in one of the arguments and abort if there is (because arithmetic is ill-defined in this case). Then you just add sec and nsec separately and adjust for the nsec overflow. Note that non-leap nsec values do not overflow the nsec type, because 2 * 1e9 < 2**31. It all fits so nicely together with nanoseconds, I couldn't think of a more appropriate representation. The POSIX.1b people made certainly a good choice here. In programming languages such as C++ or Ada where operators like "+" and "-" can be overloaded, the authors of bindings to the xtime API will certainly add the corresponding overloadings, and then xtime is as easy to use as a float type. If you insist, we could add macros or functions for struct xtime arithmetic, but I think it is better to let users do this themselves to make sure they have thought about the leap second aspects of arithmetic, which are certainly application dependent. (Again, the lack of an exception mechanism in C prevents to do a robost and comfortable interface at the same time, so we stay on the robust side and leave to comfort to languages with proper mechanics to do it.) How would you represent leap seconds in your stime_t arithmetic type? This was after all besides the resolution concerns the major reason for getting rid of time_t and its idea of using an arithmetic type.
* We should pass timestamps by value whenever possible, instead of by reference. Passing timestamps by reference makes the program more error-prone and (these days) slows it down. This is particularly important for stime_get.
That is an interesting point. Actually, I have never done any measurements on whether passing a 96-bit struct by reference is more efficient than passing it by value with the usual C compilers. Does anyone have any numbers on this for typical 32-bit processors? For stime_get however this is not an option: the return value is certainly required for comfortable error checking. Other programming languages with exceptions could probably return the time value here, but in C we have no exceptions and therefore (as xtime_get can fail), we will have to use it in some form of if statement with an explicit exception handler.
* We should constrain the implementation to support times through a reasonable upper bound (e.g. 9999 AD, or perhaps something a bit sooner since we'll have to reform our calendar well before then). But we shouldn't insist on astronomical timescales; that places an undue burden on the implementation.
What would this burden be? We know that 32-bit is too small, and the next best size is 64-bit words or with nanoseconds a 96-bit struct. The last non-power-of-two machine that I worked on was dumped >10 years ago and never had a C compiler. These are of academic interest. There is no practical intermediate size between a useful range and a 64-bit second counter. I think it is much more of an burden for the application programmer in the end to leave the time encoding and epoch undefined.
* We should specify better what happens before the epoch for TIME_UTC and TIME_TAI.
Actually, I would suggest that we specify the functionality of a xtime_make(&xtp, &tmptr, NULL); call completely by providing an example of a correct implementation in the standard (should be possible in less than 50 lines). Real code is much clearer here than any pseudo-mathematic specification. I started to add such example code to my web page a few days ago, but was interrupted and didn't get around yet to finish. Volunteers are welcome (especially getting the leap year formula for negative years correct might be a bit of a brain tweezer). Such example code resolves many ambiguities and also should solve your concern here. As far as TIME_TAI is concerned, I do not expect *ANY* implementation to support xtime_conv with pre-1972 TIME_TAI values. The other functions do not care about the difference between TIME_TAI and TIME_UTC anyway, and the semantics of the TIME_UTC clock relative to the Gregorian calendar withh be specified in the example code.
* We shouldn't insist on a particular type like `int_fast64_t' for timestamps. I assume that this type was chosen to encourage portability,
I chose it for range and efficiency and fortunately. With int_fast64_t C provides me with exactly what I had in mind. I am not sure what type of portability you are referring to.
but it's not truly portable, since int_fast64_t might have more than 64 bits.
Please explain what the problem would be there. I don't see any.
Also, as mentioned above, it's way way overkill for almost all applications. It's better to keep the timestamp type a bit more abstract, like stime_t, and to place constraints on it as needed, as described above.
The abstractness of types like time_t in ISO C 89 was not done because this abstractness was considered to be good and beautiful design. On the contrary, is was a necessary hack because the ISO C standard had to be backwards compatible with a few strange C implementations. In this respect, ISO C 89 was a step backwards compared to K&R C. (Special thanks to Nick Maclaren <nmm1@cus.cam.ac.uk> for providing me with the historic background on this.) To my big surprise, many people today start to admire the hacky type mess that ISO C had to introduce because of bad PC compilers as glorious and wonderful abstractness, having completely forgotten the original reasons for all the awful type-uncertainties of ISO C 89.
* The stime_get interface is confusing, since it has several operational functions. It's better to separate them out into one C function for each operational function; e.g. we should have a separate function stime_getres to get the resolution, instead of passing a flag to stime_get to have it return the resolution instead of the current time. Similarly, we should have a separate function that gives us an error bound on the clock, rather than have a TIME_SYNC flag.
I considered defining several functions, and then deliberately packed all these functions into a single one. The main reason is that this packing of functionality allows implementors to add more functionality by just giving additional option and return bits a meaning, without cluttering up the namespace in non-portable ways. My proposed xtime_get interface allows you to add additional clocks to be requested (UT1 is one example I mentioned in a note in the proposal), or allows additional data to be requested (like is the clock coming in over a trusted channel, when was the last connection to this clock, estimated error interval size, etc.). I prefer to have a single more universal function that provides a clear way of addition additional functionality in a portable way over a set of functions that leave no room for extensions. It is always easy to add another biot, but is is difficult to add another function in implementations in a binary compatible way (thinking about shared libraries and these issues).
* We should require at least POSIX-style time zone specifications for tz_prep, and should suggest Olson-style. As things stand, there's little that portable code can do with that function.
Requiring POSIX-style time zone specifications as a minimum functionality would be ok for me, but so would be to leave the string completely application defined and just mentioning POSIX as one possible implementation. I don't know whether the Olson-style ones are standardizable unless we get some sort of official ISO registry for time zone specifications. As I have pointed out before, I don't like the continent prefix in the Olson/style names, therefore I do not want to make this particular syntax immortal in an ISO standard (I much prefer just ":Paris" over "Europe/Paris"). Note that the tzstring will usually be directly entered by the user (say via a config file or via an environment variable), and the portable application does not have to be aware of the syntax used on this system.
* We should codify the convention that getenv("TZ") returns the user-preferred time zone. A null tzstring should expand to a system-defined time zone. This functionality is in practice and is useful e.g. in mailer software, where you may or may not want to allow the user to specify the time zone.
OK, that sounds like a good idea.
* stime_make still has the old mktime problem that the function can't distinguish a request for ``3 days after Feb 28'' from a request for ``1 month before Mar 31''. It's silly that mktime thinks that 1 month before Mar 31 is Mar 3 (or 2, if it's a leap year). We should fix this.
I think I fixed this very nicely by not requiring xtime_make to handle *any* invalid time representations. If you want 3 days after Feb 28, then you just add 3 * 86400 to the sec field. The extremely ugly hacks with mktime overflows and underflows have become obsolete by defining the encoding of the time representation to allow direct arithmetic. It is very difficult to define mktime overflow behavior nicely, so why bother if there is no need?
* The proposal for strfstime still involves magic; e.g. how can the strftime easily determine the time zone abbreviation?
strfxtime gets the timezone object passed as a paremeter. All information can be stored in there and accessed directly. No need to extend struct tm.
I'd rather have a strftime that could in principle be written by the user; it's common practice to write augmented strftime implementations, which grind their teeth over questions like these, and it'd be better if we let people write such functions cleanly.
If you need access to the zone name, then simply use strfxtime to access it. I don't understand why it should be stored in any struct tm extention if we have a function to read it.
This will involve a new type `struct stm' that contains extra members so that no more magic is needed. Something like this:
struct stm { stime_t year; int month; int month_day; int hour; int minute; int second; stime_t utc_offset; stime_t dst_offset; const char *zone; const char *zone_description; };
No, please no additional structs. The existing one is good enough to represent and handle a full broken-down time. All other information is fully accessible via strfxtime. I consider the mixture of brocken-down time and timezone information conceptually dubious. For me a time zone is a function that maps broken-down times onto UTC, and not just auxiliary data on a broken-down time. The existing tm_isdst field seems to me to be sufficient (although not optimal) to handle ambiguities.
* We shouldn't have a separate error function just for stime. Instead, functions that report errors should yield an error number, which can be passed as an argument to strerror. This will simplify the interface (e.g. we don't need to worry about LC_MESSAGES).
That's perhaps worth thinking about. It depends on how detailed you want to make these error messages, and whether a single number can carry all information. Timezone strings can be rather tricky to get right and a comfortable diagnostic might be useful here. Thanks for your comments. Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>

Date: Tue, 29 Sep 1998 20:56:18 +0100 From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>
Instead, let's just use a signed integer count of the number of time intervals since the epoch. We can define a type stime_t for this integer, and a stime_t macro STIMES_PER_SEC giving the number of time intervals per second. This makes time arithmetic much, much easier.
I am afraid, but I strongly disagree here. I think my current approach is more robust, functional, and therefore preferable. The reason: Most C 9X implementations will not provide any type with more than 64-bits. A 64-bit signed type with a 1970 epoch can represent years past 9999 with 25 bits for the fraction; this is better than 100 ns resolution, which suffices for the vast majority of practical applications today. Conversely, the xtime proposal would limit the implementation's ability to supply very high quality clocks in a portable way. For example, I don't think it would fully support Bernstein's TAI-based approach which (if memory serves) has a 64-bit fraction. In contrast, using a single integer will suffice even for Bernstein's approach, so long as the C compiler supports 128-bit integers. As processors get faster (supporting higher-res timestamps) and bigger (supporting longer integers), I expect this is what high-quality implementers will choose. In other words, I fear that the struct xtime proposal is too specific for a C-language standard. Its clock has too much range and precision for most practical implementations; and yet it is not precise enough for the most demanding applications. It's better for the spec to be less specific, so that it can allow both practical and very-high-quality implementations. C is and will always be a comparatively simple and quite inconvenient early-1970s language. The C standard should cater to C programmers, as well as to people implementing higher-level systems atop C. The time interface should be both simple and convenient, as much as possible. I've done a lot of time programming, and I find the original POSIX.1 rules (a single integer timestamp) to be much easier to use, and much more reliable to maintain, than the C89 rules (an opaque timestamp with problematic constructors and extractors) or the POSIX.1-1996 extensions (a separate subsecond counter). This also goes for code that I've seen written by others. Simplicity is a big virtue here, because most programmers (even hard-core implementers) make a lot of mistakes in this area. Even the C _standard_ (as well as the draft C9x <time.h> changes) contains several bugs in this area, partly due to complexity. We should strive to keep the basic interface as simple as possible. I do not understand, why it should be difficult to portably add and subtract struct xtime values. I've seen it done wrong so often in real code (also with its similar BSD predecessor struct timeval). I've even seen errors in time-arithmetic macros in the .h files supplied by vendors! It really _is_ error-prone, and we should have an interface that is less likely to provoke errors. Then you just add sec and nsec separately and adjust for the nsec overflow. The adjustment to sec can overflow. This requires two overflow checks if you're worried about time overflow, whereas my proposal requires just one overflow check. Note that non-leap nsec values do not overflow the nsec type True, but leap nsec values do. I think it makes perfect sense to add 1 second to a leap-second timestamp; you said that arithmetic is not well-defined in that case, but I don't see why. There are other issues involved in adding, subtracting, multiplying and dividing struct xtime values. (Yes, it should be convenient to multiply and divide them -- this is needed for many kinds of interval calculations.) It's a very buggy process to get this all right. Requiring users to do this is just asking for trouble. For example, when I added sub-second resolution to GNU make (a feature that should appear in the next version), I found the struct timespec approach of POSIX.1-1996 to be so difficult to integrate with the existing logic, that I was forced to pack its two values into a 64-bit integer and use the resulting integer. I would have been much happier with a single-integer timestamp. In other words, I'm not making up this objection abstractly; I'm basing it on implementation experience. Struct timespec is simply a pain to use. How would you represent leap seconds in your stime_t arithmetic type? The most natural approach is to have xtime_get (or a variant) return a boolean flag specifying whether the timestamp is within a leap second. This would be needed only with TIME_UTC timestamps. Another possibility is to return a negative number instead (which can't lead to any confusion). I've toyed with this idea as well, but it may be a bit too tricky. For stime_get however this is not an option: the return value is certainly required for comfortable error checking. No, you can return a negative number -E such that E is suitable as an argument to strerror. This is a natural way to report errors for many of the xtime functions. There is no practical intermediate size between a useful range and a 64-bit second counter. There is if you have a single unified counter, as described above.
* We should specify better what happens before the epoch for TIME_UTC and TIME_TAI.
Actually, I would suggest that we specify the functionality of a xtime_make(&xtp, &tmptr, NULL); call completely by providing an example of a correct implementation in the standard (should be possible in less than 50 lines). Real code is much clearer here than any pseudo-mathematic specification. Good idea, though I would be surprised if it could be done in less than 50 lines. As far as TIME_TAI is concerned, I do not expect *ANY* implementation to support xtime_conv with pre-1972 TIME_TAI values. I think Bernstein's implementation does.
but it's not truly portable, since int_fast64_t might have more than 64 bits.
Please explain what the problem would be there. I don't see any. You can't portably fwrite int_fast64_t values (or struct xtime values) and read them back in again on another implementation. Also, I expect C9x compiler vendors to want to have compiler options that modify the type identified by int_fast64_t. If such options are used, you won't be able to fwrite a struct xtime out and read it back in even among the same implementation. It's that sense in which struct xtime is not a completely portable type; you can't output it from one implementation and read it back into another, even in textual form, without possibly having some problems. The abstractness of types like time_t in ISO C 89 was not done because this abstractness was considered to be good and beautiful design. On the contrary, is was a necessary hack because the ISO C standard had to be backwards compatible with a few strange C implementations. I don't agree with this assessment. The *_t convention was more of a POSIX creation. For example, different vendors used different int widths for user ids, so they had to institute uid_t for sanity's sake. time_t was just more of the same. I don't know whether the Olson-style ones are standardizable unless we get some sort of official ISO registry for time zone specifications. The IETF has talked about such a registry, but the latest relevant draft ftp://ftp.isi.edu/internet-drafts/draft-ietf-calsch-ical-12.txt punts, suggesting that implementers may just want to use the Olson names. (Why should they go to the work of building a registry when us tz folks are doing it for free? :-) I don't think the Olson-style names should be required; but they should be suggested, at least in a footnote with a URL. Note that the tzstring will usually be directly entered by the user (say via a config file or via an environment variable), and the portable application does not have to be aware of the syntax used on this system. Except for applications that help users choose the time zone! There are a surprisingly large number of them. (Emacs, say. :-)
* stime_make still has the old mktime problem that the function can't distinguish a request for ``3 days after Feb 28'' from a request for ``1 month before Mar 31''. It's silly that mktime thinks that 1 month before Mar 31 is Mar 3 (or 2, if it's a leap year). We should fix this.
I think I fixed this very nicely by not requiring xtime_make to handle *any* invalid time representations. Sorry, I missed that point. If you want 3 days after Feb 28, then you just add 3 * 86400 to the sec field. That doesn't work in the presence of leap seconds. It is very difficult to define mktime overflow behavior nicely, so why bother if there is no need? I'm somewhat sympathetic to this argument, but I fear that people who are used to mktime will be less sympathetic. With mktime, you can easily ask for 3 months after a given date; with the new interface, it's not so easy. If you need access to the zone name, then simply use strfxtime to access it. OK, that sounds reasonable; but there is still one value that is quite inconvenient to get via strftime: the UTC offset (to the nearest second, please! :-). Perhaps a new format spec could be added to get that? Timezone strings can be rather tricky to get right and a comfortable diagnostic might be useful here. Having strerror decode the error number doesn't preclude having a comfortable diagnostic. (The error number could even encode the position in the time zone string that contained the error.) In practice, though, I think this is way overkill and a simple traditional errno value will suffice.

Paul Eggert wrote, discussing with Markus:
Instead, let's just use a signed integer count of the number of time intervals since the epoch. We can define a type stime_t for this integer, and a stime_t macro STIMES_PER_SEC giving the number of time intervals per second. This makes time arithmetic much, much easier.
Hmm. I am not sure to follow you here, since for the vast majority of applications, struct xtime arithmetic will be defined as following struct xtime addxtime(struct xtime x, struct xtime y) { return (struct xtime){.sec = x.sec+y.sec}; } and similar. In words (instead of specific C9X code ;-), only considering the seconds count, dropping completely the fractional part. And I believe this is probably the right way to do most things (while I agree with Paul that, in certain circumstances, this might prove insufficient, like his example about GNU make).
A 64-bit signed type with a 1970 epoch can represent years past 9999 with 25 bits for the fraction; this is better than 100 ns resolution, which suffices for the vast majority of practical applications today. <snip> using a single integer will suffice even for Bernstein's approach, so long as the C compiler supports 128-bit integers.
This proposal have a clear drawback: like clock_t in C89, it requires to specify the resolution of the type itself, thus requiring a lot of multiplication/division with a constant like XTIME_PER_SEC (and before it is noted, I note that with 128-bits integer, choosing XTIME_PER_SEC to be 1LLL<<64 greatly eases things).
In other words, I fear that the struct xtime proposal is too specific for a C-language standard. Its clock has too much range and precision for most practical implementations; and yet it is not precise enough for the most demanding applications. It's better for the spec to be less specific, so that it can allow both practical and very-high-quality implementations.
Well, I agree with you here, but OTOH, the present state of affairs of time_t shows me that at least some specification is required. Having an opaque struct xtime like the proposed struct tmx will be the worst, of course. But I do not like Paul's proposal if it means a XTIME_PER_SEC either.
C is and will always be a comparatively simple and quite inconvenient early-1970s language.
The C standard should cater to C programmers, as well as to people implementing higher-level systems atop C. The time interface should be both simple and convenient, as much as possible.
Agreed. Also, it should be fairly easy to build atop of it a convenient interface using the (modern) tools provided by other programming languages like C++, Ada or Java (not a limitative list). But everybody agrees about that, and every proposal achieves it in some sense, AFAIK.
I've done a lot of time programming, and I find the original POSIX.1 rules (a single integer timestamp) to be much easier to use, and much more reliable to maintain, than the C89 rules (an opaque timestamp with problematic constructors and extractors) or the POSIX.1-1996 extensions (a separate subsecond counter).
This is a more interesting point. I agree with you POSIX.1's approach is by far the easiest to use. But my understanding was that its position regarding leap seconds (and/or handling TAI times) is far from being clear (clumbersome is more appropriate here). Also, *if* we disregard subseconds, Markus' proposal is essentialy the same as the initial POSIX, isn't it?
I do not understand, why it should be difficult to portably add and subtract struct xtime values.
I've seen it done wrong so often in real code (also with its similar BSD predecessor struct timeval). I've even seen errors in time-arithmetic macros in the .h files supplied by vendors! It really _is_ error-prone, and we should have an interface that is less likely to provoke errors.
Agreed here. As I said above, most programmers *should* use only the sec field. But certainly some will *try* to use also the nsec, and problems will begin here. Another idea is striking me right now: - keeping struct xtime - specifying that the first member, named sec, typed int_fast64_t, should be what it is now with Markus' draft - allowing others fields, providing they are self-containing, for counting more precise quantities - adding to the draft addxtime, diffxtime, mulxtime, divxtime - adding to the draft a way to convert from double to xtime and back, either with an API (well, two), or with a macro giving the precision or any subfield. What about this sketch?
How would you represent leap seconds in your stime_t arithmetic type?
The most natural approach is to have xtime_get (or a variant) return a boolean flag specifying whether the timestamp is within a leap second. This would be needed only with TIME_UTC timestamps.
I do not like that. This way way, we should add a parameter to both xtime_make and xtime_breakup: either an input argument, or an output one. Furthermore, there should be some way for xtime_breakup to disambiguate between UTC and TAI: so one more parameter...
For stime_get however this is not an option: the return value is certainly required for comfortable error checking.
No, you can return a negative number -E such that E is suitable as an argument to strerror. This is a natural way to report errors for many of the xtime functions.
Yes, but it should not be *required* to be done this way: the Committee have strong feeling against standardizing this way, since errno-mechanism is very problematic to be used in some environments (this was tolerated for math.h because of wide existing practice). OTOH, I agree that errno-mechanism, or any other mechanism suitable to the implemantation, should be relied upon, instead of standardizing anything other (so tz_error should be dropped, IMHO). Anyway, this is just my point of view, not the official answer from the Committee ;-)
but it's not truly portable, since int_fast64_t might have more than 64 bits.
Please explain what the problem would be there. I don't see any.
You can't portably fwrite int_fast64_t values (or struct xtime values) and read them back in again on another implementation.
Yes you can, as long as either: - all the platforms are 2's complement - or you isolate the sign bit in some way - or you deal only with positive values because C9X now warrants that all the significants bits are the low-order 63 ones, even if int_fast64_t happen to be say 72 bits long. And remember you are still required to handle big-endian/little-endian duality...
Also, I expect C9x compiler vendors to want to have compiler options that modify the type identified by int_fast64_t. If such options are used, you won't be able to fwrite a struct xtime out and read it back in even among the same implementation.
This one is a good point.
It's that sense in which struct xtime is not a completely portable type; you can't output it from one implementation and read it back into another, even in textual form, without possibly having some problems.
In textual form, I do not see the problem. What is wrong with fprintf(output, PRIdFAST64 ":" PRIdFAST32 "\n", xt.sec, xt.nsec); then fscanf (input, SCNdFAST64 ":" SCNdFAST32 "\n", xt.sec, xt.nsec);
The abstractness of types like time_t in ISO C 89 was not done because this abstractness was considered to be good and beautiful design. On the contrary, is was a necessary hack because the ISO C standard had to be backwards compatible with a few strange C implementations.
I don't agree with this assessment. The *_t convention was more of a POSIX creation. For example, different vendors used different int widths for user ids, so they had to institute uid_t for sanity's sake. time_t was just more of the same.
I believe you are both right. Paul is right about the abstraction concept (also used in fpos_t, for example, in C89). And Markus is right, because what he implies is that time_t could have been made much more precise in its definition (only integer types, and/or only counts of seconds, if there was not broadly different existing practices...)
I don't think the Olson-style names should be required; but they should be suggested, at least in a footnote with a URL.
This is not standard practice in the text of an ISO Standard (I think this is even prohibited). However, mentionning it in the Rationale, perhaps with bigger explanations, seems more easy to do (I do not know about the URL).
* stime_make still has the old mktime problem that the function can't distinguish a request for ``3 days after Feb 28'' from a request for ``1 month before Mar 31''. It's silly that mktime thinks that 1 month before Mar 31 is Mar 3 (or 2, if it's a leap year). We should fix this.
I think I fixed this very nicely by not requiring xtime_make to handle *any* invalid time representations.
Sorry, I missed that point.
Well, I too missed it (so since we are at least two, reasonably involved on the matter, I believe it is worth a mention in the Rationale ;-). And I feel not comfortable with its implication: That is, to do arithmetic with a struct tm, you need to do xtime_make(&xt, &tm, &the_tz); if( xt.nsec>1000000000 ) // do something when a leap second occured xt.sec += 3 * 86400L; /* or xt.sec -= 30 * 86400L; */ xtime_break(&tm, &xt, &the_tz); in the latter case (varying of a given number of days), and if( --tm.tm_mon < 0 ) { --tm.tm_year; tm.tm_mon = 11; } if( tm.tm_mday < (last=last_day_in_month[isleap(tm.tm_year)][tm.tm_mon]) ) { tm.tm_mday = last; } in the former case, assuming 1 month before Mar 31st should yield last day of February (if it should yield Feb 28th + 3 days, the code is even more complicated, because the last line should be followed by the snippet above...) The advantage is that the programmer is requested to specify the behaviour in the latter case. But that is certainly not
If you want 3 days after Feb 28, then you just add 3 * 86400 to the sec field.
That doesn't work in the presence of leap seconds.
Yes it does (using Markus' acceptation of xtime)... unless of course you are referring to the point I avoided above, that is if the beginning point was in a leap second. In fact, this is the very point of Markus' choice for representing time. UTC (unlike TAI) is a discontinuating scale that is viewed as continuating for the vast majority of applications. Markus' idea isolates the discontinuation in a separate space, thus allowing the majority to handle times without being burdened with the leap seconds, while keeping the notion present.
If you need access to the zone name, then simply use strfxtime to access it.
OK, that sounds reasonable; but there is still one value that is quite inconvenient to get via strftime: the UTC offset (to the nearest second, please! :-). Perhaps a new format spec could be added to get that?
I think this should be left to the programmer, using first strfxtime("%z"...), then parsing the result and multiplying the hour count by 3600, the minute count by 60... OK, you get the idea. Shortly speaking, I do not see the real users' need. (OTOH, adding a new specifier is certainly not a problem !) Antoine

Date: Wed, 30 Sep 1998 16:01:33 +0200 From: Antoine Leca <Antoine.Leca@renault.fr> Paul Eggert wrote, discussing with Markus:
let's just use a signed integer count of the number of time intervals since the epoch. We can define a type stime_t for this integer, and a stime_t macro STIMES_PER_SEC giving the number of time intervals per second. This makes time arithmetic much, much easier.
Hmm. I am not sure to follow you here, since for the vast majority of applications, struct xtime arithmetic will be defined as following struct xtime addxtime(struct xtime x, struct xtime y) { return (struct xtime){.sec = x.sec+y.sec}; } But this gives an answer that can be off by as much as two seconds; an inaccuracy like that would break many applications. Your remark reinforces my impression that the struct xtime method is too error-prone in practice. This proposal have a clear drawback: like clock_t in C89, it requires to specify the resolution of the type itself requiring a lot of multiplication/division with a constant like XTIME_PER_SEC But any time proposal must specify the resolution somehow. struct xtime specifies it as nanoseconds, for example; so there is an implied XTIME_PER_SEC value of 1000000000. I don't see this as a special drawback of my proposal. *if* we disregard subseconds, Markus' proposal is essentialy the same as the initial POSIX, isn't it? Yes; but that's also true for my proposal. If STIMES_PER_SEC is 1, then it degenerates to the original POSIX.1 as a special case. Another idea is striking me right now: - keeping struct xtime - specifying that the first member, named sec, typed int_fast64_t, should be what it is now with Markus' draft - allowing others fields, providing they are self-containing, for counting more precise quantities - adding to the draft addxtime, diffxtime, mulxtime, divxtime - adding to the draft a way to convert from double to xtime and back, either with an API (well, two), or with a macro giving the precision or any subfield. What about this sketch? Something like that could be done, but it is complicated; if you're going to have that level of complexity, you might as well make struct xtime be an opaque type (xtime_t, say) and be done with it. That way, I could define xtime_t to be a signed integer value, and Markus could make it a structure. This would be a reasonable committee-like compromise that might satisfy neither of us (:-). In practice, though, I hope that the signed-integer-value approach would win out, as it is so much more convenient to use; much as the POSIX.1 time_t won out over the C89 opaque time_t.
How would you represent leap seconds in your stime_t arithmetic type?
The most natural approach is to have xtime_get (or a variant) return a boolean flag specifying whether the timestamp is within a leap second. This would be needed only with TIME_UTC timestamps.
This way way, we should add a parameter to both xtime_make and xtime_breakup: either an input argument, or an output one. Yes, this would be needed if we use a separate boolean. Furthermore, there should be some way for xtime_breakup to disambiguate between UTC and TAI: so one more parameter... I don't understand this comment. xtime_breakup is defined to work only on TIME_UTC timestamps. There's no need to break up TIME_TAI timestamps.
For stime_get however this is not an option: the return value is certainly required for comfortable error checking.
No, you can return a negative number -E such that E is suitable as an argument to strerror. This is a natural way to report errors for many of the xtime functions.
Yes, but it should not be *required* to be done this way: the Committee have strong feeling against standardizing this way, since errno-mechanism is very problematic to be used in some environments (this was tolerated for math.h because of wide existing practice). The approach that I'm suggesting does not use errno -- it returns a value suitable as an argument to strerror, without communicating it via errno. So I think the suggestion shouldn't run afoul of the anti-errno sentiment.
It's that sense in which struct xtime is not a completely portable type; you can't output it from one implementation and read it back into another, even in textual form, without possibly having some problems.
In textual form, I do not see the problem. What is wrong with fprintf(output, PRIdFAST64 ":" PRIdFAST32 "\n", xt.sec, xt.nsec); then fscanf (input, SCNdFAST64 ":" SCNdFAST32 "\n", xt.sec, xt.nsec); That code doesn't report overflow well if the destination host's sec field is smaller than the source's. However, my comments were about binary I/O. If you're willing to go to textual form, then the signed integer approach is also portable, though I admit that it's a bit trickier, as you must also output the resolution.
I don't think the Olson-style names should be required; but they should be suggested, at least in a footnote with a URL.
This is not standard practice in the text of an ISO Standard (I think this is even prohibited). Wow, an ISO standard can't say ``should''? OK, if so, the Rationale would be fine.
That doesn't work in the presence of leap seconds.
Yes it does (using Markus' acceptation of xtime)... unless of course you are referring to the point I avoided above, that is if the beginning point was in a leap second. Sorry, I should have been clearer. You are correct: adding 3 * 86400 to the sec field doesn't work if the starting timestamp is within a leap second. Also, even ignoring the problem of starting within a leap second, there is a difference between wanting a time that is 3*86400 seconds later, and wanting a time that is 3 days later, because the former wants to respect leap seconds but the latter wants to ignore them. C89 mktime supports both sorts of requests (assuming int is large enough to represent 3*86400) but the struct xtime proposal supports only the latter. Or perhaps the intent is that if you want to add 3*86400 seconds then you have to convert your TIME_UTC timestamp to some other clock type, do the arithmetic there, and then convert back? If so, which clock type is recommended? TIME_TAI would be the best, but presumably it's not always available. So I'm a bit puzzled if this is indeed the intent.
there is still one value that is quite inconvenient to get via strftime: the UTC offset (to the nearest second, please! :-). Perhaps a new format spec could be added to get that?
I think this should be left to the programmer, using first strfxtime("%z"...), then parsing the result and multiplying the hour count by 3600, the minute count by 60. That is inconvenient, but it would work, so long as %Z outputs the UTC offset to 1-second resolution if necessary, not just the 1-minute in the current proposal.

Paul Eggert wrote, quoting me:
Hmm. I am not sure to follow you here, since for the vast majority of applications, struct xtime arithmetic will be defined as following
struct xtime addxtime(struct xtime x, struct xtime y) { return (struct xtime){.sec = x.sec+y.sec}; }
But this gives an answer that can be off by as much as two seconds;
Yes. And this is the current accuracy of *all* applications using the present API.
an inaccuracy like that would break many applications.
I do not think so (given the current state of affairs). While I agree with you that improving the accuracy would be a real gain. Remember that my remark above was just highlighting a *possible* (and quite probable IMHO) behaviour, and certainly not the one I will recommend for the general case.
Your remark reinforces my impression that the struct xtime method is too error-prone in practice.
It looks like we both agree an API to make the basic arithmetic upon the "basic" times would be a gain.
This proposal have a clear drawback: like clock_t in C89, it requires to specify the resolution of the type itself requiring a lot of multiplication/division with a constant like XTIME_PER_SEC
But any time proposal must specify the resolution somehow. struct xtime specifies it as nanoseconds, for example; so there is an implied XTIME_PER_SEC value of 1000000000. I don't see this as a special drawback of my proposal.
I beg your pardon to make it clear: it requires the cost of the multiplication/division. Markus' proposal, OTOH, permits to avoid it (at the cost of accuracy, as you highlighted), or to reduce it to the minimum.
*if* we disregard subseconds, Markus' proposal is essentialy the same as the initial POSIX, isn't it?
Yes; but that's also true for my proposal. If STIMES_PER_SEC is 1, then it degenerates to the original POSIX.1 as a special case.
I missed this point; thanks for making it clear.
Another idea is striking me right now: <snip> Something like that could be done, but it is complicated; if you're going to have that level of complexity, you might as well make struct xtime be an opaque type (xtime_t, say) and be done with it.
That is what I tried to avoid...
This would be a reasonable committee-like compromise that might satisfy neither of us (:-).
:-)
Furthermore, there should be some way for xtime_breakup to disambiguate between UTC and TAI: so one more parameter...
I don't understand this comment. xtime_breakup is defined to work only on TIME_UTC timestamps. There's no need to break up TIME_TAI timestamps.
But it also works if a TIME_TAI value is passed, without any effort. This looks like a nice drawback of Markus' design to me. (I know that from a Standard C' point of view, this is undefined behaviour, so this is not guaranteed to work; however, I do not see how it can fail).
In textual form, I do not see the problem. What is wrong with
fprintf(output, PRIdFAST64 ":" PRIdFAST32 "\n", xt.sec, xt.nsec); then fscanf (input, SCNdFAST64 ":" SCNdFAST32 "\n", xt.sec, xt.nsec);
That code doesn't report overflow well if the destination host's sec field is smaller than the source's.
This will require having a value outside the range ]-1<<63s, +1<<63s[... that is almost ]-300 Gyears, 300 Gyears[ ! Quite not a problem in practice until some time, I think ! ;-) Antoine

Date: Thu, 01 Oct 1998 10:29:34 +0200 From: Antoine Leca <Antoine.Leca@renault.fr>
But any time proposal must specify the resolution somehow. struct xtime specifies it as nanoseconds, for example; so there is an implied XTIME_PER_SEC value of 1000000000. I don't see this as a special drawback of my proposal.
I beg your pardon to make it clear: it requires the cost of the multiplication/division. Markus' proposal, OTOH, permits to avoid it (at the cost of accuracy, as you highlighted), or to reduce it to the minimum. I don't see why struct xtime will be more efficient in terms of multiplication and division; on the contrary, I think it's less efficient. At the low level, few clocks operate at exactly 1-ns speed, so struct xtime will need to be implemented by multiplication and division internally. (This is what Solaris 2.6 does for struct timespec, for example.) Under my proposal, this multiplication and division could be avoided. You're right that if the user needs to do time arithmetic, he may well have to do some multiplication and division; but that's pretty much inevitable: it's true even for the struct xtime proposal. On the other hand, if the user doesn't need to do any time arithmetic, then under the single-integer proposal it's possible that no multiplication or division will need to be done at all, even internally; the user will be dealing directly with the hardware clock value. This is a win.
That code doesn't report overflow well if the destination host's sec field is smaller than the source's.
This will require having a value outside the range ]-1<<63s, +1<<63s[... that is almost ]-300 Gyears, 300 Gyears[ ! You're right that this is obviously not a problem in terms of what times the clock will return, but it is a problem for routines that need to print and read clock values reliably, regardless of what the values are.

Markus Kuhn wrote:
As I have pointed out before, I don't like the continent prefix in the Olson/style names, therefore I do not want to make this particular syntax immortal in an ISO standard (I much prefer just ":Paris" over "Europe/Paris").
Interesting example. Is ``:Paris'' "Paris, France", or "Paris, Texas", or perhaps one of the other half-dozen or so cities named Paris? I'm not particularly thrilled with the continent name either, but it does serve a purpose. --Ken Pizzini

Ken Pizzini wrote on 1998-09-30 00:22 UTC:
Markus Kuhn wrote:
As I have pointed out before, I don't like the continent prefix in the Olson/style names, therefore I do not want to make this particular syntax immortal in an ISO standard (I much prefer just ":Paris" over "Europe/Paris").
Interesting example. Is ``:Paris'' "Paris, France", or "Paris, Texas", or perhaps one of the other half-dozen or so cities named Paris?
Remember that time zone names refer to the most populated area within a region with common time zone history. This rule should already resolve practically all ambiguities. If there are really two Paris that are both candidates for TZ entries as they are both most populated areas in different time zone regions, then they should get qualifiers added (or at least all but the largest one should). This way, all those little Paris clones in the US are not of concern any more, and "Paris" would be guaranteed to refer to the real big one under the Eiffel's tower. To remove any ambiguity, we have the coordinates of the place, and a GUI TZ selector tools can easily indicate on a map what region we are talking about.
I'm not particularly thrilled with the continent name either, but it does serve a purpose.
But not very well. How many Paris are there in the US alone? An ISO 3166-1 country code or where necessary ISO 3166-2 country/region code for those hypothetical cases where there could occur an ambiguity would serve this purpose much better. The continent names come from the file organization of the Olson DB, and this implementation detail should IMHO not leak through to the name space. That's why I am not particular happy with seeing iCalendar people making these continent/city names more permanent by quoting them in their standards. If I had to design proper tz names from scratch, they might look more like fr.paris us.tx.paris Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>

Markus Kuhn wrote:
Interesting example. Is ``:Paris'' "Paris, France", or "Paris, Texas", or perhaps one of the other half-dozen or so cities named Paris?
If I had to design proper tz names from scratch, they might look more like
fr.paris us.tx.paris
You will need to include 3166-2 notation as well for Paris, eg. "fr.75.Paris", since there are at least 4 places in Metropolitan France with such a name (add fr.71.Paris_l'Hopital, fr.26.Petit_Paris, fr.62.Paris_Plage). Of course, one is somewhat bigger that the others. So I had to cheat to get the names, because my system cleverly answers me only the first when I was asking about this list... So, this is just to say that ":Paris", like ":Los_Angeles", should be unambiguous enough when speaking about *world* time zones, unless a small city lost in the fields far from the capital in a big country choose to have its proper history of time, and it this city happens to be named Paris... Antoine

Markus Kuhn writes:
Before any arithmetic, you first check whether there is a leap second in one of the arguments and abort if there is (because arithmetic is ill-defined in this case).
Of course, libtai doesn't have that problem. libtai provides support routines for addition, subtraction, comparison, halving, conversions to and from the external TAI64NA format, and floating-point approximations. Applications don't need to worry about the internal details.
xtime_get can fail
In libtai, tai_now() and taia_now() always succeed. ---Dan 1000 recipients, 28.8 modem, 10 seconds. http://pobox.com/~djb/qmail/mini.html

"D. J. Bernstein" wrote on 1998-10-03 02:23 UTC:
Markus Kuhn writes:
Before any arithmetic, you first check whether there is a leap second in one of the arguments and abort if there is (because arithmetic is ill-defined in this case).
Of course, libtai doesn't have that problem.
libtai provides support routines for addition, subtraction, comparison, halving, conversions to and from the external TAI64NA format, and floating-point approximations. Applications don't need to worry about the internal details.
xtime_get can fail
In libtai, tai_now() and taia_now() always succeed.
I think this claim is a bit unrealistic. The only common hardware platform where a correct and robust implementation of your libtai is possible would be a GPS receiver that has been running for at least 12 minutes with an antenna outside a building. I am not sure, what the scope of your API is, but that sounds somewhat restricted to me. I use systems every day that might not know about TAI. You might have defined yourself some sort of synthetic pseudo-TAI that you send through various correction filters and adjust to the real TAI whenever you have correct leap-second tables and an external UTC reference available, but that in the end is not guaranteed to have a reliable relation to the real TAI as published by BIPM/IERS. If it is not TAI, then please do not call it TAI, or be accused of sloppy terminology. If you use some not clearly defined pseudo-TAI, then in the end what you provide is probably just one of the possible implementation forms of xtime's TIME_MONOTONIC, which is also guaranteed to always succeed, but which does not claim to have a relationship with TAI and which can therefore even be implemented on embedded systems with no non-volatile memory whatsoever and with no UTC or TAI reference. The xtime API was designed for a much broader scope and I think is much better suited as a template for an ISO standard. I have looked at your libtai, and the claim that it doesn't have any problems that the xtime API does have certainly does not hold up under even superficial examination. The clock model in your API seems to be a simplistic subset, and regarding the guarantees you make about the availability of TAI it probably either uses misleading terminology or is just not implementable in a robust and reliable way on many platforms. Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>

Markus Kuhn writes:
xtime_get can fail In libtai, tai_now() and taia_now() always succeed. I think this claim is a bit unrealistic.
RTFM. These functions return their best guess as to the current time. Accuracy is a quality-of-implementation issue.
The only common hardware platform where a correct and robust implementation of your libtai is possible would be a GPS receiver that has been running for at least 12 minutes with an antenna outside a building.
Where do you get these absurd ideas? libtai works with whatever clock is available. There's nothing special about GPS.
If it is not TAI, then please do not call it TAI,
It is TAI. See http://pobox.com/~djb/proto/tai64.txt. ---Dan 1000 recipients, 28.8 modem, 10 seconds. http://pobox.com/~djb/qmail/mini.html

Date: Sun, 04 Oct 1998 11:43:30 +0100 From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk> "D. J. Bernstein" wrote on 1998-10-03 02:23 UTC:
In libtai, tai_now() and taia_now() always succeed.
I think this claim is a bit unrealistic. But do you agree that Bernstein's library could be used to implement a version of xtime_get(...,TIME_TAI) that always succeeds, so long as it never sets the TIME_SYNC bit of the result? In other words, so long as xtime_get never _claims_ to be accurate within 1 s, it would conform to your spec.

Paul Eggert wrote on 1998-10-05 21:02 UTC:
Date: Sun, 04 Oct 1998 11:43:30 +0100 From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>
"D. J. Bernstein" wrote on 1998-10-03 02:23 UTC:
In libtai, tai_now() and taia_now() always succeed.
I think this claim is a bit unrealistic.
But do you agree that Bernstein's library could be used to implement a version of xtime_get(...,TIME_TAI) that always succeeds, so long as it never sets the TIME_SYNC bit of the result? In other words, so long as xtime_get never _claims_ to be accurate within 1 s, it would conform to your spec.
I don't want to forbid this interpretation of the standard, so yes, I would say it could be used. (see below for why) For high quality implementations, I would prefer to see that implementations know what they do not know. For instance, I think a leap second table should have an expire date (at the moment this would be around 12 months), because the information necessary to update it (IERS Bulletin C) is distributed every six months, and if a leap second table has not been updated for more than 12 months than it is very likely that using it beyond the last known TAI-UTC offset will lead to wrong results. Note that implementations who are happy with such a best effort estimate of TAI can always use tz_jump to go back to the last known leap second and use xtime_conv() to read out the offset at that time. So even if the current TIME_TAI becomes unavailable due to an aging leap second table, with a bit more effort, applications can still get a more dangerous best effort estimate. TIME_SYNC in TIME_TAI and TIME_UTC has basically the same meaning (i.e., we have now a link to a reference clock), while the availability of TIME_TAI means that we have reason to believe that we know what the current UTC-TAI value is (or we otherwise have received TAI directly in cases where we don't know UTC at all). Do you think the specification should be formulated stricter here and if yes, what would be your preferred formulation? Please keep in mind that my goal was to write an easily understandable specification that allows but does not enforce high quality implementations. Therefore I intentionally left details open, like when exactly is TIME_TAI allowed to be available, because how to do this correctly depends a lot on to what resources the system has access. We can always add additional stricter requirements to the C API later in separate documents, for instance in further ISO standards about time protocols to be used in POSIX environments (with beautiful facilities for fully automated remote timezone and UTC-TAI table updates, etc.). Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
participants (5)
-
Antoine Leca
-
D. J. Bernstein
-
Ken Pizzini
-
Markus Kuhn
-
Paul Eggert