(I hope you don't mind Paul, if I reply on tz) Paul Eggert wrote on 1998-09-29 01:34 UTC:
Here are some big-picture comments on your proposed extensions to ISO C <time.h>. I've hacked up a copy of your HTML proposal along these lines, if you're interested in reading it. (I still don't have a revised rationale, though.)
It's very hard work to come up with a good spec! After trying my hand at it, I respect the work you've done so far; the following comments are in the spirit of improving it to be as useful as possible.
I have some more detailed comments once you've had time to digest these bigger-picture comments, but first things first.....
* Terminology proposal: let's rename `xtime' to `stime' (short for ``standard C time''), uniformly. The rest of my discussion assumes this new terminology.
Hm, see my separate posting on the many different renaming proposals that I received.
* struct stime is inconvenient for programmers. It's hard to portably subtract such values, much less do other arithmetic on them. Even comparing them is error-prone. This structure's members are derived from a similar structure in POSIX.1, but the POSIX.1 structure was predicated on not having integer types longer than 32 bits.
Instead, let's just use a signed integer count of the number of time intervals since the epoch. We can define a type stime_t for this integer, and a stime_t macro STIMES_PER_SEC giving the number of time intervals per second. This makes time arithmetic much, much easier.
I am afraid, but I strongly disagree here. I think my current approach is more robust, functional, and therefore preferable. The reason: Most C 9X implementations will not provide any type with more than 64-bits. If we take a 64-bit type, then you can have only nanosecond resolution over an interval of 58 years. This is unacceptable; even the notorious Y2K COBOL programs fail only after 99 years. Note that this is independent of whether you use a 64-bit int or a 64-bit float type, 64-bit is just not enough for a high-resolution timestamp. About the "convenience" argument in general (and I'll come back to this for Antoine's xtime_get return value critique): Compared to modern programming languages with their variable length arrays, exceptions, garbage collectors, controlled name spaces, etc. C is and will always be a comparatively simple and quite inconvenient early-1970s language. If you want convenience, look for another language than C. The interesting thing about C these days is mostly that it is ubiquitously implemented. Most programming environments for more modern and comfortable languages are built on top of the existing C API. If you read the C++, Java, and Ada95 standards, then it becomes embarrassingly obvious that the abilities of their standard library has been limited by the existing ability of the C library, therefore is is extremely important to get the functionality of the C API right. The comfort should certainly come after functional criteria. I do not understand, why it should be difficult to portably add and subtract struct xtime values. The new <stdint.h> offer a reasonably good support here. The algorithm is trivial: Before any arithmetic, you first check whether there is a leap second in one of the arguments and abort if there is (because arithmetic is ill-defined in this case). Then you just add sec and nsec separately and adjust for the nsec overflow. Note that non-leap nsec values do not overflow the nsec type, because 2 * 1e9 < 2**31. It all fits so nicely together with nanoseconds, I couldn't think of a more appropriate representation. The POSIX.1b people made certainly a good choice here. In programming languages such as C++ or Ada where operators like "+" and "-" can be overloaded, the authors of bindings to the xtime API will certainly add the corresponding overloadings, and then xtime is as easy to use as a float type. If you insist, we could add macros or functions for struct xtime arithmetic, but I think it is better to let users do this themselves to make sure they have thought about the leap second aspects of arithmetic, which are certainly application dependent. (Again, the lack of an exception mechanism in C prevents to do a robost and comfortable interface at the same time, so we stay on the robust side and leave to comfort to languages with proper mechanics to do it.) How would you represent leap seconds in your stime_t arithmetic type? This was after all besides the resolution concerns the major reason for getting rid of time_t and its idea of using an arithmetic type.
* We should pass timestamps by value whenever possible, instead of by reference. Passing timestamps by reference makes the program more error-prone and (these days) slows it down. This is particularly important for stime_get.
That is an interesting point. Actually, I have never done any measurements on whether passing a 96-bit struct by reference is more efficient than passing it by value with the usual C compilers. Does anyone have any numbers on this for typical 32-bit processors? For stime_get however this is not an option: the return value is certainly required for comfortable error checking. Other programming languages with exceptions could probably return the time value here, but in C we have no exceptions and therefore (as xtime_get can fail), we will have to use it in some form of if statement with an explicit exception handler.
* We should constrain the implementation to support times through a reasonable upper bound (e.g. 9999 AD, or perhaps something a bit sooner since we'll have to reform our calendar well before then). But we shouldn't insist on astronomical timescales; that places an undue burden on the implementation.
What would this burden be? We know that 32-bit is too small, and the next best size is 64-bit words or with nanoseconds a 96-bit struct. The last non-power-of-two machine that I worked on was dumped >10 years ago and never had a C compiler. These are of academic interest. There is no practical intermediate size between a useful range and a 64-bit second counter. I think it is much more of an burden for the application programmer in the end to leave the time encoding and epoch undefined.
* We should specify better what happens before the epoch for TIME_UTC and TIME_TAI.
Actually, I would suggest that we specify the functionality of a xtime_make(&xtp, &tmptr, NULL); call completely by providing an example of a correct implementation in the standard (should be possible in less than 50 lines). Real code is much clearer here than any pseudo-mathematic specification. I started to add such example code to my web page a few days ago, but was interrupted and didn't get around yet to finish. Volunteers are welcome (especially getting the leap year formula for negative years correct might be a bit of a brain tweezer). Such example code resolves many ambiguities and also should solve your concern here. As far as TIME_TAI is concerned, I do not expect *ANY* implementation to support xtime_conv with pre-1972 TIME_TAI values. The other functions do not care about the difference between TIME_TAI and TIME_UTC anyway, and the semantics of the TIME_UTC clock relative to the Gregorian calendar withh be specified in the example code.
* We shouldn't insist on a particular type like `int_fast64_t' for timestamps. I assume that this type was chosen to encourage portability,
I chose it for range and efficiency and fortunately. With int_fast64_t C provides me with exactly what I had in mind. I am not sure what type of portability you are referring to.
but it's not truly portable, since int_fast64_t might have more than 64 bits.
Please explain what the problem would be there. I don't see any.
Also, as mentioned above, it's way way overkill for almost all applications. It's better to keep the timestamp type a bit more abstract, like stime_t, and to place constraints on it as needed, as described above.
The abstractness of types like time_t in ISO C 89 was not done because this abstractness was considered to be good and beautiful design. On the contrary, is was a necessary hack because the ISO C standard had to be backwards compatible with a few strange C implementations. In this respect, ISO C 89 was a step backwards compared to K&R C. (Special thanks to Nick Maclaren <nmm1@cus.cam.ac.uk> for providing me with the historic background on this.) To my big surprise, many people today start to admire the hacky type mess that ISO C had to introduce because of bad PC compilers as glorious and wonderful abstractness, having completely forgotten the original reasons for all the awful type-uncertainties of ISO C 89.
* The stime_get interface is confusing, since it has several operational functions. It's better to separate them out into one C function for each operational function; e.g. we should have a separate function stime_getres to get the resolution, instead of passing a flag to stime_get to have it return the resolution instead of the current time. Similarly, we should have a separate function that gives us an error bound on the clock, rather than have a TIME_SYNC flag.
I considered defining several functions, and then deliberately packed all these functions into a single one. The main reason is that this packing of functionality allows implementors to add more functionality by just giving additional option and return bits a meaning, without cluttering up the namespace in non-portable ways. My proposed xtime_get interface allows you to add additional clocks to be requested (UT1 is one example I mentioned in a note in the proposal), or allows additional data to be requested (like is the clock coming in over a trusted channel, when was the last connection to this clock, estimated error interval size, etc.). I prefer to have a single more universal function that provides a clear way of addition additional functionality in a portable way over a set of functions that leave no room for extensions. It is always easy to add another biot, but is is difficult to add another function in implementations in a binary compatible way (thinking about shared libraries and these issues).
* We should require at least POSIX-style time zone specifications for tz_prep, and should suggest Olson-style. As things stand, there's little that portable code can do with that function.
Requiring POSIX-style time zone specifications as a minimum functionality would be ok for me, but so would be to leave the string completely application defined and just mentioning POSIX as one possible implementation. I don't know whether the Olson-style ones are standardizable unless we get some sort of official ISO registry for time zone specifications. As I have pointed out before, I don't like the continent prefix in the Olson/style names, therefore I do not want to make this particular syntax immortal in an ISO standard (I much prefer just ":Paris" over "Europe/Paris"). Note that the tzstring will usually be directly entered by the user (say via a config file or via an environment variable), and the portable application does not have to be aware of the syntax used on this system.
* We should codify the convention that getenv("TZ") returns the user-preferred time zone. A null tzstring should expand to a system-defined time zone. This functionality is in practice and is useful e.g. in mailer software, where you may or may not want to allow the user to specify the time zone.
OK, that sounds like a good idea.
* stime_make still has the old mktime problem that the function can't distinguish a request for ``3 days after Feb 28'' from a request for ``1 month before Mar 31''. It's silly that mktime thinks that 1 month before Mar 31 is Mar 3 (or 2, if it's a leap year). We should fix this.
I think I fixed this very nicely by not requiring xtime_make to handle *any* invalid time representations. If you want 3 days after Feb 28, then you just add 3 * 86400 to the sec field. The extremely ugly hacks with mktime overflows and underflows have become obsolete by defining the encoding of the time representation to allow direct arithmetic. It is very difficult to define mktime overflow behavior nicely, so why bother if there is no need?
* The proposal for strfstime still involves magic; e.g. how can the strftime easily determine the time zone abbreviation?
strfxtime gets the timezone object passed as a paremeter. All information can be stored in there and accessed directly. No need to extend struct tm.
I'd rather have a strftime that could in principle be written by the user; it's common practice to write augmented strftime implementations, which grind their teeth over questions like these, and it'd be better if we let people write such functions cleanly.
If you need access to the zone name, then simply use strfxtime to access it. I don't understand why it should be stored in any struct tm extention if we have a function to read it.
This will involve a new type `struct stm' that contains extra members so that no more magic is needed. Something like this:
struct stm { stime_t year; int month; int month_day; int hour; int minute; int second; stime_t utc_offset; stime_t dst_offset; const char *zone; const char *zone_description; };
No, please no additional structs. The existing one is good enough to represent and handle a full broken-down time. All other information is fully accessible via strfxtime. I consider the mixture of brocken-down time and timezone information conceptually dubious. For me a time zone is a function that maps broken-down times onto UTC, and not just auxiliary data on a broken-down time. The existing tm_isdst field seems to me to be sufficient (although not optimal) to handle ambiguities.
* We shouldn't have a separate error function just for stime. Instead, functions that report errors should yield an error number, which can be passed as an argument to strerror. This will simplify the interface (e.g. we don't need to worry about LC_MESSAGES).
That's perhaps worth thinking about. It depends on how detailed you want to make these error messages, and whether a single number can carry all information. Timezone strings can be rather tricky to get right and a comfortable diagnostic might be useful here. Thanks for your comments. Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>