Functions add/diff/cmp for xtime
Several commentors have expressed a desire to have a few basic predefined arithmetic functions on xtime values. I suggest to add the following three functions. These functions are so simple that I feel the easiest way of specifying their behaviour is by providing a sample implementation (i.e., using C as the specification language instead of English): ------------------------------------------------------------------------ /* * Assertions: * * fabs(xtime_diff(xtime_add(t, d), t) - d) <= 0.5e-9 * for all t, d with fabs(t.sec + d) < 2**62 and t.nsec in the valid range * * xtime_cmp(xtime_add(t, d), t) * is in the set {-1, 0, 1} and has the same sign as d or is 0 iff d==0.0 * */ int xtime_cmp (struct xtime t1, struct xtime t2) { if (t1.sec == t2.sec && t1.nsec == t2.nsec) return 0; if (t1.sec > t2.sec || (t1.sec == t2.sec && t1.nsec > t2.nsec)) return 1; return -1; } double xtime_diff(struct xtime t1, struct xtime t2) { if (t1.nsec > 1000000000) t1.nsec = 1000000000; if (t2.nsec > 1000000000) t2.nsec = 1000000000; return (t1.sec - t2.sec) + (t1.nsec - t2.nsec) / 1.0e9; } struct xtime xtime_add(struct xtime t, double d) { double ds; if (t.nsec > 1000000000) t.nsec = 1000000000; t.nsec += floor(modf(d, &ds) * 1.0e9 + 0.5); t.sec += ds; while (t.nsec >= 1000000000) t.sec++, t.nsec -= 1000000000; while (t.nsec < 0) t.sec--, t.nsec += 1000000000; return t; } ------------------------------------------------------------------------ Note that concerning leap seconds, the arithmetic functions follow exactly the spirit of the TIME_UTC definition and do not count inserted leap seconds as time worth counting. So even if there is an inserted leap second 1972-06-30 23:59:60 UTC the xtime_diff difference between 1972-06-30 00:00:00 UTC and 1972-07-01 00:00:00 UTC is still precisely 86400.0 seconds. This allows you to use time intervals such as 86400.0 in order to mean "one day later" on the UTC clock, which is what most applications need (and what POSIX provided so far). Note that I also have carefully defined the behaviour of the function for the case that they operate inside leap seconds to conform to this model and to fulfil the - I think - desireable algebraic property that I specified in the assertion. If you are *really* interested in the number of SI seconds between two UTC timestamps (very few applications will really need this, and those usually will better use TIME_MONOTONIC instead of UTC), then just convert both via xtime_conv() into TAI timestamps (assuming the information is available), and then use xtime_diff on the resulting TAI timestamps: if (!xtime_conv(&tai1, TIME_TAI, &utc1, TIME_UTC) && !xtime_conv(&tai1, TIME_TAI, &utc1, TIME_UTC)) return xtime_diff(tai1, tai2); else { /* relevant leap second history not available */ } Very simple, very intuitive, very robust, very clear semantic, and very straight forward. *** I love it. *** :-) Other questions: Has anyone made some experiments or can contribute arguments to decide on whether passing 96-bit structs by pointer or by value is more desirable? Has anyone completed the UTC part of the xtime_make() specification code in the proposal or do I have to do this myself? Is anyone willing to merge the xtime text with the existing C9X draft and also include the change requests that Paul submitted to ANSI? Any suggestions regarding the sample code of the above functions? Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
Markus Kuhn writes:
If you are *really* interested in the number of SI seconds between two UTC timestamps (very few applications will really need this,
Ever heard of TCP/IP? How about UNIX? How about system logs? Timestamps in logs aren't just for clock displays; they're also used for profiling and accounting. Time differences are crucial. Why do you refuse to admit that clocks are used this way?
and those usually will better use TIME_MONOTONIC instead of UTC
Do you expect people to record two clocks in their logs? A single TAI clock is better for everybody. ---Dan 1000 recipients, 28.8 modem, 10 seconds. http://pobox.com/~djb/qmail/mini.html
D. J. Bernstein wrote:
Markus Kuhn writes:
If you are *really* interested in the number of SI seconds between two UTC timestamps (very few applications will really need this,
But they *are* applications that need it, right?
Timestamps in logs aren't just for clock displays; they're also used for profiling and accounting. Time differences are crucial.
Why do you refuse to admit that clocks are used this way?
OTOH, your API impedes the use Markus is speaking about (discarding leap seconds), and that is a nuisance sometimes. (Many people are *very* comfortable with the fact that a day is 86400 seconds, not a pseudo-random value...) The only reasonable solution is to have the option. Currently, the only API which gives me the choice one way or another is Markus', provided that every struct xtime parameter may be tagged with the information whether it is really TIME_TAI *or* TIME_UTC (as defined by Markus, with the added point that the calendar should be the proleptic Gregorian). This is the very reason why I prefer Markus' solution: it allows to handle nicely *both* kinds of clocks. OTOH: - C89 is incomplete in this area - Posix states: I support only UTC without leap seconds; that is brocken from the QoI point of view - C9X states: once the number of leap seconds for a timestamp is determined (as per mkxtime or zonetime), this cannot be changed*, effectively giving us a new clock scale that happens to coincide with UTC for a lmited range of time: that is brocken *: unless you specificaly add: tmx.tm_leapseconds = _NOLEAPSECONDS; before calling mkxtime a second time; IMHO, this is a nuisance, and it won't be done this way by programmers. - Mr Bernstein only provides TAI as a basis: while I agree the same effects can be achieved, I believe this is not that natural to most people (in particular, there is a danger if struct tai is transported from one host to another if they disagree about the leap seconds table: effectively, this means either *every* hosts should have an uptodate leapsecodns table, which is unrealistic, or either struct tai should not be exchanged, then why specifying it?
and those usually will better use TIME_MONOTONIC instead of UTC
Do you expect people to record two clocks in their logs? A single TAI clock is better for everybody.
I disagree. This is the same as saying: "English is better for everybody", or "Euro is better for everybody". While I agree with the second ;-), I disagree with the former. YMMV ;-) Also, while TAI timestamps might be better, this is certainly not what is happening: everybody uses UTC (or LT) timestamps, usualy in textual form with :60 indicating leap seconds. This is certainly equivalent, but this is not the same. Antoine
Antoine Leca wrote on 1998-10-07 10:29 UTC:
D. J. Bernstein wrote:
Markus Kuhn writes:
If you are *really* interested in the number of SI seconds between two UTC timestamps (very few applications will really need this,
But they *are* applications that need it, right?
Sure there are. But *much* fewer than some people here tend to think. You need something like TAI to get correct SI second intervals across reboots of systems where TIME_MONOTONIC is not preserved. You certainly need TAI to control astronomical instruments and time astronomical or geophysical observations, or navigate spacecraft. You need TAI as Dan pointed out for some types of accounting (although you really should first investigate the details here before you just assume that charging for leap seconds is what really what is desired).
The only reasonable solution is to have the option. Currently, the only API which gives me the choice one way or another is Markus', provided that every struct xtime parameter may be tagged with the information whether it is really TIME_TAI *or* TIME_UTC (as defined by Markus, with the added point that the calendar should be the proleptic Gregorian).
Note that there is no reason for the tagging to show up in the bits of the value. The tagging can be done implicitly by naming your variables in software such that they make clear whether they are intended for UTC or TAI or whatever values. If you want to do this more formally, then in languages like Ada you have powerful subtyping mechanisms that allow you to build APIs that make it unlikely that programmers will accidentally mix UTC and TAI values in an inappropriate way. For this, you would of course have overloaded versions of xtime.get. Question: What precisely does "proleptic" mean and where is it defined. The most official definition of the Gregorian calendar that I have to reference is ISO 8601, and it does not use the term "proleptic". My time and astronomy references to not define the term either. (I suspect it means "extended before the time it was defined", but would like to get a confirmation and reference.) Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
Markus Kuhn wrote:
Note that there is no reason for the tagging to show up in the bits of the value. The tagging can be done implicitly by naming your variables in software such that they make clear whether they are intended for UTC or TAI or whatever values.
It is not *necessary* for the timestamp-type to appear in the timestamp, but doing so costs only a few bits and provides an excellent defensive programming mechanism: it prevents you from attempting to convert TAI timestamps to the local timezone, a meaningless activity.
Question: What precisely does "proleptic" mean and where is it defined. The most official definition of the Gregorian calendar that I have to reference is ISO 8601, and it does not use the term "proleptic". My time and astronomy references to not define the term either.
The WWWebster dictionary defines "prolepsis" as: anticipation: the representation or assumption of a future act or development as if presently existing or accomplished [...] "Proleptic" would be the standard adjectival form. The HP MPE/ix documentation (http://jazz.external.hp.com/src/year2000/dateintr.txt) says: # All the date intrinsics follow what is called the "Proleptic Calendar". # Stated in simple terms this calendar ignores the fact that calendars in # different countries changed at different times (around the year 1753) . # In other words, there are no lost days and there is no year 0. This is # similar to the calendar used by ALLBASE/SQL date/time functions. The Julian Day count is based on a "proleptic Julian calendar", i.e. the Julian calendar as if it was in use from 4713 B.C.E. to the present. -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)
Antoine Leca writes:
OTOH, your API impedes the use Markus is speaking about (discarding leap seconds), and that is a nuisance sometimes.
Oh, really? Show us some programs where you claim it's a nuisance.
(Many people are *very* comfortable with the fact that a day is 86400 seconds, not a pseudo-random value...)
That isn't a fact; it's a fantasy. A UTC day is _not_ guaranteed to be 86400 seconds. If your code doesn't work correctly during leap seconds then it's wrong. libtai makes it easy to do the right thing.
effectively, this means either *every* hosts should have an uptodate leapsecodns table, which is unrealistic,
Repeating that assertion doesn't make it true. The cost of distributing up-to-date leap-second tables is minor---certainly much less than the costs imposed on future programmers by Markus's API.
Also, while TAI timestamps might be better, this is certainly not what is happening: everybody uses UTC (or LT) timestamps,
Wrong. Serious accounting doesn't rely on amateur toys like syslog. ---Dan 1000 recipients, 28.8 modem, 10 seconds. http://pobox.com/~djb/qmail/mini.html
D. J. Bernstein wrote:
Antoine Leca writes:
OTOH, your API impedes the use Markus is speaking about (discarding leap seconds), and that is a nuisance sometimes.
Oh, really? Show us some programs where you claim it's a nuisance.
My usual use for the date and time functions in the C library is to calculate a date that is X u apart from a given point, where "X" is a number and "u" some unit of time. The most usual is: in 1 month, but a very common one is: in 1 week. The first computation requires the use of mktime normalisation. But the second does not (if I start with a time_t value), because, *in the absence of leap seconds*, I just need to add 7*86400L to my start value. I do not claim this is the most used function of the library, because it certainly isn't. But since you asked for an example, I give you the best I can find. Also, don't argue with me that your library allow it. I perfectly know it. My point is that it is not *that* easy.
(Many people are *very* comfortable with the fact that a day is 86400 seconds, not a pseudo-random value...)
That isn't a fact; it's a fantasy. A UTC day is _not_ guaranteed to be 86400 seconds. If your code doesn't work correctly during leap seconds then it's wrong.
Good point, but I won't argue with that. I, and most users as well, do not care what precise value an UTC day, or hour, or second, is. They (and I) want reliable and mostly straightforward ways to resolve *their* problems, not trying to write convolutate programs to handle the fantasy of Earth movement.
libtai makes it easy to do the right thing.
For the knowledgable people, yes, and everybody seems to agree with you here. My point is that it *requires* some users to "do the right thing" for the most simple tasks, even if their tasks have no real link to the bare reality of UTC time scale. BTW, what is your objective? do you think we should take your library and pasting it in the C standard?
effectively, this means either *every* hosts should have an uptodate leapsecodns table, which is unrealistic,
Repeating that assertion doesn't make it true.
What is wrong? The need for the table (which you are speaking about just after)? or the fact that its distribution is unrealistic? If it is the realism, let's give you an example: I am in charge of the maintenance of thousands of machines running Windows 95. When W95 appears on the market, it carries a wrong table for *our* time zone (change at midnight locale time, the last sunday of september!). We are in 1998, 60 months away from the needed change in the table; nevertheless, we had quite a number of users who shifted two weeks ago... because their tables have not been updated. I agree with you the updating of the leapsecond table is much more critical, so cannot suffer from a delay that big. But since the difference is only seconds, I am quite sure *most* of my users won't be up to date at a given point of time. Again, there is a difference between a knowledgable person, like you are, and Jean Utilisateur-Moyen. Also, I do not want a C standard that requires every host to run NTP or a equivalent protocol...
The cost of distributing up-to-date leap-second tables is minor---certainly much less than the costs imposed on future programmers by Markus's API.
At least in my part, the cost of distributing any patch is much heavier than any burden imponed to any programmer, how severe it is (that is a little too strong, but you should get the idea). The very point is that my fellow programmers (and I as well) do not accept this reality, and certainly won't change their habits, so don't do it. But it does not change the fact. Also, where do you find overwhelming costs behind the adoption of Markus's proposal? I am not Markus, I do not want to promote its work particularly. But I also see only a very limited part of the picture, so I will like to know where are the problems of each solution to try to forge a better one. If we spend our time critizing, nothing will happen. Antoine
Antoine Leca writes:
I just need to add 7*86400L to my start value.
Is that actually guaranteed to work with Markus's library? Don't you get an invalid time if nsec starts out above 999999999?
They (and I) want reliable and mostly straightforward ways to resolve *their* problems, not trying to write convolutate programs to handle the fantasy of Earth movement.
With libtai the programmer simply says what he means: caltime_utc(&ct,&sec,(int *) 0,(int *) 0); /* gmtime() */ ct.date.day += 7; caltime_tai(&ct,&sec); /* mktime() */ What exactly is ``convoluted'' about that? How has libtai ``impeded'' time handling?
BTW, what is your objective? do you think we should take your library and pasting it in the C standard?
Of course not. libtai is still under development.
Also, where do you find overwhelming costs behind the adoption of Markus's proposal?
Markus's proposal is a disaster from the programmer's point of view. I didn't say this was an ``overwhelming'' cost; I simply said that it's much larger than the cost of distributing leap-second tables. ---Dan 1000 recipients, 28.8 modem, 10 seconds. http://pobox.com/~djb/qmail/mini.html
"D. J. Bernstein" wrote on 1998-10-08 17:15 UTC:
Antoine Leca writes:
I just need to add 7*86400L to my start value.
Is that actually guaranteed to work with Markus's library?
Yes, it is. That was the reason why some head scratching went into the selection of how xtime_add() and xtime_diff() threat leap second values.
Don't you get an invalid time if nsec starts out above 999999999?
No. Xtime_add first converts 23:59:60.1234 into 24:00:00 before adding the other parameter. A leap second is additional time between 23:59:59.99999 and 24:00:00. This additional time is not available when we go outside the leap second. So we first have to "conceptually" shrink the borders of the leap second to a zero interval, and then the original leap second timestamp gets trapped at 24:00:00 from where you continue with normal arithmetic. An alternative (probably much more intuitive) model of thought for the same behaviour is the following: TIME_UTC does not count leap seconds as real time. It ignores the time inserted by a leap second. Think of it in the following Image: If you drive with your car along the timeline, think of leap seconds as slippery intervals of the street of time where your wheels block and your odometer (xtime_diff) stops and therefore they ignore the leap second. In this model of thought, xtime_add first has to bring you out of the leap second, and then has to add the required time interval in real non-leap time. If you start driving inside a slippery leap second, your odometer doesn't start to count from 0 to 7*86400L before you have left the leap second. If you visualize UTC arithmetic this way, it suddenly becomes very intuitive (at least for me, try yourself). This way, a lot of algebraic properites are nicely preserved by xtime_add and xtime_diff. If they were defined on a single set (not xtime + double), and if we ignore cosmological overflow, then they would form proper group operations.
They (and I) want reliable and mostly straightforward ways to resolve *their* problems, not trying to write convolutate programs to handle the fantasy of Earth movement.
With libtai the programmer simply says what he means:
caltime_utc(&ct,&sec,(int *) 0,(int *) 0); /* gmtime() */ ct.date.day += 7; caltime_tai(&ct,&sec); /* mktime() */
With xtime, the programmer simply says what she means: #define DAY 86400.0 t = xtime_add(t, 7*DAY); In Ada (a language with strong typing and full operator overloading), the equivalent API will look even simpler because xtime_add becomes of course function "+"(Left : Xtime; Right : Double) return Xtime: t := t + 7 * DAY; In an Ada API, you would of course also add function "+"(Left : Xtime; Right : Double) return Xtime; function "+"(Left : Xtime; Right : Long_Integer) return Xtime; function "+"(Left : Xtime; Right : Xtime) return Xtime; function "+"(Left : Double; Right : Xtime) return Xtime; function "+"(Left : Long_Integer; Right : Xtime) return Xtime; but for C this would probably be clumsy and overkill. The implementation of those should be obvious from the xtime_add example.
What exactly is ``convoluted'' about that? How has libtai ``impeded'' time handling?
Beauty is in the eye of the beholder, but I reserve the right to consider my code to be more straight forward. I also do not have to define an equivalent of mktime() with overflow rules as you seem to have, because I add always seconds directly. Look at the tmx proposal to see how convoluted just the proper definition of mktime becomes in that case. Code becomes difficult to read once people start to add months and years via hidden overflow rules, as months and years differ in length. I would feel highly uncomfortable by doing my time arithmetic via mktime()-like functions. That was after all the whole reason for completely abandoning time_t in my proposal.
Markus's proposal is a disaster from the programmer's point of view.
Thank you for your honest opinion. (Wait, so must be POSIX and BSD then, on which my proposal is *very* closely modeled after all ...) Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk> wrote:
I also do not have to define an equivalent of mktime() with overflow rules as you seem to have, because I add always seconds directly. Look at the tmx proposal to see how convoluted just the proper definition of mktime becomes in that case. Code becomes difficult to read once people start to add months and years via hidden overflow rules, as months and years differ in length. I would feel highly uncomfortable by doing my time arithmetic via mktime()-like functions.
Which reminds me. I think that there _should_ be an API for doing date arithmetic, and I agree that mktime() is problematical with its auto-normalization rules having to guess at what the application programmer was trying to do. Perhaps something like: struct tm add_tm(struct tm base, struct tm delta); This, to compute the date six months and three days before date X: struct tm d; d.tm_hour = d.tm_min = d.tm_sec = 0; d.tm_year = 0; d.tm_mday = -3; d.tm_mon = -6; newX = add_tm(X, d); Now the API itself needs a fair bit more thought than I've given it; I'm just tossing this one out as a starting point for discussion. But I feel strongly that it is an important function to have. Application programmers are likely to get things like leap year rules wrong (whether from ignorance or laziness), yet the desire to use these non-constant-number-of-second intervals is fairly common. Since routines such as xtime_breakup() already need to have full knowledge of the structure of our calendar, it makes sense to make use of this knowledge for broken-down date arithmetic also. --Ken Pizzini
Ken Pizzini writes:
This, to compute the date six months and three days before date X:
Could you explain exactly what you mean by that? What is 1 month before 31 March? What is 1 month before 16 March? What is 1 month before 1 March? What is 1 month after 1 February? What is 1 month after 16 February? What is 1 month after 28 February?
Now the API itself needs a fair bit more thought than I've given it;
Semantics first, please. ---Dan 1000 recipients, 28.8 modem, 10 seconds. http://pobox.com/~djb/qmail/mini.html
Date: 9 Oct 1998 23:17:26 -0000 From: "D. J. Bernstein" <djb@cr.yp.to> What is 1 month before 31 March? I see two answers to that question: * ``It's an error to ask that question.'' I.e. the mktime replacement would report an error if asked that question. * 29 February if it's a leap lear, 28 February otherwise. There are reasonable arguments for either answer. Perhaps the standard should leave it open. A closely related question is ``What is 1 minute before 23:59:60?''. Here's a good one: ``What is 1 month before 1 February 1995 in Kiritimati?''. There was no 1 January 1995 in Kiritimati; they skipped a day by moving the clocks ahead 24 hours. Similar issues arise when asking about times that occur within smaller, more typical UTC offset changes. A number of obscure questions arise here, and I'm not sure it's worth standardizing all the answers.
Paul Eggert writes:
* 29 February if it's a leap lear, 28 February otherwise.
If that's 1 month before 31 March, then what's 1 month before 30 March? 1 month before 29 March? 1 month before 28 March? And what's 1 month after 28 February? 1 month after 27 February? 1 month _before_ 27 February? I wouldn't mind adding more date-arithmetic functions to libtai/caldate, but I haven't seen a coherent explanation of the desired functions.
* ``It's an error to ask that question.''
I find it difficult to believe that users would be satisfied with that answer. What applications do you have in mind? ---Dan 1000 recipients, 28.8 modem, 10 seconds. http://pobox.com/~djb/qmail/mini.html
Date: 10 Oct 1998 15:35:05 -0000 From: "D. J. Bernstein" <djb@cr.yp.to> Paul Eggert writes:
* 29 February if it's a leap lear, 28 February otherwise.
If that's 1 month before 31 March, then what's 1 month before 30 March? 1 month before 29 March? The same as 1 month before 31 March. 1 month before 28 March? 28 February. And what's 1 month after 28 February? 1 month after 27 February? 1 month _before_ 27 February? 28 March, 27 March, and 27 January. Obviously date arithmetic like this won't have the nice properties of ordinary arithmetic: e.g. when you add 1 month and then subtract 1 month you may not end up where you started. Too bad, but that's life. The real problems occur when people start futzing with UTC offsets, e.g. the Kiritimati example I mentioned earlier. Then it becomes much harder to specify rules that match most people's intuition, since people don't have as much intuition for those sorts of problems. And there are also problems when you consider people futzing with calendars, e.g. what's 1 month before 10 October 1752 in Great Britain? But we're starting to stray from the subject....
Paul Eggert writes:
The same as 1 month before 31 March.
Okay. You can easily implement that on top of something like mktime(): void month_add(struct caldate *out,struct caldate *in,int m) { *out = *in; out->month += m; normalize(out); if (out->day != in->day) { out->day = FIRST - 1; normalize(out); } } This example certainly doesn't convince me that mktime()'s handling of invalid dates is a bad thing. ---Dan 1000 recipients, 28.8 modem, 10 seconds. http://pobox.com/~djb/qmail/mini.html
On 11 Oct 1998 01:32:39 -0000, D. J. Bernstein <djb@cr.yp.to> wrote:
This example certainly doesn't convince me that mktime()'s handling of invalid dates is a bad thing.
True, the examples we have been dealing with have not been crossing the line that mktime() has trouble with. Those mainly occur when one has to deal with a discontinuity, such as imposed by a Summer Time crossing. But I'm prepared to drop the idea, because of the point that Joe Celko raised: there is *no* well-accepted standard in the "real world" on how to deal with this. Each application will have to implement its own rules. I still wish there were some good way to provide applications with the knowledge of the calendar which the time portion of the C library has, so that we didn't have to have each application writer make their own idiosyncratic mistakes in this realm, but I cannot think of how to do this in a resonable manner. --Ken Pizzini
On 9 Oct 1998 23:17:26 -0000, D. J. Bernstein <djb@cr.yp.to> wrote:
What is 1 month before 31 March? What is 1 month before 16 March? What is 1 month before 1 March?
What is 1 month after 1 February? What is 1 month after 16 February? What is 1 month after 28 February?
Of the questions you ask, the only one which I find poorly defined is "1 month before 31 March". If I wanted, say, 30 days before any of those dates I would have used d.tm_mday = -30 (or even just direct arithmetic on a TIME_UTC xtime date). (Of course, there are many other poorly defined situations, I just don't think that the other dates you ask about are problemmatical.)
Now the API itself needs a fair bit more thought than I've given it;
Semantics first, please.
The basic plan is: add the (signed) values in the delta to the corresponding fields in the base tm, using overflow rules appropriate to the field. Thus an underflow in tm_mon causes tm_year to decrement, and an overflow in tm_mday causes an increment of tm_mon. As to the ambigous cases, how about this revision to my off-the-cuff API: int add_tm(struct tm *value, struct tm delta, timezonet_t, int guess_flag); If "guess_flag" is zero, then add_tm() shall fail if the request is not well defined ("one month after March 31"). If guess_flag is non-zero, then add_tm() shall force some plausible interpretation (*) on the result (e.g., adjusting a tm_mon where tm_mday is in {29,30,31} and the target month won't hold that value results in tm_mday being forced to the maximum value for the target month). (*) Yes, this needs to be much better defined. I'm first interested in whether others think that the add_tm() function would be appropriate to add to the standard, assuming that it can be adequately defined. I suspect that an appropriately weasle-worded description can be worked out that should satisfy most potential users of such a function most of the time, even though I sincerely doubt that there exists a single definition which will work for all possible application domains. My main concern is: this is an often-desired function, as ill-defined as it may be. Using C89 the best one could do is to use mktime() to do the normalization, but the mktime() interface looses information which can be helpful to the implementation in trying to disambibuate what the user was trying to request. While in an ideal world we could tell our clients that their business rules are ill-defined, as a practical matter we will need to do our best to handle requests of the form "six months from today". Sometimes we are allowed to interpret this as "180 days from today", sometimes we are required to add 6 to tm_mon, and overflow into tm_year if need be, and what to do if the tm_mday does not exist for that target month varies (but typically for such business rules, we take the last day of the target month for this situation). I suppose that what would be even better is if there were some reasonable way to directly expose the encoding of the calendar (which the implementation needs in order to handle xtime_breakup(), e.g.) so that an application can make use of it for its needs, as I feel that per-application attempts to duplicate this information tend to be quite error-prone. I just can't think of such an interface, and add_tm() is intended to provide the main functionality that I think applications would want such exposure for. --Ken Pizzini
Ken Pizzini writes:
I just don't think that the other dates you ask about are problemmatical.
You misunderstand. Some people seem to believe that 1 month before 31 May is ``clearly'' not 1 May; it is ``obviously'' 30 April. Does this mean that 1 month before 30 May is 29 April? And 1 month before 29 May is 28 April? And 1 month before 25 May is 24 April? On the other hand, surely 1 month before 1 May is 1 April, and 1 month before 2 May is 2 April. So 1 month before 25 May is 25 April?
If "guess_flag" is zero, then add_tm() shall fail if the request is not well defined ("one month after March 31").
What exactly do you mean by ``well defined''? What should this function do with 31 March plus -1 months and -3 days? For which values of M and D is ``31 March plus M months and D days'' defined, and what is the result in those cases?
If guess_flag is non-zero, then add_tm() shall force some plausible interpretation (*) on the result (e.g., adjusting a tm_mon where tm_mday is in {29,30,31} and the target month won't hold that value results in tm_mday being forced to the maximum value for the target month).
How does that adjustment interact with the added days? What is 31 March plus 1 month plus 1 day? What is 31 March plus 1 month plus -1 day?
Using C89 the best one could do is to use mktime() to do the normalization, but the mktime() interface looses information which can be helpful to the implementation in trying to disambibuate what the user was trying to request.
I still haven't seen a complete, coherent explanation of what _you_ are trying to request. At least the mktime() behavior is comprehensible.
My main concern is: this is an often-desired function,
_When_ is it desired? Point out some programs that are currently using mktime() and obtaining unsatisfactory results.
I suppose that what would be even better is if there were some reasonable way to directly expose the encoding of the calendar
libtai provides caldate_mjd() and caldate_frommjd(). ---Dan 1000 recipients, 28.8 modem, 10 seconds. http://pobox.com/~djb/qmail/mini.html
Date: Fri, 09 Oct 1998 13:00:37 +0100 From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk> we first have to "conceptually" shrink the borders of the leap second to a zero interval, and then the original leap second timestamp gets trapped at 24:00:00 from where you continue with normal arithmetic. An alternative (probably much more intuitive) model of thought for the same behaviour is the following: TIME_UTC does not count leap seconds as real time. It ignores the time inserted by a leap second. Think of it in the following Image: If you drive with your car along the timeline, think of leap seconds as slippery intervals of the street of time where your wheels block and your odometer (xtime_diff) stops and therefore they ignore the leap second. These models are entertaining, and it's fun to play with the algebra, but I fear that you've been involved with the problem a bit too much, and you need to step back and take a deep breath. We need a model that's simple and easy to explain; the explanations above are neither. So.... why not use the official model instead? Officially, since 1972, UTC-TAI has been a (negative) integer number of seconds, and when a leap second is inserted, UTC-TAI decreases by 1. What this means is pretty simple: on an implementation whose internal clock ticks TAI, the UTC clock ticks right along with the internal clock -- except during an inserted leap second, where the UTC clock is adjusted back by one second. When converting a UTC clock to a printed representation, it's conventional to use :60 for the inserted leap second, but this is merely a notation to indicate that the UTC clock is repeating, much as the German-standard 'A' and 'B' suffixes are notations for repeated local time when the UTC offset decreases. Viewed in this light, struct xtime's TIME_UTC is not really UTC, as TIME_UTC clocks have special values during an inserted leap second, whereas UTC clocks simply go back 1 second. TIME_UTC is therefore a compromise between UTC clocks (which are not monotonic) and POSIX clocks (which have no leap seconds). TIME_UTC therefore suffers the complexity of a solution that is neither fish nor fowl. The struct xtime proposal would be simplified if it didn't use this complicated interface, and instead used either true UTC, or true POSIX. (Of course, both true UTC and true POSIX could be supported, by having two different clock types.) I am dubious about standardizing on a new halfway-between-UTC-and-POSIX clock type that has never been used in practice and which has some nontrivial conceptual problems.
Paul Eggert wrote:
[O]n an implementation whose internal clock ticks TAI, the UTC clock ticks right along with the internal clock -- except during an inserted leap second, where the UTC clock is adjusted back by one second.
What's official about that? I don't see how a UTC clock has been "adjusted backward". Rather, a UTC minute can sometimes contain 61 seconds, properly labeled from 0 to 60. Similarly, a UTC day sometimes contains 86401 seconds. This is not at all the same as adjusting a clock, where the clock was wrong before and is hopefully now correct.
When converting a UTC clock to a printed representation, it's conventional to use :60 for the inserted leap second, but this is merely a notation to indicate that the UTC clock is repeating,
Do you have some kind of authority for this claim? -- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)
Paul Eggert wrote on 1998-10-09 19:16 UTC:
These models are entertaining, and it's fun to play with the algebra, but I fear that you've been involved with the problem a bit too much, and you need to step back and take a deep breath. We need a model that's simple and easy to explain; the explanations above are neither.
I don't think that was a fair comment, because I am as well convinced that my model is simple, straight forward, easy to understand and robust in practice. I did step back numerous times and I have seriously considered your alternative model and again and again concluded that it is problematic. Below you will find another simple application scenario, which I hope you will study seriously, and which I hope will help you to understand and acknowledge the problems that I see in your concept.
So.... why not use the official model instead? Officially, since 1972, UTC-TAI has been a (negative) integer number of seconds, and when a leap second is inserted, UTC-TAI decreases by 1. What this means is pretty simple: on an implementation whose internal clock ticks TAI, the UTC clock ticks right along with the internal clock -- except during an inserted leap second, where the UTC clock is adjusted back by one second.
When converting a UTC clock to a printed representation, it's conventional to use :60 for the inserted leap second, but this is merely a notation to indicate that the UTC clock is repeating, much as the German-standard 'A' and 'B' suffixes are notations for repeated local time when the UTC offset decreases.
Ok. So far nothing wrong in your argument. The big intellectual accident happens in the next sentence, and from then on the conclusions get dubious:
Viewed in this light, struct xtime's TIME_UTC is not really UTC, as TIME_UTC clocks have special values during an inserted leap second, whereas UTC clocks simply go back 1 second.
Sorry, but this is just obviously wrong: UTC clocks display a special overflow value (60 - 60.999...) during the leap second, *exactly* as struct xtime is doing (1e9 - 2e9-1). There is absolutely no conceptual difference between "real UTC" and TIME_UTC, since there exists an obvious bijective deterministic mapping between both. Struct xtime is just a simple fully static encoding of the full YYYY-MM-DD HH:MM:SS display as found on any official UTC clock. "Static" means here "independent of dynamically changing leap second tables".
TIME_UTC is therefore a compromise between UTC clocks (which are not monotonic) and POSIX clocks (which have no leap seconds).
No, TIME_UTC is by all means a fully correct UTC clock.
TIME_UTC therefore suffers the complexity of a solution that is neither fish nor fowl.
Again, I feel "neither fish nor fowl" is a fair comment. There exists a simple and obvious eternally valid algorithm that converts in a bijective way between a YYYY-MM-DD HH:MM:SS display of an official UTC clock, and a struct xtime value. On the other hand: There exists no eternally valid static algorithm for this conversion in your modified TAI, because the timestamp versus displayed time relationship changes each time you update a leap second table. This might be very problematic and confusing for the non-expert user (and even the expert), as I hope the following illustrates: Practical example: Say you have an electronic commerce system, which knows only the leap seconds up to the end of 1998. You enter into this electronic commerce system a command to rejects all contracts that expire after 2000-01-01 00:00:00, because for instance at this time a new law comes into effect that is unacceptable for your business. This law is so unacceptable, that even accepting a contract that expires 2000-01-01 00:00:01 is absolutely unacceptable for you and your legal department would send you to prison if your system did that. Now the following happens: Your implementation uses the current leap second table, which does not yet contain the mid-1999 leap second. It converts the 2000-01-01 00:00:00 cut-off date into an integer timestamp T based on the assumption that there will not be a further leap second until then as the current table suggests. Months later you receive an updated leap second table and you install it in your system, not knowing what fatal side effects this will have for you. The new leap second table contains an additional leap second in mid-1999. This will cause the timestamp T suddenly to be interpreted as 2000-01-01 00:00:01 by your system, because UTC clocks "repeat" one second between now and then as your system has just learned from the update. Your application software however naturally contains no code to update these integer timestamps. Your integer timestamps just change their real-live meaning as leap second tables get updated, and nobody has expected that, because they didn't read the fine print in the libtai manual that came in volume 13 of the system reference documentation. Now someone sends you a contract (e.g., a multi-million stock market option) that expires on 2000-01-01 00:00:01, trying to take advantage of the new law that will be in force by then. Fatally, your system accepts this contract against what the specification said, because after converting 2000-01-01 00:00:01 to an integer and doing an integer comparison, the expire date of this contract is now NOT any more after the cut-off date (which was originally entered as 2000-01-01 00:00:00). Result: You loose millions of dollars and go into prison. Or your lawyers kill you right away. End of example (and I could come up with numerous more). Doesn't this convince you at least a bit that there are numerous applications, where we care much less about the real number of seconds until some date than we care about what exactly the precise official UTC notation for this date is in YYYY-MM-DD HH:MM:SS notation? Your lawyers are not at all interested in that this contract expires in 41865474 seconds from now. However they are definitely interested in whether it expires on 2000-01-01 00:00:01 or 1999-31-12 23:59:59, because this can make a few millions dollar difference on the options market in case say the deal would in the former case be subject to new tax. My TIME_UTC (and also POSIX) provides this reliable relationship between the easy to process integer struct value and the full broken-down time. Your modified TAI does not provide this reliable relationship! (Unless of course, you put your leap second table under a revision control system and attach to every time stamp a version code that identifies according to which revision of the leap second table this timestamp should be interpreted when converting back to an external UTC representation. But that would of course be a ridiculous fix.) I can think of numerous examples, where the faithful encoding of UTC is what really matters. Just think about legal investigations in the context of some fraud, where timestamps in digitally signed documents were encoded using your modified TAI encoding. The timestamp would be meaningless unless you also sign with each document the entire leap second table that is necessary to interpret it, right? Relevant in the real world is UTC, not TAI. The values displayed on UTC clocks usually matter, and not the number of seconds between two events. Therefore, it is absolutely paramount that there is a secure way of converting without any ambiguity between an xtime value and what a UTC clock displays. You open a hole bag of risks by making the meaning of every timestamp on the UTC scale dependent on the interpretation of a leap second table.
The struct xtime proposal would be simplified if it didn't use this complicated interface, and instead used either true UTC, or true POSIX. (Of course, both true UTC and true POSIX could be supported, by having two different clock types.)
I certainly do true UTC. There is no such thing as true POSIX, because POSIX does not allow accurate representation of time. What some POSIX implementations therefore do at the moment as an ugly hack is to use a 1000 ppm frequency shift ramp to compensate this inconsistency in the specification, and others (probably the majority) just repeat 23:59:59. I hope you do not consider any of these to be an acceptable long-term solution.
I am dubious about standardizing on a new halfway-between-UTC-and-POSIX clock type that has never been used in practice and which has some nontrivial conceptual problems.
I am sorry, but I can't agree less. My proposal is a clear and faithful one-to-one real UTC implementation, and I consider it to be free of fundamental conceptual problems, as I hope to have pointed out in great details in the past postings. It is exactly as complex as needed, no bit too simple and no bit too complicated. I consider it to be ready for standardization. Only the details of the text need some polishing, not the basic design principles. Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
Markus observes that, thanks to leap seconds, we don't know how many TAI seconds there will be before (say) 2100-01-01 00:00:00 UTC. If you ask libtai now, its guess will be a minute away from the truth. What Markus doesn't realize is that POSIX times suffer the same problem. Thanks to future changes in the calendar, we don't know how many non-leap seconds there will be before (say) 9999-01-01 00:00:00 UTC. If you ask Markus's vaporware library, its guess will be wrong by one or more _days_! Markus's error here is an order of magnitude worse than the leap-second error. Shall I review his examples of ``risks'' in this light?
My TIME_UTC (and also POSIX) provides this reliable relationship between the easy to process integer struct value and the full broken-down time.
False. Markus's ``reliable relationship'' will fail as soon as the Gregorian calendar expires. The POSIX rules break down even sooner. ---Dan 1000 recipients, 28.8 modem, 10 seconds. http://pobox.com/~djb/qmail/mini.html
Date: Fri, 09 Oct 1998 23:13:35 +0100 From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk> Paul Eggert wrote on 1998-10-09 19:16 UTC:
struct xtime's TIME_UTC is not really UTC, as TIME_UTC clocks have special values during an inserted leap second, whereas UTC clocks simply go back 1 second.
... There is absolutely no conceptual difference between "real UTC" and TIME_UTC, since there exists an obvious bijective deterministic mapping between both. No, because "real UTC" is ambiguous during inserted leap seconds. By "real UTC" I mean the sort of UTC described by Fig. 2 of Terry Quinn, The BIPM and the Accurate Measure of Time, Proc IEEE 79, 7 (1991-07), 894-905. This is the UTC that is denoted in the expression `UTC-TAI', which is commonly used to denote how many leap seconds have been inserted. Since 1972, the graph of UTC-TAI versus time has been a staircase function that looks like this: --+ +----+ +---------+ +---.... where each decrease corresponds to an inserted leap second. This graph makes sense only if UTC clocks are adjusted backwards by one second during a leap, as I described above. That is, the `:60' notation is used to disambiguate leap-second timestamps, but it's not part of "real UTC". Struct xtime is just a simple fully static encoding of the full YYYY-MM-DD HH:MM:SS display as found on any official UTC clock. Let's call this display "display UTC", as opposed to the "real UTC" mentioned above. Then clearly the TIME_UTC struct xtime encodes "display UTC", not "real UTC". Another way to say it is that struct xtime is a broken-down representation of UTC, one that contains a special encoding to disambiguate UTC timestamps within leap seconds. struct xtime is not as broken-down as struct tm is, but it _is_ broken-down to some extent. And its conceptual problems stem from this fact. I've heard that, in some C implementations, time_t encodes a broken-down value, so that localtime simply picks the bits out of the encoded representation and stuffs them into the struct tm. This conforms to C89, but it means time_t is quite inconvenient to deal with directly. struct xtime has many of the problems that these C implementations have (albeit in a more limited way). I'd prefer an internal representation that didn't have these problems, if one is possible. There exists no eternally valid static algorithm for this conversion in your modified TAI, because the timestamp versus displayed time relationship changes each time you update a leap second table. True, but any application requiring such an algorithm should not use TIME_TAI; it should use TIME_UTC. You enter into this electronic commerce system a command to rejects all contracts that expire after 2000-01-01 00:00:00, Such an application should not use TIME_TAI for future timestamps, obviously. But, for the purpose of this discussion, let's assume that the programmer made a bad mistake and wrote code that attempts to convert every timestamp to TIME_TAI internally, even future timestamps. It converts the 2000-01-01 00:00:00 cut-off date into an integer timestamp T based on the assumption that there will not be a further leap second until then as the current table suggests. No, the implementation will report an error when the application attempts to make the conversion, since leap seconds aren't known that far in advance. That will preclude the later problems mentioned in your scenario. This issue is independent of whether the API uses struct xtime or an integer clock. there are numerous applications, where we care much less about the real number of seconds until some date than we care about what exactly the precise official UTC notation for this date Yes, and such apps should use TIME_UTC. My TIME_UTC (and also POSIX) provides this reliable relationship between the easy to process integer struct value and the full broken-down time. My proposal won't change this. (POSIX.1 is a special case of my proposal.) Your modified TAI does not provide this reliable relationship! Nor does the unmodified TIME_TAI. (Sorry, I don't see why these points are relevant.) Just think about legal investigations in the context of some fraud, where timestamps in digitally signed documents were encoded using your modified TAI encoding. I wouldn't recommend that people use such timestamps for that purpose, any more than I would recommend that they use TIME_MONOTONIC. The modified TIME_TAI clocks have an implementation-defined epoch, among other things. The code wouldn't even get off the ground; it'd be a silly thing to try. I consider [the struct xtime proposal] to be ready for standardization. I can't agree. It's not implemented yet. We don't have any real-world experience with it. Even assuming we're willing to live with its conceptual problems, a lot of work is needed before it's ready for standardization. And I'd prefer a spec that had fewer conceptual problems, before implementation begins.
Paul Eggert wrote on 1998-10-10 00:24 UTC:
By "real UTC" I mean the sort of UTC described by Fig. 2 of Terry Quinn, The BIPM and the Accurate Measure of Time, Proc IEEE 79, 7 (1991-07), 894-905. This is the UTC that is denoted in the expression `UTC-TAI', which is commonly used to denote how many leap seconds have been inserted.
Well, what I consider to be the "real" UTC is the (displayed) UTC defined in the IERS Bulletins C, and not some TAI display plus a UTC-TAI offset as you seem to suggest. Obviously, the UTC-TAI difference (and not UTC itself) needs some extra definition to be fully defined during an inserted leap second. UTC-TAI over TAI diagrams will naturally have a one second gap during inserted leap seconds (because the UTC clock goes into a range where the TAI clock never was), and it is a matter of convention of how you fill this gap if you depend on a continuous fail-safe representation of UTC-TAI. Observe what xtime_diff is doing in this case: It will fill this gap in the UTC-TAI graph smoothly by linear interpolation between 23:59:60 and 00:00:00, which is IMHO the most useful semantics in this case. (In the case of deleted leap seconds, you never have any such ambiguity anyway.)
This graph makes sense only if UTC clocks are adjusted backwards by one second during a leap, as I described above. That is, the `:60' notation is used to disambiguate leap-second timestamps, but it's not part of "real UTC".
No, the ":60" is the real UTC. There is a full inserted second (an interval of time with its own unique name) that never appears on a TAI clock. The idea of "adjusting UTC backwards" is something that you have made up. I haven't found this notion anywhere in the literature, and I also consider it not to be very helpful. Adjusting clocks backwards is just a figure of speech to explain DST switches to the general population without introducing proper notation. UTC is much better and stricter defined than most local times. You have set up yourself a fairly complicated model of thinking about UTC. Just stay strictly with the UTC as defined by the letter of ITU-R TF.460-4 and the IERS Bulletins C, and you stay out of such conceptual troubles.
Struct xtime is just a simple fully static encoding of the full YYYY-MM-DD HH:MM:SS display as found on any official UTC clock.
Let's call this display "display UTC", as opposed to the "real UTC" mentioned above. Then clearly the TIME_UTC struct xtime encodes "display UTC", not "real UTC".
OK (although I don't like your terminology to call UTC-TAI values "real" UTC).
Another way to say it is that struct xtime is a broken-down representation of UTC, one that contains a special encoding to disambiguate UTC timestamps within leap seconds. struct xtime is not as broken-down as struct tm is, but it _is_ broken-down to some extent.
Correct. It is broken down once, in order to allow a :60 overflow just as on a displayed UTC and to provide the necessary range and resolution.
And its conceptual problems stem from this fact.
Which are? You referred to "conceptual problems" around half a dozen times in the last few messages without ever giving a concrete example of what these are and why your modified TAI does not have them but why the combination of TIME_MONOTONIC and TIME_UTC in the context of my API does have them. I still don't understand exactly, where you see these problems. A real live application example that is elegant in your API but very clumsy in my API would be helpful here.
I've heard that, in some C implementations, time_t encodes a broken-down value, so that localtime simply picks the bits out of the encoded representation and stuffs them into the struct tm.
I've never seen any of these, but it probably is possible to do a correct implementation this way. Say time_t is a 64 bit word (16 nibbles) with a YYYYMMDDHHMMSS.SS BCD encoding (10000 years range, 10 ms resolution, leap seconds possible, certainly better than many other encodings that I have seen).
This conforms to C89, but it means time_t is quite inconvenient to deal with directly.
Agreed, it is inconvenient in that not only difftime will have to transform everything first to a sec scale, but also that in that localtime will have to add the UTC offset and then adjust the broken-down representation for minute, hour, day, month, and year overflows. struct xtime has none of these problems, because it has no minute, hour, day, month, and year overflows. And my equivalent of difftime is only three lines long, as I posted before.
struct xtime has many of the problems that these C implementations have (albeit in a more limited way). I'd prefer an internal representation that didn't have these problems, if one is possible.
Which problems? It would be much easier for me to answer if you'd actually name them. It obviously can't be the minute, hour, day, month, and year overflows, because struct xtime has none of these. There is only the nsec overflow, and I wouldn't call the single if statement necessary to handle it a "conceptual problem" (especially since it is there for good reasons: to allow leap seconds to be encoded, do provide a comfortable range and resolution, and to allow for compatibility with POSIX's struct timespec).
There exists no eternally valid static algorithm for this conversion in your modified TAI, because the timestamp versus displayed time relationship changes each time you update a leap second table.
True, but any application requiring such an algorithm should not use TIME_TAI; it should use TIME_UTC.
OK, so my critique here was then based on the unfortunate fact that you have not yet published a full proposal and I have not yet had a chance of completely understanding what precisely is the set of clock and timestamp types that you plan to offer. My example was based on the assumption that you are in the TAI-only fanatics camp with Mr. Libtai. It would be helpful for the discussion if you published an actual specification in a language close to something that could be copied into the standard. Based on what you said so far, I am confident that I can efficiently implement your additional clocks with very few lines of code on top of xtime_get(..., TIME_UTC) and tz_jump() (note that both functions give you full access to "displayed UTC" plus a window of nearly leap seconds, even if CLOCK_TAI is not available, as you required, which is exactly what you get from NTP and DCF77). But to proof this, I obviously have to see your precise specification first. Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
Date: Sat, 10 Oct 1998 11:36:11 +0100 From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk> I don't like your terminology to call UTC-TAI values "real" UTC. Fair enough. How about if we call the UTC-TAI based clock "internal UTC", as opposed to "display UTC"? what I consider to be the "real" UTC is the (displayed) UTC defined in the IERS Bulletins C, and not some TAI display plus a UTC-TAI offset as you seem to suggest. But those bulletins define internal UTC as well as displayed UTC. That is, they specify both UTC-TAI and the display format. UTC-TAI is a first-class citizen in these standards. the UTC-TAI difference (and not UTC itself) needs some extra definition to be fully defined during an inserted leap second. UTC-TAI over TAI diagrams will naturally have a one second gap during inserted leap seconds (because the UTC clock goes into a range where the TAI clock never was) The time standard does not supply, nor does it need any ``extra definition''. E.g. the latest IERS Bulletin C at http://hpiers.obspm.fr/iers/bul/bulc/bulletinc.dat does not specify a gap in internal UTC. Governing bodies like the BIPM publish diagrams graphing UTC-TAI over TAI that don't have any gaps. And the IERS publishes tables that make it quite clear that internal UTC is adjusted by subtracting inserted leap seconds from it. For example, see: http://hpiers.obspm.fr/webiers/general/earthor/utc/table2.html Internal UTC is meant for clocks; display UTC is meant for humans. That is, internal UTC is suited for implementations of clocks that are based on counting seconds one-by-one; conversely, display UTC is suited for displays meant for humans. Adjusting clocks backwards is just a figure of speech to explain DST switches to the general population without introducing proper notation. Sorry, you've lost me. Adjusting clocks backwards is a figure of speech? In a couple of weeks, I'll be wandering over my house, office, and car manually adjusting dozens of dumb localtime clocks backward. On that day it certainly won't feel like a figure of speech. :-) Nor is it a figure of speech with internal UTC; the ideal internal UTC clock is adjusted when leap seconds occur.
Then clearly the TIME_UTC struct xtime encodes "display UTC", not "real UTC".
You referred to "conceptual problems" around half a dozen times in the last few messages without ever giving a concrete example of what these are We've discussed several such problems, including bogus leap seconds (i.e. struct xtime values with nsec>=1000000000 that do not correspond to actual leap seconds), and glitches with time arithmetic involving true leap seconds. I'm not saying that you aren't aware of these problems and aren't working on solutions to them. What I'm saying is that they _are_ problems, and they require solutions, and that the solutions complicate the model. my equivalent of difftime is only three lines long, as I posted before. But (as I mentioned earlier) that implementation has a double-rounding bug. Also, the interface requires information loss if the times are sufficiently far apart, at least on the vast majority of hosts where double can't represent 96-bit integers exactly. There's no easy, portable fix for either problem.
struct xtime has many of the problems that these C implementations have (albeit in a more limited way)....
There is only the nsec overflow, and I wouldn't call the single if statement necessary to handle it a "conceptual problem" (especially since it is there for good reasons... I think it's more than just a single if statement, and that people will need if-statements sprinkled throughout their code. Let me put the problem a different way. The time standard provides two forms of UTC: display UTC and internal UTC, each with its own advantages and disadvantages. struct xtime attempts to be a compromise between the two forms, in order to have some of the advantages of both. Unfortunately, this means struct xtime also has some of the _disadvantages_ of both. Some of these disadvantages are mentioned above. Furthermore, struct xtime is not isomorphic to either display UTC or internal UTC, so it has some problems that are uniquely its own (e.g. values with nsec>=2000000000). It would be helpful for the discussion if you published an actual specification in a language close to something that could be copied into the standard. Yes, and I'm working on it. Sorry, it isn't done yet; when it is, we can do some comparisons (it'll be your turn to throw bricks :-). By the way, I've found your discussions (along with comments by the others) to be quite helpful in ironing out problems in the draft.
Paul Eggert wrote on 1998-10-11 00:25 UTC:
Adjusting clocks backwards is just a figure of speech to explain DST switches to the general population without introducing proper notation.
Sorry, you've lost me. Adjusting clocks backwards is a figure of speech? In a couple of weeks, I'll be wandering over my house, office, and car manually adjusting dozens of dumb localtime clocks backward. On that day it certainly won't feel like a figure of speech. :-)
I think it is time mankind should advance the state of the art on household clocks a bit: Instead of wandering around to turn them back, you should be able to program them such that tonight they will go through the hours 1, 2A, 2B, 3, etc. Then DST switching becomes an exercise of inserting and deleting leap hours into the time scale, removing the ambiguity of the repeated hour. Most clocks have integrated circuits, therefore this functionality would cost practically nothing to add. If you know a clock manufacturer desperately looking for new features to add, feel free to forward this patent-free idea. Unfortunately, the only self-adjusting clocks that I know of are computers and radio clocks. Actually thinking about it, there is a neat way of encoding hour 2A in a distinguishable way even on "analog" clock displays: Just stop the motion of the hour pointer for 60 min during hour 2A, and it will be immediately recognizable whether you are in hour 2A or 2B (except perhaps for the first few minutes of the hour, where the smaller pointer hasn't moved yet much during hour 2B). Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
Markus wrote on Sunday, October 11, 1998 5:45 AM:
Most clocks have integrated circuits, therefore this functionality would cost practically >nothing to add.
Sorry, but just because it is software does not mean it costs "practically nothing" to add. BTW, there are a few of these on the (U.S.) market that use radio frequencies to automatically adjust to the U.S. Atomic Clock. As I recall, the wrist watch was about $900, the analog wall clock about $200 and the digital clock/radio about $100 (all U.S. dollars). Rich Shockney mailto:rshockney@ibm.net
"Richard L. Shockney" wrote on 1998-10-11 15:38 UTC:
BTW, there are a few of these on the (U.S.) market that use radio frequencies to automatically adjust to the U.S. Atomic Clock. As I recall, the wrist watch was about $900, the analog wall clock about $200 and the digital clock/radio about $100 (all U.S. dollars).
These clocks are probably based on WWV, which is a short wave transmitter, for which receivers are a bit more expensive than for long-wave transmission. The German DCF77 is a very powerful (50 kW) long-wave transmitter. Long-wave radiation has the advantage that it penetrates buildings quite well, so you do not need external antennas and your clocks work in every room. I used to have a DCF77 wrist watch for around $50, and I have connected to the serial port of my Linux box a DCF77 receiver for around $20. Since suitable long-wave receivers can be implemented in a very simple chip design with practically the only external component being the receiver coil/capacitor combination, you find DCF77 more and more often in lowest cost products (< $50) in Germany. NIST also operates a WWVB 60 kHz long-wave service in the US, but I it has a much weaker signal than DCF77 (50 kW). WWVB used to have 10 kW for a long time and was only very recently upgraded to 23 kW, with plans for further upgrades to 35-40 kW. The British Telecom transmitter MSF in Rugby has 27 kW at 60 kHz. http://www.boulder.nist.gov/timefreq/ http://www.ptb.de/english/org/4/43/433/disse.htm http://www.npl.co.uk/npl/ctm/msf.html Once these WWVB upgrades have been completed, I would expect that you might also get in the US mass market radio clocks that are as cheap as those in Central Europe. Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
Markus Kuhn said:
Most clocks have integrated circuits,
??? Most clocks in this house have a battery and a small fixed-speed motor. -- Clive D.W. Feather | Work: <clive@linx.org> | Tel: +44 1733 705000 Regulation Officer | or: <clive@demon.net> | or: +44 973 377646 London Internet Exchange | Home: <clive@davros.org> | Fax: +44 1733 353929 (on secondment from Demon Internet)
Paul Eggert wrote on 1998-10-11 00:25 UTC:
my equivalent of difftime is only three lines long, as I posted before.
But (as I mentioned earlier) that implementation has a double-rounding bug.
I don't think so. The code in question was: (double) ((t1.sec - t2.sec) + (t1.nsec - t2.nsec) / 1.0e9) where t?.sec is at least 64-bit int and t?.nsec is at least 32-bit int. Can you really construct input values that will lead to your claimed double rounding error on say a Pentium under gcc/Linux (standard IEEE double arithmetic), or is this "bug" just a suspicion based on the common (but inappropriate) belief that floating point arithmetic is incomprehensible magic stuff that always adds unpredictable noise in the last significant bits of the mantissa? Note that int -> double rounding only takes place if the mantissa is shorter than the integer value. Otherwise the conversion is just a lossless and fully reversible reformatting of the number. I assume that what you are talking about is that (t1.nsec - t2.nsec) is first converted to double, and then the result of the double division is rounded. However, have you considered that the 32-bit (t1.nsec - t2.nsec) result fits completely into the > 32-bit double mantissa and NO rounding can take place here? The division is IEEE guaranteed to be the closest value, the (t1.sec - t2.sec) could be larger than the mantissa, which will just move insignificant bits of the division result out of the mantissa but not add additional rounding uncertainty that could move us away from the closest possible value. I have not yet found the time to formally proof it, but it looks to me very much that there is no double rounding going on here and that the presented C code is the best you can do in an implementation. I would not know what I could do better in assembler.
Also, the interface requires information loss if the times are sufficiently far apart, at least on the vast majority of hosts where double can't represent 96-bit integers exactly. There's no easy, portable fix for either problem.
There is a straight forward way to represent the difference as a 96-bit struct xtime value. The code should be completely obvious, so I didn't want to waste any time to post it in addition, but I mentioned several times that especially in languages with operator overloading and strong typing I would of course also expect xtime-only versions of the arithmetic functions to be present in an API. I consider double arithmetic useful here, because although I consider it unacceptable that timespamps become less precise the farer we get away from the epoch, I assume that most applications are perfectly happy with floating point values used in their own calculations, where the precision decreases logarithmically with the size of the difference but is guaranteed to be independent of the age of the epoch. Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
Date: Sun, 11 Oct 1998 12:22:03 +0100 From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk> (double) ((t1.sec - t2.sec) + (t1.nsec - t2.nsec) / 1.0e9) where t?.sec is at least 64-bit int and t?.nsec is at least 32-bit int. Can you really construct input values that will lead to your claimed double rounding error Sure. Let's use the Sparc IEEE implementation, which straightforwardly maps `double' to IEEE 64-bit double, and let's assume round-to-even, which is the IEEE default. Then here are example input values: t1.sec = 9007199254740993 (i.e. 2**53 + 1) t1.nsec = 1000000000 (i.e. 10**9) t2.sec = t2.nsec = 0 The exact answer is 9007199254740994 (i.e. 2**53 + 2), a number that is exactly representable as an IEEE double. But the expression above yields 9007199254740992 (i.e. 2**53) -- it is off by 2. There is a straight forward way to represent the difference as a 96-bit struct xtime value. The code should be completely obvious, The code _should_ be obvious, but it's very likely that people will get it wrong in practice. The bugs in your example code are minor in comparison to some of the stinkers I've seen in real life. Let's use a less error-prone approach. I consider it unacceptable that timestamps become less precise the farer we get away from the epoch, I assume that most applications are perfectly happy with floating point values used in their own calculations Sorry, I don't follow you here. If it's unacceptable for timestamps to become less precise, why is it acceptable for time differences to become less precise? After all, a timestamp is merely a time difference from an epoch. And people use time differences to compute timestamps all the time, so errors in time differences will cause errors in timestamps.
Paul Eggert wrote on 1998-10-12 06:21 UTC:
Date: Sun, 11 Oct 1998 12:22:03 +0100 From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>
(double) ((t1.sec - t2.sec) + (t1.nsec - t2.nsec) / 1.0e9)
where t?.sec is at least 64-bit int and t?.nsec is at least 32-bit int.
Can you really construct input values that will lead to your claimed double rounding error
Sure. Let's use the Sparc IEEE implementation, which straightforwardly maps `double' to IEEE 64-bit double, and let's assume round-to-even, which is the IEEE default. Then here are example input values:
t1.sec = 9007199254740993 (i.e. 2**53 + 1) t1.nsec = 1000000000 (i.e. 10**9) t2.sec = t2.nsec = 0
The exact answer is 9007199254740994 (i.e. 2**53 + 2), a number that is exactly representable as an IEEE double. But the expression above yields 9007199254740992 (i.e. 2**53) -- it is off by 2.
Come on, that is a value 0.28 billion years from now. Do you have a non-pathologic case, where you do not fill up the double mantissa completely? We define non-pathological as follows: Clive Feather was just here in the Lab, and one of the things he said certainly needs to be changed is that the struct xtime requirement int_fast64_t sec; int_fast32_t nsec; has to be relaxed to require sec only to represent roughly all values from the start of the year -9999 to the end of the year +9999 ("proleptic Gregorian calendar in astronomical integer year notation" for the nitpickers ;-). This includes all ISO 8601 representable time stamps, the Julian Date origin, the limits of the accuracy of the Gregorian calendar, the entire recorded human history ("recorded" meaning "written documents have been found"), and should therefore be OK for >>99% of all applications. This range requires a bit less than 40 bits for sec, and then we do not care about rounding errors in IEEE double if sec is not a 40-bit representable integer. A few other things Clive mentioned: The language of my draft is not acceptable and needs a major rewrite to become an ISO proposal. It has to be prefixed with a convincing and very easy to read tutorial with some usage examples (which will later become part of the rationale) that introduces readers to concepts and problem areas such as leap seconds and time zones and clearly motivates every proposed feature (whereas my text at the moment was more written for people with a long experience in the field and gives just pointers into the literature). The *real* problems are more related to how we present the new proposal to the committee and actually getting it through, and not so much to all the more bizarre technical things we have been discussing here over the past few days. In other words, we should be much more concerned about the PR aspects of the proposal than about very technical details and esoteric clock models, otherwise this will stay an academic exercise. Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
Date: Mon, 12 Oct 1998 13:04:06 +0100 From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>
The exact answer is 9007199254740994 (i.e. 2**53 + 2), a number that is exactly representable as an IEEE double. But the expression above yields 9007199254740992 (i.e. 2**53) -- it is off by 2.
Come on, that is a value 0.28 billion years from now. But a couple of messages ago you were confident of being able to prove that the expression works for all timestamps. Do you have a non-pathologic case, where you do not fill up the double mantissa completely? We define non-pathological as follows: Can you prove that there isn't one? Or will your definition of ``non-pathological'' (which was missing from the copy of the message that I received) define away the problem? The implementation must work correctly for _all_ timestamps, because such timestamps will occur in practice if they are representable. (Among other things, the maximum timestamp will occur often in real code.) In practice, simple rounding error (which is large and cannot be avoided) will be more important than this more esoteric multiple-rounding error (which is smaller and can be avoided, e.g. if the library internally uses infinite precision arithmetic). It is the simple rounding error that I am mainly objecting to; the multiple-rounding error is icing on the cake. int_fast64_t sec; int_fast32_t nsec; has to be relaxed to require sec only to represent roughly all values from the start of the year -9999 to the end of the year +9999 My draft spec won't place any restriction on the representable years, as that is a quality-of-implementation issue. Even these relaxed requirements are overkill for the vast majority of applications. A C-based CPU running in an automobile engine shouldn't be required to handle timestamps all the way back to the Clovis people. This range requires a bit less than 40 bits for sec, and then we do not care about rounding errors in IEEE double if sec is not a 40-bit representable integer. This statement confuses the minimum requirement with the actual implementation. If the actual implementation supports 64-bit sec (which is likely under struct xtime proposal), then the larger timestamps are possible, and any credible implementation must handle them correctly. The *real* problems are more related to how we present the new proposal to the committee and actually getting it through, Yes, the politics must be handled carefully. But the technical side must also be done carefully; otherwise, what's the point of doing anything?
In fact all this bludgeoning of each other with UTC timestamps and TAI timestamps is quite irrelevant to any matter where legal future times are wanted. Future times that are of legal significance *must* be maintained in fully broken-down format, because they are invariably specified according to the local time of a specified place. There is no telling how many seconds will elapse between Now and Then, because the timezone rules may change between Now and Then, by as much as 24 hours! -- John Cowan cowan@ccil.org e'osai ko sarji la lojban.
Markus Kuhn writes:
(Wait, so must be POSIX and BSD then, on which my proposal is *very* closely modeled after all ...)
Don't try to blame BSD for your mistakes. To record a BSD timestamp I can simply call gettimeofday(). The code will never fail. The difference between two timestamps is always an excellent approximation to the real time difference---as long as the machine is _not_ running xntpd. Your API does not have a sane replacement for gettimeofday().
a lot of algebraic properites are nicely preserved by xtime_add and xtime_diff.
Don't be silly. For your notion of ``subtraction'' it isn't true that (a-b)+b=a, for example, or that a-b<a when b>0.
With xtime, the programmer simply says what she means: #define DAY 86400.0 t = xtime_add(t, 7*DAY); In Ada (a language with strong typing and full operator overloading), [ blah, blah, blah ]
Brilliant. Now show us your definitions of MONTH and YEAR.
I also do not have to define an equivalent of mktime() with overflow rules
People also do not have to use your library. ---Dan 1000 recipients, 28.8 modem, 10 seconds. http://pobox.com/~djb/qmail/mini.html
Date: Tue, 06 Oct 1998 16:43:08 +0100 From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk> I suggest to add the following three functions. xtime_diff and xtime_add do not always report the correct answer; they suffer from rounding errors. This is inherent to the interface, at least on the vast majority of hosts (e.g. hosts where `double' is IEEE double-precision). Applications that want to work reliably on all timestamps will therefore have to avoid these primitives. double xtime_diff(struct xtime t1, struct xtime t2) ... return (t1.sec - t2.sec) + (t1.nsec - t2.nsec) / 1.0e9; This implementation suffers from double-rounding, so the answer is not always the closest to the true result. An implementation should be free to supply a more precise answer than the one implied by this expression. struct xtime xtime_add(struct xtime t, double d) ... double ds;... t.nsec += floor(modf(d, &ds) * 1.0e9 + 0.5); t.sec += ds; First, this also suffers from multiple-rounding problems. Second, why is that `+ 0.5' there? strfxtime is supposed to truncate to minus infinity, so for consistency shouldn't conversion to floating point also truncate to minus infinity? Third, if we really want rounding instead of truncation, the implementation should be free to round to even (instead of rounding to minus infinity), as round-to-even is the IEEE floating-point default and typically gives better answers in practice. * Assertions: * * fabs(xtime_diff(xtime_add(t, d), t) - d) <= 0.5e-9 * for all t, d with fabs(t.sec + d) < 2**62 and t.nsec in the valid range I think this assertion is false in the presence of rounding (or truncation) errors, but I haven't checked this. There may also be problems with floating-point overflow on weird implementations allowed by the C standard; I haven't checked this either (this is not of practical importance, but the C standards nerds will care). * xtime_cmp(xtime_add(t, d), t) * is in the set {-1, 0, 1} and has the same sign as d or is 0 iff d==0.0 This assertion is false e.g. if d == 1e-10. So many problems in what should be simple code! I still say that struct xtime is way too error-prone compared to integer timestamps. If we used integer timestamps, we wouldn't need xtime_cmp, xtime_diff, and xtime_add at all, as <, -, and + would do the job more efficiently and more naturally. This allows you to use time intervals such as 86400.0 in order to mean "one day later" on the UTC clock, which is what most applications need (and what POSIX provided so far). This reinforces my previous suggestion that we should modify the spec so that TIME_UTC clocks ignore leap seconds entirely. This will make the primitives operate more consistently, and will remove the need for tricks like `if (t1.nsec > 1000000000) t1.nsec = 1000000000;' in user code. Has anyone made some experiments or can contribute arguments to decide on whether passing 96-bit structs by pointer or by value is more desirable? Has anyone completed the UTC part of the xtime_make() specification code in the proposal or do I have to do this myself? Is anyone willing to merge the xtime text with the existing C9X draft and also include the change requests that Paul submitted to ANSI? All these are is on my list of things to do, though I'll be proposing the TIME_UTC variant mentioned above, so it may not match exactly what you want.
In message <E0zQZGb-0003Om-00@heaton.cl.cam.ac.uk>, Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk> writes
Several commentors have expressed a desire to have a few basic predefined arithmetic functions on xtime values.
I suggest to add the following three functions. These functions are so simple that I feel the easiest way of specifying their behaviour is by providing a sample implementation (i.e., using C as the specification language instead of English):
Sorry, but if they're that simple, why bother ? -- Clive D.W. Feather | Regulation Officer, LINX | Work: <clive@linx.org> Tel: +44 1733 705000 | (on secondment from | Home: <cdwf@i.am> Fax: +44 1733 353929 | Demon Internet) | <http://i.am/davros> Written on my laptop; please observe the Reply-To address
"Clive D.W. Feather" wrote on 1998-10-08 19:16 UTC:
In message <E0zQZGb-0003Om-00@heaton.cl.cam.ac.uk>, Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk> writes
Several commentors have expressed a desire to have a few basic predefined arithmetic functions on xtime values.
I suggest to add the following three functions. These functions are so simple that I feel the easiest way of specifying their behaviour is by providing a sample implementation (i.e., using C as the specification language instead of English):
Sorry, but if they're that simple, why bother ?
1) Because much more than one commentor requested them. 2) Because they can be expected to be present in higher programming languages with operator overloading anyway, and for the people who develop these bindings, it will be helpful to have some guidance regarding the semantics. 3) The implementation certainly is trivial, but the semantic in the context of leap seconds has cost me a few hours of scratching my head before I got the ones with which I am happy now. I did throw away a number of alternative functions that did not result in the nice algebraic properties and in the consistency with the TIME_UTC model that the ones I proposed now have (see the assertion). 4) The functions are useful for discussions about the API, as they allow me to quickly write down algorithms based on this API (see my previous postings where I did already use them) I believe that there is some expert insight in the selection of these functions and that the majority of programmers might get less favourable functions if they are given 10 minutes to write them. The operator overloading in a C++ or Ada API will probably be more orthogonal and complete and will also include functions that replace the double by xtime, but given the presented functions, the semantics of these additional operators should be very clear and obvious, therefore I did not suggest to add these as well to the C API. Since there is no operator overloading in C, there is the danger of overwhelming the user with too many functions if you provide full orthogonality through all possible type signatures (double, xtime, int, etc.). C is a bit too primitive for designing really nice APIs. Markus -- Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
participants (9)
-
Antoine Leca -
Clive D.W. Feather -
Clive D.W. Feather -
D. J. Bernstein -
John Cowan -
Ken Pizzini -
Markus Kuhn -
Paul Eggert -
Richard L. Shockney