draft C9x <time.h> (followup to Clive Feather's 18 Jun comments)
Date: Thu, 18 Jun 1998 17:32:40 +0100 From: "Clive D.W. Feather" <clive@on-the-train.demon.co.uk> This explains, I hope, why there are no reentrant versions of the functions in <time.h> - no-one expressed any desire to have them added (or if they did, they didn't do anything about it). It's fair to say that the C9x deliberations on <time.h> have not been publicized well outside the committee. The public C9x draft was the first I'd heard of it, and I try to follow time zone programming issues fairly closely. On this particular issue, there is considerable expertise outside the committee, and I hope that the committee will be open to careful criticisms by outside experts. There is a widely recognized need for ``reentrant'' versions of localtime, gmtime, etc. The need is so widely recognized that these functions are now in POSIX.1. It would be a shame for C9x to omit them -- there's nothing POSIX-specific about them. (I quoted ``reentrant'' above because the POSIX.1 functions aren't really reentrant if you are changing locales or time zones on the fly, but that's a larger subject and I presume that C9x won't be able to address it.) 7.16.1 para 3: Replacing tm_zone with tm_gmtoff (except that it ought to be called tm_utcoff) True, but (alas) the name `tm_gmtoff' is in common use, and it's consistent with the name `gmtime'. The tm_ext and tm_extlen members are not a kludge, but rather an anti- kludge! ... WG14 felt the adopted proposal was optimal. I'm afraid that the only opinions that I've seen (which are not from WG14) are that the tm_ext and tm_extlen members are a kludge. They don't reflect existing practice, and there's no precedent for them in other parts of the standard. They are an experiment invented by the committee, and it's quite likely that they won't work well in practice. Let me give an example problem that might arise with these new members. I assume that the implementation can allocate storage and assign its address to tm_ext; surely this is part of the point of tm_ext. Can it later free that storage? If so, then how can an application ever copy struct tmx values returned by the implementation, since their tm_ext members might be invalid? If not, then won't there be problems either with garbage collection or reused static storage? The storage allocation problem becomes worse if we have reentrant functions like localtime_r, since then the static storage solution is not even feasible and we are forced to deal with garbage collection issues. For this reason alone, tm_ext should go. You mentioned that the committee considered other possible approaches; what were they, and what objections did the committee have to them? Perhaps I can suggest improvements on them that would overcome the objections. 7.16.2.3 para 4: I don't understand your objection. The paragraph you cite means, in effect: the normalization process shall not alter a broken-down time that was generated by the normalization process But the standard also requires that the second call to mktime must return the same time_t value as the first call. That is what makes the requirement unrealistic. I see that I should have given more details about the problem (and I apologize for not being clearer in my earlier comments). Here's an example. Suppose I am in Sri Lanka, and invoke mktime on the equivalent of 1996-10-26 00:15:00 with tm_isdst==0. There are two distinct valid time_t values for this input, since Sri Lanka moved the clock back from 00:30 to 00:00 that day, permanently. There is no way to disambiguate these two time_t values with tm_isdst, since both times are standard time. Therefore, mktime should be entitled to return either time_t value. On examples like these, at least one mktime implementation (mine :-) can return different time_t values for the same input at different times during the execution of the program, since it uses a cache to improve performance. It's unreasonable for the standard to disallow this performance improvement. To normalize a struct tm, the implementation should do the equivalent of: - copy it to a struct tmx - set the additional fields and tm_isdst as described - normalize the result - copy the relevant fields back, except for tm_isdst which is just set to negative, zero, or positive as needed. But the last step of this process loses information, and the next invocation of mktime cannot reasonably be expected to intuit the lost information, as shown in the example above. 7.16.2.4 para 3: If tm_isdst is negative the zone information is not available, so the implementation should assume that "local time" has allowed for any DST in effect - in this case, 7.16.2.6 will set X2 to 0 in the algorithm. Thanks for the clarification. Perhaps you could add some text to the standard about this? 7.16.2.6 para 1: A negative tm_isdst, in both struct tm and struct tmx, means "unavailable". If POSIX.1 tries to give it a different meaning, POSIX.1 is broken. I wasn't referring to tm_isdst's value when I wrote ``negative daylight-saving time''; I was referring to a daylight-saving UTC offset that is less than the standard time UTC offset. I don't know that this has ever happened in practice (I've often heard rumors about it but none of them have ever checked out), but I'm leery that draft C9x disallows it, since POSIX.1 does allow it. Note that you never need to handle negative DST: take all the UTC offsets that the locale uses, and make the most negative the base time with all others being DST (in other words, replace negative "summer" time with positive "winter" time). But then tm_isdst can be nonzero even when daylight-saving time is not in effect. E.g. under your proposal, tm_isdst should be nonzero now in Sri Lanka (even though Sri Lanka does not now observe daylight-saving time), because Sri Lanka's UTC offset now (+0600) is greater than some UTC offset that it has had in the past (e.g. +0520). If this is what is really meant by struct tmx's tm_isdst member, then the member's name is misleading. Its name should be `tm_offset_from_most_negative_historical_gmtoff' or something like that. But frankly, I don't see how such a member's contents would be useful in practice, and I think users should avoid it entirely. 7.16.2.6 para 2: These limits were chosen to allow calculations to be done in longs without having to make excessive effort to avoid overflow. Even with no limits, a calculation shouldn't need excessive effort; it should need only a relatively small sanity check near the end. Your statement that you can't calculate today's date is wrong: tm_mday = time_t_now / 86400; tm_sec = time_t_now % 86400 My Unix host has leap second support, so that method doesn't work. However, I do understand your concern. I have no objection in principle to changing these limits.... Good; let's remove them. 7.16.2.6 para 3: (1) The algorithm assumes that "day" means 86400 seconds at all times This behavior is not reasonable in the presence of leap seconds. It's not what users will expect or want. If I'm writing an accounting application and ask for 1 day after midnight December 31, I don't want mktime to return December 31 merely because that day happens to have a leap second! Also, this behavior is not consistent with the other well-established properties of mktime. If I am at the start of a month and add 1 month, mktime returns the start of the next month, regardless of how many days are in the current month. Similarly, if I am at the start of a minute and add 1 minute, mktime should return the start of the next minute, regardless of how many seconds are in a minute. If you can construct an alternative algorithm I'll be please to look at it. There's no simple answer to this (which is why I don't think it ought to be in the standard). But since you asked, one way to do it is to normalize seconds-per-minute in a way similar to days-per-month. This is done in my free version of mktime. You can get a copy of a recent version from the latest GNU Emacs source code (<ftp:://ftp.gnu.org/pub/gnu/emacs-20.2.tar.gz>; see the file src/mktime.c) or from recent GNU/Linux C library sources. (2) S and D *are* determined - while they are implementation-defined in some circumstances (those when X1 and X2 are used), the implementation should only be able to pick one value for any given unambiguous input and environment. In order to choose X1 and X2, the implementation (in general) must know S and/or D. For example, how can the implementation choose the number of leap seconds to insert (X1) until it knows which day we're talking about (D)? So the definition looks circular to me. It sound like you're trying to break the circularity by saying that the implementation consults an oracle to choose X1 and X2. But in that case, more explanation is needed. For more about this please see (5) below. (3) It is C code (as should be clear from the typeface). There is no possibility of overflow if the limits of paragraph 2 are kept to, Yes there is. For example, suppose tm_hour == INT_MAX && INT_MAX == 32767. Then tm_hour*3600 overflows, even though tm_hour satisfies the limits of paragraph 2. This is just the first example I found; there are others. Also, if you remove the limits (as suggested by the comments to 7.16.2.6 para 2 above), then the overflow problem becomes worse if you assume the spec is written in C. If there is to be a detailed spec like this at all (which I'm not yet convinced of, considering its problems), I suspect that it will be much easier to write it using mathematical arithmetic than using C arithmetic; and if done well, it won't be any harder to implement. and there are no promotions or conversions happening as far as I can tell. The pseudocode doesn't declare the types of SS, M, Y, Z, D, or S. However, I presume that `int' won't suffice due to potential overflow problems like the one discussed above; and in that case, there will be promotions or conversions. (This problem is another argument for using mathematical notation instead of C here.) (5) I wrestled with this problem myself and failed to come up with a good answer.... can you find a better way of expressing it ? Sorry, I can't think of a simple change that will correct the problem. The only avenue that I can think of is that the section could be reformulated as a _constraint_ on S and D, not as a way of _determining_ S and D. That is, S and D would not be uniquely determined by the inputs. But this will require some real thought. If time is pressing, I suggest removing this section completely. If that's not acceptable, perhaps you can put in some vague English that expresses the intent. (6) Yes, an error does seem to have crept into the definition of D. The first line should read: D = Y * 365 + QUOT(Z,400) * 97 + REM(Z,400) / 4 - REM(Z,400) / 100 + This response doesn't address my criticism that the definition of D is unmotived. You wrote in response to (5) that ``Effectively D and S are the TAI date and time since some epoch,'' but you didn't say what the epoch is, nor did you say what ``effectively'' means. Please add comments so that it's absolutely clear what D and S are, so that other people can check your work. My comments on this section proposed a (slightly different) definition for D in which the epoch is 0000-03-01; this change doesn't need to be adopted as-is, but whatever definition that _is_ adopted ought to be explained. I currently have the impression that nobody other than yourself has ever completely understood the definition as it now stands. This is an unsatisfactory state of affairs. The rest of the expression is correct (including the Y at the start). Why is it Y and not Z? 7.16.3.5: The zonetime function is the struct tmx version of localtime and gmtime, but instead of offering only 2 choices of zone it offers any zone. No, zonetime doesn't offer _any_ time zone; it offers only local times that can be characterized by a single, invariant UTC offset. In practice, local times usually cannot be characterized this way, because they involve daylight-saving time, or historical changes to the underlying standard UTC offset, or both. For example, I can't use zonetime to determine the broken-down time for `*timer' in London's time zone, because London observes daylight-saving time (and also because London's standard UTC offset has not always been zero). This means that `zonetime' is of only limited use in practical applications. It hardly seems worth adding to the standard. 7.16.1 para 2: The limits of 14400 are correct. This allows you to adjust by up to 10 days (not 1 day) in an unnormalized time without risking accidentally using _LOCALTIME. Ideally _LOCALTIME would be something like INT_MAX or LONG_MIN. OK; but please add a footnote to this effect, as it's confusing otherwise. 7.16.2.6 para 3: The macros can't hit overflow with the limits of paragraph 2. Yes they can, because those limits are in terms of LONG_MAX, but sometimes the arguments of the macros are int, not long. (Also, as mentioned earlier, if the limits are removed, the macros can overflow even if int and long are the same size; but all this is irrelevant if we're using mathematical notation, not C.) Finally, I didn't see any response to my comment (quoted below) for section 7.16.3.6 paragraph 5; did I miss something? ``If this value is outside the normal range, the characters stored are unspecified.'' What is the ``normal range''? The range as output by localtime, the range of the Gregorian calendar, or the limits as specified in 7.16.2.6?
participants (1)
-
Paul Eggert