comments on draft ISO C9x changes to <time.h>

June 12, 1998

      The ISO committee in charge of the C language has issued a draft for
C9x, the next major revision to C.  A copy of this (large) document is
available in:

http://osiris.dkuug.dk/JTC1/SC22/open/2620/n2620/

Section 7.16 of this draft C standard proposes a major overhaul of the
functions and datatypes defined in <time.h>.  It adds a new data type
`struct tmx' that is struct tm extended with the following members:

	int tm_version;	// version number
	int tm_zone;	// time zone offset in minutes from UTC [-1439,+1439]
	int tm_leapsecs;// number of leap seconds applied
	void *tm_ext;	// extension block
	size_t tm_extlen; // size of extension block

Also, a struct tmx's tm_isdst is the positive number of minutes of
offset if DST is in effect.  New functions mkxtime, strfxtime use
struct tmx instead of struct tm; a new function

       struct tmx *zonetime (const time_t *timer, int zone);

is the rough analog of localtime and gmtime for struct tm.

I've submitted the following comments to the ISO committee for their
review.  A copy of these comments (along with all other US public
comments on Committee Draft 1) can be found in:
http://osiris.dkuug.dk/JTC1/SC22/WG14/www/docs/n834.htm

Category: Feature that should be removed
Committee Draft subsection: 7.16
Title: changes to <time.h> need a lot of work and should be withdrawn for now
Detailed description:   

  Background and comments

    Draft C9X introduced a new time struct tmx, new macros
    _NO_LEAP_SECONDS and _LOCALTIME, and new functions mkxtime,
    zonetime, and strfxtime.

    These new functions seem to be an invention of the committee;
    they are not based on existing practice, and in some cases
    even ignore longstanding existing practice.  The new functions
    do not address many of the common problems observed with the
    C89 primitives, notably with mktime.  Nor do they add much
    functionality.

    For example, a common extension to C, now required by POSIX.1, are
    reentrant versions of localtime, gmtime, etc.  This fills a
    genuine need, but it's not addressed by draft C9X.

    There are also other genuine needs that are not addressed; just
    look at, say, the harsh words about mktime expressed by the author
    of the tide-calculation program XTide in its source code
    <http://www.universe.digex.net/~dave/files/xtide-1.6.2.tar.gz>.
    Draft C9X addresses few of the needs expressed by this author.

    Here are some more detailed comments on technical shortcomings
    in this area.

      Section 7.16.1 paragraph 3.

	The tm_zone member is an integer number of minutes.  However,
	common practice (e.g. SunOS 4.x, BSD/OS, Linux) is to have a
	member named tm_gmtoff that is a long number of seconds.  This
	is required for proper support of POSIX.1, which lets the user
	specify UTC offset to the second; it is also required for
	proper support of historical applications.  For example, the
	UTC offset of Liberia was 44 minutes and 30 seconds until May
	1972, and any program running on, say, Linux with the TZ
	environment variable set to "Africa/Monrovia" cannot operate
	correctly with if the UTC offset is required to be a multiple
	of 60 seconds.

	The tm_ext and tm_extlen members are an unprecedented kludge
	in the standard library spec.  This is not C++!  If the
	specification for struct tmx is incomplete, this suggests that
	the editorial work is not done and this type should be
	withdrawn from the standard.

      Section 7.16.2.3 paragraph 4.

        Here, draft C9X added the following new specification for mktime:

	   If the call is successful, a second call to the mktime
	   function  with  the  resulting  struct tm value shall always
	   leave it unchanged and return the same value  as  the  first
	   call.  (*)

	This specification is reasonable for mkxtime, but for mktime
	it requires changes to existing practice in a way that breaks
	existing software.  Existing software often assumes that
	tm_isdst is either negative, 0, or 1; C89 does not guarantee
	this, but it is common existing practice, so software that
	makes this assumption is portable in practice.

	Unfortunately, specification (*) cannot be satisfied without
	either adding hidden members to struct tm (which breaks binary
	compatibility) or by stuffing more information into tm_isdst
	(which breaks the programs described above).

	Granted, programs shouldn't assume that a positive tm_isdst
	is 1, but it's very common in POSIX.1 programs to see
	expressions like `tzname[tm->tm_isdst]', and these expressions
	won't work if tm_isdst contains large values.

      Section 7.16.2.4 paragraph 3.

	If tm_zone was _LOCALTIME, and if tm_isdst is preposterous
	(e.g. negative, or INT_MAX), this specification is unclear
	about what to do.  The comments in 7.16.2.6 don't help much.

      Section 7.16.2.6 paragraph 1.

	The specification for tm_isdst does not allow for negative
	daylight-saving time.  I don't know of any historical practice
	for this, but POSIX.1 allows it, and implementations that
	support POSIX.1 have to allow for it.

      Section 7.16.2.6 paragraph 2.

 	The limits on ranges for struct tmx members are unreasonable.
	Common existing practice, for example, is to invoke mktime
	with a large value for tm_sec to compute a time stamp at some
	distance from the POSIX.1 epoch.  If int and long are the same
	size, this runs afoul of the new restriction in this section,
	which limits tm_sec to one-eighth of the potential range.
	With this limitation I cannot even use mktime to compute
	today's date on my Unix host from today's time_t value!

	The other limits are also unnecessary.  A well-written mktime
	should work in the presence of arbitrary values in struct
	tm members; similarly for mkxtime.

      Section 7.16.2.6 paragraph 3.

	There are so many errors in this section that it is hard to
	determine what is intended.  But from what I can tell, the
	intent is wrong.  For example, it seems to be saying that if
	the implementation supports leap seconds, and if local time is
	UTC, and if I have a struct tmx that corresponds to 1997-06-30
	00:00:00, and then add 1 to tm_mday and invoke mkxtime, I
	should get 1997-06-30 23:59:60 due to the intervening leap
	second.  This is not what I, the programmer, want or expect!

	The first sentence in this paragraph reads ``Values S and D
	shall be determined as follows''.  But the rules that follow
	do not _determine_ S and D; they merely place _constraints_
	on S and D.  This is because the implementation has some leeway
	in choosing X1 and X2.

	It's not clear in this paragraph whether we're looking at C
	code or mathematics.  Are we supposed to be using all the C
	rules for promotion, conversion, and overflow, or are the
	calculations to be done using mathematical integer arithmetic?

	The last sentence in the comment about X1 and X2 is
	incoherent; I really can't make out what it means.

	For the implementation to determine X1 and X2, it needs to
	know what D and S are.  But D and S are computed from X1 and
	X2!  More explanation is needed before I can really figure out
	what's intended here.

	The definition of D is completely unmotivated, and does not
	obey the rules of the Gregorian calendar.  Among other things,
	it uses / and % in places where it should use QUOT and REM.
	(And it can't possibly be right without a `100' in it
	somewhere.  :-) The definition should be rewritten to be
	something like the following.  (Sorry, I haven't tested this,
	as it's less than 30 minutes before the deadline for
	submitting comments in the US as this sentence is being
        written.)

	  D = // day offset since 0000-03-01

	      // contribution from year
	      Z*365 // number of non-leap days since 0000-03-01
	      + QUOT(Z, 4) // Every 4 years ends in a leap year.
	      - QUOT(Z, 100) // Every 100 years ends in a nonleap year.
	      + QUOT(Z, 400) // Every 400 years ends in a leap year.

	      // contribution from month; note we start from 03-01
	      + ((int []){ ...yday offsets, starting in March ...})
			[REM(M - 2, 12)]

	      // contribution from day of month
	      + tm_mday - 1

	      // contribution from time of day
	      + QUOT(SS, 86400)

	except of course that the expression QUOT(SS, 86400) mishandles
	leap seconds as described above.

      Section 7.16.3.5

	This new function zonetime is if only marginal use; it seems to
	be present mostly as a way of defining how mkxtime works.

	The definition of leap seconds is incorrect.  Leap seconds are
	not a UTC-UT1 offset.  The absolute value of the difference
	between UTC and UT1 is at most 0.9 seconds, by definition.

    The changes to 7.16 seem to be hastily edited: there are a number
    of what seem to be typographical errors.  The changed text is not
    explained, and the typos make it hard to understand what was
    intended.  Here are some of the typos that I spotted despite these
    problems:

      Section 7.16.1 paragraph 2.  _LOCALTIME ``must be outside the
      range [-14400, +14400].''  Presumably this should be [-1440,
      +1440], i.e. one day's worth not ten.

      Section 7.16.2.6 paragraph 3.

	The definition for QUOT yields numerically incorrect results
	if (b)-(a) or (b)-(a)-1 overflows.  I suggest replacing it
	with the following definition, which is clearer and free of
	problems with overflow.  This definition relies on C9X's new
	guarantees about integer division.

	  #define QUOT(a,b) ((a)/(b)  -  ((a)%(b) < 0))

	Similarly, REM can overflow if (b)*QUOT(a,b) overflows.  Here
	is a better version.

	  #define REM(a,b) ((a)%(b)  +  (b) * ((a)%(b) < 0))

	The definition of Z can be written more compactly as:

	  Z = Y - (M < 2);

      Section 7.16.3.6 paragraph 5.

	``If this value is outside the normal range, the characters stored
	are unspecified.''  What is the ``normal range''?  The range as
	output by localtime, the range of the Gregorian calendar, or
	the limits as specified in 7.16.2.6?

  Suggestion

    Drop all changes to the <time.h> section for this revision of
    the C Standard.

    Bring in experts in this area for the next revision of the
    C Standard.  I suggest working together with the members of the
    Time Zone Mailing list <tz@elsie.nci.nih.gov>.

    Build on existing practice rather than relying on committee
    inventions, which have been error-prone in this area.

    If these suggestions is not followed, a lot of changes are
    needed to this section, as suggested by the above discussion;
    please contact me if you need more details.

comments on draft ISO C9x changes to <time.h>

Paul Eggert