
"Clive D.W. Feather" wrote on 1997-06-30 13:14 UTC:
You may be interested to know that WG14 adopted last week working paper N733, which adds the following items to the strftime() function:
%f is replaced by the weekday as a decimal number (1-7), where Monday is 1 (the ISO 8601 weekday number). %F is equivalent to "%Y-%m-%d" (the ISO 8601 date format). %T is equivalent to "%H:%M:%S" (the ISO 8601 time format). %V is replaced by the ISO 8601 week number of the year (weeks begin on a Monday, and week 1 is the week that includes both January 4th and the first Thursday of the year) as a decimal number (00-53).
Thanks a lot for forwarding this interesting working paper. Please forward the following comments to the authors of N733 and whoever else might be interested. If possible, I would like to review the final text of the new strftime() definition (will the current draft be on the Web somewhere?). Comments on WG14 proposal N733 ------------------------------ Markus Kuhn <kuhn@cs.purdue.edu> -- 1997-06-30 (= 1997-W27-1) I appreciate this proposal and I have the following related comments. a) The valid range for %V numbers is 01-53. There is no week 00 in the ISO 8601 week numbering scheme. The week before a week 01 is either week 52 or week 53 of the previous year. b) %V alone is not sufficient to be able to use the ISO week numbering system. Each week is associated with a year, the year in which the majority of the days of this week fall, and this is not necessarily the year in which all days of the weeks fall. For instance: The week 1999-W52 goes from 1999-12-27 to 2000-01-02, in other words, the day 2000-01-02 has a week notation of the form 1999-W52-7. There should be another format descriptor for the year to which the current ISO week belongs, preferably both in 4-digit (%G) and 2-digit (%g) form. If you implement an algorithm for %V, you'll get the value of %G anyway as a by-product very easily, and therefore it should be made available to the strftime() user. c) Existing practice: The Olson tzcode package <ftp://elsie.nci.nih.gov/pub/> contains a widely used strftime() implementation that supports already: %u ISO 8601 week day number (1 = Monday, 7=Sunday) %V ISO 8601 week number (01-53) %G ISO 8601 year of current week, 4-digits %g ISO 8601 year of current week, 2-digits Unless there is a good rationale for the characters suggested by N733, I would suggest to stick with %u instead of %f for the weekday number, and I hope that you will add %G and %g as used in the Olson package and Arnold Robbins' strftime version 3.0. Therefore, if I evaluate on 1977-01-02 the string %G-W%V-%u, I should get 1976-W53-7, and on 1975-12-29 I should get 1976-W01-1. Not directly related to N733, but affecting the same part of the standard, I have a number of other suggestions: d) The range for %S and tm_sec is currently defined to be 00-61 to provide for "as many as two leap seconds". This was based on a serious missunderstanding and there can never be two leap seconds per day as it becomes very obvious by reading ITU-R Recommendation TF.460-4 (I can send you a copy if you are interested). Since this 00-61 range is being widely quoted in other standards, this error should be fixed just to stop spreading this serious missconception of how leap seconds work. The correct range is 00-60. This is not an interoperability problem, but fixing this would make WG14 look like they know what they are doing, and it is therefore a good idea. e) I wonder whether %W is anywhere used and whether this field could be dropped to simplify implementation and memory cost. Countries that start the week with Monday normally use ISO 8601 week numbers (%V) and not the scheme defined by %W. I suspect %W was defined based on a missconception of how week numbers work in Europe. Unless anyone can come up with an example where %W is used or needed, I suggest to drop it as it looks completely useless to me (and please don't quote standards that just copied the %W from ISO C). f) In the definition of %y and %Y, the first two digits of a four digit year are refered to as the "century", which is problematic, since the years 1999 and 2000 belong to the 20th century, but 2001 belongs to the 21st century. Suggested better wording: "%y is replaced by the last two digits of the year as a decimal number (00-99)". Again, not a serious interoperability problem, but it makes WG14 look like they know what they are doing. g) mktime() is the inverse function of localtime(), but there exists no portable inverse function for gmtime() that converts a struct tm given in UTC into time_t. This is a serious problem, and the addition of a new function (e.g., mkgmtime() might be a possible name) should be considered seriously. It is not possible to invert gmtime() in a 100% portable way in an application program, and in practice, I have encountered awful hacks like binary searches over the time_t range to invert gmtime() in an as portable as possible way. See <http://www.ft.uni-erlangen.de/~mskuhn/iso-time.html> for further info. Markus -- Markus G. Kuhn, Computer Science grad student, Purdue University, Indiana, USA -- email: kuhn@cs.purdue.edu

My apologies for the extreme delay in addressing this. I have been both busy and also having trouble locating a copy of 8601 to look at. "Markus G. Kuhn" <kuhn@cs.purdue.edu> writes
Thanks a lot for forwarding this interesting working paper. Please forward the following comments to the authors of N733 and whoever else might be interested. If possible, I would like to review the final text of the new strftime() definition (will the current draft be on the Web somewhere?).
I've taken your comments and will be producing a new paper that addresses them. No, I don't expect the draft to be on the web any time soon, though my papers are, at <http://www.gold.net/users/cdwf/c/>.
a) The valid range for %V numbers is 01-53. b) %V alone is not sufficient to be able to use the ISO week numbering system. Each week is associated with a year, the year in which the majority of the days of this week fall, and this is not necessarily the year in which all days of the weeks fall.
I can't actually find any text in 8601 that addresses this issue. However, since what you describe appears to be current practice, I will work on that basis.
c) Existing practice: The Olson tzcode package <ftp://elsie.nci.nih.gov/pub/> contains a widely used strftime() implementation that supports already: %u ISO 8601 week day number (1 = Monday, 7=Sunday) %V ISO 8601 week number (01-53) %G ISO 8601 year of current week, 4-digits %g ISO 8601 year of current week, 2-digits Unless there is a good rationale for the characters suggested by N733,
N733 used the codes described by our Posix.2 liasion (Keld Simonsen). I will propose these instead.
Not directly related to N733, but affecting the same part of the standard, I have a number of other suggestions: d) The range for %S and tm_sec is currently defined to be 00-61 to provide for "as many as two leap seconds". This was based on a serious missunderstanding
Yes: Jutta will tell you that I have been hammering on this for a *long* time. It *will* get fixed.
it becomes very obvious by reading ITU-R Recommendation TF.460-4 (I can send you a copy if you are interested).
Please.
e) I wonder whether %W is anywhere used and whether this field could be dropped to simplify implementation and memory cost.
Dropping thing is usually a bad idea. It's unlikely to be a major cost to implementers, so I'd rather leave it in.
f) In the definition of %y and %Y, the first two digits of a four digit year are refered to as the "century", which is problematic,
Noted.
g) mktime() is the inverse function of localtime(), but there exists no portable inverse function for gmtime() that converts a struct tm given in UTC into time_t. This is a serious problem, and the addition of a new function (e.g., mkgmtime() might be a possible name) should be considered seriously. It is not possible to invert gmtime() in a 100% portable way in an application program,
The type time_t is intended to represent times in a zone-independent manner. Thus calling localtime() and gmtime() on the same time_t value should give results that reflect the relationship between UTC and the local zone. Part of my proposals add extra semantics (including zone knowledge) to mktime(), so that it will be possible to specify the UTC offset of the time to be represented. Antoine adds:
It should be put in relief that N735 adds: %#W - ISO 8601 week number ) If %W would be zero, the date is %#y - ISO 8601 week number year % 100 ) treated as belonging to week 53 %#Y - ISO 8601 week number year ) of the previous year
This is unlikely to happen. -- Clive D.W. Feather | Director of Software Development | Home email: Tel: +44 181 371 1138 | Demon Internet Ltd. | <clive@davros.org> Fax: +44 181 371 1037 | <clive@demon.net> | Written on my laptop; please observe the Reply-To address |
participants (2)
-
Clive D.W. Feather
-
kuhn@cs.purdue.edu