2nd of 2 forwarded messages from Bradley White: strtotm
The attached comes from Bradley White, bww+@cs.cmu.edu. Your comments will be appreciated! --ado I often miss an at-least-pseudo-standard means of converting a reasonable string date/time specification into a "struct tm", and, from the number of requests on different newsgroups, it appears others do too. Perhaps adding one to the "tz" package might be a reasonable thing to do? If you concur, and are open to suggestions or want to open it up for discussion, here are some random thoughts on the interface off the top of my head. It's undoubtedly deficient. Then again, perhaps you have considered all of this before, and decided against it? Brad ---------->8---------- static int _strtotm(str, flags, np, tp) char *str; unsigned *flags; struct tm *np; struct tm *tp; { /* * decode str into tp using (if necessary) flags and np; * undoubtedly requires yacc and/or lex to do a good job * * return 0 on success, non-zero (error codes?) on error * * The only flags I can currently think of are: * IN: * XX_DEFPAST default to most recent past * matching time, rather than * most near future matching time, * for incomplete specifications * XX_DEFDDMM DD/MM/YY instead of MM/DD/YY * if ambiguous * XX_DEFYYMM YY/MM/DD instead of above * if ambiguous * OUT: * XX_DEFTIME no time specified, used "now" time * XX_DEFZONE no zone specified, used "now" zone */ return XX_EUNIMPL; } struct tm * strtotm(str, flags, np, tp) char *str; unsigned *flags; struct tm *np; /* "now" for relative times */ struct tm *tp; /* space for result, malloc if NULL */ { struct tm nowtm, tmptm; if (np == 0) { time_t now = time((time_t *) 0); np = localtime(&now); } nowtm = *np; if (_strtotm(str, flags, &nowtm, &tmptm) != 0) return 0; #if 0 /* * I'd like to do this to check/canonicalize the struct tm, * but it is really only appropriate if `tmptm' is in the * local timezone, right??? Perhaps timeoff()??? */ if (mktime(&tmptm) == -1) return 0; #endif if (tp == 0 && (tp = malloc(sizeof *tp)) == 0) return 0; *tp = tmptm; return tp; }
Thanks to Guy Harris and Paul Eggert for their comments, and again to Paul Eggert for an earlier, private note. There at least seems to be a consensus that adding a date/time parser to ado's "tz" package would be a reasonable thing, although it may be difficult to decide exactly what it should look like. In this note I will try to summarize what has been said so far in the hope that a reasonable specification may begin to take shape. The final goal would be an implementation of sufficient quality and taste to merit inclusion in the package. * Prior Art Perhaps a reasonable candidate already exists, or can be grown from existing code. These routines (in alphabetical order) have been mentioned: func who where ----------------------------------------------------- dparsetime() RAND/UCI MH getabsdate() Moraes C News getdate() USL Sys V Rel 4 getdate() Bellovin/Salz/Berets B News parsedate() Hamey/Accetta Mach partime() Harrenstien/Eggert RCS strptime() Harris SunOS 4.1[.x] The following evaluation criteria (assuming correctness) have been mentioned: - interface (some are more useful than others) - date/time language - implementation methods - ease of internationalization - default/optional values - error checking - speed * Interface So far, it seems that we want to return a struct_tm (as opposed to a time_t) plus an indication of how much of the date/time string was used, leaving the following minimal interface. in: struct tm * (to fill in [if NULL malloc or static?]) char * (date/time string to parse) out: struct tm * (result [NULL on error?]) char * (unconsumed part of string) What else is needed? Perhaps we could list full declarations for each of the above routines. * The date/time language I think we can all agree that we want to be able to parse strings like ... Mon Jun 22 15:26:09 EDT 1992 Mon, 22 Jun 92 15:26:09 -0400 (EDT) 06/22/1992 15:26:09 -0400 1992-06-22T15:26:09 ... and whatever similar strings are for different locales. However, some of the above implementations can also handle strings like ... now next Friday three days ago two years from today new year's eve, 1999 half-past four the day after tomorrow 8pm US/Pacific on the 1st Tuesday in November, 1996 ... but this may seem like kitchen-sink-ism. What language do we want to accept? * Implementation Methods A particular answer to the language question (e.g., the input language is regular) may suggest a preferred implementation method (e.g., DFA's) and/or a useful tool (e.g., lex). Paul points out that some kind of hack may be necessary to allow any yacc-generated parsers to co-exist. On the other hand, lex, yacc, and other formal methods are usually easy to extend, and provide a good level of correctness-confidence. Should we limit ourselves to regular, LL(1), or LALR(1) languages? * Internationalization It is clear that there needs to be support for "internationalization" (like $LANG, $LC_TIME, ..., ???). Depending upon the implementation method and date/time language, this may be more or less difficult. Indeed, the presence of features in the target system, like dynamic loading, may change the best answer. What's the favoured approach? * Default values How do you interpret relative times? How do you specify default values for optional fields? How do you know where default values were used? Default timezones probably need to be specified with something like "localtime", "US/Eastern", or another zoneinfo name, so that standard and daylight offsets can be given at the correct times of year. How do you indicate this? * Out-of-range values Should out-of-range numeric values result in a parse error? If so, is "out-of-range" smart (e.g., knows about month lengths, leap years, leap seconds, correct day-of-week, ....)? Your comments are appreciated. Is this even worth pursuing? Brad
participants (2)
-
ado -
Bradley White