Thread-safe localtime(3) (was Re: Reading binary files)
Am I missing a fundamental, or would it be reasonable to not necessarily expose the data structure for the parsed conversion data, but instead offer explicit parse routine (returning an opaque blob) and convert routine that takes such a blob as a parameter? Then it should be easy to make tzset(3) utterly trivial -- call the zoninfo parser on getenv(TZ) with locking, and localtime(3) likewise trivial, call the real convert routine with the datastructure cached by the latest tzset.
That's actually what I'm working on right now. I'm creating an open source c++ library that will process timezones and provide functionality. From: bent@latency.net [mailto:bent@latency.net] On Behalf Of Bennett Todd Sent: Wednesday, November 02, 2011 12:02 PM To: Thom Hehl Cc: tz@iana.org Subject: Thread-safe localtime(3) (was Re: [tz] Reading binary files) Am I missing a fundamental, or would it be reasonable to not necessarily expose the data structure for the parsed conversion data, but instead offer explicit parse routine (returning an opaque blob) and convert routine that takes such a blob as a parameter? Then it should be easy to make tzset(3) utterly trivial -- call the zoninfo parser on getenv(TZ) with locking, and localtime(3) likewise trivial, call the real convert routine with the datastructure cached by the latest tzset.
It'd surely be nice if this reorg could be propagated to tzcode, in C, ultimately to wend its way into libc everywhere; forking or independently reimplementing tzcode carries a maintenance cost, as the tzdata format isn't quite completely set in stone yet, and as sufficiently clever legislators cook up sufficiently brilliant schemes to "save daylight", it's possible tzdata format and tzcode may have to co-evolve again. Any folks on this list have opinions about how the tzset/localtime api should best evolve?
On Nov 2, 2011, at 10:12 AM, Bennett Todd wrote:
It'd surely be nice if this reorg could be propagated to tzcode, in C, ultimately to wend its way into libc everywhere; forking or independently reimplementing tzcode carries a maintenance cost, as the tzdata format isn't quite completely set in stone yet, and as sufficiently clever legislators cook up sufficiently brilliant schemes to "save daylight", it's possible tzdata format and tzcode may have to co-evolve again.
Any folks on this list have opinions about how the tzset/localtime api should best evolve?
I think that, all other things being equal, if it's already evolved on at least one UN*X, it should evolve in the same direction, as tzcode is used as the basis of many libc implementations (*BSD, Mac OS X, maybe Solaris if they're keeping up with changes). I.e., I'd go with Christos Zoulas' APIs unless either 1) some other UN*X has a different API, at which point the multiple APIs should be investigated or 2) there's a significant problem with them.
From a quick look at the NetBSD APIs, we have:
timezone_t tzalloc(const char *name); takes a time zone name as an argument, and returns NULL on failure to load that zone and a timezone_t (an opaque handle, implemented as a pointer to an opaque structure) on success void tzfree(const timezone_t sp); releases the result of tzalloc() struct tm *localtime_rz(const timezone_t sp, const time_t * __restrict timep, struct tm *tmp); given a timezone_t and a time_t, fills in a struct tm - presumably returns the struct tm * passed to it time_t mktime_z(const timezone_t sp, struct tm *tmp); given a timezone_t and a struct tm, returns a time_t size_t strftime_z(const timezone_t sp, char * const s, const size_t maxsize, const char * const format, const struct tm * const t); given a timezone_t, a struct tm, and a format, fills in a string with the time formatted according to the string Presumably the timezone_t is needed for the time zone name, as 1) even if NetBSD might have tm_gmtoff and tm_zone fields in struct tm, not all systems will necessarily have it and 2) even if it does, the problem of "what happens if tm_zone points to something in the timezone state and that gets freed before a reference is made to the struct tm?" remains. I think having APIs that explicitly take a loaded time zone as an argument is something we should do, as thread safety isn't the only issue; there are reasons to have more than one time zone used for time conversions within a given single thread of control.
That sounds all superb, and excellent. The one question I'd ask is, would it be practical and reasonable to lose the _z and _rz suffixes, and make the timezone arg an optional extra arg to localtime/mktine, with varargs? It seems saddening to carry the original api forward unchanged at the expense of blessing into official api standard what look to a naive eye like placeholder function names.
Or, if going varargs is tasteless, maybe rename localtime_rz tzconvert, and implementing localtime as a trivial wrapper?
On Nov 2, 2011, at 12:49 PM, Bennett Todd wrote:
Or, if going varargs is tasteless, maybe rename localtime_rz tzconvert, and implementing localtime as a trivial wrapper?
It'd be implemented as a wrapper in any case. There's already precedent for the _[letters] suffixes, as per my previous mail. There are a lot of API changes I'd make if I had a time machine, including but not limited to adding a UNIX create()-with-an-e call with three arguments as an alternative to the three-argument open and giving UNIX mktime() since Day One (with a timezone argument), but....
<<On Wed, 2 Nov 2011 15:34:49 -0400, Bennett Todd <bet@rahul.net> said:
That sounds all superb, and excellent. The one question I'd ask is, would it be practical and reasonable to lose the _z and _rz suffixes, and make the timezone arg an optional extra arg to localtime/mktine, with varargs?
The C standard does not allow this. There are no "optional extra args" in C. (A variadic function must have enough information from mandatory arguments to determine exactly how many arguments there are.) Besides which, the prototype of the functions from the C standard is fixed by the standard. -GAWollman
On Nov 2, 2011, at 12:34 PM, Bennett Todd wrote:
That sounds all superb, and excellent. The one question I'd ask is, would it be practical and reasonable to lose the _z and _rz suffixes, and make the timezone arg an optional extra arg to localtime/mktine, with varargs?
How would you do that without breaking any existing code on any implementation? (That means "without relying on any implementation quirks", so you can't assume arguments are passed on the stack or in registers or....) How would localtime distinguish tm = localtime(when); from tm = localtime(when, tz); in a fashion that works with all C compilers on all platforms?
It seems saddening to carry the original api forward unchanged at the expense of blessing into official api standard what look to a naive eye like placeholder function names.
That's already happened - localtime() is not thread safe, so there's localtime_r(), which takes a time_t and a pointer to a struct tm as arguments, and strftime() and strptime() are also not thread-safe if you need to make different formatting or parsing calls in different locales, so there's strftime_l() and strptime_l(), which take a locale_t in addition to the regular strftime() or strptime() arguments. These are in the current Single UNIX standard (as are other _r and _l functions).
Thanks for the additional info. I Hadn't realized that random _x suffixes had been blessed into standards, I'd thought they were platform-specific kluges. That said, maybe it's time to add not only tzalloc and tzfree, but also tzconvert, then implement locwltime_? and kin, as well as tzset and localtime, in terms of primitives that look better, particularly in documentation?
On Wed, 02 Nov 2011, Bennett Todd wrote:
That sounds all superb, and excellent. The one question I'd ask is, would it be practical and reasonable to lose the _z and _rz suffixes, and make the timezone arg an optional extra arg to localtime/mktine, with varargs?
There's a long history of using "_r" a suffix for "re-entrant" or "thread safe" versions of functions. Using "_z" for functions that take an explicit timezone argument is new, but seems to fit the established pattern. varargs functions need a way of knowing whether or not more arguments are present. How would you do that? --apb (Alan Barrett)
Am 02.11.2011 21:20, schrieb Alan Barrett:
On Wed, 02 Nov 2011, Bennett Todd wrote:
That sounds all superb, and excellent. The one question I'd ask is, would it be practical and reasonable to lose the _z and _rz suffixes, and make the timezone arg an optional extra arg to localtime/mktine, with varargs?
There's a long history of using "_r" a suffix for "re-entrant" or "thread safe" versions of functions. Using "_z" for functions that take an explicit timezone argument is new, but seems to fit the established pattern.
varargs functions need a way of knowing whether or not more arguments are present. How would you do that?
--apb (Alan Barrett)
glibc already has localtime_r() also the other functions are covered. NTL it would be nice to have a reference implemtation, here also. Since POSIX has defined an API (http://pubs.opengroup.org/onlinepubs/9699919799/functions/localtime.html). The party is over. What still is needed are localtime zones. re, wh
On Nov 2, 12:08pm, guy@alum.mit.edu (Guy Harris) wrote: -- Subject: Re: [tz] Thread-safe localtime(3) (was Re: Reading binary files) | >From a quick look at the NetBSD APIs, we have: | | timezone_t tzalloc(const char *name); | takes a time zone name as an argument, and returns NULL on failure to load that zone and a timezone_t (an opaque handle, implemented as a pointer to an opaque structure) on success | | void tzfree(const timezone_t sp); | releases the result of tzalloc() | | struct tm *localtime_rz(const timezone_t sp, const time_t * __restrict timep, struct tm *tmp); | given a timezone_t and a time_t, fills in a struct tm - presumably returns the struct tm * passed to it | | time_t mktime_z(const timezone_t sp, struct tm *tmp); | given a timezone_t and a struct tm, returns a time_t | | size_t strftime_z(const timezone_t sp, char * const s, const size_t maxsize, const char * const format, const struct tm * const t); | given a timezone_t, a struct tm, and a format, fills in a string with the time formatted according to the string | | Presumably the timezone_t is needed for the time zone name, as No, it is the loaded tzstate. | 1) even if NetBSD might have tm_gmtoff and tm_zone fields in struct tm, not all systems will necessarily have it | | and | | 2) even if it does, the problem of "what happens if tm_zone points to something in the timezone state and that gets freed before a reference is made to the struct tm?" remains. You document that you cannot free the timezone_t if you want to keep using the referene. | I think having APIs that explicitly take a loaded time zone as an argument is something we should do, as thread safety isn't the only issue; there are reasons to have more than one time zone used for time conversions within a given single thread of control. That is what has been done; why do you think otherwise? christos
On Nov 2, 2011, at 3:49 PM, Christos Zoulas wrote:
On Nov 2, 12:08pm, guy@alum.mit.edu (Guy Harris) wrote:
| Presumably the timezone_t is needed for the time zone name, as
No, it is the loaded tzstate.
The timezone_t is a pointer to the loaded tzstate, is it not? strftime() only needs time zone information for the time zone name and offset, unless I'm missing something.
| I think having APIs that explicitly take a loaded time zone as an argument is something we should do, as thread safety isn't the only issue; there are reasons to have more than one time zone used for time conversions within a given single thread of control.
That is what has been done;
It's what has been done in NetBSD; it's not what has been done, yet, in the tzcode.
why do you think otherwise?
Because *nothing* has been done yet in the tzcode, which is what we're discussing. I'm trying to indicate that "make the tzstate per-thread data" may solve the thread-safety problem but won't solve other problems, so I'd vote for APIs that take a loaded time zone as an argument, such as the NetBSD ones, rather than just making the tzstate per-thread data.
On Wednesday 02 November 2011 13:12:01 Bennett Todd wrote:
It'd surely be nice if this reorg could be propagated to tzcode, in C, ultimately to wend its way into libc everywhere; forking or independently reimplementing tzcode carries a maintenance cost, as the tzdata format isn't quite completely set in stone yet, and as sufficiently clever legislators cook up sufficiently brilliant schemes to "save daylight", it's possible tzdata format and tzcode may have to co-evolve again.
Any folks on this list have opinions about how the tzset/localtime api should best evolve?
POSIX already covers things like localtime(). if people want to extend the time API that is part of the C library, then that discussion should happen on the POSIX mailing lists. http://www.opengroup.org/austin/lists.html http://pubs.opengroup.org/onlinepubs/9699919799/functions/localtime.html -mike
<<On Wed, 2 Nov 2011 16:45:00 -0400, Mike Frysinger <vapier@gentoo.org> said:
POSIX already covers things like localtime(). if people want to extend the time API that is part of the C library, then that discussion should happen on the POSIX mailing lists.
Actually, it's generally the preference of the Austin Group to standardize existing practice, so the right thing to do is to add these extensions to tzcode, and then allow the downstream consumers to add them to their C libraries as they see fit. Then a proposal can be written to include them in POSIX (or in the next revision of C, which is a different standards committee, since they are actually C standard library functions). I would very much like to see this happen. The only quibble I would make with Christos's proposed API is that, because C does not have incomplete typedefs, a new foo_t type should not be introduced. Using such typedefs creates undesirable ordering dependencies between headers (not a problem for the Implementation but definitely a problem for applications if they want to minimize namespace pollution). Using an incomplete structure type has all the benefits of a typedef and none of the drawbacks. If the proposal is adopted by either C or POSIX, they will ignore the drawbacks and create a new _t type anyway, but since they have joint custody of the "_t" namespace, they can at least guarantee that it's defined in an official system header as well. -GAWollman
On 02/11/11 21:58, Garrett Wollman wrote:
I would very much like to see this happen. The only quibble I would make with Christos's proposed API is that, because C does not have incomplete typedefs, a new foo_t type should not be introduced. Using such typedefs creates undesirable ordering dependencies between headers (not a problem for the Implementation but definitely a problem for applications if they want to minimize namespace pollution). Using an incomplete structure type has all the benefits of a typedef and none of the drawbacks. If the proposal is adopted by either C or POSIX, they will ignore the drawbacks and create a new _t type anyway, but since they have joint custody of the "_t" namespace, they can at least guarantee that it's defined in an official system header as well.
Isn't this an incomplete typedef? typedef struct _timezone_opaque_t *timezone_t; It's a typedef for an incomplete structure, I know, but there's precedent for this kind of thing. jch
Garrett Wollman said:
I would very much like to see this happen. The only quibble I would make with Christos's proposed API is that, because C does not have incomplete typedefs,
Huh? Of course C has incomplete typedefs. typedef struct fred jim; struct fred { int sheila; }; is perfectly legal code.
Using such typedefs creates undesirable ordering dependencies between headers (not a problem for the Implementation but definitely a problem for applications if they want to minimize namespace pollution).
There are various ways to fix this. Note that C already requires that all standard headers can be included in any order even when there are dependencies between them. (I haven't seen the proposal, so I may be misunderstanding what you're getting at. If so, please explain.) -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
participants (10)
-
Alan Barrett -
Bennett Todd -
christos@zoulas.com -
Clive D.W. Feather -
Garrett Wollman -
Guy Harris -
John Haxby -
Mike Frysinger -
Thom Hehl -
walter harms