Extension to tzcode to support additional timezones
Several of my employers have needed to work with data from multiple timezones within a single process. The traditional way to work with this has been to change the value of the TZ environment variable and invoke tzset() and switch between timezones one at a time. What I have done is extend the time(3) API to allow applications to load and use arbitrary timezones separate from the current local and GM timezones. I've done this by adding the following API calls: - void *tzopen(const char *name). This loads the rules for a specified timezone and returns a void * cookie. If the zone cannot be parsed it returns NULL and sets errno to EINVAL. - void tzclose(void *zone). This "closes" a set of rules opened via tzopen() by releasing the associated resources. - struct tm *tztime(void *zone, const time_t *t, struct tm *tm). This is like localtime_r() except that it uses the timezone rules from the passed in cookie instead of the local timezone indicated by TZ. - time_t timetz(void *zone, struct tm * const tm). This is like mktime() except that it uses the timezone rules from the passed in cookie instead of the local timezone indicated by TZ. The changes to the code are fairly simple since the per-zone state was already broken out into a standalone state object to handle GMT vs local time. In some cases the patches make things cleaner (I think) by removing hacks where a function pointer was queried to determine which state object to use since the state object pointer is now passed all the way down the internal time code stack. I have a patch that is relative to the FreeBSD source tree, but all of the real code changes are in localtime.c which I believe should apply directly to the tzcode distribution. The patch is available at http://www.FreeBSD.org/~jhb/patches/tzopen.patch Is this API something that you folks would be interested in? I currently have this as a private patch locally, but if possible I would like it adopted upstream. -- John Baldwin
I'm forwarding this message from John Baldwin, who is not on the time zone mailing list. Those of you who are on the list, please direct replies appropriately. --ado ________________________________________ From: John Baldwin [jhb@freebsd.org] Sent: Tuesday, October 26, 2010 11:20 AM To: tz@lecserver.nci.nih.gov Subject: Extension to tzcode to support additional timezones Several of my employers have needed to work with data from multiple timezones within a single process. The traditional way to work with this has been to change the value of the TZ environment variable and invoke tzset() and switch between timezones one at a time. What I have done is extend the time(3) API to allow applications to load and use arbitrary timezones separate from the current local and GM timezones. I've done this by adding the following API calls: - void *tzopen(const char *name). This loads the rules for a specified timezone and returns a void * cookie. If the zone cannot be parsed it returns NULL and sets errno to EINVAL. - void tzclose(void *zone). This "closes" a set of rules opened via tzopen() by releasing the associated resources. - struct tm *tztime(void *zone, const time_t *t, struct tm *tm). This is like localtime_r() except that it uses the timezone rules from the passed in cookie instead of the local timezone indicated by TZ. - time_t timetz(void *zone, struct tm * const tm). This is like mktime() except that it uses the timezone rules from the passed in cookie instead of the local timezone indicated by TZ. The changes to the code are fairly simple since the per-zone state was already broken out into a standalone state object to handle GMT vs local time. In some cases the patches make things cleaner (I think) by removing hacks where a function pointer was queried to determine which state object to use since the state object pointer is now passed all the way down the internal time code stack. I have a patch that is relative to the FreeBSD source tree, but all of the real code changes are in localtime.c which I believe should apply directly to the tzcode distribution. The patch is available at http://www.FreeBSD.org/~jhb/patches/tzopen.patch Is this API something that you folks would be interested in? I currently have this as a private patch locally, but if possible I would like it adopted upstream. -- John Baldwin
On Oct 26, 2010, at 9:03 AM, Olson, Arthur David (NIH/NCI) [E] wrote:
Several of my employers have needed to work with data from multiple timezones within a single process. The traditional way to work with this has been to change the value of the TZ environment variable and invoke tzset() and switch between timezones one at a time. What I have done is extend the time(3) API to allow applications to load and use arbitrary timezones separate from the current local and GM timezones. I've done this by adding the following API calls:
- void *tzopen(const char *name). This loads the rules for a specified timezone and returns a void * cookie. If the zone cannot be parsed it returns NULL and sets errno to EINVAL.
- void tzclose(void *zone). This "closes" a set of rules opened via tzopen() by releasing the associated resources.
Note that, on several OSes - including, as I remember, FreeBSD - "struct tm" includes a "tm_zone" field, which points to the timezone abbreviation for the time in question. If:
- struct tm *tztime(void *zone, const time_t *t, struct tm *tm). This is like localtime_r() except that it uses the timezone rules from the passed in cookie instead of the local timezone indicated by TZ.
a program calls tzopen(), tztime(), and then tzclose(), is the "tm_zone" field in the "struct tm" filled in by tztime() still valid?
I have a patch that is relative to the FreeBSD source tree, but all of the real code changes are in localtime.c which I believe should apply directly to the tzcode distribution. The patch is available at http://www.FreeBSD.org/~jhb/patches/tzopen.patch
"404 - Not Found" isn't much of a patch. :-)
Is this API something that you folks would be interested in? I currently have this as a private patch locally, but if possible I would like it adopted upstream.
People have been suggesting this sort of thing on several occasions, so we'd be interested in an API of this sort. (In the past, I'd proposed support for this, with a patch, and somebody pointed out the tm_zone issue.)
On Oct 26, 9:58am, guy@alum.mit.edu (Guy Harris) wrote: -- Subject: Re: Extension to tzcode to support additional timezones | > - void *tzopen(const char *name). This loads the rules for a specified | > timezone and returns a void * cookie. If the zone cannot be parsed it returns | > NULL and sets errno to EINVAL. Why return void *, when you can return an opaque type that can be typechecked? | Note that, on several OSes - including, as I remember, FreeBSD | - "struct tm" includes a "tm_zone" field, which points to the | timezone abbreviation for the time in question. The field is an OS extension and for the OS's that really want to support it we can use a pool of immutable strings to implement it. I can provide sample code for that. | People have been suggesting this sort of thing on several occasions, | so we'd be interested in an API of this sort. (In the past, I'd | proposed support for this, with a patch, and somebody pointed out | the tm_zone issue.) Well I proposed the same change and I provided a patch. I have not heard any feedback if people like the patch (the names or the arguments or the way the patch was done) or not. I have been curious since the early nineties why this has not been done already (since I needed this 20 years ago in a multi-timezone tranding system I was writing at the time). I would like to make progress on this. If people like the patch, I can implement the zone name pooling code and fix the manual page. I don't want to spend the time to improve on this if people think that I am doing things the wrong way or this will never be accepted. christos
On Tuesday, October 26, 2010 1:24:46 pm Christos Zoulas wrote:
On Oct 26, 9:58am, guy@alum.mit.edu (Guy Harris) wrote: -- Subject: Re: Extension to tzcode to support additional timezones
| > - void *tzopen(const char *name). This loads the rules for a specified | > timezone and returns a void * cookie. If the zone cannot be parsed it returns | > NULL and sets errno to EINVAL.
Why return void *, when you can return an opaque type that can be typechecked?
An opaque type would be an improvement.
| People have been suggesting this sort of thing on several occasions, | so we'd be interested in an API of this sort. (In the past, I'd | proposed support for this, with a patch, and somebody pointed out | the tm_zone issue.)
Well I proposed the same change and I provided a patch. I have not heard any feedback if people like the patch (the names or the arguments or the way the patch was done) or not.
I have been curious since the early nineties why this has not been done already (since I needed this 20 years ago in a multi-timezone tranding system I was writing at the time).
I would like to make progress on this. If people like the patch, I can implement the zone name pooling code and fix the manual page. I don't want to spend the time to improve on this if people think that I am doing things the wrong way or this will never be accepted.
I am not tied to my patch, I would just like the functionality in some form. -- John Baldwin
Having an extension like this would be nice, but while we're making the interface reentrant, we should also pass in the locale as a parameter, for functions like strftime that need the locale to do their work. Many operating systems already have strftime_l and we should extend that. Also, it would be better to use names that build on existing conventions, rather than inventing new names. How about if we use a z suffix for the new functions that have a time zone parameter? That would build on the existing tradition of using _r and _l for similar extensions. Something like this: strftime_lz (for the strftime_l variant that has a time zone parameter) localtime_rz (for the localtime_r variant that has a time zone parameter) mktime_z (for the mktime variant that has a time zone parameter) I agree that it'd be better to have a new opaque type, struct tz * (say), rather than void *.
On Oct 26, 10:33am, eggert@cs.ucla.edu (Paul Eggert) wrote: -- Subject: Re: Extension to tzcode to support additional timezones | Having an extension like this would be nice, but while we're | making the interface reentrant, we should also pass in the | locale as a parameter, for functions like strftime | that need the locale to do their work. Many operating | systems already have strftime_l and we should extend that. | | Also, it would be better to use names that build | on existing conventions, rather than inventing new names. | How about if we use a z suffix for the new functions that | have a time zone parameter? That would build on the existing | tradition of using _r and _l for similar extensions. Something | like this: | | strftime_lz (for the strftime_l variant that has a time zone parameter) | localtime_rz (for the localtime_r variant that has a time zone parameter) | mktime_z (for the mktime variant that has a time zone parameter) All these sound reasonable, and I will add them. | I agree that it'd be better to have a new opaque type, | struct tz * (say), rather than void *. I just have: typedef struct __state *timezone_t; christos
On Oct 26, 1:55pm, wollman@csail.mit.edu (Garrett Wollman) wrote: -- Subject: Re: Extension to tzcode to support additional timezones | <<On Tue, 26 Oct 2010 13:44:20 -0400, christos@zoulas.com (Christos Zoulas) said: | | > I just have: | | > typedef struct __state *timezone_t; | | Please don't. C does not have opaque typedefs, only opaque | structures. I understand this and typedef'ing things is problematic because you cannot typedef things multiple times, so you end up having to include a particular header that provides the typedef to bring it in scope instead of just having to do a forward struct declaration. But is it really better in this case? I can easily change the code to 'struct tz' or 'struct timezone' if people want. Oh, and I've implemented the string pooling code already for tm_zone: /* * Simple array based string pool. */ #include <stdlib.h> #include <string.h> static char **pool; static size_t poollen; static size_t poolmax; char *tzstrpool(const char *); /* * This code is very simple because we don't expect to have more than * a handful of zones active and it should not become performance critical. */ char * tzstrpool(const char *str) { size_t i; for (i = 0; i < poollen; i++) if (strcmp(str, pool[i]) == 0) return pool[i]; if (poollen == poolmax) { char **npool; size_t npoolmax = poolmax + 20; poolmax += 20; npool = realloc(pool, sizeof(*npool) * npoolmax); if (npool == NULL) return NULL; pool = npool; poolmax = npoolmax; } pool[poollen] = strdup(str); if (pool[poollen] == NULL) return NULL; return pool[poollen++]; }
On Oct 26, 12:13pm, eggert@cs.ucla.edu (Paul Eggert) wrote: -- Subject: Re: Extension to tzcode to support additional timezones | On 10/26/10 12:08, Christos Zoulas wrote: | > Oh, and I've implemented the string pooling code already for tm_zone: | | If we do that sort of thing, we need to take care that it's | thread-safe. good point. christos
On Oct 26, 3:18pm, christos@zoulas.com (Christos Zoulas) wrote: -- Subject: Re: Extension to tzcode to support additional timezones | On Oct 26, 12:13pm, eggert@cs.ucla.edu (Paul Eggert) wrote: | -- Subject: Re: Extension to tzcode to support additional timezones | | | On 10/26/10 12:08, Christos Zoulas wrote: | | > Oh, and I've implemented the string pooling code already for tm_zone: | | | | If we do that sort of thing, we need to take care that it's | | thread-safe. | | good point. | So, the simplest way to do this is to with a pthread mutex (IMHO) Other alternatives are: - use pthread once/pthread local (worse than using pthread mutex) - use c99 __thread (does not work everywhere) - use lockless structures (complex and possibly slow) - create an mmapped file with the universe of all possible zone names using zic (again, complicated, os dependent etc.) and then mmap and point to that for zone names. BTW, tm->tm_zone is broken right now since it just does: tmp->TM_ZONE = &sp->chars[ttisp->tt_abbrind]; and if I call tzset() with a different zone and try to lookup tmp->TM_ZONE this will possibly point to junk. Any other ideas or preferences? christos
On Tuesday, October 26 2010, "Christos Zoulas" wrote to "tz@lecserver.nci.nih.gov, tz@lecserver.nci.nih.gov" saying:
BTW, tm->tm_zone is broken right now since it just does:
tmp->TM_ZONE = &sp->chars[ttisp->tt_abbrind];
and if I call tzset() with a different zone and try to lookup tmp->TM_ZONE this will possibly point to junk.
This is why the POSIX specification of strftime says: If a struct tm broken-down time structure is created by localtime() or localtime_r(), or modified by mktime(), and the value of TZ is subsequently modified, the results of the %Z and %z strftime() conversion specifiers are undefined, when strftime() is called with such a broken-down time structure. <http://www.opengroup.org/onlinepubs/009695399/functions/strftime.html> This was directly inspired by the fact that the strftime "%Z" conversion specifier just prints the contents of tm_zone, on platforms that have it, and in the described scenario that field does indeed point to junk. Could one apply the same constraint to using the tm_zone field or the %Z conversion specifier following a call to tz_free() (or whatever one wants to call it)? -- Jonathan Lennox lennox@cs.columbia.edu
On Oct 26, 4:15pm, lennox@cs.columbia.edu (lennox@cs.columbia.edu) wrote: -- Subject: Re: Extension to tzcode to support additional timezones | > and if I call tzset() with a different zone and try to lookup tmp->TM_ZONE | > this will possibly point to junk. | | This is why the POSIX specification of strftime says: | | If a struct tm broken-down time structure is created by localtime() or | localtime_r(), or modified by mktime(), and the value of TZ is | subsequently modified, the results of the %Z and %z strftime() | conversion specifiers are undefined, when strftime() is called with | such a broken-down time structure. | | <http://www.opengroup.org/onlinepubs/009695399/functions/strftime.html> | | This was directly inspired by the fact that the strftime "%Z" conversion | specifier just prints the contents of tm_zone, on platforms that have it, | and in the described scenario that field does indeed point to junk. | | | Could one apply the same constraint to using the tm_zone field or the %Z | conversion specifier following a call to tz_free() (or whatever one wants to | call it)? Yes, we can document the failure scenarios (it is fairly complex to do it correctly, but doable; for example there are no such warnings for tm_zone that I know of [1]), or we can try to fix them if people think it is worth- while and the implementation is not overly complicated. christos [1] the man page (NetBSD) just says: The tm_zone and tm_gmtoff fields exist, and are filled in, only if arrangements to do so were made when the library containing these func- tions was created. There is no guarantee that these fields will continue to exist in this form in future releases of this code. -- I don't even understand what that means :-)
On Tue, 26 Oct 2010, Christos Zoulas wrote:
- use c99 __thread (does not work everywhere)
There is no such thing. C99 has no thread support. C1x will have thread support but it has nothing called __thread. It does have a _Thread_local storage class. Tony. -- f.anthony.n.finch <dot@dotat.at> http://dotat.at/ HUMBER THAMES DOVER WIGHT PORTLAND: NORTH BACKING WEST OR NORTHWEST, 5 TO 7, DECREASING 4 OR 5, OCCASIONALLY 6 LATER IN HUMBER AND THAMES. MODERATE OR ROUGH. RAIN THEN FAIR. GOOD.
On Oct 27, 11:15am, dot@dotat.at (Tony Finch) wrote: -- Subject: Re: Extension to tzcode to support additional timezones | On Tue, 26 Oct 2010, Christos Zoulas wrote: | > | > - use c99 __thread (does not work everywhere) | | There is no such thing. C99 has no thread support. C1x will have thread | support but it has nothing called __thread. It does have a _Thread_local | storage class. | Seems that it is part of edits to c99? http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/C99-Thread_002dLocal-Edits.html Anyway the point is moot, since it needs TLS and not everyone has it. christos
On 26/10/10 18:44, Christos Zoulas wrote:
On Oct 26, 10:33am, eggert@cs.ucla.edu (Paul Eggert) wrote: | I agree that it'd be better to have a new opaque type, | struct tz * (say), rather than void *.
I just have:
typedef struct __state *timezone_t;
As 'state' is a fairly common word (even though '__state' would be a system reserved identifier in C terms), it might be better to make it: typedef struct __tzstate *timezone_t; -- -=( Ian Abbott @ MEV Ltd. E-mail: <abbotti@mev.co.uk> )=- -=( Tel: +44 (0)161 477 1898 FAX: +44 (0)161 718 3587 )=-
On Oct 27, 10:03am, abbotti@mev.co.uk (Ian Abbott) wrote: -- Subject: Re: Extension to tzcode to support additional timezones | On 26/10/10 18:44, Christos Zoulas wrote: | > On Oct 26, 10:33am, eggert@cs.ucla.edu (Paul Eggert) wrote: | > | I agree that it'd be better to have a new opaque type, | > | struct tz * (say), rather than void *. | > | > I just have: | > | > typedef struct __state *timezone_t; | | As 'state' is a fairly common word (even though '__state' would be a | system reserved identifier in C terms), it might be better to make it: | | typedef struct __tzstate *timezone_t; Sure, I can easily change that, but let's settle first if we want a typedef or not. I will be fixing some api things that people pointed out to me and I'll post a new diff soon. If we come into consensus if we want a typedef or not, I will make the necessary changes. The second part of the discussion is how to deal with tm_zone. Do we document that if you tzfree() you are going to lose, or do we make it work using a string pool and thread locks? Or something else? christos
On Wednesday, October 27 2010, "Christos Zoulas" wrote to "tz@lecserver.nci.nih.gov, tz@elsie.nci.nih.gov, jhb@freebsd.org" saying:
The second part of the discussion is how to deal with tm_zone. Do we document that if you tzfree() you are going to lose, or do we make it work using a string pool and thread locks? Or something else?
Remember that this means not only tm_zone per se, but also strftime("%Z"), on systems which have tm_zone. (On systems without tm_zone, as far as I can tell there's no way to get strftime("%Z") correct for non-standard timezones, unless I'm overlooking something.) -- Jonathan Lennox lennox@cs.columbia.edu
On 27/10/10 16:57, lennox@cs.columbia.edu wrote:
(On systems without tm_zone, as far as I can tell there's no way to get strftime("%Z") correct for non-standard timezones, unless I'm overlooking something.)
For GNU libc, depending on the feature macros in force, tm_zone might be replaced with the double-underscored __tm_zone, in which case, it's still there but is only meant to be used by the library (e.g. by strftime("%Z")). -- -=( Ian Abbott @ MEV Ltd. E-mail: <abbotti@mev.co.uk> )=- -=( Tel: +44 (0)161 477 1898 FAX: +44 (0)161 718 3587 )=-
On Oct 27, 11:57am, lennox@cs.columbia.edu (lennox@cs.columbia.edu) wrote: -- Subject: Re: Extension to tzcode to support additional timezones | On Wednesday, October 27 2010, "Christos Zoulas" wrote to "tz@lecserver.nci.nih.gov, tz@elsie.nci.nih.gov, jhb@freebsd.org" saying: | | > The second part of the discussion | > is how to deal with tm_zone. Do we document that if you tzfree() you are | > going to lose, or do we make it work using a string pool and thread locks? | > Or something else? | | Remember that this means not only tm_zone per se, but also strftime("%Z"), | on systems which have tm_zone. Indeed. | (On systems without tm_zone, as far as I can tell there's no way to get | strftime("%Z") correct for non-standard timezones, unless I'm overlooking | something.) We have strftime_z() that can deduce it from the timezone_t that is passed in. Otherwise it defaults to the tzname etc. christos
On Tuesday, October 26, 2010 1:33:34 pm Paul Eggert wrote:
Having an extension like this would be nice, but while we're making the interface reentrant, we should also pass in the locale as a parameter, for functions like strftime that need the locale to do their work. Many operating systems already have strftime_l and we should extend that.
Also, it would be better to use names that build on existing conventions, rather than inventing new names. How about if we use a z suffix for the new functions that have a time zone parameter? That would build on the existing tradition of using _r and _l for similar extensions. Something like this:
strftime_lz (for the strftime_l variant that has a time zone parameter) localtime_rz (for the localtime_r variant that has a time zone parameter) mktime_z (for the mktime variant that has a time zone parameter)
I agree that it'd be better to have a new opaque type, struct tz * (say), rather than void *.
These all sound fine to me. I am not tied to any specific names. I mostly care about not having to maintain this functionality as a local patch. -- John Baldwin
On Tuesday, October 26, 2010 12:58:20 pm Guy Harris wrote:
On Oct 26, 2010, at 9:03 AM, Olson, Arthur David (NIH/NCI) [E] wrote:
Several of my employers have needed to work with data from multiple timezones within a single process. The traditional way to work with this has been to change the value of the TZ environment variable and invoke tzset() and switch between timezones one at a time. What I have done is extend the time(3) API to allow applications to load and use arbitrary timezones separate from the current local and GM timezones. I've done this by adding the following API calls:
- void *tzopen(const char *name). This loads the rules for a specified timezone and returns a void * cookie. If the zone cannot be parsed it returns NULL and sets errno to EINVAL.
- void tzclose(void *zone). This "closes" a set of rules opened via tzopen() by releasing the associated resources.
Note that, on several OSes - including, as I remember, FreeBSD - "struct tm" includes a "tm_zone" field, which points to the timezone abbreviation for the time in question.
If:
- struct tm *tztime(void *zone, const time_t *t, struct tm *tm). This is like localtime_r() except that it uses the timezone rules from the passed in cookie instead of the local timezone indicated by TZ.
a program calls tzopen(), tztime(), and then tzclose(), is the "tm_zone" field in the "struct tm" filled in by tztime() still valid?
No. However, I think this case is similarly broken: putenv("TZ=America/New_York"); tzset(); time(&t); localtime_r(&t, &tm); putenv("TZ=Europe/London"); tzset(); at this point I believe that tm_zone will not point to "EST" and "EDT", but probably the London equivalents (FreeBSD's localtime.c uses the case where there is static storage backing lclptr and gmtptr). In the ALL_STATES case I think tm_zone would reference free'd memory just as in the case you point out above.
I have a patch that is relative to the FreeBSD source tree, but all of the real code changes are in localtime.c which I believe should apply directly to the tzcode distribution. The patch is available at http://www.FreeBSD.org/~jhb/patches/tzopen.patch
"404 - Not Found" isn't much of a patch. :-)
Oops, fixed.
Is this API something that you folks would be interested in? I currently have this as a private patch locally, but if possible I would like it adopted upstream.
People have been suggesting this sort of thing on several occasions, so we'd be interested in an API of this sort. (In the past, I'd proposed support for this, with a patch, and somebody pointed out the tm_zone issue.)
-- John Baldwin
participants (9)
-
christos@zoulas.com -
Garrett Wollman -
Guy Harris -
Ian Abbott -
John Baldwin -
lennox@cs.columbia.edu -
Olson, Arthur David (NIH/NCI) [E] -
Paul Eggert -
Tony Finch