Re: [tz] [libc-coord] thread-safe localtime() for an arbitrary timezone
On 2023-06-15 10:03, enh wrote:
any thoughts on either renaming `struct state` (i went with `struct __timezone_t` to match the typedef to timezone_t for the pointer) or having a knob for that (i added a `#define state __timezone_t` in the `#if NETBSD_INSPIRED` in private.h)? why do i care? because tzcode pulls in the system <time.h>, which is actually fine for the tzcode code, but breaks the OpenBSD wcsftime().
We could rename 'struct state', but I'm a bit lost as to what the problem is, as I don't see mention of 'state' or 'timezone_t' in openbsd/lib/libc/time/wcsftime.c. Although there is a 'struct state' in openbsd/lib/libc/time/localtime.c I assume you're replacing that with tzcode's localtime.c somehow. User apps (which tzcode is, by default) shouldn't define symbols like __timezone_t because these symbols are reserved to the system. We could easily rename 'struct state' to 'struct timezone_struct' so that you could compile with -Dtimezone_struct=__timezone_t, though I'm still curious about why 'struct state' causes problems. I'll cc this to the tz mailing list to see whether other people there have insight into this issue. For those new to the discussion, the original thread is archived here: https://www.openwall.com/lists/libc-coord/2023/06/14/1
On Mon, Jun 19, 2023 at 11:08 PM Paul Eggert <eggert@cs.ucla.edu> wrote:
On 2023-06-15 10:03, enh wrote:
any thoughts on either renaming `struct state` (i went with `struct __timezone_t` to match the typedef to timezone_t for the pointer) or having a knob for that (i added a `#define state __timezone_t` in the `#if NETBSD_INSPIRED` in private.h)? why do i care? because tzcode pulls in the system <time.h>, which is actually fine for the tzcode code, but breaks the OpenBSD wcsftime().
We could rename 'struct state', but I'm a bit lost as to what the problem is, as I don't see mention of 'state' or 'timezone_t' in openbsd/lib/libc/time/wcsftime.c. Although there is a 'struct state' in openbsd/lib/libc/time/localtime.c I assume you're replacing that with tzcode's localtime.c somehow.
it's actually the openbsd strptime.c that's unhappy: ``` In file included from bionic/libc/tzcode/strptime.c:40: bionic/libc/tzcode/private.h:790:23: error: typedef redefinition with different types ('struct state *' vs 'struct __timezone_t *') typedef struct state *timezone_t; ^ bionic/libc/include/time.h:46:30: note: previous definition is here typedef struct __timezone_t* timezone_t; ^ 1 error generated. ``` the trick in private.h assumes that it's #included _before_ <time.h>, but that's not true in that file. (since we already have local changes in that file for gnu extensions that openbsd wasn't interested in, i've just reordered the #includes.) User apps (which tzcode is, by default) shouldn't define symbols like
__timezone_t because these symbols are reserved to the system. We could easily rename 'struct state' to 'struct timezone_struct' so that you could compile with -Dtimezone_struct=__timezone_t, though I'm still curious about why 'struct state' causes problems.
true; since tzcode _is_ the implementation for 3.5+ billion devices, i forget that others might not consider it "the implementation" :-)
I'll cc this to the tz mailing list to see whether other people there have insight into this issue. For those new to the discussion, the original thread is archived here:
( the Archives of this thread seem incomplete.) On 6/20/23 15:00:29, enh via tz wrote:
... We could rename 'struct state', but I'm a bit lost as to what the problem is, as I don't see mention of 'state' or 'timezone_t' in openbsd/lib/libc/time/wcsftime.c. Although there is a 'struct state' in openbsd/lib/libc/time/localtime.c I assume you're replacing that with tzcode's localtime.c somehow.
(Replying mostly to the Subject:) A thread-safe localtime() seems impractical. localtime() depends on the environment variable TZ. If concurrent threads set different values of TZ the results will be unpredictable. The same applies to localtime_r(). -- gil
On Jun 20, 2023, at 7:16 PM, Paul Gilmartin via tz <tz@iana.org> wrote:
( the Archives of this thread seem incomplete.)
tz mailing list archives, or lib-coord: https://www.openwall.com/lists/libc-coord/2020/01/30/1 archives? The former are incomplete because the thread didn't start on the tz mailing list; the first message, to libc-coord, was linked from Paul Eggert's message to tz in this thread: https://www.openwall.com/lists/libc-coord/2023/06/14/1
On 6/20/23 15:00:29, enh via tz wrote:
... We could rename 'struct state', but I'm a bit lost as to what the problem is, as I don't see mention of 'state' or 'timezone_t' in openbsd/lib/libc/time/wcsftime.c. Although there is a 'struct state' in openbsd/lib/libc/time/localtime.c I assume you're replacing that with tzcode's localtime.c somehow.
(Replying mostly to the Subject:) A thread-safe localtime() seems impractical. localtime() depends on the environment variable TZ. If concurrent threads set different values of TZ the results will be unpredictable. The same applies to localtime_r().
If concurrent thread set different values of TZ, perhaps the concurrent threads should use a reentrant localtime_z() routine, possibly with a different name, as has been proposed in a number of places, including in this list back in 2001: http://mm.icann.org/pipermail/tz/2001-June/011636.html That requires making such a routine common on multiple platforms, of course.... And, in fact, in the first message in the thread, the author said is anyone aware of any standardization work in this area? i know netbsd has tzalloc()/localtime_rz()/tzfree() (and tzcode has implementations), but they're the only ones who've shipped anything, right? and there's no in-progress work on any alternative? So the proposal there is for localtime_rz(), taking a "time zone" structure, or pointer to same, provided by tzalloc() and freed by jtzfree(), rather than something that looks at TZ. (Perhaps the title was a bit misleading, and should have said "lcoaltime() equivalent".)
On Jun 20, 2023, at 2:00 PM, enh via tz <tz@iana.org> wrote:
it's actually the openbsd strptime.c that's unhappy: ``` In file included from bionic/libc/tzcode/strptime.c:40:
I see the string "bionic" rather than "openbsd" there, so that appears to be *Bionic's* strptime.c that's unhappy. Is there a requirement that Bionic's strptime.c be unchanged from OpenBSD's? (BTW, is there some software using Markdown involved here? the ``` won't change the formatting to fixed format in plain-text or HTML email; in the former, what you see is pretty much what you get, and in the latter, you'd need whatever tags ``` turns into.)
(since we already have local changes in that file for gnu extensions that openbsd wasn't interested in, i've just reordered the #includes.)
Exactly.
On Jun 20, 2023, at 11:27 PM, Guy Harris via tz <tz@iana.org> wrote:
On Jun 20, 2023, at 2:00 PM, enh via tz <tz@iana.org> wrote:
it's actually the openbsd strptime.c that's unhappy: ``` In file included from bionic/libc/tzcode/strptime.c:40:
I see the string "bionic" rather than "openbsd" there, so that appears to be *Bionic's* strptime.c that's unhappy. Is there a requirement that Bionic's strptime.c be unchanged from OpenBSD's?
In fact, the openbsd strptime.c, as of CVS revision 1.32: https://cvsweb.openbsd.org/src/lib/libc/time/strptime.c?rev=1.31&content-typ... does *not* contain the line typedef struct state *timezone_t; so it appears that it's the *Bionic* strptime.c that's unhappy.
On Wed, Jun 21, 2023 at 12:05 AM Guy Harris <gharris@sonic.net> wrote:
On Jun 20, 2023, at 11:27 PM, Guy Harris via tz <tz@iana.org> wrote:
On Jun 20, 2023, at 2:00 PM, enh via tz <tz@iana.org> wrote:
it's actually the openbsd strptime.c that's unhappy: ``` In file included from bionic/libc/tzcode/strptime.c:40:
I see the string "bionic" rather than "openbsd" there, so that appears to be *Bionic's* strptime.c that's unhappy. Is there a requirement that Bionic's strptime.c be unchanged from OpenBSD's?
In fact, the openbsd strptime.c, as of CVS revision 1.32:
https://cvsweb.openbsd.org/src/lib/libc/time/strptime.c?rev=1.31&content-typ...
does *not* contain the line
typedef struct state *timezone_t;
so it appears that it's the *Bionic* strptime.c that's unhappy.
(this is answered on the other thread: it's tz's "private.h" that contains that. i'm afraid without reading _both_ threads, the tz@ thread is going to be confusing.)
From an earlier message of yours: https://www.openwall.com/lists/libc-coord/2023/06/15/2 sent only to libc-coord:
writing some basic unit tests for these functions went fine, though it was sad that i couldn't use `std::unique_ptr<timezone_t, decltype(&tzfree)> seoul{tzalloc("Asia/Seoul"), tzfree}` because the size of timezone_t isn't known. i'd be tempted to further mangle this so i have a public struct containing a large-enough char[], but i worry about just how large to make it to be safe from future tzcode changes. (and on the other hand, do i think that tzalloc() is going to be used frequently enough to warrant the effect? probably not?)
Is there some particular reason to make timezone_t an API/ABI structure, rather than a pointer to an opaque structure? Presumably if it's a structure, "tzalloc()" really means "given a tzid, fill in a structure as appropriate for that tzid, and return that structure", not "allocate a structure, fill it in as appropriate for the tzid, and return a pointer to that structure", and "tzfree()" means "free up anything pointed to by the structure whose value was passed to it" rather than "free everything pointed to by the structure pointed to by the argument and then free the structure". If you want to do that, perhaps names that don't imply that they allocate and free a timezone_t should be chosen.
On Wed, Jun 21, 2023 at 12:54 AM Guy Harris <gharris@sonic.net> wrote:
From an earlier message of yours:
https://www.openwall.com/lists/libc-coord/2023/06/15/2
sent only to libc-coord:
writing some basic unit tests for these functions went fine, though it was sad that i couldn't use `std::unique_ptr<timezone_t, decltype(&tzfree)> seoul{tzalloc("Asia/Seoul"), tzfree}` because the size of timezone_t isn't known. i'd be tempted to further mangle this so i have a public struct containing a large-enough char[], but i worry about just how large to make it to be safe from future tzcode changes. (and on the other hand, do i think that tzalloc() is going to be used frequently enough to warrant the effect? probably not?)
Is there some particular reason to make timezone_t an API/ABI structure, rather than a pointer to an opaque structure?
no, my preference would be for timezone_t to be more like DIR... typedef struct DIR DIR; means you need to write the * in signatures, but that std::unique_ptr-style safe usage "just works" without having to talk about the underlying struct or use std::remove_pointer.
Presumably if it's a structure, "tzalloc()" really means "given a tzid, fill in a structure as appropriate for that tzid, and return that structure", not "allocate a structure, fill it in as appropriate for the tzid, and return a pointer to that structure", and "tzfree()" means "free up anything pointed to by the structure whose value was passed to it" rather than "free everything pointed to by the structure pointed to by the argument and then free the structure". If you want to do that, perhaps names that don't imply that they allocate and free a timezone_t should be chosen.
On Jun 21, 2023, at 12:50 PM, enh <enh@google.com> wrote:
On Wed, Jun 21, 2023 at 12:54 AM Guy Harris <gharris@sonic.net> wrote:
Is there some particular reason to make timezone_t an API/ABI structure, rather than a pointer to an opaque structure?
no, my preference would be for timezone_t to be more like DIR...
typedef struct DIR DIR;
Like DIR, which, at least in macOS (and thus probably in most *BSDs), is fully defined in <dirent.>, or like pcap_t, which is "defined" as an opaque structure in <pcap/pcap.h> and only defined fully in a header internal to libpcap? (And, yes, pcap_t's contents are subject to change over time, have been changed over time, and will continue to change over time.
means you need to write the * in signatures, but that std::unique_ptr-style safe usage "just works" without having to talk about the underlying struct or use std::remove_pointer.
Does that work if the structure is opaque outside the routines that process tz files?
Presumably if it's a structure, "tzalloc()" really means "given a tzid, fill in a structure as appropriate for that tzid, and return that structure", not "allocate a structure, fill it in as appropriate for the tzid, and return a pointer to that structure", and "tzfree()" means "free up anything pointed to by the structure whose value was passed to it" rather than "free everything pointed to by the structure pointed to by the argument and then free the structure". If you want to do that, perhaps names that don't imply that they allocate and free a timezone_t should be chosen.
I.e., it would be timezone_t *tzalloc(const char *tzid); and void tzfree(timezone_t *);
On Wed, Jun 21, 2023 at 1:08 PM Guy Harris <gharris@sonic.net> wrote:
On Jun 21, 2023, at 12:50 PM, enh <enh@google.com> wrote:
On Wed, Jun 21, 2023 at 12:54 AM Guy Harris <gharris@sonic.net> wrote:
Is there some particular reason to make timezone_t an API/ABI structure, rather than a pointer to an opaque structure?
no, my preference would be for timezone_t to be more like DIR...
typedef struct DIR DIR;
Like DIR, which, at least in macOS (and thus probably in most *BSDs), is fully defined in <dirent.>, or like pcap_t, which is "defined" as an opaque structure in <pcap/pcap.h> and only defined fully in a header internal to libpcap? (And, yes, pcap_t's contents are subject to change over time, have been changed over time, and will continue to change over time.
NetBSD exposes its `struct _dirdesc` but FreeBSD/OpenBSD don't (though they use the same name for the private struct). somewhat unusually, bionic didn't start from any of the BSD's <dirent.h>, and uses exact the line i showed above, with the definition only in dirent.cpp itself.
means you need to write the * in signatures, but that std::unique_ptr-style safe usage "just works" without having to talk about the underlying struct or use std::remove_pointer.
Does that work if the structure is opaque outside the routines that process tz files?
Presumably if it's a structure, "tzalloc()" really means "given a tzid, fill in a structure as appropriate for that tzid, and return that structure", not "allocate a structure, fill it in as appropriate for the tzid, and return a pointer to that structure", and "tzfree()" means "free up anything pointed to by the structure whose value was passed to it" rather than "free everything pointed to by the structure pointed to by the argument and then free the structure". If you want to do that, perhaps names that don't imply that they allocate and free a timezone_t should be chosen.
I.e., it would be
timezone_t *tzalloc(const char *tzid);
and
void tzfree(timezone_t *);
exactly, just like opendir() and closedir().
On Jun 21, 2023, at 1:17 PM, enh <enh@google.com> wrote:
NetBSD exposes its `struct _dirdesc` but FreeBSD/OpenBSD don't (though they use the same name for the private struct).
Then I vote for doing as FreeBSD and OpenBSD and CupertinoBSD and systems using glibc do with DIR, rather than as NetBSD does with DIR, and have whatever the handle for a tzdb region is called be a pointer to an opaque structure.
On Wed, Jun 21, 2023 at 1:33 PM Guy Harris <gharris@sonic.net> wrote:
On Jun 21, 2023, at 1:17 PM, enh <enh@google.com> wrote:
NetBSD exposes its `struct _dirdesc` but FreeBSD/OpenBSD don't (though they use the same name for the private struct).
Then I vote for doing as FreeBSD and OpenBSD and CupertinoBSD and systems using glibc do with DIR, rather than as NetBSD does with DIR, and have whatever the handle for a tzdb region is called be a pointer to an opaque structure.
you were right first time about macOS --- it's in the NetBSD camp (well, it inlines the struct definition in the typedef, so effectively the same). i haven't looked at glibc, but musl is in the FreeBSD/OpenBSD/MountainViewBsdOnLinux camp, so i assume glibc is too :-)
On 2023-06-21 12:50, enh wrote:
my preference would be for timezone_t to be more like DIR...
typedef struct DIR DIR;
Although that would also work, that's not how NetBSD did things when it introduced timezone_t in NetBSD 6.0 (2012), and tzcode copied this part of NetBSD in tzdb 2014g. I doubt whether we should change the meaning of timezone_t at this point, as it'd be too much backward-compatibility hassle. We could use a different name, though, for the struct. How about if we change this line in tzcode private.h: typedef struct state *timezone_t; to the following? typedef struct tm_timezone *timezone_t; and change all other instances of "struct state" to "struct tm_timezone"? This would let users write "struct tm_timezone *" and "timezone_t" interchangeably. The name tm_timezone, like timezone_t, is reserved by POSIX when you include <time.h>[1], so if the <time.h> implementation uses this name it won't break any conforming programs. [1]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#ta...
On Wed, Jun 21, 2023 at 2:18 PM Paul Eggert <eggert@cs.ucla.edu> wrote:
On 2023-06-21 12:50, enh wrote:
my preference would be for timezone_t to be more like DIR...
typedef struct DIR DIR;
Although that would also work, that's not how NetBSD did things when it introduced timezone_t in NetBSD 6.0 (2012), and tzcode copied this part of NetBSD in tzdb 2014g. I doubt whether we should change the meaning of timezone_t at this point, as it'd be too much backward-compatibility hassle.
We could use a different name, though, for the struct. How about if we change this line in tzcode private.h:
typedef struct state *timezone_t;
to the following?
typedef struct tm_timezone *timezone_t;
and change all other instances of "struct state" to "struct tm_timezone"?
This would let users write "struct tm_timezone *" and "timezone_t" interchangeably. The name tm_timezone, like timezone_t, is reserved by POSIX when you include <time.h>[1], so if the <time.h> implementation uses this name it won't break any conforming programs.
(i just checked and NetBSD does have `typedef struct __state *timezone_t;` in its <time.h>. my objection to that is that the compiler will actually use the underlying name in error messages, which is less than helpful if the names aren't close. so having s/state/tm_timezone/ seems like a step forward even just in that regard!) up to you (especially since my #define in "private.h" is unlikely to be much of a maintenance burden), but "i'd use this" (if only to get rid of my #define)...
[1]:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#ta...
participants (4)
-
enh -
Guy Harris -
Paul Eggert -
Paul Gilmartin