Questions on tzname and tm_zone
Dear all, I have a few questions regarding how the tzdb code handles tzname and tm_zone: * The define HAVE_TZNAME have three different "settings"; 0 if tzname is not supported, 1 to support tzname defined by the system library and 2 to support and define tzname. In the code though, when defining tzname it also adds the value of TZ_TIME_T which reduces the options. Basically, if time_tz is defined to another datatype that time_t, tzname cannot be defined by the system library since setting HAVE_TZNAME to 1 would still define tzname (around line 195 of localtime.c). Is there a reason for this behaviour (which is also similar for USG_COMPAT)? In my platform we define a new 64 bit datatype, but would still need tzname as defined by the system library. * In case of an invalid timezone, the behaviour of tzname and tm_zone behaves differently; tzname would be set to WILDABBR (which is possible to define), but if localtime is called tm_zone will be set to "UTC" while if gmtime is called, tm_zone is set to "-00" and if offtime is called (with a non-0 offset) tm_zone is WILDABBR. It seems somewhat inconsistent, so is this by design? * The way tzname is updated (in settzname) in settzname is that it first set tzname[0] and tzname[1] to either WILDABBR or UTC, and then possibly update with the correct names. While tzname is not considered thread safe, this way writing to tzname twice means that even call to tzset with the same timezone could affect other threads reading tzname. Not a huge risk, but still possible. Is this something to consider in the implementation? * Since tzname is not thread safe, and due to the way tm_zone is handled it is somewhat hard to detect if a TZ string is valid or not. Is there a recommended way to detect an invalid string? Best regards, Patrik Lantto -- Patrik Lantto - patrik.lantto@wisi.se - +46 13 210918
On 2022-12-29 05:36, Patrik Lantto via tz wrote:
* The define HAVE_TZNAME have three different "settings"; 0 if tzname is not supported, 1 to support tzname defined by the system library and 2 to support and define tzname. In the code though, when defining tzname it also adds the value of TZ_TIME_T which reduces the options. Basically, if time_tz is defined to another datatype that time_t, tzname cannot be defined by the system library since setting HAVE_TZNAME to 1 would still define tzname (around line 195 of localtime.c). Is there a reason for this behaviour (which is also similar for USG_COMPAT)? In my platform we define a new 64 bit datatype, but would still need tzname as defined by the system library.
Defining time_tz is intended for internal use (as per Makefile). When time_tz is defined, the code renames all externally-visible symbols so that it in effect becomes a library that's independent from the C library; for example, localtime becomes tz_localtime. Since tzname is an externally-visible symbol it gets renamed to tz_tzname, and since it's renamed, tzcode must define tz_tzname even if HAVE_TZNAME is 1 since HAVE_TZNAME is about tzname, not tz_tzname. I guess your code could include private.h and get all the renaming; that way, it'd see "#define tzname tz_tzname" and would use tz_tzname, tz_localtime, etc. But of course private.h is intended to be private.... If you're defining and using your own library it might make more sense to compile with -Dtime_tz="long long" -DHAVE_TZNAME=0 -DUSG_COMPAT=0 -DALTZONE=0 -DTM_gmtoff=tm_gmtoff -DTM_ZONE=tm_zone, and change your code to use tm_gmtoff and/or tm_zone and/or strftime instead of tzname, daylight, timezone, and altzone. This will make the code more reliable anyway.
* In case of an invalid timezone, the behaviour of tzname and tm_zone behaves differently; tzname would be set to WILDABBR
I'm not seeing that. tzname is set to WILDABBR before tzset is ever called, or if !ALL_STATE and malloc fails which means tzset can't do anything. An invalid timezone causes tzset to be set to "UTC", not to WILDABBR.
(which is possible to define), but if localtime is called tm_zone will be set to "UTC"
Yes, and that matches tzname's behavior.
while if gmtime is called, tm_zone is set to "-00"
I don't see that. gmtime sets tm_zone to "UTC", no?
and if offtime is called (with a non-0 offset) tm_zone is WILDABBR The intended meaning is:
* WILDABBR means tzname or tm_zone's value is meaningless. This occurs either if tzset has not been called; or if you look at tzname[1] after setting to a timezone without daylight saving, or look at tzname[0] after setting to a timezone that has no standard time; or you called offtime with a nonzero offset (see below). * If you call gmtime etc., you want UTC so the abbreviation is "UTC". * An invalid TZ environment variable is treated like UTC without leap seconds, with the abbreviation "UTC". (The conventional way to get this behavior is to set TZ to the empty string.) * If you call offtime, we haven't implemented its tz abbreviation so you get WILDABBR which means tm_zone's value is meaningless. This is not considered a big deal since the very few people who use offtime generally don't care about time zone abbreviations. I suppose this could be fixed if someone got the energy but it's low priority.
* The way tzname is updated (in settzname) in settzname is that it first set tzname[0] and tzname[1] to either WILDABBR or UTC, and then possibly update with the correct names. While tzname is not considered thread safe, this way writing to tzname twice means that even call to tzset with the same timezone could affect other threads reading tzname.
We're OK as POSIX says the behavior is undefined if users do that <https://pubs.opengroup.org/onlinepubs/9699919799/functions/tzset.html>.
* Since tzname is not thread safe, and due to the way tm_zone is handled it is somewhat hard to detect if a TZ string is valid or not. Is there a recommended way to detect an invalid string?
Call tzalloc and see whether it returns a null pointer. Although not perfect (you can get a null pointer on memory exhaustion too) it's generally good enough.
participants (2)
-
Patrik Lantto -
Paul Eggert