default value of TZDEFAULT
[This is a tzcode issue, not tzdata. Your regular discussion of zone definition and naming issues will resume shortly. :-) ] Consider this simple little program: #include <stdio.h> #include <time.h> int main() { time_t t = time(NULL); struct tm *tmp = localtime(&t); puts(asctime(tmp)); return 0; } There's nothing special about this program (that's the point). But suppose I download tzcode or tzdb and then compile the little program (call it lt.c) along with a just-built localtime(), like this: make localtime.o cc -o lt lt.c localtime.o But if I then run 'lt', it is likely not to work -- there's a pretty good chance it will print the current time in UTC, not my local time zone. If it fails, one reason is that it didn't successfully find my system's already-installed tz files. (That is, I made no attempt to build or install any new tz files.) The reason it didn't find them is obvious enough, and the Makefile contains all the hints I need: TOPDIR= /usr/local # "Compiled" time zone information is placed in the "TZDIR" directory # (and subdirectories). # Use an absolute path name for TZDIR unless you're just testing the software. # TZDIR_BASENAME should not contain "/" and should not be ".", ".." or empty. TZDIR_BASENAME= zoneinfo TZDIR= $(TOPDIR)/etc/$(TZDIR_BASENAME) And, sure enough, on my system, the already-installed tz files are not in /usr/local, they're in /usr/share. So it looks like if I change TOPDIR to /usr/share, or TZDIR to /usr/share/zoneinfo, I should be able to fix it. But no! It still doesn't work, and the reason is, my user environment doesn't have the TZ environment variable set; I'm assuming that the system's default time zone will be used. And, out of the box, the tzcode distribution assumes that the local time zone is defined by a file 'localtime' *in TZDIR*; that is, it essentially assumes that TZDEFAULT is $TZDIR/localtime. But some (perhaps even many) systems put it in /etc/localtime. That's where my (MacOS) system puts it, and a lot of Linux systems put it there, too. The upshot is that on such a system, if you compile and use localtime.c straight out of the tzcode distribution, it won't find your localtime file, and will end up falling back to UTC unless TZ is set. Since the Makefile already lets you set your own version of TZDIR (overriding the setting in tzfile.h), I think it makes sense to do the same thing for TZDEFAULT. Patch attached. With this patch in place, and with the Makefile edited to say TZDEFAULT = "/etc/localtime" fresh compiles of localtime.o correctly find my system's localtime file, and the lt program finally prints local time, as expected. (This tweak to TZDEFAULT's usage helps the extraction case -- that is, when the code tries to fetch the default zone, but not necessarily the tz file building case. I'm not sure whether 'make install' will do the right thing when TZDEFAULT is set, but it definitely doesn't do the right thing if DESTDIR and TZDEFAULT are both set. Is using DESTDIR supported? The Makefile should probably document it if so.) Diagnosing problems involving misconfigured TZDIR and TZDEFAULT variables can be surprisingly difficult. The localtime code doesn't have any way of telling you it's having trouble finding tz files in general, or the localtime file in particular. I'll address that issue in a separate thread. Steve Summit scs@eskimo.com
As mentioned in a previous message, tzcode's error handling leaves a bit to be desired. If you specify a nonexistent timezone (say, by setting the TZ environment variable to it), or if the tz files can't be found, or if you haven't specified a time zone and the system's default tz file can't be found, you quietly -- and I do mean quietly! -- get UTC instead. There's no good way to know what went wrong. I have some proposed patches to address this. I'm presenting this as a series of incremental patches, partly so that interested readers can more easy follow the progression from the current code to the fully modified, comprehensively error reporting version, and partly because some of the later stages might be thought too invasive for inclusion. Patch 1 optionally prints some simple, straightforward error messages if a requested zone can't be loaded. The printing of error messages is optional: library code obviously can't be noisy, so the default behavior is the legacy, quiet fallback to UTC. Patch 1 also includes a (one line) change to the distributed 'date' command as a demonstration of how to enable verbose error reporting. For various reasons, though, the error messages go only halfway: they can tell you that a zone couldn't be loaded, but they can't always really tell you why. So Patch 2 attempts to improve the error handling by printing more useful messages. For example, instead of printing "Can't load zone America/New_York", it might say "/usr/local/zoneinfo/America/New_York: No such file or directory". Unfortunately, this ends up requiring rather a lot of changes, and the implementation is significantly imperfect in that it uses some static global data which renders it non thread safe in the error handling case. Patch 3 refactors some code so that the relevant error reporting information can be more cleanly preserved in the local_storage structure, and for just long enough that the appropriate error message can be generated without resorting to static data. (The only remaining drawback is that I'm uncertain of the interplay between these changes and the ALL_STATE option.) Finally, patch 4 optionally prints some merely informative output messages (e.g. "trying /usr/local/zoneinfo/EST5EDT", "parsing 'EST5EDT' as Posix tz string") which do not necessarily indicate any error but which might be useful to maintainers. The four patches are presented in the following four messages, along with a certain amount of additional explanation. This all seems to work, although I've seen a few spurious warnings having to do with a missing and possibly optional 'posixrules' file. I'm pretty sure those messages are due to some misconfiguration in my test environments (and I'm pretty sure the messages are innocuous, anyway), but if they crop up for anybody else, they may need addressing. Steve Summit scs@eskimo.com
This is Patch 1. This patch improves localtime's error handling, to optionally print actual error messages if a requested zone can't be loaded. For various reasons, though, the error messages go only halfway: they will say things like "can't load zone %s", but they are *not* able to say things like "/usr/share/zoneinfo/America/New_York: No such file or directory". (Personally, I consider this sort of lack a nearly fatal flaw in a system error message.) As you might guess, this shortcoming will be addressed in Patch 2. If it's not obvious, the way to enable the optional error messages is either to call tz_set_stderr_warn_func(), which requests that they be sent to stderr, or to call tz_set_warn_func(), which lets you install your own error-printing callback function (with the signature of printf). Also enclosed is a one-line patch to tzcode's copy of date.c adding a call to tz_set_stderr_warn_func. One other mild drawback of this code is that it, unsurprisingly, uses stdio. Once this patch is applied, tzcode depends on stdio again, not long after someone's careful attempt to break that dependency. This patch is with respect to 2017c. Steve Summit scs@eskimo.com
This is Patch 2, which attempts to improve localtime's error handling further. It will tell you if it couldn't open a zoneinfo file (explicitly stating which one and why); it will tell you if it had trouble parsing a zoneinfo file; it will tell you if it had trouble parsing a Posix timezone string not referencing a file. That's all well and good, but the drawbacks are that this ends up requiring rather a lot of changes, and the implementation is significantly imperfect in that it uses some errno-style static global data which renders it non thread safe in the error handling case. As a user, patch 1's error messages (though arguably better than nothing) would have me putting my fist through the screen ("Come on, tell me *which* *file* you couldn't open!"). But Patch 2's nonreentrancy is likely to be a fatal flaw in some people's eyes, and rightly so. For those looking at the code: the error handling variables in question are tzlb_reason and tzlb_filename. The obvious non-statically-global place to stash them, for better thread safety, would be in an instance of union local_storage, but those guys don't stick around long enough under all circumstances. Addressing that will be the subject of Patch 3, at the cost of yet more code rearrangement. This patch is with respect to 2017c, plus Patch 1. Steve Summit scs@eskimo.com
This is patch 3, which repurposes the local_storage union so that it can also save the reason that an attempted zone load failed. Doing so has several implications: * local_storage changes from a union to a struct. (This means tzcode will use a bit more memory, although not permanently so.) * The allocation of local_storage is no longer performed in tzload, but rather in its callers: tzparse, gmtload, and zoneinit. This means that tzload doesn't do anything any more, so I have merged it with tzloadbody. * The allocation of local_storage in tzload used to be either on the heap or the stack, depending on whether ALL_STATE was defined. For simplicity, I've removed the heap allocation option, such that now (in all three of tzparse, gmtload, and zoneinit), local_storage is always allocated on the stack. I don't understand why it was ever important to allocate it on the heap (even in the ALL_STATE case), since it was always freed immediately afterwards. But I don't fully understand what ALL_STATE is for, so I might have missed something. This patch is with respect to 2017c, plus Patches 1 and 2. Steve Summit scs@eskimo.com
Patch 4 adds the ability to print a few extra messages having to do with tzcode's logic for determining how to interpret a requested time zone. There are only two messages so far: "trying %s", when the code is attempting to open a zoneinfo file; and "parsing '%s' as Posix tz string", when it, well, attempts to parse a Posix-style timezone string. That's not much, but it lets you watch, for example, TZ=EST5EDT try both ways but TZ=:EST5EDT only try the file. To enable "verbose debugging", call tz_set_verbose_debug(1), and call tz_set_verbose_debug(0) to turn it off again. Like the rest, this patch is with respect to 2017c. It also depends on Patch 1 (but not 2 or 3). Steve Summit scs@eskimo.com
Steve Summit wrote:
Patch 1 optionally prints some simple, straightforward error messages if a requested zone can't be loaded.
Emitting human-orineted messages is a crap way for library code to signal an error. It means the calling code has no opportunity to handle the error in a manner coherent with its own requirements and style. You at least get points for giving the caller control over where the message goes, but the global variable controlling error disposition breaks code composability: an innocent application utility function that used to internally perform a time conversion now acquires the additional externally-visible behaviour of emitting unwanted warnings, because the top-level program wants the warnings for its own time conversions. There is already a defined way for localtime() et al to report an error: they can return NULL. This doesn't give much information, but at least the caller can see that there was an error on that specific call. If you want to declare some situation to be an error, this is the place to start. If that would break too much existing code, then surely emitting warning messages under control of a new mechanism that existing code doesn't know about would also be too breaking. The better way to signal errors would be to fill a somewhat-user-parseable structure describing the error, and to return it as an additional result from the call that incurred the error, with the caller having requested this kind of information for that specific call. This implies calling a new API function instead of the existing ones. This would be better applied to any of the proposed APIs that pass more of the operands explicitly, than to the legacy APIs that use global variables all over. Of course one would provide a function to translate an error-describing structure into a human-oriented message. But think about locale for a minute: this translation potentially varies based on locale, and this is the right place for that locale to be decided. One doesn't necessarily want error messages in the same locale as the formatting of the time string requested from strftime(). Give the caller control over this. -zefram
Zefram wrote:
Steve Summit wrote:
Patch 1 optionally prints some simple, straightforward error messages if a requested zone can't be loaded.
Emitting human-orineted messages is a crap way for library code to signal an error. It means...
Everything you say is true, but:
There is already a defined way for localtime() et al to report an error: they can return NULL. This doesn't give much information, but...
That's crap, too. (It gives basically one half of one bit of information.) True story: I was moved to write the (allegedly) improved error-handling code when I wrote some tz-using code a while back (a test suite which has unfortunately languished), and submitted it to tz, and none other than Paul Eggert couldn't get it to work at first, because he neglected to update TZDIR in the tzcode Makefile to match the location of his test system's tz files, and it took him and me a couple of days and a several rounds of email to figure out the problem. I submit that the current situation doesn't just give "not much information", it gives no useful information at all. It's like the old MS-DOS systems that printed "no such file or directory, or access denied, or illegal pathname" (or whatever it was) when the mkdir command failed. Yes, global modes and static data are poor for library functions. Yes, a modern interface would "fill a somewhat-user-parseable structure describing the error" and return it. But remember: localtime() et al. are not modern interfaces! And they're already laced with global modes and static data. (And to my mind, "not modern" is not even pejorative, here. We wouldn't still be using localtime et al. if they didn't work well enough, most of the time.) But with all of that said, I suppose that a program that wanted to fail less quietly on misinstalled or misconfigured tz data could, instead of calling this new tz_set_warn_func() first, call a hypothetical new tzset_with_error(&errorinfo) first, followed by the requisite print_tz_error(&errorinfo, locale), or whatever. (What I don't want to do is force such a program to change all of its localtime and gmtime and mktime calls, since there may be arbitrarily many of those.)
Date: Mon, 11 Dec 2017 17:25:59 -0500 From: scs@eskimo.com (Steve Summit) Message-ID: <2017Dec11.1725.scs.0004@quinine.local> | But with all of that said, I suppose that a program that wanted | to fail less quietly on misinstalled or misconfigured tz data | could, instead of calling this new tz_set_warn_func() first, call | a hypothetical new tzset_with_error(&errorinfo) first, This would be one way, as all the errors/whatever you seem to be interested in are related to tzset and nothing really related to issues after tzset has succeeded (when it succeeds) properly. Alternatively, simply add some error/status info into the internal zone structure, and then add a function to extract status (much like ferror() does for stdio - though perhaps with a little more info available). All in all the second seems like a more useful addition to me, if anything really needs changing at all (the alternative is simply to create a tzverify program, which would do what tzset does, but in a user visible way, so when things are not behaving as they should, the user can just run that program and see exactly what is happening - when everything is correctly configured, none of this extra is needed, it is just extra baggage for most programs, most of the time.) kre
On 12/11/2017 04:16 PM, Robert Elz wrote:
simply add some error/status info into the internal zone structure, and then add a function to extract status (much like ferror() does for stdio - though perhaps with a little more info available).
Perhaps this could be done by having localtime etc. set tm_isdst to a negative value when there's a configuration error. If we're feeling lazy, we could use strerror (- tm->tm_isdst) to stringify the error indication.
On 12/11/2017 04:33 AM, Steve Summit wrote:
Since the Makefile already lets you set your own version of TZDIR (overriding the setting in tzfile.h), I think it makes sense to do the same thing for TZDEFAULT.
This makes sense, and it ties to an earlier suggestion by Brian Inglis here: http://mm.icann.org/pipermail/tz/2016-September/024234.html I took a first cut at doing something along those lines, to match Debian reasonably closely, and installed the attached. Although this doesn't change the data, it's a nontrivial change to the default installation procedure; comments are welcome.
participants (4)
-
Paul Eggert -
Robert Elz -
scs@eskimo.com -
Zefram