Timezone change detection
Hello. I'm working on upstreaming a patch, which originally came from NetApp, to modify localtime.c to automatically detect changes to the time zone. The use case is to make the applications handle moving between timezones, or changing the default system timezone. The idea is pretty simple: just stat(2) the timezone file inode to make sure it still refers to the same file it previously did, and make sure to do this at most once per minute. The patch (with the FreeBSD-specific build glue stripped) follows; previous discussion can be found at https://reviews.freebsd.org/D30183. The patch was written for FreeBSD version of localtime.c, which differs a bit from upstream, but as there seems to be an effort underway to sync FreeBSD with upstream, I believe it also makes sense to ask for comments here before commiting it to FreeBSD. Any feedback is welcome. Thanks :-) diff --git a/contrib/tzcode/stdtime/localtime.c b/contrib/tzcode/stdtime/localtime.c index e221c1fa396..926b24470e1 100644 --- a/contrib/tzcode/stdtime/localtime.c +++ b/contrib/tzcode/stdtime/localtime.c @@ -354,6 +354,45 @@ settzname(void) } } +#ifdef DETECT_TZ_CHANGES +/* + * Determine if there's a change in the timezone since the last time we checked. + * Returns: -1 on error + * 0 if the timezone has not changed + * 1 if the timezone has changed + */ +static int +change_in_tz(const char *name) +{ + static char old_name[PATH_MAX]; + static struct stat old_sb; + struct stat sb; + int error; + + error = stat(name, &sb); + if (error != 0) + return -1; + + if (strcmp(name, old_name) != 0) { + strlcpy(old_name, name, sizeof(old_name)); + old_sb = sb; + return 1; + } + + if (sb.st_dev != old_sb.st_dev || + sb.st_ino != old_sb.st_ino || + sb.st_ctime != old_sb.st_ctime || + sb.st_mtime != old_sb.st_mtime) { + old_sb = sb; + return 1; + } + + return 0; +} +#else /* !DETECT_TZ_CHANGES */ +#define change_in_tz(X) 0 +#endif /* !DETECT_TZ_CHANGES */ + static int differ_by_repeat(const time_t t1, const time_t t0) { @@ -379,6 +418,7 @@ register const int doextend; int stored; int nread; int res; + int ret; union { struct tzhead tzhead; char buf[2 * sizeof(struct tzhead) + @@ -427,6 +467,22 @@ register const int doextend; (void) strcat(fullname, name); name = fullname; } + if (doextend == TRUE) { + /* + * Detect if the timezone file has changed. Check + * 'doextend' to ignore TZDEFRULES; the change_in_tz() + * function can only keep state for a single file. + */ + ret = change_in_tz(name); + if (ret <= 0) { + /* + * Returns -1 if there was an error, + * and 0 if the timezone had not changed. + */ + free(fullname); + return ret; + } + } if ((fid = _open(name, OPEN_MODE)) == -1) { free(fullname); return -1; @@ -1209,12 +1265,43 @@ gmtload(struct state *const sp) (void) tzparse(gmt, sp, TRUE); } +#ifdef DETECT_TZ_CHANGES +static int +recheck_tzdata() +{ + static time_t last_checked; + struct timespec now; + time_t current_time; + int error; + + /* + * We want to recheck the timezone file every 61 sec. + */ + error = clock_gettime(CLOCK_MONOTONIC, &now); + if (error <= 0) { + /* XXX: Can we somehow report this? */ + return 0; + } + + current_time = now.tv_sec; + if ((current_time - last_checked > 61) || + (last_checked > current_time)) { + last_checked = current_time; + return 1; + } + + return 0; +} +#else /* !DETECT_TZ_CHANGES */ +#define recheck_tzdata() 0 +#endif /* !DETECT_TZ_CHANGES */ + static void tzsetwall_basic(int rdlocked) { if (!rdlocked) _RWLOCK_RDLOCK(&lcl_rwlock); - if (lcl_is_set < 0) { + if (lcl_is_set < 0 && recheck_tzdata() == 0) { if (!rdlocked) _RWLOCK_UNLOCK(&lcl_rwlock); return;
On 9/3/21 9:53 AM, Edward Tomasz Napierała via tz wrote:
The use case is to make the applications handle moving between timezones, or changing the default system timezone.
Does the use case include installing new timezone tables when a new tzdb release comes out? Or is it to handle only the case where /etc/timezone changes because the system has moved or has changed its default timezone for other reasons? Won't the change break applications like 'zdump' that call 'localtime' several times and assume that all the 'localtime' calls return consistent values for a single timezone? Perhaps it would be better to do change detection only when tzset is called explicitly; this would be less likely to break zdump etc. A smaller question: when opening the TZif file for the first time, wouldn't it be more efficient (and avoid a race) to do open+fstat instead of stat+open?
Paul Eggert via tz <tz@iana.org> writes:
On 9/3/21 9:53 AM, Edward Tomasz Napierała via tz wrote:
The use case is to make the applications handle moving between timezones, or changing the default system timezone.
Won't the change break applications like 'zdump' that call 'localtime' several times and assume that all the 'localtime' calls return consistent values for a single timezone?
I think the odds of this breaking stuff are significantly larger than the odds of it doing useful stuff. I find it particularly frightening that the patch doesn't appear to allow the calling application to have any say in the matter. regards, tom lane
On Sep 3, 2021, at 3:43 PM, Tom Lane via tz <tz@iana.org> wrote:
I think the odds of this breaking stuff are significantly larger than the odds of it doing useful stuff. I find it particularly frightening that the patch doesn't appear to allow the calling application to have any say in the matter.
Note that, in Darwin, and thus, in macOS/iOS/iPadOS/etc.: 1) there is a notification mechanism by which the code in the OS that adjusts the time zone if the machine (from an iPhone to a MacBook*) crosses tzdb region boundaries can tell all code that uses the time zone information that the time zone has changed; 2) the tzdb-reference-code-based code in libSystem registers for those notifications and reloads the time zone information when the time zone changes; so that the following program, when compiled and run on a Mac: #include <stdio.h> #include <time.h> #include <unistd.h> int main(void) { time_t now; for (;;) { now = time(NULL); printf("%s", ctime(&now)); sleep(5); } return 0; } can produce output such as Fri Sep 3 15:55:20 2021 Fri Sep 3 15:55:25 2021 Fri Sep 3 15:55:30 2021 Fri Sep 3 15:55:35 2021 Fri Sep 3 15:55:40 2021 Fri Sep 3 15:55:45 2021 Fri Sep 3 15:55:50 2021 Fri Sep 3 15:55:55 2021 Fri Sep 3 15:56:00 2021 Fri Sep 3 15:56:05 2021 Fri Sep 3 17:56:10 2021 Fri Sep 3 17:56:15 2021 Fri Sep 3 15:56:20 2021 Fri Sep 3 15:56:25 2021 if the time zone changes, and 3) the only opt-out mechanism I know of would be to somehow determine the tzdb region you want to stay in forever (note that TZ doesn't reflect it - it's not set by default, so /etc/localtime is used, so if you want to stay in the current tzdb region, you'd have to do a readlink() on /etc/localtime), do a setenv() to set TZ to that, and then call tzset(). (In that example, I turned off the automatic time zone picker, manually changed to a different time zone, and then turned the automatic time zone picker back on, as I'm not close enough to a time zone boundary to be able to test it using only the picker. But the same thing will happen if, for example, you get on a plane with your MacBook* or iPhone or iPad, cross a time zone boundary, and then look at the time in the airport where you arrive - I know this because it *did* happen to me several years ago.)
On 9/3/21 4:05 PM, Guy Harris wrote:
Note that, in Darwin, and thus, in macOS/iOS/iPadOS/etc.:
1) there is a notification mechanism by which the code in the OS that adjusts the time zone if the machine (from an iPhone to a MacBook*) crosses tzdb region boundaries can tell all code that uses the time zone information that the time zone has changed;
2) the tzdb-reference-code-based code in libSystem registers for those notifications and reloads the time zone information when the time zone changes;
That shouldn't break zdump. However, it might break an application that relies on the system-default timezone rather than on setting TZ explicitly, if the application assumes a coherent timezone. That being said, if Darwin operates the way you describe it sounds like they're willing to live with this sort of breakage. Arguably the benefit is worth the cost. (Does the same thing happen with localtime_r on Darwin? localtime_r is documented by POSIX to not necessarily do a tzset, for all the usual thread-safety and performance reasons. If localtime responds to these notifications but localtime_r does not, this could lead to mysterious behavior.) Edward seemed to be asking for something more, though. That is, he appeared to be asking for something where it's also the case that in code like this: setenv("TZ", "America/New_York", 1); time_t x = whatever; struct tm a = *localtime(&x); struct tm b = *localtime(&x); A and B might have different values if tzdb is updated between the two calls to localtime. This example is independent of whether the system default time zone has changed. (And presumably the same thing could happen with localtime_r.) Another issue would be using tzdb extensions taken from NetBSD: timezone_t tz = tzalloc("America/New_York"); time_t x = whatever; struct tm a = *localtime_rz(tz, &x, &a); struct tm b = *localtime_rz(tz, &x, &b); A and B might have different values if tzdb is updated between the two calls to localtime_rz. Even though tz is a time zone "value", it is a "value" that mutates when tzdb is updated. This usage appears to be particularly problematic, as part of the point of localtime_rz is speed and thread-safety, and I'd hate for it to have to consult the filesystem. What do you think about all this, Edward?
On Sep 3, 2021, at 6:11 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
(Does the same thing happen with localtime_r on Darwin? localtime_r is documented by POSIX to not necessarily do a tzset, for all the usual thread-safety and performance reasons. If localtime responds to these notifications but localtime_r does not, this could lead to mysterious behavior.)
Same behavior from #include <stdio.h> #include <time.h> #include <unistd.h> int main(void) { time_t now; struct tm tmstr, *tm; for (;;) { now = time(NULL); tm = localtime_r(&now, &tmstr); printf("%04d-%02d-%02d %02d:%02d:%02d\n", tm->tm_year + 1900, tm->tm_mon + 1, tm->tm_mday, tm->tm_hour, tm->tm_min, tm->tm_sec); sleep(5); } return 0; } Performance shouldn't be an issue, as it doesn't reload - or even check for a change - on every call; it reloads /etc/localtime only if it's notified that the tzdb region changed. This program does *not* show a tzdb region change if I tweak the time zone: #include <stdio.h> #include <time.h> #include <unistd.h> #include <stdlib.h> int main(void) { time_t now; struct tm tmstr, *tm; setenv("TZ", "America/Los_Angeles", 1); for (;;) { now = time(NULL); tm = localtime_r(&now, &tmstr); printf("%04d-%02d-%02d %02d:%02d:%02d\n", tm->tm_year + 1900, tm->tm_mon + 1, tm->tm_mday, tm->tm_hour, tm->tm_min, tm->tm_sec); sleep(5); } return 0; } I suspect the same will be the case if ctime() is used.
On Sep 3, 2021, at 6:11 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
A and B might have different values if tzdb is updated between the two calls to localtime_rz. Even though tz is a time zone "value", it is a "value" that mutates when tzdb is updated. This usage appears to be particularly problematic, as part of the point of localtime_rz is speed and thread-safety, and I'd hate for it to have to consult the filesystem.
As far as I know, the Darwin scheme does not bother checking for tzdb updates, it just handles explicit notifications that the system time zone has changed, so this would be a change over and above what you get in, for example, macOS.
Another issue would be using tzdb extensions taken from NetBSD:
timezone_t tz = tzalloc("America/New_York"); time_t x = whatever; struct tm a = *localtime_rz(tz, &x, &a); struct tm b = *localtime_rz(tz, &x, &b);
A and B might have different values if tzdb is updated between the two calls to localtime_rz. Even though tz is a time zone "value", it is a "value" that mutates when tzdb is updated. This usage appears to be particularly problematic, as part of the point of localtime_rz is speed and thread-safety, and I'd hate for it to have to consult the filesystem.
I don't think this is right. tzalloc() loads the zone data and after that the filesystem is not consulted anymore, so A and B will always be the same. In fact, I would say that the tzalloc() mechanism should be used for applications that wish to use fresh data each time the timezone file gets updated. Perhaps we should add a "const char *tzgetpath(const timezone_t);" so that the application can use whatever mechanism it wants to detect staleness (but that would require adding more information to timezone_t). If we want to implement staleness checks in the library, we should do this by adding the information to detect staleness in timezone_t, not using static variables (which are also not thread safe, and don't work with the "z" variants). christos
On 9/4/21 2:05 AM, Christos Zoulas wrote:
Perhaps we should add a "const char *tzgetpath(const timezone_t);" so that the application can use whatever mechanism it wants to detect staleness (but that would require adding more information to timezone_t). If we want to implement staleness checks in the library, we should do this by adding the information to detect staleness in timezone_t, not using static variables (which are also not thread safe, and don't work with the "z" variants).
The approach I had thought of would be to add a tzalloc variant in which you specify a flag saying "I want this timezone_t to be updated if the system changes timezone or timezone data". Using that flag would slow down localtime_rz calls that use the resulting timezone_t. There might be multiple flags depending on how much checking you want. This approach would insulate apps from the details of staleness checking. It'd allow the implementation to go to a Google-like approach, for example, where there's just one big file with all the TZif data inside it. This approach would affect only the relatively few apps that use localtime_rz, though. For apps using localtime and localtime_r, surely it'd be OK for us to change tzset so that it always checks the filesystem (which it doesn't always do now), and to make this change only for explicit calls to tzset. That wouldn't break many applications, as nobody expects localtime results to be consistent across explicit tzset boundaries. As for localtime routinely checking a la Darwin, that's problematic for the semantic reasons already discussed. If we do that, I suspect we should do it only if a build-time option tells us to. And perhaps we'd need two options, one for "look for changes to the sysadmin's selection of the default timezone", and one for "look for changes to the database".
On Sep 4, 2021, at 9:25 AM, Paul Eggert <eggert@cs.ucla.edu> wrote:
As for localtime routinely checking a la Darwin, that's problematic for the semantic reasons already discussed.
Note that at least one trademark-licensed UNIX implements that approach (albeit with a notification mechanism rather than polling i.e. routinely checking); it's unwise for UN*X programs to assume that the timezone doesn't change out from under you, especially given that there are UN*X devices that you can wear on your wrist even if you're riding in a vehicle that can cross time zone boundaries. According to C11, "The local time zone and Daylight Saving Time are implementation-defined.", so I don't see anything there that forbids the local time zone from changing out from under the program. The tzset() page in the Single UNIX Specification says The tzset() function shall use the value of the environment variable TZ to set time conversion information used by ctime, localtime, mktime, and strftime. If TZ is absent from the environment, implementation-defined default timezone information shall be used. Which also doesn't appear to forbid the local time zone from changing out from under the program if TZ isn't set. So it may be "problematic" but my sympathies for any software for which that causes problems is limited. I don't object to making something such as that not be the default, but I'll still advise software developers either to explicitly set TZ or to use localtime_rz() if they need to convert times in a specific fixed time zone. (I'd *love* to have localtime_rz(), or some equivalent, available within Wireshark, on multiple platforms including Windows, for example, so people who want to see packet time stamps etc. in the zone in which they were captured could do so without annoying and platform-dependent hacks.)
On 0903T1340, Paul Eggert wrote:
On 9/3/21 9:53 AM, Edward Tomasz Napierała via tz wrote:
The use case is to make the applications handle moving between timezones, or changing the default system timezone.
Does the use case include installing new timezone tables when a new tzdb release comes out? Or is it to handle only the case where /etc/timezone changes because the system has moved or has changed its default timezone for other reasons?
The former - it compares both the device/inode numbers, and file modification/change times. It also handles a case where there's a changed symlink in the zone file path.
Won't the change break applications like 'zdump' that call 'localtime' several times and assume that all the 'localtime' calls return consistent values for a single timezone? Perhaps it would be better to do change detection only when tzset is called explicitly; this would be less likely to break zdump etc.
That's a really good point. The mechanism is not compiled in by default, so it is already kind of opt-in, but yeah, I see the problem. I need to think about it some more.
A smaller question: when opening the TZif file for the first time, wouldn't it be more efficient (and avoid a race) to do open+fstat instead of stat+open?
I believe that's how it works right now in FreeBSD version of localtime.c.
participants (5)
-
Christos Zoulas -
Edward Tomasz Napierała -
Guy Harris -
Paul Eggert -
Tom Lane