localtime_r multiple times slower for Europe/Moscow timezone
Hi, a Ubuntu user has been reporting in https://launchpad.net/bugs/868395 that the Europe/Moscow timezone is multiple times slower than other timezones. Here is the example bug.c code: ``` #include <time.h> #include <stdio.h> int main() { time_t t = time(0); int i; struct tm result; for(i=0; i < 10000000; i++) localtime_r(&t, &result); puts(ctime(&t)); return 0; } ``` Compile and run this code: ``` gcc -o bug bug.c time TZ=Etc/UTC ./bug time TZ=Europe/Berlin ./bug time TZ=Europe/Moscow ./bug ``` The result on my machine is that Etc/UTC, Europe/Berlin, and other timezones take around 250 to 400 ms, but Europe/Moscow takes 1200 ms. The result is the same when running in a Debian unstable, Ubuntu 22.04 (jammy) and Ubuntu 23.04 (lunar) chroot. Is that a bug? If yes, is that a bug in glibc? -- Benjamin Drung Debian & Ubuntu Developer
Interesting! I can confirm that I also see this in Ubuntu 22.10. However, I do NOT see it on macOS 13.1. Walt On Wed, Jan 11, 2023 at 12:15:53AM +0100, Benjamin Drung via tz wrote:
Hi,
a Ubuntu user has been reporting in https://launchpad.net/bugs/868395 that the Europe/Moscow timezone is multiple times slower than other timezones. Here is the example bug.c code:
``` #include <time.h> #include <stdio.h>
int main() { time_t t = time(0); int i; struct tm result; for(i=0; i < 10000000; i++) localtime_r(&t, &result); puts(ctime(&t)); return 0; } ```
Compile and run this code:
``` gcc -o bug bug.c time TZ=Etc/UTC ./bug time TZ=Europe/Berlin ./bug time TZ=Europe/Moscow ./bug ```
The result on my machine is that Etc/UTC, Europe/Berlin, and other timezones take around 250 to 400 ms, but Europe/Moscow takes 1200 ms. The result is the same when running in a Debian unstable, Ubuntu 22.04 (jammy) and Ubuntu 23.04 (lunar) chroot.
Is that a bug? If yes, is that a bug in glibc?
-- Benjamin Drung Debian & Ubuntu Developer
Benjamin Drung wrote in <3f5601eaa696b06a12f3a578b3128b131ed3bbef.camel@canonical.com>: |Hi, | |a Ubuntu user has been reporting in https://launchpad.net/bugs/868395 |that the Europe/Moscow timezone is multiple times slower than other |timezones. Here is the example bug.c code: | |``` |#include <time.h> |#include <stdio.h> | |int main() { | time_t t = time(0); | int i; | struct tm result; | for(i=0; i < 10000000; i++) | localtime_r(&t, &result); | puts(ctime(&t)); | return 0; |} |``` | |Compile and run this code: | |``` |gcc -o bug bug.c |time TZ=Etc/UTC ./bug |time TZ=Europe/Berlin ./bug |time TZ=Europe/Moscow ./bug |``` | |The result on my machine is that Etc/UTC, Europe/Berlin, and other |timezones take around 250 to 400 ms, but Europe/Moscow takes 1200 ms. |The result is the same when running in a Debian unstable, Ubuntu 22.04 |(jammy) and Ubuntu 23.04 (lunar) chroot. Wow, what performance. #?0|kent:tmp$ for f in Etc/UTC Europe/Berlin Europe/Moscow; do time TZ=$f ./zt; done Tue Jan 10 23:51:57 2023 real 0m0.520s user 0m0.515s sys 0m0.004s Wed Jan 11 00:51:58 2023 real 0m7.202s user 0m7.180s sys 0m0.000s Wed Jan 11 02:52:05 2023 real 0m2.076s user 0m2.071s sys 0m0.000s |Is that a bug? If yes, is that a bug in glibc? CRUX Linux, 2022g, 2.36-3, gcc 12.2.0. #?0|kent:tmp$ zdump -v Europe/Berlin|wc -l 2144 #?0|kent:tmp$ zdump -v Europe/Moscow|wc -l 166 #?0|kent:tmp$ zdump -v Etc/UTC|wc -l 6 Hm. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
Steffen Nurpmeso wrote in <20230110235603.tclHE%steffen@sdaoden.eu>: |Benjamin Drung wrote in | <3f5601eaa696b06a12f3a578b3128b131ed3bbef.camel@canonical.com>: ||a Ubuntu user has been reporting in https://launchpad.net/bugs/868395 ||that the Europe/Moscow timezone is multiple times slower than other ||timezones. Here is the example bug.c code: || ||``` ||#include <time.h> ||#include <stdio.h> || ||int main() { || time_t t = time(0); || int i; || struct tm result; || for(i=0; i < 10000000; i++) || localtime_r(&t, &result); || puts(ctime(&t)); || return 0; ||} ||``` ... ||The result on my machine is that Etc/UTC, Europe/Berlin, and other ||timezones take around 250 to 400 ms, but Europe/Moscow takes 1200 ms. ||The result is the same when running in a Debian unstable, Ubuntu 22.04 ||(jammy) and Ubuntu 23.04 (lunar) chroot. | |Wow, what performance. | | #?0|kent:tmp$ for f in Etc/UTC Europe/Berlin Europe/Moscow; do time \ | TZ=$f ./zt; done | real 0m0.520s | user 0m0.515s | sys 0m0.004s | Wed Jan 11 00:51:58 2023 | real 0m7.202s | user 0m7.180s | sys 0m0.000s | Wed Jan 11 02:52:05 2023 | real 0m2.076s | user 0m2.071s | sys 0m0.000s | ||Is that a bug? If yes, is that a bug in glibc? | |CRUX Linux, 2022g, 2.36-3, gcc 12.2.0. | | #?0|kent:tmp$ zdump -v Europe/Berlin|wc -l | 2144 | #?0|kent:tmp$ zdump -v Europe/Moscow|wc -l | 166 | #?0|kent:tmp$ zdump -v Etc/UTC|wc -l | 6 ..just to correct stupidity a little bit $ for f in Etc/UTC Europe/Berlin Europe/Moscow; do ef=$(echo $f | sed -E 's|/|\\/|') sed -E '/^Z '"$ef"'/,/^Z /p;d' /usr/share/zoneinfo/tzdata.zi done comes to Z Etc/UTC 0 - UTC Z Etc/GMT 0 - GMT Z Europe/Berlin 0:53:28 - LMT 1893 Ap 1 c CE%sT 1945 May 24 2 1 So CE%sT 1946 1 DE CE%sT 1980 1 E CE%sT Z Europe/Gibraltar -0:21:24 - LMT 1880 Au 2 Z Europe/Moscow 2:30:17 - LMT 1880 2:30:17 - MMT 1916 Jul 3 2:31:19 R %s 1919 Jul 1 0u 3 R %s 1921 O 3 R MSK/MSD 1922 O 2 - EET 1930 Jun 21 3 R MSK/MSD 1991 Mar 31 2s 2 R EE%sT 1992 Ja 19 2s 3 R MSK/MSD 2011 Mar 27 2s 4 - MSK 2014 O 26 2s 3 - MSK Z Europe/Simferopol 2:16:24 - LMT 1880 --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
This is a known performance issue in glibc; see: https://sourceware.org/bugzilla/show_bug.cgi?id=15943 It'd be nice if someone found the time to fix it; perhaps that could be you? The bug depends on how the TZif files are built. On my Fedora 37 platform if you build with 'zic -b fat' Europe/Berlin is way faster than Europe/Moscow, but if you build with 'zic -b slim' it's the other way round. Obviously performance shouldn't be that much affected by whether TZif files are fat or slim.
On Jan 10, 2023, at 6:40 PM, Paul Eggert via tz <tz@iana.org> wrote:
This is a known performance issue in glibc; see:
This dates back to the addition of 64-bit time support; in the discussion, Robbin Kawabata suggested:
1. This is another idea for supporting dates into the far future. Is it feasible for zic to encode variable information in the data file for the last set of rules, that would be used for times past the last entry in the transition table? Then localtime() would use the variables algorithmically rather than using table-driven data, for dates past the last table entry. Perhaps set a minimum size for the table entries (ie, have table entries at least up to year xxxx.)
zic would effectively map the last set of transition rules of an Olson timezone to an equivalent POSIX timezone.
where the sequence of transitions stretching into the future is compressed by providing data for an algorithm to compute them. Unless I missed a mail message in my archive search, we ended up going with
As before, zic writes a second instance of headers and data to time zone files; the second instance has eight-byte transition times to cover far-future (and far past) cases. Zic also puts a newline-enclosed POSIX-style time zone string at the end of the file when possible (or, when a zone can't be represented using POSIX, puts a newline-enclode empty string at the end of the file). (Enclosing the string in newlines makes for meaningful output from the "tail -1" command applied to time zone files.) When a POSIX-style string is available, zic does *not* write 400 years worth of data.
The files that don't have a POSIX string at the end are: America/Godthab America/Santiago Antarctica/Palmer Asia/Tehran Asia/Jerusalem Asia/Tel_Aviv Chile/Continental Chile/EasterIsland Iran Israel Pacific/Easter For zones such as America/Godthab, we use the previous dodge of writing 400 years worth of data to the time zone data file and then working modulo 400 in localtime.
("enclode" is a typo for "enclosed"; Arthur later sent out a message with the subject "yet another try at 64-bit changes", in which the typo is fixed.) The problem appears to be that, if the tzdb region's file 1) has no transitions past the time being converted and 2) has a POSIX TZ string, then localtime_r() and localtime() will, as per the above, parse the POSIX TZ string. If the results of the first parse of the TZ string are saved and are reused for all subsequent, this won't be too bad. If they are not, and you're converting a lot of times with the same tz setting, you're going to parse the same string over and over again, which seems a bit costly. GNU libc's code does *not* save the results of the parse. Its code to test whether the time is past the last transition's time is else if (__glibc_unlikely (timer >= transitions[num_transitions - 1])) which suggest that they assumed that this was an unlikely case. There may have been a time when it was unlikely, but, with the current zic, this will, I think, be the case for any tzdb region that does not currently adjust the clocks, and there are quite a few of them. I am waiting for Sourceware to give me a Bugzilla account. Once they do, I will point this out to them.
On Jan 11, 2023, at 12:07 AM, Guy Harris via tz <tz@iana.org> wrote:
GNU libc's code does *not* save the results of the parse. Its code to test whether the time is past the last transition's time is
else if (__glibc_unlikely (timer >= transitions[num_transitions - 1]))
which suggest that they assumed that this was an unlikely case. There may have been a time when it was unlikely, but, with the current zic, this will, I think, be the case for any tzdb region that does not currently adjust the clocks, and there are quite a few of them.
Including China, India, Japan, and Russia, for starters.
participants (5)
-
Benjamin Drung -
Guy Harris -
Paul Eggert -
Steffen Nurpmeso -
Walt Mankowski