[PROPOSED] Simplify zdump average-taking
* zdump.c (hunt): Simplify slightly, so that one more instruction can be int width rather than time_t width if that’s what the compiler prefers. No need to mention C89 in the comment, as C89 support is going away soon. (The code still happens to work in C89, for what it’s worth.) --- zdump.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/zdump.c b/zdump.c index a020102..04953dc 100644 --- a/zdump.c +++ b/zdump.c @@ -732,13 +732,9 @@ hunt(timezone_t tz, time_t lot, time_t hit, bool only_ok) for ( ; ; ) { /* T = average of LOT and HIT, rounding down. - Avoid overflow, even on oddball C89 platforms - where / rounds down and TIME_T_MIN == -TIME_T_MAX - so lot / 2 + hit / 2 might overflow. */ - time_t t = (lot / 2 - - ((lot % 2 + hit % 2) < 0) - + ((lot % 2 + hit % 2) == 2) - + hit / 2); + Avoid overflow. */ + int rem_sum = lot % 2 + hit % 2; + time_t t = (rem_sum == 2) - (rem_sum < 0) + lot / 2 + hit / 2; if (t == lot) break; tm_ok = my_localtime_rz(tz, &t, &tm) != NULL; -- 2.38.1
I don't mean to single out this diff, but: why are we making changes like this? I cannot imagine that the width of a single instruction in zdump (of all things!) matters to...literally anyone? [ If someone has an embedded system that spends its time flat-out running zdump to control an elevator or something, I want to hear about it! ] And the risk of these fixes introducing subtle errors or not being carefully reviewed seems far higher than the benefit? I think I am missing something but could someone explain the perspective of why these kinds of changes should be made? Thanks. -- jhawk@alum.mit.edu John Hawkinson Paul Eggert via tz <tz@iana.org> wrote on Wed, 30 Nov 2022 at 20:32:26 EST in <20221201013226.90512-1-eggert@cs.ucla.edu>:
* zdump.c (hunt): Simplify slightly, so that one more instruction can be int width rather than time_t width if that’s what the compiler prefers. No need to mention C89 in the comment, as C89 support is going away soon. (The code still happens to work in C89, for what it’s worth.) --- zdump.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/zdump.c b/zdump.c index a020102..04953dc 100644 --- a/zdump.c +++ b/zdump.c @@ -732,13 +732,9 @@ hunt(timezone_t tz, time_t lot, time_t hit, bool only_ok)
for ( ; ; ) { /* T = average of LOT and HIT, rounding down. - Avoid overflow, even on oddball C89 platforms - where / rounds down and TIME_T_MIN == -TIME_T_MAX - so lot / 2 + hit / 2 might overflow. */ - time_t t = (lot / 2 - - ((lot % 2 + hit % 2) < 0) - + ((lot % 2 + hit % 2) == 2) - + hit / 2); + Avoid overflow. */ + int rem_sum = lot % 2 + hit % 2; + time_t t = (rem_sum == 2) - (rem_sum < 0) + lot / 2 + hit / 2; if (t == lot) break; tm_ok = my_localtime_rz(tz, &t, &tm) != NULL; -- 2.38.1
On 2022-12-01 10:06:18 (+0800), John Hawkinson via tz wrote:
I don't mean to single out this diff, but: why are we making changes like this? I cannot imagine that the width of a single instruction in zdump (of all things!) matters to...literally anyone?
I'm generally in favour of constant (thoughtful) refactoring and keeping up with developments in standardisation. It keeps eyeballs on the code. I'm sure there are several of us on tz@ who at least glance over every diff. (Or at least a few. More than one.) Having said that, I would feel more comfortable with the more arcane changes and apparent micro-optimisations if we had automated regression testing in place.
[ If someone has an embedded system that spends its time flat-out running zdump to control an elevator or something, I want to hear about it! ]
[ Shhhh! Even thinking thoughts like this can cause such devices to spring into existence, and then someone will eventually have to fix them. :-) ]
And the risk of these fixes introducing subtle errors or not being carefully reviewed seems far higher than the benefit?
I'm more worried about the changes not seeing wider testing before ending up in releases.
I think I am missing something but could someone explain the perspective of why these kinds of changes should be made?
I can't speak for Paul but this specific change looks primarily like a simplification. The fact that it improves performance is a happy coincidence. Philip -- Philip Paeps Senior Reality Engineer Alternative Enterprises
On 2022-11-30 18:25, Philip Paeps wrote:
I can't speak for Paul but this specific change looks primarily like a simplification. The fact that it improves performance is a happy coincidence.
This particular change's main point was to fix the comment, since we are moving to C99 or later. The code simplification was a nicety that's independent from the C99 change. The delta to the generated instructions was more a note to myself than anything important. I did run the usual regression tests, as I do with all code changes. You can run them too, by invoking "make check" or (my favorite on GNU/Linux) "make CFLAGS='$(GCC_DEBUG_FLAGS)' check". Of course the tests are not exhaustive. zdump is by far the biggest performance bottleneck in 'make check' so at least one user (namely me) would benefit from a faster zdump, not that this change affected zdump's performance significantly.
Paul Eggert <eggert@cs.ucla.edu> wrote on Thu, 1 Dec 2022 at 00:01:47 EST in <cc9a7cc5-ddbf-94dc-c5ad-faad18437abc@cs.ucla.edu>:
This particular change's main point was to fix the comment, since we are moving to C99 or later.
The idea that moving to C99 means we should go out and remove C89-isms is troubling to me, because I don't think our test coverage is strong enough to ensure that these kinds of changes do not introduce problems. To me, moving to C99 means we don't go out of our way to preserve C89-compatibility, but not that we needlessly alter working code just because we can now use a C99 feature to simplify it. (There is also an issue of burden on code reviewers.) -- jhawk@alum.mit.edu John Hawkinson
The code simplification was a nicety that's independent from the C99 change. The delta to the generated instructions was more a note to myself than anything important.
I did run the usual regression tests, as I do with all code changes. You can run them too, by invoking "make check" or (my favorite on GNU/Linux) "make CFLAGS='$(GCC_DEBUG_FLAGS)' check". Of course the tests are not exhaustive.
zdump is by far the biggest performance bottleneck in 'make check' so at least one user (namely me) would benefit from a faster zdump, not that this change affected zdump's performance significantly.
On 2022-12-01 01:27, John Hawkinson wrote:
I don't think our test coverage is strong enough to ensure that these kinds of changes do not introduce problems.
Yes, that goes without saying, for almost any project, as testing doesn't 100% guarantee correctness. The idea isn't to go out and use every C99 feature just because we can. It's to improve code quality by not insisting on portability to C89. And these improvements need not be limited to new code; they can also apply to existing code. This is not a new thing. We did something similar in 2012 when we started assuming C89 or later. For example, this patch: https://github.com/eggert/tz/commit/400ecf36bb0b73f6390f9641e6cb8bbfb91a5cfd changed tzcode to use function prototypes instead of K&R style function declarations. If the policy had been to not refactor unless we could guarantee it wouldn't introduce problems, then that patch would not have been applied in 2012, since the old code ran just fine on C99 and C11 platforms. However, the patch was a win because C89 function prototypes have important technical advantages over K&R style. Although there was some immediate pain due to this conversion, in the not-so-long run it was a win. (It'll be an even bigger win once C23 becomes popular, as C23 outlaws K&R function declarations.) We'll have similar opportunities once we start assuming C99. We can decide which C99 features to use on a case-by-case basis. We shouldn't use every possible C99 feature we can, only the ones where the long-term benefit is arguably worth the short-term cost. For example, tzcode should not use C99's variable-length arrays because they're not that portable in practice. (They were so controversial that they even became optional in later C standards, so we couldn't use them anyway.) In contrast, the C99 feature of declarations after statements can help code readability significantly, and if someone (probably me) wants to take the time to convert existing code to use this feature, that sounds like a win. And there's quite a bit of backward compatibility code (e.g., our own implementation of snprintf) that could be removed, as in practice nobody uses or tests it any more and it's simply a maintenance burden. I wouldn't object to removing that code, quite the contrary. I understand that doing all this will place a burden on reviewers, and will make spelunking harder. But that's OK if it lessens the overall maintenance burden in the long run.
participants (3)
-
John Hawkinson -
Paul Eggert -
Philip Paeps