FW: [casting; overflow detection]
I'm forwarding this message from Christos Zoulas. --ado -----Original Message----- From: Christos Zoulas [mailto:christos@zoulas.com] Sent: Thursday, January 13, 2011 8:50 To: Olson, Arthur David (NIH/NCI) [E] Subject: Hi, Mail to the tz mailing list does not seem to go through from my server. E82CA564C4 2128 Thu Jan 13 04:00:52 christos@zoulas.com (connect to lecserver.nci.nih.gov[137.187.215.78]: Operation timed out) tz@lecserver.nci.nih.gov I I sent the following two messages: Best, christos
From christos Wed Jan 12 11:49:47 2011 From: christos@rebar.astron.com (Christos Zoulas) Date: Wed, 12 Jan 2011 11:49:47 -0500 In-Reply-To: <201101121624.p0CGOrGD018457@lecserver.nci.nih.gov> from Arthur David Olson (Jan 12, 11:24am) Organization: Astron Software X-Mailer: Mail User's Shell (7.2.6 beta(4.pl1)+dynamic 20000103) To: <tz@elsie.nci.nih.gov>, <tz@lecserver.nci.nih.gov> Subject: Re: proposed time zone package changes (Hawaii, stack overflow, et al.) Status: OR
On Jan 12, 11:24am, olsona@elsie.nci.nih.gov (Arthur David Olson) wrote: -- Subject: proposed time zone package changes (Hawaii, stack overflow, et al | If no problems are found, these changes will appear in tzcode2011a.tar.gz | and tzdata2011.tar.gz on 2011-11-23. Please don't cast the result of calloc(3). It will hide errors where the prototype is not in scope and then the function is assumed to return "int" which breaks on 64 bit systems. It's been more than 20 years (22 actually since c89) since ANSI C made allocation functions return "void *" :-) On a different subject, I've committed the reentrant versions of the timezone code to the NetBSD tree. Is there any chance that those changes will be merged anytime soon? Or do people need more time to review them? How do we go about accepting or rejecting them? Should there be a vote? Best, christos
From christos Wed Jan 12 23:00:52 2011 From: christos@rebar.astron.com (Christos Zoulas) Date: Wed, 12 Jan 2011 23:00:52 -0500 Organization: Astron Software X-Mailer: Mail User's Shell (7.2.6 beta(4.pl1)+dynamic 20000103) To: tz@lecserver.nci.nih.gov Subject: Gcc and integer overflow Status: OR
Hello, Integer overflow in ansi c has undefined behavior. Gcc 4.5.x takes advantage of this fact to optimize code. In particular you cannot depend anymore on: int oldx = x; x += delta; if ((oldx > x && delta > 0) || (oldx < x && delta < 0)) overflow; What happens in fact, is that gcc will warn, and drop the code that is "impossible" (drop the whole if statement). Unfortunately tzcode relies on the above behavior to detect overflow. I have changed the code the following way (but I have not tested it), so if you please can you check my work, or propose a better way to do it. I think we should come up with a viable solution for the next tzcode release. Many thanks, christos Index: localtime.c =================================================================== RCS file: /cvsroot/src/lib/libc/time/localtime.c,v retrieving revision 1.51 diff -u -u -r1.51 localtime.c --- localtime.c 6 Jan 2011 02:41:34 -0000 1.51 +++ localtime.c 10 Jan 2011 21:32:09 -0000 @@ -1703,9 +1703,16 @@ { int number0; + if (delta == 0) + return 0; + number0 = *number; *number += delta; - return (*number < number0) != (delta < 0); + + if (delta < 0) + return number0 < 0 && (-*number + INT_MIN) > delta; + else /* delta > 0 */ + return number0 > 0 && (INT_MAX - *number) < delta; } static int @@ -1713,9 +1720,16 @@ { long number0; + if (delta == 0) + return 0; + number0 = *number; *number += delta; - return (*number < number0) != (delta < 0); + + if (delta < 0) + return number0 < 0 && (-*number + LONG_MIN) > delta; + else /* delta > 0 */ + return number0 > 0 && (LONG_MAX - *number) < delta; } static int
The business about casting calloc is more a matter of style than anything else (do we want to program as if it were the 1980s? then use a cast. otherwise don't) but the overflow issue is more important. On 01/13/11 06:30, christos@rebar.astron.com (Christos Zoulas) wrote:
+ return number0 < 0 && (-*number + INT_MIN) > delta;
That won't work either, since the '-' could overflow. The tz code is littered with assumptions that integer overflow wraps around. You've just found the tip of the iceberg here. It would take a great deal of effort to remove all those assumptions. I suggest that, if you compile with GCC, that you compile with the -fwrapv option, to avoid problems in this area. That being said, in areas where we know about overflow problems we might as well be more portable. Here's a patch to catch the instances of overflow issues that I found. I gave up after a while, because there are lots more where these came from, but these are pretty easy fixes so we might as well put them in. This patch also alters Makefile to warn about the problem and suggest -fwrapv to GCC users. =================================================================== RCS file: RCS/Makefile,v retrieving revision 2010.14 diff -pu -r2010.14 Makefile --- Makefile 2010/10/12 16:36:50 2010.14 +++ Makefile 2011/01/13 18:25:22 @@ -110,6 +110,10 @@ LDLIBS= # -TTZ_DOMAINDIR=\"/path\" to use "/path" for gettext directory; # the default is system-supplied, typically "/usr/lib/locale" # $(GCC_DEBUG_FLAGS) if you are using GCC and want lots of checking +# $(GCC_OVERFLOW_FLAGS) if you are using GCC 3.4 or later. +# If you are using a compiler other than GCC, and if it defaults to +# undefined behavior on integer overflow, then you need to specify a flag +# saying that integer arithmetic is supposed to wrap around on overflow. # -DNO_RUN_TIME_WARNINGS_ABOUT_YEAR_2000_PROBLEMS_THANK_YOU=1 # if you do not want run time warnings about formats that may cause # year 2000 grief @@ -120,6 +124,7 @@ GCC_DEBUG_FLAGS = -Dlint -g -O -fno-comm -Wall -Wcast-qual -Wconversion -Wmissing-prototypes \ -Wnested-externs -Wpointer-arith -Wshadow \ -Wtraditional # -Wstrict-prototypes -Wwrite-strings +GCC_OVERFLOW_FLAGS = -fwrapv # # If you want to use System V compatibility code, add # -DUSG_COMPAT @@ -386,7 +391,7 @@ names: public: $(ENCHILADA) make maintainer-clean - make "CFLAGS=$(GCC_DEBUG_FLAGS)" + make "CFLAGS=$(GCC_DEBUG_FLAGS) $(GCC_OVERFLOW_FLAGS)" -mkdir /tmp/,tzpublic -for i in $(TDATA) ; do zic -v -d /tmp/,tzpublic $$i 2>&1 | grep -v "starting year" ; done for i in $(TDATA) ; do zic -d /tmp/,tzpublic $$i || exit; done =================================================================== RCS file: RCS/date.c,v retrieving revision 2007.10 diff -pu -r2007.10 date.c --- date.c 2007/12/03 14:42:13 2007.10 +++ date.c 2011/01/13 18:36:59 @@ -598,15 +598,12 @@ comptm(atmp, btmp) register const struct tm * const atmp; register const struct tm * const btmp; { - register int result; - - if ((result = (atmp->tm_year - btmp->tm_year)) == 0 && - (result = (atmp->tm_mon - btmp->tm_mon)) == 0 && - (result = (atmp->tm_mday - btmp->tm_mday)) == 0 && - (result = (atmp->tm_hour - btmp->tm_hour)) == 0 && - (result = (atmp->tm_min - btmp->tm_min)) == 0) - result = atmp->tm_sec - btmp->tm_sec; - return result; + return atmp->tm_year != btmp->tm_year + || atmp->tm_mon != btmp->tm_mon + || atmp->tm_mday != btmp->tm_mday + || atmp->tm_hour != btmp->tm_hour + || atmp->tm_min != btmp->tm_min + || atmp->tm_sec != btmp->tm_sec; } /* =================================================================== RCS file: RCS/localtime.c,v retrieving revision 2010.14 diff -pu -r2010.14 localtime.c --- localtime.c 2010/10/12 16:36:51 2010.14 +++ localtime.c 2011/01/13 17:58:57 @@ -1606,8 +1606,10 @@ int delta; int number0; number0 = *number; + if (delta < 0 ? number0 < delta - INT_MIN : INT_MAX - delta < number0) + return 1; *number += delta; - return (*number < number0) != (delta < 0); + return 0; } static int @@ -1618,8 +1620,10 @@ int delta; long number0; number0 = *number; + if (delta < 0 ? number0 < delta - LONG_MIN : LONG_MAX - delta < number0) + return 1; *number += delta; - return (*number < number0) != (delta < 0); + return 0; } static int =================================================================== RCS file: RCS/zdump.c,v retrieving revision 2010.14 diff -pu -r2010.14 zdump.c --- zdump.c 2010/10/12 16:36:51 2010.14 +++ zdump.c 2011/01/13 18:32:27 @@ -17,6 +17,7 @@ static char elsieid[] = "@(#)zdump.c 8.1 #include "time.h" /* for struct tm */ #include "stdlib.h" /* for exit, malloc, atoi */ #include "float.h" /* for FLT_MAX and DBL_MAX */ +#include "limits.h" /* for INT_MAX, LONG_MAX, etc. */ #include "ctype.h" /* for isalpha et al. */ #ifndef isascii #define isascii(x) 1 @@ -426,21 +427,25 @@ _("%s: use of -v on system with floating } } else if (0 > (time_t) -1) { /* - ** time_t is signed. Assume overflow wraps around. + ** time_t is signed. */ - time_t t = 0; - time_t t1 = 1; - - while (t < t1) { - t = t1; - t1 = 2 * t1 + 1; + if (sizeof (time_t) == sizeof (int)) { + absolute_min_time = INT_MIN; + absolute_max_time = INT_MAX; + } else if (sizeof (time_t) == sizeof (long)) { + absolute_min_time = LONG_MIN; + absolute_max_time = LONG_MAX; +#if defined LLONG_MIN && defined LLONG_MAX + } else if (sizeof (time_t) == sizeof (long long)) { + absolute_min_time = LLONG_MIN; + absolute_max_time = LLONG_MAX; +#endif + } else { + (void) fprintf(stderr, +_("%s: use of -v on system with signed time_t whose extrema are unknown\n"), + progname); + exit(EXIT_FAILURE); } - - absolute_max_time = t; - t = -t; - absolute_min_time = t - 1; - if (t < absolute_min_time) - absolute_min_time = t; } else { /* ** time_t is unsigned. =================================================================== RCS file: RCS/zic.c,v retrieving revision 2010.14 diff -pu -r2010.14 zic.c --- zic.c 2010/10/12 17:37:29 2010.14 +++ zic.c 2011/01/13 18:18:35 @@ -2533,14 +2533,11 @@ oadd(t1, t2) const long t1; const long t2; { - register long t; - - t = t1 + t2; - if ((t2 > 0 && t <= t1) || (t2 < 0 && t >= t1)) { + if (t2 < 0 ? t1 < t2 - LONG_MIN : LONG_MAX - t2 < t1) { error(_("time overflow")); exit(EXIT_FAILURE); } - return t; + return t1 + t2; } static zic_t @@ -2548,18 +2545,15 @@ tadd(t1, t2) const zic_t t1; const long t2; { - register zic_t t; - if (t1 == max_time && t2 > 0) return max_time; if (t1 == min_time && t2 < 0) return min_time; - t = t1 + t2; - if ((t2 > 0 && t <= t1) || (t2 < 0 && t >= t1)) { + if (t2 < 0 ? t1 < t2 - LONG_MIN : LONG_MAX - t2 < t1) { error(_("time overflow")); exit(EXIT_FAILURE); } - return t; + return t1 + t2; } /*
Date: Thu, 13 Jan 2011 10:52:44 -0800 From: Paul Eggert <eggert@cs.ucla.edu> Message-ID: <4D2F49FC.8070601@cs.ucla.edu> | @@ -598,15 +598,12 @@ comptm(atmp, btmp) | register const struct tm * const atmp; | register const struct tm * const btmp; | { | - register int result; | - | - if ((result = (atmp->tm_year - btmp->tm_year)) == 0 && | - (result = (atmp->tm_mon - btmp->tm_mon)) == 0 && | - (result = (atmp->tm_mday - btmp->tm_mday)) == 0 && | - (result = (atmp->tm_hour - btmp->tm_hour)) == 0 && | - (result = (atmp->tm_min - btmp->tm_min)) == 0) | - result = atmp->tm_sec - btmp->tm_sec; | - return result; | + return atmp->tm_year != btmp->tm_year | + || atmp->tm_mon != btmp->tm_mon | + || atmp->tm_mday != btmp->tm_mday | + || atmp->tm_hour != btmp->tm_hour | + || atmp->tm_min != btmp->tm_min | + || atmp->tm_sec != btmp->tm_sec; If you're going to make that change, which is OK, as date.c doesn't need more than that, then you should probably also rename the function to make it clear that it is now an equality test for struct tm's, rather than a comparison of struct tm's which it was before (but which for its usage in date.c it did not need to be). You don't get to make the same kind of change to the very similar function in localtime.c though ... Most of it is also not needed, the struct tm's are normalised, which means that "atmp->tm_hour - btmp->tm_hour" cannot possibly underflow or overflow, the values cannot possibly be big enough (in magitude) - so we don't really need to take extremes to protect against overflow that the raw data types potentially would allow. | number0 = *number; | + if (delta < 0 ? number0 < delta - INT_MIN : INT_MAX - delta < number0) | + return 1; Surely the first test there should be number0 < INT_MIN - delta ? INT_MIN is a big (in magnitude) negative number, sutracting that from a relatively small (negative) number must generate a quite large positive number, almost any value for number0 is going to be smaller than that, leading to a false detection of underflow. | number0 = *number; | + if (delta < 0 ? number0 < delta - LONG_MIN : LONG_MAX - delta < number0) | + return 1; Ditto. I haven't checked the proposed mods to zic or zdump to see whether they're correct - in general I don't much llike pandering to this nonsense from the C standards people. The chances that C will ever be used in any meaningful way on any hardware where interger overflow doesn't wrap is close to 0 (just because all the existing software is written with that assumption, and no sane manufacturer is ever going to burden themselves with needing to fix all of it). So whatever the academic possibilities are for this (and some other - such as not assuming 2's complement for negative numbers - again, today no other choice is possible) the practical environment is that they're never going to happen, and what the standards should deal with is what we actually see in practice, not might be possible in the fevered imagination of some theoretician. kre
On 01/14/2011 08:43 PM, Robert Elz wrote:
If you're going to make that change, which is OK, as date.c doesn't need more than that, then you should probably also rename the function
Yes, that'd be fine.
Most of it is also not needed, the struct tm's are normalised, which means that "atmp->tm_hour - btmp->tm_hour" cannot possibly underflow or overflow,
Yes, when they're normalized, only the tm_year subtraction can overflow. I thought it a bit clearer to use "!=" everywhere if I was to use it with tm_year, but it's a minor detail.
| number0 = *number; | + if (delta < 0 ? number0 < delta - INT_MIN : INT_MAX - delta < number0) | + return 1;
Surely the first test there should be number0 < INT_MIN - delta ?
Yes, that's correct. Thanks for catching the typo, both there, and in the "delta - LONG_MIN" case.
The chances that C will ever be used in any meaningful way on any hardware where interger overflow doesn't wrap is close to 0
Well, I'm afraid that's not true these days. With the latest GCC on x86, the following program exits with status 0 if you compile with "gcc -O2": #include <limits.h> int x = INT_MAX; enum { delta = 1 }; int main (void) { return (x + delta < x) != (delta < 0); } even though it should exit with status 1 if integer overflow wraps around. "gcc -O2"-on-x86 is a pretty common platform.
Date: Sat, 15 Jan 2011 00:02:34 -0800 From: Paul Eggert <eggert@cs.ucla.edu> Message-ID: <4D31549A.7060402@cs.ucla.edu> | Well, I'm afraid that's not true these days. With the latest GCC on x86, That's only because gcc is taking advantage of what the C standards people say is OK - it isn't because i386's can't, or doesn't, do "normal" 32 (or 64) bit 2's complement arithmetic (just as easily, or more easily than anything else). And that behaviour can be disabled (and IMO, should be ...) In practice the optimisation is useless - the code it makes go away wouldn't have been written in the first place if it wasn't important, so when it happens it just makes us programmers work harder to prevent the compiler from taking useless advantage. kre
On Sat, Jan 15, 2011 at 04:28:17PM +0700, Robert Elz <kre@munnari.OZ.AU> wrote:
In practice the optimisation is useless - the code it makes go away wouldn't have been written in the first place if it wasn't important, so when it happens it just makes us programmers work harder to prevent the compiler from taking useless advantage.
Putting my gcc head on, I have to say it is far from useless :) These kind of things often happen during macro expansion. It's like saying, optimising away this is useless, because nobody would write such code: 0 < 0 ? ... but it might be part of an abs macro or just something more complicated where some arguments are constants. also, taking advantage of the undefined nature of (signed) integer overflow makes it possible to calculate the number of loop rounds in a lot more cases (because gcc doesn't have to handle the case where it wraps around and you suddenly have a lot more iterations), and the gains from loop optimisations can be substantial. here is an article by ian lance taylor with some examples: http://www.airs.com/blog/archives/120 note that this behaviour is with us for a long time already, and gcc does provide a switch to go into non-iso mode. one can disagree with this and ask for a default mode that inhibits optimisations that compilers of the good old 90ies didn't do, but one cannot say they are useless :) -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schmorp@schmorp.de -=====/_/_//_/\_,_/ /_/\_\
Does the code available at... http://www.cert.org/secure-coding/integralsecurity.html ...survive gcc's latest optimizations? --ado
On 02/14/2011 07:59 AM, Olson, Arthur David (NIH/NCI) [E] wrote:
Does the code available at... http://www.cert.org/secure-coding/integralsecurity.html ...survive gcc's latest optimizations?
I doubt whether anybody knows the answer to that question. Looking at the code, I'm not sure I'd trust that code all that much, as I found a systemic bug in it after five minutes' worth of investigation. In multiple places it naively assumes that integer division can't overflow, which of course is incorrect for two's complement arithmetic.
On Mon, 14 Feb 2011, Paul Eggert wrote:
On 02/14/2011 07:59 AM, Olson, Arthur David (NIH/NCI) [E] wrote:
Does the code available at... http://www.cert.org/secure-coding/integralsecurity.html ...survive gcc's latest optimizations?
I doubt whether anybody knows the answer to that question.
Looking at the code, I'm not sure I'd trust that code all that much, as I found a systemic bug in it after five minutes' worth of investigation. In multiple places it naively assumes that integer division can't overflow, which of course is incorrect for two's complement arithmetic.
And in addition to integer division being able to overflow, the modulo operation INT_MIN % -1 is also undefined in C for two's complement arithmetic (C1X makes this undefinedness explicit after the committee confirmed it was as intended). The code appears to ignore that as well. -- Joseph S. Myers jsm@polyomino.org.uk
participants (5)
-
Joseph S. Myers -
Marc Lehmann -
Olson, Arthur David (NIH/NCI) [E] -
Paul Eggert -
Robert Elz