tzcode: uninitialized sp->charcnt gives clang analyzer warning
When building tzcode commit 7f74206b39673b1a110285c6992f70507e211060 with -DALL_STATE, clang reports a loop with a undefined termination condition because sp->charcnt is not set before its use. There should probably be a sp->charcnt=0 in zoneinit.
clang --analyze -Xanalyzer -analyzer-output=text localtime.c -DALL_STATE localtime.c:321:16: warning: The right operand of '<' is a garbage value [core.UndefinedBinaryOperatorResult] for (i = 0; i < sp->charcnt; ++i) ^ localtime.c:2210:7: note: 'err' is 0 if (err) { ^~~ localtime.c:2210:3: note: Taking false branch if (err) { ^ localtime.c:2214:3: note: Calling 'tzset_unlocked' tzset_unlocked(); ^~~~~~~~~~~~~~~~ localtime.c:1383:13: note: Assuming 'name' is null int lcl = name ? strlen(name) < sizeof lcl_TZname : -1; ^~~~ localtime.c:1383:13: note: '?' condition is false localtime.c:1384:7: note: 'lcl' is < 0 if (lcl < 0 ^~~ localtime.c:1384:7: note: '?' condition is true localtime.c:1385:9: note: Assuming 'lcl_is_set' is >= 0 ? lcl_is_set < 0 ^~~~~~~~~~~~~~ localtime.c:1384:3: note: Taking false branch if (lcl < 0 ^ localtime.c:1389:7: note: Assuming 'sp' is null if (! sp) ^~~~ localtime.c:1389:3: note: Taking true branch if (! sp) ^ localtime.c:1390:19: note: Uninitialized value stored to field 'charcnt' lclptr = sp = malloc(sizeof *lclptr); ^~~~~~~~~~~~~~~~~~~~~~ localtime.c:1392:7: note: Assuming 'sp' is non-null if (sp) { ^~ localtime.c:1392:3: note: Taking true branch if (sp) { ^ localtime.c:1393:9: note: Calling 'zoneinit' if (zoneinit(sp, name) != 0) ^~~~~~~~~~~~~~~~~~ localtime.c:1355:7: note: 'name' is null if (name && ! name[0]) { ^~~~ localtime.c:1355:12: note: Left side of '&&' is false if (name && ! name[0]) { ^ localtime.c:1369:15: note: Calling 'tzload' int err = tzload(name, sp, true); ^~~~~~~~~~~~~~~~~~~~~~ localtime.c:739:7: note: Assuming 'lsp' is null if (!lsp) ^~~~ localtime.c:739:3: note: Taking true branch if (!lsp) ^ localtime.c:740:5: note: Returning without writing to 'sp->charcnt' return errno; ^ localtime.c:1369:15: note: Returning from 'tzload' int err = tzload(name, sp, true); ^~~~~~~~~~~~~~~~~~~~~~ localtime.c:1370:9: note: Assuming 'err' is equal to 0 if (err != 0 && name && name[0] != ':' && tzparse(name, sp, NULL)) ^~~~~~~~ localtime.c:1370:18: note: Left side of '&&' is false if (err != 0 && name && name[0] != ':' && tzparse(name, sp, NULL)) ^ localtime.c:1372:9: note: 'err' is equal to 0 if (err == 0) ^~~ localtime.c:1372:5: note: Taking true branch if (err == 0) ^ localtime.c:1373:7: note: Calling 'scrub_abbrs' scrub_abbrs(sp); ^~~~~~~~~~~~~~~ localtime.c:321:16: note: The right operand of '<' is a garbage value for (i = 0; i < sp->charcnt; ++i) ^ ~~~~~~~~~~~ localtime.c:1277:4: warning: Value stored to 'theiroffset' is never read [deadcode.DeadStores] theiroffset = theirstdoffset; ^ ~~~~~~~~~~~~~~ localtime.c:1277:4: note: Value stored to 'theiroffset' is never read theiroffset = theirstdoffset; ^ ~~~~~~~~~~~~~~ 2 warnings generated.
On 8/9/21 4:47 AM, Jan Engelhardt via tz wrote:
When building tzcode commit 7f74206b39673b1a110285c6992f70507e211060 with -DALL_STATE, clang reports a loop with a undefined termination condition because sp->charcnt is not set before its use. There should probably be a sp->charcnt=0 in zoneinit.
Thanks for reporting that. Unfortunately, though, this appears to be a false alarm from Clang, as I don't see how sp->charcnt can be used without being set, because If tzload returns zero then sp->charcnt must be set. If you see a reason it is not a false alarm, please let us know. Otherwise, it might be good to file a bug report with the Clang folks. I did try running clang with -fanalyzer and _DALL_STATE, as well as gcc with similar flags (something I don't normally do, as -fanalyzer is kinda slow), and fixed some minor glitches with the attached patch. This doesn't affect the problem you mention, though.
On Monday 2021-08-09 23:32, Paul Eggert wrote:
On 8/9/21 4:47 AM, Jan Engelhardt via tz wrote:
When building tzcode commit 7f74206b39673b1a110285c6992f70507e211060 with -DALL_STATE, clang reports a loop with a undefined termination condition because sp->charcnt is not set before its use. There should probably be a sp->charcnt=0 in zoneinit.
Thanks for reporting that. Unfortunately, though, this appears to be a false alarm from Clang, as I don't see how sp->charcnt can be used without being set, because If tzload returns zero then sp->charcnt must be set.
Inside tzload, if malloc fails, then, by POSIX standardese, it ought to set errno. However, clang - rightfully, I think - does not make any particular assumptions about malloc and has found and reported the case whereby this malloc returns with NULL _and_ errno is 0. (The malloc(3) page on Linux systems mentions the corner-cases in which errno=0 can happen, namely "private malloc implementations".) tzload then returns errno, which is 0 under these pretenses, thereby signalling to its caller that everything was fine, when it fact it wasn't. This is how clang then arrived at sp->charcnt being used without initialization. Using calloc instead of malloc, or just setting the field to zero, should have little ill effect, even cosmetically.
On Aug 9, 2021, at 3:55 PM, Jan Engelhardt via tz <tz@iana.org> wrote:
Inside tzload, if malloc fails, then, by POSIX standardese, it ought to set errno. However, clang - rightfully, I think - does not make any particular assumptions about malloc and has found and reported the case whereby this malloc returns with NULL _and_ errno is 0. (The malloc(3) page on Linux systems mentions the corner-cases in which errno=0 can happen, namely "private malloc implementations".)
tzload then returns errno, which is 0 under these pretenses, thereby signalling to its caller that everything was fine, when it fact it wasn't.
Then it should be fixed not to do so, e.g. by returning ENOMEM if errno is 0. (And, as far as I know, the tzdb code works on at least some non-Unix-like platforms; if so, "POSIX says so" isn't sufficient, as malloc() is a C library function, not solely a POSIX/UN*X function. C90 says If the size of the space requested is zero, the behavior is implementation-defined, the value returned shall be either a null pointer or a unique pointer. However, in this particular case, a sizeof value for a union pointed to by a pointer is passed to malloc(), with none of the union members being zero-sized, so the size of the space requested will not be zero, and thus a null return value should mean "allocation failed".)
Guy Harris via tz <tz@iana.org> writes:
On Aug 9, 2021, at 3:55 PM, Jan Engelhardt via tz <tz@iana.org> wrote:
tzload then returns errno, which is 0 under these pretenses, thereby signalling to its caller that everything was fine, when it fact it wasn't.
Then it should be fixed not to do so, e.g. by returning ENOMEM if errno is 0.
I'm not quite sure which part you're saying should be fixed. But it's entirely legal per C99 (not POSIX) for malloc not to bother setting errno. AFAICS, C99 doesn't even specify the existence of ENOMEM. regards, tom lane
On Aug 10, 2021, at 5:58 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Guy Harris via tz <tz@iana.org> writes:
On Aug 9, 2021, at 3:55 PM, Jan Engelhardt via tz <tz@iana.org> wrote:
tzload then returns errno, which is 0 under these pretenses, thereby signalling to its caller that everything was fine, when it fact it wasn't.
Then it should be fixed not to do so, e.g. by returning ENOMEM if errno is 0.
I'm not quite sure which part you're saying should be fixed. But it's entirely legal per C99 (not POSIX) for malloc not to bother setting errno. AFAICS, C99 doesn't even specify the existence of ENOMEM.
Then a routine that calls malloc() must not return 0 on success and the value of errno on an error. Instead, it should do something such as /* Load tz data from the file named NAME into *SP. Read extended format if DOEXTEND. Return 0 on success, an errno value on failure. */ static int tzload(char const *name, struct state *sp, bool doextend) { #ifdef ALL_STATE union local_storage *lsp = malloc(sizeof *lsp); if (!lsp) { return HAVE_MALLOC_ERRNO ? errno : ENOMEM; } else { int err = tzloadbody(name, sp, doextend, lsp); free(lsp); return err; } #else union local_storage ls; return tzloadbody(name, sp, doextend, &ls); #endif } which is, in fact, what the tip of the main branch is currently doing. That ensures that tzload() always returns a non-zero value if malloc() fails, so that its caller can distinguish between "malloc succeeded" and "malloc failed". See Git commit d9b364304b9f56e7c94252e84829efba3804417b in the tz Git repository for the full change.
On 8/9/21 3:55 PM, Jan Engelhardt wrote:
Inside tzload, if malloc fails, then, by POSIX standardese, it ought to set errno. However, clang - rightfully, I think - does not make any particular assumptions about malloc and has found and reported the case whereby this malloc returns with NULL _and_ errno is 0.
localtime.c assumes POSIX behavior for other functions (e.g., 'open', 'read'), which may help explain why assuming POSIX for 'malloc' hasn't been a problem in practice. Still, I take your point that there could be platforms that support POSIX 'open', 'read', etc. but not POSIX 'malloc', or static analyzers like 'clang' that assume that the platform's malloc does not conform to POSIX. So I took the usual way out in the spirit of HAVE_POSIX_DECLS etc. by adding a compile-time option HAVE_MALLOC_ERRNO which you can set to 0 if your platform's malloc departs from standard practice. See the attached proposed patches. With these patches you should be able to run clang this way: clang --analyze -Xanalyzer -analyzer-output=text localtime.c \ -DALL_STATE -DHAVE_MALLOC_ERRNO=0 and get a clean report.
The malloc(3) page on Linux systems mentions the corner-cases in which errno=0 can happen, namely "private malloc implementations".)
That man page is outdated, as it was written when the GNU C Library was looser and goosier about memory allocation. Your email prompted me to send in a patch here: https://lore.kernel.org/linux-man/20210810193708.10277-1-eggert@cs.ucla.edu/ and I expect the man page to be updated in due course. The glibc manual says that any memory allocator replacement functions must be compatible with glibc, right down to errno. Among other things, it says "for example, the replacement 'free' should also preserve 'errno'"; this is a property of GNU 'free' that I believe will appear in the next POSIX revision.
Using calloc instead of malloc, or just setting the field to zero, should have little ill effect, even cosmetically.
I can see an ill effect, in that clearing storage merely to pacify static analysis false alarms can mask other errors. Valid reports of uninitialized variables can be useful to developers, and invariably clearing storage to zero can suppress these useful reports and can lead to less-reliable software.
On Tuesday 2021-08-10 21:51, Paul Eggert wrote:
On 8/9/21 3:55 PM, Jan Engelhardt wrote:
Inside tzload, if malloc fails, then, by POSIX standardese, it ought to set errno. However, clang - rightfully, I think - does not make any particular assumptions about malloc and has found and reported the case whereby this malloc returns with NULL _and_ errno is 0.
So I took the usual way out in the spirit of HAVE_POSIX_DECLS etc. by adding a compile-time option HAVE_MALLOC_ERRNO which you can set to 0 if your platform's malloc departs from standard practice. See the attached proposed patches. With these patches you should be able to run clang this way:
clang --analyze -Xanalyzer -analyzer-output=text localtime.c \ -DALL_STATE -DHAVE_MALLOC_ERRNO=0
and get a clean report.
Certainly not. Previously, there was a return errno; now in 5c79ca1 there is a return HAVE_MALLOC_ERRNO ? errno : ENOMEM; HAVE_MALLOC_ERRNO is a compile-time constant (set to 1 on POSIXy) so you really just have the same thing as before. If you have not yet been notified by github, my original (counter)proposal is in https://github.com/eggert/tz/pull/28 . *That* clears the clang report. It also does without any new compile-time define. It is easy to identify, as there just is no good reason for malloc(non-zero) to return NULL and not set errno - whether POSIX or not.
On 8/10/21 2:41 PM, Jan Engelhardt wrote:
With these patches you should be able to run clang this way:
clang --analyze -Xanalyzer -analyzer-output=text localtime.c \ -DALL_STATE -DHAVE_MALLOC_ERRNO=0
and get a clean report.
Certainly not.
Did you actually try it? It worked for me. I am using clang 12.0.0 (Fedora 12.0.0-2.fc34) on x86-64. The key is that -DHAVE_MALLOC_ERRNO=0 option.
If you have not yet been notified by github, my original (counter)proposal is in https://github.com/eggert/tz/pull/28 .
Patches like those are best circulated on this mailing list, as I don't often look at GitHub requests. (I wish there was some way to for GitHub to inform patch-submitters of this, but there doesn't seem to be.) Anyway, the counterproposal doesn't solve the problem portably. On platforms where malloc does not conform to POSIX, errno might be garbage after malloc fails, and tzalloc etc. shouldn't communicate that garbage to its callers when tzalloc fails in turn. Conversely, an alternative approach in which tzalloc always sets errno to ENOMEM on malloc failure would lose useful information on POSIX platforms where malloc failures can yield errno values other than ENOMEM. Something like HAVE_MALLOC_ERRNO is needed if localtime.c both (a) properly reports valid errno info and (b) ports to POSIX as well as to these non-POSIX platforms. This is because localtime.c's code cannot reliably distinguish between the two kinds of platforms simply by using runtime tests.
Paul Eggert via tz <tz@iana.org> wrote on Tue, 10 Aug 2021 at 19:11:21 EDT in <5d0f0b15-0f0f-8d67-3da5-d5fbcdbb6989@cs.ucla.edu>:
Patches like those are best circulated on this mailing list, as I don't often look at GitHub requests. (I wish there was some way to for GitHub to inform patch-submitters of this, but there doesn't seem to be.)
As noted by me on the list last year nearly a year ago: https://docs.github.com/en/github/building-a-strong-community/creating-a-pul... looks like I accidently dropped the final character in a copy/paste error, sorry about that:
Date: Sat, 15 Aug 2020 18:52:58 -0400 From: John Hawkinson <jhawk@alum.mit.edu> To: Paul Eggert <eggert@cs.ucla.edu>, Victor Perov <vic.gald@gmail.com> Cc: Time zone mailing list <tz@iana.org> Subject: Re: [tz] Change Request: Europe/Kiev to Europe/Kyiv Message-ID: <20200815225258.GA62055@alum.mit.edu> ... Paul Eggert <eggert@cs.ucla.edu> wrote on Sat, 15 Aug 2020 at 18:19:08 EDT in <cefc7f01-a3e9-f2a4-9ab9-5288147f994a@cs.ucla.edu>:
Thanks for bringing that to the mailing list (if I could shut off GitHub pull requests I would).
Although you cannot do so, we can handle them a little bit better. Please see https://docs.github.com/en/github/building-a-strong-community/creating-a-pul...
You can create a template that explains to a would-be pull request submitter than pull requests will be closed and the project uses the mailing list for development discussions. ("I understand that the tz project does not use pull requests on github, and that if I enter a pull request here, it will be closed and I will be directed to send email to tz@iana.org.")
-- jhawk@alum.mit.edu John Hawkinson
On Wednesday 2021-08-11 01:11, Paul Eggert wrote:
With these patches you should be able to run clang this way:
clang --analyze -Xanalyzer -analyzer-output=text localtime.c \ -DALL_STATE -DHAVE_MALLOC_ERRNO=0
and get a clean report.
Certainly not.
Did you actually try it? It worked for me. I am using clang 12.0.0 (Fedora 12.0.0-2.fc34) on x86-64. The key is that -DHAVE_MALLOC_ERRNO=0 option.
But I have a platform where malloc does set errno on failure, hence I am looking (only) at the -DHAVE_MALLOC_ERRNO=1 configuration and what clang outputs there.
On 8/11/21 1:47 AM, Jan Engelhardt wrote:
But I have a platform where malloc does set errno on failure, hence I am looking (only) at the -DHAVE_MALLOC_ERRNO=1 configuration and what clang outputs there.
Although the platform's malloc sets errno on failure, the static analyzer incorrectly assumes otherwise. You can work around this problem by compiling with -DHAVE_MALLOC_ERRNO=1 (the default) for the platform, and by running the static analyzer with -DHAVE_MALLOC_ERRNO=0. Or, if it's an absolute requirement to do static analysis and compilation with the same flags and to get 100% clean reports, then use -DHAVE_MALLOC_ERRNO=0 for both compilation and static analysis: although this might lose some errno information at runtime, that's less important than an absolute requirement and it's better than propagating junk errno values. Alternatively, you can write a script to remove the incorrect static-analyzer diagnostic, or simply ignore the diagnostic; this is a very common thing to do in such situations. Of course it would be better if the static analyzer didn't make incorrect assumptions about the underlying platform. A bug report to the Clang maintainers would be in order, if this problem is sufficiently annoying. There's nothing unusual about this sort of thing. I've run many static analyses using Coverity, GCC, Clang, etc. and there are almost invariably glitches where static analysis issues false alarms. And although I've sent in my fair share of bug reports, this area continues to be buggy. The thing to remember in cases like these is that static analysis should be one's servant, not one's master.
participants (5)
-
Guy Harris -
Jan Engelhardt -
John Hawkinson -
Paul Eggert -
Tom Lane