A NetBSD user reported fsck dumping core (when attempting to correct a corrupted filesystem). That's not nice. Upon investigation, the core dump turns out to be from within asctime_r() (as called via asctime() from ctime()). The code in fsck just does p = ctime(&t); where "t" contains one of the time fields from an (already determined to be corrupted, or we wouldn't be printing) inode. ctime() is (or course) just asctime(locatime(t)) locatime() can return NULL these days, and given a nonsense time value (which is quite likely under these circumstances), probably did, but asctime() doesn't expect anything but a valus "struct tm *" and will simply dereference NULL is NULL is returned from localtime() - instant core. It would be possible to make ctime() be tm = localtime(t); if (tm == NULL) return NULL; return asctime(tm); and in theory that would be adequate, but I'm actually going to suggest moving the fix into asctime() (into asctime_r() really of course), and have asctime_r() start if (timeptr == NULL) return NULL; I think that's a safer fix. Appended is the patch I suggest, made against the tzcode2009t sources (latest I believe.) kre
Sorry for all the silly typos in that message... But I think you can (all) work out what I meant to type. kre
Robert Elz <kre <at> munnari.OZ.AU> writes:
locatime() can return NULL these days, and given a nonsense time value (which is quite likely under these circumstances), probably did, but asctime() doesn't expect anything but a valus "struct tm *" and will simply dereference NULL is NULL is returned from localtime() - instant core.
Ouch, that is a good one. I don't know if these are FreeBSD specific, but the following ones take the "struct tm *" without checking the values neither: * time1() via mktime(), gmtime(). * asctime_r(). Edwin
Date: Sat, 30 Jan 2010 21:22:41 +0000 (UTC) From: Edwin Groothuis <edwin@mavetju.org> Message-ID: <loom.20100130T221814-351@post.gmane.org> | I don't know if these are FreeBSD specific, They aren't. | but the following ones take the | "struct tm *" without checking the values neither: | | * time1() via mktime(), gmtime(). Those are OK, that "struct tm" comes from the user, if the user wants their program to crash, they can pass in any bogus value they want. The only internal calls of those functions I can see (eg: from in strftime()) never pass a NULL value in (since the struct tm there is the starting point, it is typically explicitly allocated (declared as a variable) rather than obtained from one of the other functions). | * asctime_r(). Yes, that's the one (the asctime() variant interface isn't interesting itself) - technically asctime() (or asctime_r()) shouldn't need to verify its input parameters either - it is only because it is called internally that makes me suggest it (and which is why the problem could also be fixed in ctime()). I'm suggesting adding this check into asctime() (into asctime_r() really of course) because code like is in ctime() has been a programming idiom on unix systems for almost 40 years now - there will be lots of code that simply assumes that localtime() (etc) cannot fail, and so does asctime(localtime(&t)) or similar. kre
On Sat, 30 Jan 2010, Robert Elz wrote:
ctime() is (of course) just asctime(locatime(t))
locatime() can return NULL these days, and given a nonsense time value (which is quite likely under these circumstances), probably did, but asctime() doesn't expect anything but a valus "struct tm *" and will simply dereference NULL is NULL is returned from localtime() - instant core.
This code follows the C standard, so this bug is a defect in the standard. It has not been corrected in the latest committee draft. http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1425.pdf Tony. -- f.anthony.n.finch <dot@dotat.at> http://dotat.at/ GERMAN BIGHT HUMBER: SOUTHWEST 5 TO 7. MODERATE OR ROUGH. SQUALLY SHOWERS. MODERATE OR GOOD.
On Sat, Jan 30, 2010 at 5:37 PM, Tony Finch <dot@dotat.at> wrote:
On Sat, 30 Jan 2010, Robert Elz wrote:
locatime() can return NULL these days, ...
This code follows the C standard, so this bug is a defect in the standard.
How so? It's not the job of the standard to specify coding practices. If a function can return a null pointer, programmers need to check for that. Is it "a defect in the standard" that malloc() can return NULL? Or do you mean something else that I'm missing? --Bill Seymour
On Sat, Jan 30, 2010 at 07:21:17PM -0600, Bill Seymour <stdbill.h@pobox.com> wrote:
This code follows the C standard, so this bug is a defect in the standard.
How so? It's not the job of the standard to specify coding practices. If a function can return a null pointer, programmers need to check for that.
Description [#2] The ctime function converts the calendar time pointed to by timer to local time in the form of a string. It is equivalent to asctime(localtime(timer))
Is it "a defect in the standard" that malloc() can return NULL?
Or do you mean something else that I'm missing?
Yes, you should probably read the standard first before commenting about it :) -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schmorp@schmorp.de -=====/_/_//_/\_,_/ /_/\_\
On Sat, 30 Jan 2010, Bill Seymour wrote:
On Sat, Jan 30, 2010 at 5:37 PM, Tony Finch <dot@dotat.at> wrote:
On Sat, 30 Jan 2010, Robert Elz wrote:
locatime() can return NULL these days, ...
This code follows the C standard, so this bug is a defect in the standard.
How so? It's not the job of the standard to specify coding practices. If a function can return a null pointer, programmers need to check for that.
The standard specifies the presence of the bug that Robert reported: Section 7.23.3.2 para. 2: The ctime function converts the calendar time pointed to by timer to local time in the form of a string. It is equivalent to asctime(localtime(timer)) Section 7.23.3.4 para. 4: The localtime function returns a pointer to the broken-down time, or a null pointer if the specified time cannot be converted to local time. Section 7.23.3.1 specifies the behaviour of asctime() in terms of source code which does not check for a NULL argument. The fact that localtime() can return NULL but neither ctime() nor asctime() checks for this is exactly the bug that Robert reported and proposed a fix for. You might argue that this bug falls under the general "garbage in garbage out" clause (quoted below). I think that's pretty unsatisfactory given that it leads to the inconsistency that localtime() checks its argument but ctime() does not. (section 7.1.4) If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer, or a pointer to non-modifiable storage when the corresponding parameter is not const-qualified) or a type (after promotion) not expected by a function with variable number of arguments, the behavior is undefined. Tony. -- f.anthony.n.finch <dot@dotat.at> http://dotat.at/ GERMAN BIGHT HUMBER: SOUTHWEST 5 TO 7. MODERATE OR ROUGH. SQUALLY SHOWERS. MODERATE OR GOOD.
I've just round to looking at this discussion. Tony Finch said:
On Sat, 30 Jan 2010, Bill Seymour wrote:
On Sat, Jan 30, 2010 at 5:37 PM, Tony Finch <dot@dotat.at> wrote:
On Sat, 30 Jan 2010, Robert Elz wrote:
locatime() can return NULL these days, ...
This code follows the C standard, so this bug is a defect in the standard.
How so? It's not the job of the standard to specify coding practices. If a function can return a null pointer, programmers need to check for that.
The standard specifies the presence of the bug that Robert reported:
Section 7.23.3.2 para. 2:
The ctime function converts the calendar time pointed to by timer to local time in the form of a string. It is equivalent to
asctime(localtime(timer))
Section 7.23.3.4 para. 4:
The localtime function returns a pointer to the broken-down time, or a null pointer if the specified time cannot be converted to local time.
Section 7.23.3.1 specifies the behaviour of asctime() in terms of source code which does not check for a NULL argument.
The fact that localtime() can return NULL but neither ctime() nor asctime() checks for this is exactly the bug that Robert reported and proposed a fix for.
This situation happens when localtime is not able to convert the specified time to a broken-down time. This means that the notional broken-down time being supplied to asctime is nonsensical. WG14 has already decided (defect report 217) that asctime's behaviour is undefined if the input cannot be represented according to the given algorithm - for example, if the year is outside the range -999 to 9999. This is just another such situation. The answer - if anyone asked them - is likely to be that you shouldn't call ctime if you can't be sure it's representable. Nothing stops a given implementation making this work, but nothing requires it to either. -- Clive D.W. Feather | If you lie to the compiler, Email: clive@davros.org | it will get its revenge. Web: http://www.davros.org | - Henry Spencer Mobile: +44 7973 377646
It would be possible to make ctime() be
tm = localtime(t); if (tm == NULL) return NULL; return asctime(tm);
and in theory that would be adequate, but I'm actually going to suggest moving the fix into asctime() (into asctime_r() really of course), and have asctime_r() start
if (timeptr == NULL) return NULL;
I think that's a safer fix. Appended is the patch I suggest, made against the tzcode2009t sources (latest I believe.)
Any sentiment for (or against)... if (timeptr == NULL) return "?"; ...or... return "??? ??? ?? ??:??:?? ????\n"; ...or some variant? --ado
Date: Tue, 2 Feb 2010 17:10:57 -0500 From: "Olson, Arthur David (NIH/NCI) [E]" <olsona@dc37a.nci.nih.gov> Message-ID: <996D816825CFEA469870126E9050D3F0AA320C96@NIHMLBX11.nih.gov> | Any sentiment for (or against)... Technically, it is undefined, so anything is OK. If you prefer a non-null result, then I think it should be a 26 char buffer, not just "?" though. kre
On Tue, Feb 2, 2010 at 2:10 PM, Olson, Arthur David (NIH/NCI) [E] < olsona@dc37a.nci.nih.gov> wrote:
It would be possible to make ctime() be
tm = localtime(t); if (tm == NULL) return NULL; return asctime(tm);
and in theory that would be adequate, but I'm actually going to suggest moving the fix into asctime() (into asctime_r() really of course), and have asctime_r() start
if (timeptr == NULL) return NULL;
I think that's a safer fix. Appended is the patch I suggest, made against the tzcode2009t sources (latest I believe.)
Any sentiment for (or against)... if (timeptr == NULL) return "?"; ...or... return "??? ??? ?? ??:??:?? ????\n"; ...or some variant?
AS Robert Elz said, if you are going to use a non-null pointer, then I suggest the 26-character string. As to the contents - I have no problem with question marks followed by a newline; I think the newline at the end is a good idea as people may well rely on its presence to format their information. -- Jonathan Leffler <jonathan.leffler@gmail.com> #include <disclaimer.h> Guardian of DBD::Informix - v2008.0513 - http://dbi.perl.org "Blessed are we who can laugh at ourselves, for we shall never cease to be amused."
If you all recall ... | locatime() can return NULL these days, and given a nonsense time | value (which is quite likely under these circumstances), probably did, it turns out that wasn't the cause, the time_t value being converted was perfectly sane -- there's another bug. The fix that's currently proposed is fine, and needed, so this doesn't impact upon that, nor does it need to delay the release of the code with the fix on Tuesday (US time) - but it wouldn't hurt if this bug was fixed at the same time. Recall from the earlier mail that the core dump happened in single user mode while attempting to fix a filesystem with fsck_ffs (one that had corrupted inodes that needed fixing). It was from that I assumed that the underlying cause was probably a bogus time value that was part of garbage in an inode. But, of course, there's another difference here - in single user mode (at least on NetBSD) the timezone files are not available, or might not be, and in this case weren't (when /usr was mounted, the core dump didn't happen). From that I assumed (and yes, this was a stretch) that the bogus time value must have been one which was valid in the local timezone (that's UTC-0500 I think for the guy who had the problem), but invalid in GMT (perhaps subtracting 5 hours was just enough to move the time back enough so it would fit in a struct tm.). Unlikely, of course, but possible... And nonsense of course (well, it could have happened - it just didn't). NetBSD compiles localtime with ALL_STATE defined (this is important) (also USG_COMPAT, that one is irrelevant I believe.) The actual cause of the bug is simpler, someone was silly enough to write a line of code ... return NULL; /* "cannot happen" */ which of course guaranteed that it would, and it did... I'm tempted to say that the fix is to remove the comment, after which it will never happen again! But no. The execution path (relevant functions that are called, in order) is ctime -> localtime -> tzset -> tzsetwall -> tzload -> gmtload -> tzload -> tzparse -> tzload -> localsub -> BANG (To be strictly correct, NetBSD's code is slightly modified, using different names for the some of the internal functions, in order to deal with thread locking issues - but none of that is relevant to what's going on, so I'll just use the names from tzcode2009t). The immediate cause of the bug was the test ... if ((sp->goback && t < sp->ats[0]) || (sp->goahead && t > sp->ats[sp->timecnt - 1])) { At this point, sp->timecount was 0 (no time data was loaded), but sp->goback was true (not sure about sp->goahead, didn't need to investigate that one) and sp->ats[0] was a very large value (much much bigger than t). The reason for that is found in a combination of tzsetwall() and tzload(). tzsetwall(void) { if (lcl_is_set < 0) return; lcl_is_set = -1; #ifdef ALL_STATE if (lclptr == NULL) { lclptr = (struct state *) malloc(sizeof *lclptr); if (lclptr == NULL) { settzname(); /* all we can do */ return; } } #endif /* defined ALL_STATE */ if (tzload((char *) NULL, lclptr, TRUE) != 0) gmtload(lclptr); settzname(); } and then tzload() (with comments deleted to save space here). tzload(name, sp, doextend) register const char * name; register struct state * const sp; register const int doextend; { register const char * p; register int i; register int fid; register int stored; register int nread; union { struct tzhead tzhead; char buf[2 * sizeof(struct tzhead) + 2 * sizeof *sp + 4 * TZ_MAX_TIMES]; } u; if (name == NULL && (name = TZDEFAULT) == NULL) return -1; { register int doaccess; char fullname[FILENAME_MAX + 1]; if (name[0] == ':') ++name; doaccess = name[0] == '/'; if (!doaccess) { if ((p = TZDIR) == NULL) return -1; if ((strlen(p) + strlen(name) + 1) >= sizeof fullname) return -1; (void) strcpy(fullname, p); (void) strcat(fullname, "/"); (void) strcat(fullname, name); if (strchr(name, '.') != NULL) doaccess = TRUE; name = fullname; } if (doaccess && access(name, R_OK) != 0) return -1; if ((fid = open(name, OPEN_MODE)) == -1) return -1; } the rest of the function isn't relevant, as in the circumstances that apply, the open() always fails, as there are no zone files anywhere to open, so that "return -1" always happens. Lower down in tzload() we find ... sp->goback = sp->goahead = FALSE; but that never gets executed - in fact, almost none of the state is ever initialised, just a little in tzparse() ... dstlen = 0; sp->typecnt = 1; /* only standard time */ sp->timecnt = 0; sp->ttis[0].tt_gmtoff = -stdoffset; sp->ttis[0].tt_isdst = 0; sp->ttis[0].tt_abbrind = 0; No mention there if goback, goahead, or ats. Often, we're lucky, including in all the tests I was able to make that were small enough to actually debug, and the space returned by malloc() is all zero, and we don't see a problem. So to force this I explicitly set lclptr->ats[0] = 0xFFFFFFFFFFLL; lclptr->goback = 1; immediately after the malloc() (after lclptr is tested for non NULL of course) and sure enough, BANG... Moving the sp->goback = sp->goahead = FALSE; from way down near the end of tzload() to way up the front, so it always gets done is enough to fix this for me. I put it before even the test of name, just in case (though that one can only return if TZDEFAULT is NULL, and as that's a constant, that's unlikely!) Whether there's anything else that also needs initialising I'm not sure at the minute. Another fix (I think, I didn't try this one) would be to test sp->timecount in the if ((sp->goback && t < sp->ats[0]) || (sp->goahead && t > sp->ats[sp->timecnt - 1])) { test, it is a little rude to be enen contemplating accessing sp->ats[sp->timecnt - 1] if sp->timecount == 0 after all. That is, something like if (sp->timecount > 0 && ((sp->goback && t < sp->ats[0]) || (sp->goahead && t > sp->ats[sp->timecnt - 1]))) { Alternatives would be to init these in tzparse() with the others, or to simply memset() the state to 0's after the malloc (or use calloc()) so we always have a nice clean state array. I'll leave it for someone else to decide which way of fixing this you prefer, but this is definitely a bug that needs fixing. kre
A conservative course is to memset the state structure to all zeroes at the top of "tzload" (as below). An alternative is to change the three "malloc(sizeof *pointer)" calls to "calloc(1, sizeof *pointer) calls in localtime.c. I'm feeling conservative on this one. --ado ------- localtime.c ------- *** /tmp/geta8774 Tue Feb 16 17:24:44 2010 --- /tmp/getb8774 Tue Feb 16 17:24:44 2010 *************** *** 5,11 **** #ifndef lint #ifndef NOID ! static char elsieid[] = "@(#)localtime.c 8.10"; #endif /* !defined NOID */ #endif /* !defined lint */ --- 5,11 ---- #ifndef lint #ifndef NOID ! static char elsieid[] = "@(#)localtime.c 8.11"; #endif /* !defined NOID */ #endif /* !defined lint */ *************** *** 350,355 **** --- 350,356 ---- 4 * TZ_MAX_TIMES]; } u; + memset(sp, 0, sizeof *sp); if (name == NULL && (name = TZDEFAULT) == NULL) return -1; {
Date: Tue, 16 Feb 2010 17:33:28 -0500 From: "Olson, Arthur David (NIH/NCI) [E]" <olsona@dc37a.nci.nih.gov> Message-ID: <996D816825CFEA469870126E9050D3F0B0A1C49E@NIHMLBX11.nih.gov> | A conservative course is to memset the state structure to all | zeroes at the top of "tzload" (as below). That's probably overkill, tzload() can end up being called several times. | An alternative is to change the three "malloc(sizeof *pointer)" | calls to "calloc(1, sizeof *pointer) calls in localtime.c. That's what the NetBSD solution was. I think that's better, clearing the struct once, when it is allocated, is enough, and what's more, allowing calloc() to do it means that you get the benefit of any libc private method of getting all those 0's, rather than a loop writing a byte at a time. (Yes, I know memset() can do better than that - but to achieve that it needs at least a bunch of tests to handle all the odd cases - calloc() doesn't.) | I'm feeling conservative on this one. Yes, with that in mind, the NetBSD people also moved the init of goahead and goback up to the start of tzload() (where you proposed putting the memset()) to make sure they always get set correctly. That's probably worth doing as well - make sure that tzload() (which is responsible for setting those values) never returns without having explicitly set them. kre
participants (8)
-
Bill Seymour -
Clive D.W. Feather -
Edwin Groothuis -
Jonathan Leffler -
Marc Lehmann -
Olson, Arthur David (NIH/NCI) [E] -
Robert Elz -
Tony Finch