From: Robert Elz [mailto:kre@munnari.OZ.AU] Sent: Tuesday, December 06, 2005 10:20 PM | ! while (isascii((unsigned char) *cp) &&
You sometimes do better if you write that as
while (isascii(*(unsigned char *)cp) &&
It can also be a little clearer what you're intending - there's no intention here to fetch the char, then convert it to unsigned, all we want is the 0..255 value that cp points at.
If memory serves, the latter form (*(unsigned char *)cp) is not portable to all C89 hosts, whereas the former form ((unsigned char) *cp) is. The idea is that some C89 hosts might have padding bits in their unsigned char representation, and it's incorrect to access a char as if it were unsigned char. I believe this issue got cleared up in C99, so the code is portable to C99 compilers. But the zic stuff attempts to be portable to C89 (as well as earlier) compilers.
it is certainly true that it's possible to test for digits by using
= '0' && <= '9' tests - but if that's the best way to write it, then that's what isdigit() ought to be doing.
Alas, that's not true in practice. isdigit is typically slower, and it can be quite a bit slower. For example, on my host (Debian GNU/Linux stable, GCC 4.0.2, gcc -O4), with the following code: int F (char *p) { return isdigit ((unsigned char) *p) != 0; } int G (char *p) { return '0' <= *p && *p <= '9'; } F compiles into 12 instructions that contain a subtroutine call (for a total of 31 instructions executed), whereas G compiles into 10 instructions of straight-line code. I think part of the problem is that isdigit might be sensitive to the locale. So there's a correctness issue here as well; isdigit might actually return the wrong value, since it might think that some other byte code is a digit. (This is just a theoretical issue, as far as I know, though.)
Paul's version may be textually shorter, but with that cp++ side effect buried in the middle of the && sequence, it is not nearly as easy to read.
True, but in my defense that buried cp++ was in the original code. How about this instead? It might be a bit clearer. char c = *cp; if ('0' <= c && c <= '9') { cp++; if (c == '1' && '0' <= *cp && *cp <= '4') cp++; }