Re: Difftime code

Aug. 7, 2004

      Paul Eggert said:
...
...
...
The idea behind the sizeof test is to avoid "long double" if it's safe
to do so, since long double is expensive on some hosts.
Is it *that* expensive? I thought you were trying to avoid rounding errors.
Avoiding rounding errors is the primary goal, but efficiency is also
important.
Well, your choice.
...
...
You might be better off comparing time1 and time0 with DBL_EPSILON or use
the maxLDint trick I described.
I'd rather avoid run-time tests.  That is, I would rather have the
normal case execute code without any conditional branches.
Hmm.
...
...
But if you're worried about efficiency, why are you doing this in floating
point when they're integers?
On the modern processors where this code typically executes, floating
point is pretty fast.  (Except maybe long double, I suppose.)
I'm surprised that long double takes more than twice a normal double or
Much more than the function call, particularly if:
...
Conditional branches hurt performance, though.
I'm obviously out of date.
...
...
The problem occurs when time_t is signed and the maximum value of time_t
is the same as the maximum value of uintmax_t.
I see.  But this will happen only on weird hosts with padding bits, right?
Not that weird.
...
On other hosts (i.e. the vast majority) we don't need to check for overflow.
In particular, it only happens if:

    #if UINTMAX_MAX == INTMAX_MAX

(so if you have those symbols, or ULONG_MAX == LONG_MAX on C89, you can
test at compile time.
...
...
s4 = (time1 >> 2) - time0 / 4;   // time0 < 0, so >> is a bad idea
We can't use >> here, since time_t might be a floating type.
Oops. The code will only get executed if time_t is integer, but you're
right. Silly me.

A test for time1 >= 0 would be useful here; firstly, if it's false,
overflow can't happen. Secondly, it it's true, the compiler has enough
information to know that /4 can be converted to >>2.
...
I looked at your code more carefully and have the following thoughts.
  * The idea of adding 2 * ~((top_type)-1 >> 1) is basically the same
    idea as the current difftime's approach of subtracting 2 *
    (long_double) hibit, except it's being done with unsigned
    arithmetic and therefore avoids integer-overflow problems.
Right. Remember that overflow in signed integers is allowed to produce
*any* answer or even raise a signal.
...
* I suspect the add-the-high-bit approach will suffer from
    double-rounding on some theoretical platforms (given that it
    clearly suffers from similar problems in the existing difftime.c),
    but I know of no actual platforms where double-rounding will occur
    so it's hard to test whether it will introduce an error.
Not my area of expertise.
...
* I didn't understand why you computed time1 / 4 - time0 / 4.
    Presumably this was to avoid overflow, but surely
    it suffices to divide by 2, and compute time1 / 2 - time0 / 2.
At the time I wrote it I'd convinced myself there was a corner case that
still went wrong. Dividing by any number greater than 2 avoided it.
...
* I got confused by the "t4 > s4 - (top_type)-1 / 8" business; I
    don't really know what that code is doing.  Sorry.
s4 is 1/4 of a value within +/- 4 of the true answer.
t4 is 1/4 of dt, which is what you get if you subtract in top_type.

dt is either correct or it's wrong by UINTMAX_MAX+1.
So t4 - s4 is either close to zero or is about -UINTMAX_MAX/4.
That test determines which.
...
* Even on weird hosts I'd rather avoid runtime tests
    like time1 >=0 and time0 < 0.
Yet you're happy with an extra function call?
...
Here's the result of my cogitating on this.  From a practical point of
view, the main improvement over the current difftime is that it avoids
double-rounding on platforms with 64-bit time_t and 64-bit long
double; this is a somewhat-orthogonal change but it is a real win.
But I hope it also addresses your concerns about correctness on hosts
that have padding bits.
/* This part will be configured differently for pre-C99 hosts. */
#include <stdint.h>
#include <float.h>
#define uintmax uintmax_t
#define long_double long double
#include <time.h>
#include <limits.h>
#define TYPE_FLOATING(type) ((type) 0.4 != 0)
#define TYPE_SIGNED(type) (((type) -1) < 0)
#define TYPE_BIT(type)  (sizeof (type) * CHAR_BIT)
/*   
** Return TIME1 - TIME0, where TIME0 <= TIME1.
** This function is executed only if time_t is an integer type.
*/
static double
simple_difftime (time_t time1, time_t time0)
{
  /*
  ** Optimize the common special cases where time_t is unsigned,
  ** or can be converted to uintmax without losing information.
  */
  if (!TYPE_SIGNED (time_t))
    return time1 - time0;
  else if (INTMAX_MAX <= UINTMAX_MAX / 2)
     return (uintmax) time1 - (uintmax) time0;
This second test can be "INTMAX_MAX != UINTMAX_MAX", and can be a #if.
...
else
    {
      /*
      ** Weird.  uintmax has padding bits, and time_t is signed.
      ** Check for overflow: compare delta/2 to (time1/2 - time0/2).
      ** Overflow occurred if they differ by more than a small slop.
      */
      uintmax delta = (uintmax) time1 - (uintmax) time0;
      uintmax hdelta = delta / 2;
      time_t ht1 = time1 / 2;
      time_t ht0 = time0 / 2;
      time_t dht = ht1 - ht0;
      /*
      ** "h" here means half.  By simple range analysis, we have:
      **     -0.5 <= ht1 - time1/2 <= 0.5
      **     -0.5 <= ht0 - time0/2 <= 0.5
      **     -1.0 <= dht - (time1 - time0)/2 <= 1.0
      ** If overflow has not occurred, we also have:
      **     -0.5 <= hdelta - (time1 - time0)/2 <= 0
      **     -1.0 <= dht - hdelta <= 1.5
      ** and since dht - hdelta is an integer, we also have:
      **     -1 <= dht - hdelta <= 1
      ** or equivalently:
      **     0 <= dht - hdelta + 1 <= 2
      ** In the above analysis, all the operators have
      ** their exact mathematical semantics, not C semantics.
      ** However, dht - hdelta + 1 is unsigned in C,
      ** so it need not be compared to zero.
      */
      if (dht - hdelta + 1 <= 2)
  return delta;
You've basically done the same as the code you said you didn't understand!
* I used /4 instead of /2 because I wanted to ensure we never went near the
  maximum values of types.
* If there *is* an overflow, dht - hdelta is going to be around
  UINTMAX_MAX/2. So I used the eqivalent of UINTMAX_MAX/4 as the gateway;
  I'd certainly use a value bigger than 2 just in case I'd made an error
  in the analysis.

In fact, it might be a good idea to insert, at this point:

    assert (dht - hdelta + 1 >= UINTMAX_MAX/2 - 2 &&
            dht - hdelta + 1 <= UINTMAX_MAX/2 + 2);

to ensure nothing's gone wrong (I *think* +/-2 suffices).
...
else
  {
    /*
    ** Repair delta overflow.
    **
    ** The following expression contains a second rounding,
    ** so the result may not be the closest to the true answer.
    ** This problem occurs only with very large differences,
    ** It's too painful to fix this portably.
    ** We are not alone in this problem;
    ** some C compilers round twice when converting
    ** large unsigned types to small floating types,
    ** so if time_t is unsigned the "return time1 - time0" above
    ** has the same double-rounding problem with those compilers.
    */
    long_double hibit = ~(UINTMAX_MAX / 2);
    return delta + 2 * hibit;
  }
    }
}
double
difftime (time_t time1, time_t time0)
{
  if (TYPE_BIT (time_t) <= DBL_MANT_DIG
      || (TYPE_FLOATING (time_t) && sizeof (time_t) != sizeof (long_double)))
    return (double) time1 - (double) time0;
  if (TYPE_BIT (time_t) <= LDBL_MANT_DIG || TYPE_FLOATING (time_t))
    return (long_double) time1 - (long_double) time0;
if (time1 < time0)
    return -simple_difftime (time0, time1);
  else
    return simple_difftime (time1, time0);
}
-- 
Clive D.W. Feather  | Work:  <clive@demon.net>   | Tel:    +44 20 8495 6138
Internet Expert     | Home:  <clive@davros.org>  | Fax:    +44 870 051 9937
Demon Internet      | WWW: http://www.davros.org | Mobile: +44 7973 377646
Thus plc            |                            |