Re: Proposal for new ISO C 9x time API

Oct. 6, 1998

      Paul Eggert wrote on 1998-10-05 22:23 UTC:
...
It sounds like you're trying to allow for another possibility:
(3) Support just some leap seconds, without having
      a complete leap second table.
For example, it sounds like you want to cater to implementations that
know only the leap second nearest to the present, or something like
that.
Yes. Option (3) is in real life the by far most frequent and most
realistic possibility. There are hundreds of thousands of NTP controlled
workstations out there that do exactly this.

Most time services provide you an announcement of the next leap second.
For instance DCF77 (the German 77.5 kHz time service for which you can
buy 20 dollar serial port receivers and 50 dollar radio controlled
digital wrist watches that work all over Central Europe) sets a leap
second warning bit during the 60 minutes preceding the leap second. GPS
announces a few weeks ahead of time the time when the next leap second
occurs, and NTP does something similar. If you want to call a system
state that knows only about the next leap seconds in order to roll
correctly through it a system that "has a leap second table", well, ok.
All that these systems know is that there will be one leap second and no
other leap second until then.

The existing time distribution infrastructure announces the next leap
second in order to prevent a loss of synchronization right after the
leap second, but it was not designed to distribute information about the
history of TAI versus UTC relationships. That is what I repeated several
times that UTC is extremely widely available but TAI is not, and that
this is the reason why I consider it difficult to propose a robust
implementation that is entirely based on TAI.

It is important to understand that (except GPS) most time services
announce only that there will be a leap second, but not what the current
and coming UTC-TAI offset is!
...
The leap second table is _always_ incomplete, of course, since we
don't know all future leap seconds; so this problem of incomplete
knowledge of leap seconds is inherent to any high-quality interface.
So I suggest that we give the programmer a way to access to the entire
leap second table known to the implementation.
My API does this in form of tz_jump.
...
E.g. if the
implementation knows only the next leap second, then it would return
``I don't know'' for questions about later (or previous) leap seconds.
Done.
...
This seems to me to be the most natural way to model implementations
of type (1), (2) and (3).
Done.
...
By the way, I'm not familiar with type (3) C implementations in
practice -- which ones are you thinking of?
Some example code from the standard Red Hat Linux kernel running on my
machine here right now (and running on around 5 million other PCs,
active on those hundreds of thousands of PCs that are NTP synchronized):

/usr/src/linux/kernel/sched.c:

-------------------------------------------------------------------------
/*
 * this routine handles the overflow of the microsecond field
 *
 * The tricky bits of code to handle the accurate clock support
 * were provided by Dave Mills (Mills@UDEL.EDU) of NTP fame.
 * They were originally developed for SUN and DEC kernels.
 * All the kudos should go to Dave for this stuff.
 *
 */
static void second_overflow(void)
{
    long ltemp;

    /* Bump the maxerror field */
    time_maxerror += time_tolerance >> SHIFT_USEC;
    if ( time_maxerror > MAXPHASE )
        time_maxerror = MAXPHASE;

    /*
     * Leap second processing. If in leap-insert state at
     * the end of the day, the system clock is set back one
     * second; if in leap-delete state, the system clock is
     * set ahead one second. The microtime() routine or
     * external clock driver will insure that reported time
     * is always monotonic. The ugly divides should be
     * replaced.
     */
    switch (time_state) {

    case TIME_OK:
        if (time_status & STA_INS)
            time_state = TIME_INS;
        else if (time_status & STA_DEL)
            time_state = TIME_DEL;
        break;

    case TIME_INS:
        if (xtime.tv_sec % 86400 == 0) {
            xtime.tv_sec--;
            time_state = TIME_OOP;
            printk("Clock: inserting leap second 23:59:60 UTC\n");
        }
        break;

    case TIME_DEL:
        if ((xtime.tv_sec + 1) % 86400 == 0) {
            xtime.tv_sec++;
            time_state = TIME_WAIT;
            printk("Clock: deleting leap second 23:59:59 UTC\n");
        }
        break;

    case TIME_OOP:
        time_state = TIME_WAIT;
        break;

    case TIME_WAIT:
        if (!(time_status & (STA_INS | STA_DEL)))
            time_state = TIME_OK;
    }

    /*
     * Compute the phase adjustment for the next second. In
     * PLL mode, the offset is reduced by a fixed factor
     * times the time constant. In FLL mode the offset is
     * used directly. In either mode, the maximum phase
     * adjustment for each second is clamped so as to spread
     * the adjustment over not more than the number of
     * seconds between updates.
     */
    if (time_offset < 0) {
        ltemp = -time_offset;
        if (!(time_status & STA_FLL))
            ltemp >>= SHIFT_KG + time_constant;
        if (ltemp > (MAXPHASE / MINSEC) << SHIFT_UPDATE)
            ltemp = (MAXPHASE / MINSEC) << SHIFT_UPDATE;
        time_offset += ltemp;
        time_adj = -ltemp << (SHIFT_SCALE - SHIFT_HZ - SHIFT_UPDATE);
    } else {
        ltemp = time_offset;
        if (!(time_status & STA_FLL))
            ltemp >>= SHIFT_KG + time_constant;
        if (ltemp > (MAXPHASE / MINSEC) << SHIFT_UPDATE)
            ltemp = (MAXPHASE / MINSEC) << SHIFT_UPDATE;
        time_offset -= ltemp;
        time_adj = ltemp << (SHIFT_SCALE - SHIFT_HZ - SHIFT_UPDATE);
    }

    /*
     * Compute the frequency estimate and additional phase
     * adjustment due to frequency error for the next
     * second. When the PPS signal is engaged, gnaw on the
     * watchdog counter and update the frequency computed by
     * the pll and the PPS signal.
     */
    pps_valid++;
    if (pps_valid == PPS_VALID) {
        pps_jitter = MAXTIME;
        pps_stabil = MAXFREQ;
        time_status &= ~(STA_PPSSIGNAL | STA_PPSJITTER |
                         STA_PPSWANDER | STA_PPSERROR);
    }
    ltemp = time_freq + pps_freq;
    if (ltemp < 0)
        time_adj -= -ltemp >>
            (SHIFT_USEC + SHIFT_HZ - SHIFT_SCALE);
    else
        time_adj += ltemp >>
            (SHIFT_USEC + SHIFT_HZ - SHIFT_SCALE);

/* in the NTP reference this is called "hardclock()" */
static void update_wall_time_one_tick(void)
{
        /*
         * Advance the phase, once it gets to one microsecond, then
         * advance the tick more.
         */
        time_phase += time_adj;
        if (time_phase <= -FINEUSEC) {
                long ltemp = -time_phase >> SHIFT_SCALE;
                time_phase += ltemp << SHIFT_SCALE;
                xtime.tv_usec += tick + time_adjust_step - ltemp;
        }
        else if (time_phase >= FINEUSEC) {
                long ltemp = time_phase >> SHIFT_SCALE;
                time_phase -= ltemp << SHIFT_SCALE;
                xtime.tv_usec += tick + time_adjust_step + ltemp;
        } else
                xtime.tv_usec += tick + time_adjust_step;

        if (time_adjust) {
            /* We are doing an adjtime thing. 
             *
             * Modify the value of the tick for next time.
             * Note that a positive delta means we want the clock
             * to run fast. This means that the tick should be bigger
             *
             * Limit the amount of the step for *next* tick to be
             * in the range -tickadj .. +tickadj
             */
             if (time_adjust > tickadj)
                time_adjust_step = tickadj;
             else if (time_adjust < -tickadj)
                time_adjust_step = -tickadj;
             else
                time_adjust_step = time_adjust;

            /* Reduce by this step the amount of time left  */
            time_adjust -= time_adjust_step;
        }
        else
            time_adjust_step = 0;
}
-------------------------------------------------------------------------
...
It would be helpful to
have a bit more familiarity with the real-world issues here.
(Also, this issue needs to be discussed better in the rationale!)
I have provided references to papers by Dave Mills at the end of the document
that discuss the kernel treatment of leap seconds in some detail. Linux
is one system where you can see the mechanics yourself in the source
code. I think DEC does something similar, but I don't have access to
source or documentation.

All the ideas that I presented are not just some academic gobbledeegoop,
but are based on half a decade of operational implementation experience.
I was a student of Dr. Frank Kardel at the University of Erlangen, who was
one of the xntpd maintainers in the early 1990s and who demonstrated
successful NTP-based microsecond kernel PLL clock synchronization of
SunOS servers. I "grew up" in a research lab where we routinely observed the
cycles of the air conditioner in the server room my measuring the
temperature induced phase drifts of the clock generators in Sun servers via
suitable software and time protocols. Accusation made my others about my
possible ignorance with regard to software timing are a bit insulting.
It is just difficult to communicate my background in this field in a
formal proposal.

You are probably right if you mean that the ISO C committee members are
unlikely to be familiar with mechanisms, which have been very common
knowledge for NTP hackers for half a decade. It would be fortunate
if I could personally attend a meeting and present these issues myself,
but I don't have the funding for such a trip (unless they do one of their
next meetings in London). If I wrote down all background information
that I took into consideration, it would probably become a book
(may be, it will one day ...;-).
...
leap second tables are a very dubious and extremely dangerous
concept.
A implementation must have a leap-second table of _some_ sort, if only
a partial one, if it wants to support leap seconds; otherwise,
xtime_get with TIME_UTC can't return leap seconds.
Ok, let me formulate this clearer:

My concern with proposals such as libtai is the following:

Systems receive UTC from the outside (a fact of life). Then Dan
Bernstein's systems transform the received UTC into TAI and use TAI for
all further processing (including in communication protocols). In order
to do the UTC->TAI conversion correctly, you have to know not only when
the next leap second is, but also the CORRECT NUMBER of leap seconds
since 1972, since the current TAI offset is NOT announced in almost all
time services (and I have been complaining about this for half a decade
now to the operators of time services, so please don't blame me that it
is this way!). Each time one of Dan Bernstein's systems misses a leap
second announcement, the TAI value that this system uses from then on
will be off by one more second. The TAIs used in a distributed system
will start to move apart, until someone uploads fresh current TAI-UTC
offsets into all machines (and I certainly don't have to tell people in
this forum how "eager" normal users are with keeping their configuration
tables up to date).
...
Here's an encoding that uses about .00001 bit per timestamp, assuming
an integer counter is used for timestamps:
If T is a TIME_UTC timestamp, it identifies the point of time
  that is T%D xtime intervals after T/D UTC days after the
  epoch, where D is 86401*XTIMES_PER_SEC.
Ok, so you reserve values for a leap second at the end of every
UTC day (in effect making every day 86401 SI seconds long on the
scale).

If you go through the comp.protocols.time.ntp archives, then you will find
that I proposed exactly this encoding (plus a few variants) years ago.
I am not that enthusiastic any more about it, because if complicates
many algorithms significantly, it was more intended as a compatibility
hack with the old C89 spec.

It would also completely mess up the elegance of my API in that suddenly
strfxtime and xtime_make/breakup would have to know from what type of
clock this timestamp came:

Note that in my encoding, the two times

  1972-01-01 00:00:00 UTC

and

  1972-01-01 00:00:00 TAI

(which are 10 s apart) are represented by the same xtime value. An xtime
value is just seen as an encoding for a year/month/day/hour/minute/second
representation, and any routine that wants to convert these xtime values
into broken-down values does not have to knwo whether this is a UTC or
TAI (or even a local time) value. With your alternative, this would not
work any more, unless you use 86401 seconds per day for both UTC and
TAI (which your clause "If T is a TIME_UTC timestamp" does not suggest).
...
For space-saving this beats all other encodings proposed so far.
Two-bit space-saving at the cost of the complexity of algorithms (=
likelihood of user missunderstandings) does not sound attractive to me.

I think, the POSIX timespec got it almost right, and I see no reason
why we shouldn't simply build on that encoding.

The only doubts that I have about xtime is that I think it is with 96
bits a bit too long. I would feel slightly better with 64 bits. I am of
course also aware of the DCE and CORBA encodings (see the reference
section of my proposal), which use a 64-bit 0.1 µs counter since
1601-01-01. It is a nice idea and 100 ns is probably still more than
sufficient for practically all applications, but like the old POSIX
time_t, it is not broken-down and therefore does not provide for a nice
leap second encoding (unless we use the above 1 day + 1 sec hack). So
96-bit still seems to be the better compromise to me, and in the ned I
think that having a cosmological range won't hurt.

Markus

-- 
Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK
email: mkuhn at acm.org,  home page: <http://www.cl.cam.ac.uk/~mgk25/>

Re: Proposal for new ISO C 9x time API

Markus Kuhn