64-bit time_t must go--this is non-negotiable

Olson, Arthur David (NIH/NCI)

June 17, 2004

5:57 p.m.

Given that time_t has been a 32-bit entity since day one of UNIX, there are a slew of binary executables out there that are built on the premise, as well as plenty of values stored in disk files (take UNIX accounting logs, please). That makes it an extremely bad idea to redefine time_t as a 64-bit value. This does not, of course, mean that it's a bad idea to have 64-bit time stamps available--as long as they're available under a different name (along with differently named functions to manipulate them). The historic precedent in the UNIX realm is the addition of the "lseek" system call once files got larger than good old "seek" could handle (even with the provision of seeks where the short-integer seek argument specified a block number rather than a byte number). So somewhere down the road we can expect to see entities such as ltime_t and lgmtime. Bottom line: we should probably not be doing a lot of work to support 64-bit time_t implementations since we don't want to encourage such implementations. --ado

Show replies by date

Paul Eggert

June 2004

7:56 p.m.

"Olson, Arthur David (NIH/NCI)" <olsona@dc37a.nci.nih.gov> writes:

...

time_t has been a 32-bit entity since day one of UNIX,

No, actually, it was a 'long' entity. The very first UNIX ran on the PDP-7, which was an 18-bit host, so I suspect (though I can't easily check this :-) that time_t was a 36-bit quantity on the very first UNIX host. Certainly the circa-1978 Honeywell 6000 UNIX port used a 36-bit time_t, so there's longstanding precedent that time_t is not always exactly 32 bits.

...

The historic precedent in the UNIX realm is the addition of the "lseek" system call once files got larger

That predated C's typedef feature. A closer precedent is what happened to lseek when file offsets grew from 32 to 64 bits. By this time, people were supposed to use "off_t", and "off_t" grew from 32 to 64 bits without changing the name of "lseek". (The story is a bit more complicated than this, but that's the general idea.) The changes to time/ctime/etc. are similar.

...

we should probably not be doing a lot of work to support 64-bit time_t implementations since we don't want to encourage such implementations.

I'm afraid the ship has already sailed on this issue, and it's too late to reverse course, even assuming everyone agreed on the general principle. The vast majority of 64-bit hosts use 64-bit time_t. This includes GNU/Linux, the BSDs, HP-UX, Microsoft Windows, Solaris, and probably others. There are one or two holdouts (Tru64 comes to mind) but they're a clear minority. I wouldn't ask you to do a lot of work in this area, since you have other things to do. But if someone else volunteers to fix the porting problems it seems like a fairly easy call. zdump is intended to be used with other time_t implementations and so it's more important that it support non-32-bit time_t. (zic is also important, but I don't know of any problems that it has with non-32-bit time_t.) The other code is less-important, since the non-32-bit-time_t guys have mostly rewritten all that stuff anyway; but the fixes there are relatively small.

Robert Elz

7:02 a.m.

Date: Thu, 17 Jun 2004 12:56:47 -0700 From: Paul Eggert <eggert@CS.UCLA.EDU> Message-ID: <87d63yj6v4.fsf@penguin.cs.ucla.edu> | "Olson, Arthur David (NIH/NCI)" <olsona@dc37a.nci.nih.gov> writes: | | > time_t has been a 32-bit entity since day one of UNIX, | | No, actually, it was a 'long' entity. No, it wasn't. | The very first UNIX ran on the | PDP-7, which was an 18-bit host, so I suspect (though I can't easily | check this :-) that time_t was a 36-bit quantity on the very first | UNIX host. It probably was, but it wasn't long. Back in those days there was no C to have the modern concept of "long" at all, and even when C existed initially, there was no "typedef" to create new types (like time_t). That's all relatively recent additions (well, 7th edition, 1979 or so). In the earliest unix, time() operated on a int [2] array (which is why time() and all the other time related funcs, has their time args passed as a pointer - having time() return the value as well is another recent addition). The first unix on 32 bit processors actually had times represented as 64 bits, as that kept on using int[2] except with 32 bit ints (this was the University of Wollongong (Richard Miller) port of v6 unix to the Interdata (Perkin Elmer later) systems (which preceded the Bell Labs Interdata port which much influenced v7). When that switched to v7 it went back to 32 bits (using "long"). | > The historic precedent in the UNIX realm is the addition of the | > "lseek" system call once files got larger | | That predated C's typedef feature. A closer precedent is what | happened to lseek when file offsets grew from 32 to 64 bits. By this | time, people were supposed to use "off_t", and "off_t" grew from 32 to | 64 bits without changing the name of "lseek". (The story is a bit | more complicated than this, but that's the general idea.) The changes | to time/ctime/etc. are similar. yes, though ado still has a point, one still does need to handle existing applications that know time_t as a 32 but quantity. While the name of lseek wasn't changed (as known by the C compiler) to move from 32 to 64 bit offsets, the actual system call certainly was changed (the 32 bit version still exists on any architecture old enough to have had applications from when lseek took 32 bit offsets). Other changes of a similar nature have actually had source visible modifications - which is the better approach for any particular change requires careful thought and study. Certainly it isn't correct to simply cite the precedent of some other (different) change, and act as if that was sufficient justification for any new change. Personally, at the minute, I have no idea whether making time_t a 64 bit type, or introducing a new type for bigger times is the better approach. If FreeBSD are trying it in some limited sense, that's great, and will provide feedback on how well things work (though doing FreeeBSD sparc64 probably isn't a great test for existing applications...) I'm also by no means convinced that a 64 bit time_t that is just twice as many bits, bit otherwise unchanged from, our current 32 bit time_t is a rational thing to do. I can't imagine wanting to know about times that stretch from 1970 to 1970 + 2.6 * 10^11, that's just absurd. When more bits are added, the time base should probably be moved (a long way), and lots of the extra bits (say 20 or the added 32) should be used for extra precision, rather than extra years. Put the 0 value at about 10000 years info the future (say year 10000 itself), allow + and - time values, and 44 bits of year values stretches from before year -270K to almost year 300K, which is plenty of range, with microsecond precision. But this is just one possibility - not necessarily a good one, but certainly better than 2^64 bits of seconds. kre

Paul Eggert

9:29 p.m.

Robert Elz <kre@munnari.oz.au> writes:

...

Certainly it isn't correct to simply cite the precedent of some other (different) change, and act as if that was sufficient justification for any new change.

Sure. But here we've already seen precedent for this exact same change. Lots of operating systems have decided to use 64-bit time_t on 64-bit hosts. They all had their inevitable shakeout period. One (Tru64) distinguishes between time_t and time64_t, the way that ado proposed, but as far as I can tell, nobody else thought that the extra complexity is worth the hassle. These folks have already made their decision, in some cases many years ago, and they're unlikely to change their minds now no matter how much huffing and puffing we do. If we want to write code that is portable to 64-bit GNU/Linux, Solaris, FreeBSD 6, Microsoft Windows, etc., then we have to make sure it works with 64-bit time_t.

...

I can't imagine wanting to know about times that stretch from 1970 to 1970 + 2.6 * 10^11, that's just absurd.

I agree, and if we had to do it over again I'd design time_t differently, but it's too late to change this for the systems mentioned above.

pcg＠goof.com

8:01 p.m.

On Thu, Jun 17, 2004 at 01:57:36PM -0400, "Olson, Arthur David (NIH/NCI)" <olsona@dc37a.nci.nih.gov> wrote:

...

Given that time_t has been a 32-bit entity since day one of UNIX, there are

It's not on a lot of existing UNIX systems, where it already is 64 bit.

...

a slew of binary executables out there that are built on the premise, as

Binary executables are not a problem, as ABI changes are rare. It's no problem to support both 32 and 64 bit time() calls on the same system, even within the same executable. This is already standard practise.

...

well as plenty of values stored in disk files (take UNIX accounting logs, please).

The example is mostly a non-issue, as it ahs been dealt with in existing implementations (quote databases are another - bad - example). Databases are a bigger concern, but there are very few (if any) database systems that directly store time_t as it is. One of the reasons is that portable rpogramming always required to allow for different time_t representations.

...

That makes it an extremely bad idea to redefine time_t as a 64-bit

I very much disagree here with you. It might be helpful to some people to keep it a 32 bit value, but the problem needs to be dealt with before 2038, wether time_t is 32 bit or not, because at that point it will wrap, so any databses still based on 32 bit time_t's WILL have to be reworked. In essence, I think your proposal to force it to 32 bit is very harmful as it encourages problems similar to Y2K. Most existing programs don't need changes to work with a 64 bit time_t now, they just need to be recompiled. Forcing extra (and non-standard) calls means every program must be changed.

...

with differently named functions to manipulate them). The historic precedent in the UNIX realm is the addition of the "lseek" system call once files got larger than good old "seek" could handle

The reaosn was that the units was changed, so the filesystem call was incomaptible. Even early unix had no problem with having two seek calls, one for 16 bit and one for 32 bit. So they could have kept the name. The name was changed because the sematic had changed. To the contrary, the 32->64 bit transition many years ago involved no such adition of extra syscalls or functions - all you need is to recompile your program under the new environment and it can do 63 bit I/O, except when it has bugs (which would in turn result in a 32 or 31-bit-capable program only, which wouldn't be worse).

...

Bottom line: we should probably not be doing a lot of work to support 64-bit time_t implementations since we don't want to encourage such implementations.

I think this is a very bad decision. It's exactly the same bad decision as the decison that "two digits are enough to represent a year, so don't encourage the support of 4 or more digit year values in this datatype". -- The choice of a | -----==- _GNU_ | ----==-- _ generation Marc Lehmann +-- ---==---(_)__ __ ____ __ pcg@goof.com |e| --==---/ / _ \/ // /\ \/ / http://schmorp.de/ --+ -=====/_/_//_/\_,_/ /_/\_\ XX11-RIPE | |

Garrett Wollman

9:20 p.m.

<<On Thu, 17 Jun 2004 13:57:36 -0400, "Olson, Arthur David (NIH/NCI)" <olsona@dc37a.nci.nih.gov> said:

...

Given that time_t has been a 32-bit entity since day one of UNIX, there are a slew of binary executables out there that are built on the premise, as well as plenty of values stored in disk files (take UNIX accounting logs, please). That makes it an extremely bad idea to redefine time_t as a 64-bit value.

Too late. Many if not most 64-bit operating systems have already decided, irrevocably, that time_t will be 64 bits. -GAWollman

8031

Age (days ago)

8032

Last active (days ago)

List overview

Download

5 comments

5 participants

participants (5)

Garrett Wollman
Olson, Arthur David (NIH/NCI)
Paul Eggert
pcg＠goof.com
Robert Elz