tzfiles contain Unix epoch for the first transition time
I am working on enabling the .NET TimeZoneInfo class to read time zone information from tzfiles. I've hit a snag with the latest tzdata 2015f. (I'm not sure when this change started, but the problem doesn't occur with the tzfiles that are shipped with an Ubuntu 14.04 distribution.) The problem is that the 2015f version of the tzdata contains an initial "Transition Time" that is out of order. The beginning of the America/Chicago tzfile looks like the following: Transition Time Transition Offset 01/01/1970 00:00:00 -05:50:36 11/18/1883 18:00:00 -06:00:00 03/31/1918 08:00:00 -05:00:00 DST 10/27/1918 07:00:00 -06:00:00 Notice the first entry is for 1970, and then the next entry is for 1883. This breaks the documentation in 'man tzfile': The above header is followed by tzh_timecnt four-byte values of type long, sorted in ascending order. These values are written in "standard" byte order. Each is used as a transition time (as returned by time(2)) at which the rules for computing local time change. This causes the TimeZoneInfo parsing code to throw an exception because it is assuming these transitions are sorted in ascending order. Is this an intentional change in the tzfiles? If so, will the tzfile man page be updated for this change? Eric Erhardt
Just as an aid to verifying this, could you tell us which copy of the data you're using (as each file contains two or three copies of the information). A hex dump of the relevant section would be really handy, too. Jon On 13 August 2015 at 17:21, Eric Erhardt <Eric.Erhardt@microsoft.com> wrote:
I am working on enabling the .NET TimeZoneInfo class to read time zone information from tzfiles.
I’ve hit a snag with the latest tzdata 2015f. (I’m not sure when this change started, but the problem doesn’t occur with the tzfiles that are shipped with an Ubuntu 14.04 distribution.)
The problem is that the 2015f version of the tzdata contains an initial "Transition Time" that is out of order. The beginning of the America/Chicago tzfile looks like the following:
*Transition Time*
*Transition Offset*
01/01/1970 00:00:00
-05:50:36
11/18/1883 18:00:00
-06:00:00
03/31/1918 08:00:00
-05:00:00 DST
10/27/1918 07:00:00
-06:00:00
Notice the first entry is for 1970, and then the next entry is for 1883. This breaks the documentation in 'man tzfile':
The above header is followed by tzh_timecnt four-byte values of type long, *sorted in ascending order*. These values are written in "standard" byte order. Each is used as a transition time (as returned by time(2)) at which the rules for computing local time change.
This causes the TimeZoneInfo parsing code to throw an exception because it is assuming these transitions are sorted in ascending order.
Is this an intentional change in the tzfiles? If so, will the tzfile man page be updated for this change?
Eric Erhardt
Eric Erhardt wrote:
I've hit a snag with the latest tzdata 2015f. (I'm not sure when this change started, but the problem doesn't occur with the tzfiles that are shipped with an Ubuntu 14.04 distribution.) The problem is that the 2015f version of the tzdata contains an initial "Transition Time" that is out of order. The beginning of the America/Chicago tzfile looks like the following: Transition Time
Transition Offset
01/01/1970 00:00:00
-05:50:36
11/18/1883 18:00:00
-06:00:00
That's not what I'm seeing. I assume you're talking about the 64-bit part of the file, since the 1883 time stamp does not fit in 32 bits. The first transition I see, at offset 1348 of the America/Chicago file, is for -576460752303423488 (0xf800000000000000), which is the BIG_BANG time (see zic.c). The second, at file offset 1356, is for -2717650800 (0xffffffff5e03f090), which is 1883-11-18 17:00:00 UTC. Neither of these transition times agree with the times you're showing.
the problem doesn't occur with the tzfiles that are shipped with an Ubuntu 14.04 distribution.)
For what it's worth, the America/Chicago file that I generate by typing 'make install' with the tz distribution is byte-for-byte identical to /usr/share/zoneinfo/America/Chicago on my 64-bit Ubuntu 15.04 host. If I had to guess, my guess is that your software is mishandling the BIG_BANG time because the time stamp is so far in the past. Perhaps Ubuntu 14.04 didn't do the Big Bang?
Paul Eggert wrote:
The second, at file offset 1356, is for -2717650800 (0xffffffff5e03f090), which is 1883-11-18 17:00:00 UTC.
Sorry, I misinterpreted that one. The second one is actually for -2717647200 (0xffffffff5e03fea0), which is 1883-11-18 18:00:00 UTC, and this agrees with your program. So the problem is only with the first transition; the second one looks OK.
Date: Thu, 13 Aug 2015 16:21:47 +0000 From: Eric Erhardt <Eric.Erhardt@microsoft.com> Message-ID: <CY1PR0301MB1530174F8E3E98052AD684708E7D0@CY1PR0301MB1530.namprd03.prod.outlook.com> | I've hit a snag with the latest tzdata 2015f. Aside from what Jon Skeet asked, you should also indicate what you used to generate the tz binary files (tzdata only has the source for the info, not the binary versions you're obviously looking at - and quite properly I think.) Was it the zic that is with the 2015f sources, or did you use some other version, and if so what? What platform was that running on? | This causes the TimeZoneInfo parsing code to throw an exception because it | is assuming these transitions are sorted in ascending order. That's reasonable, they should be. | Is this an intentional change in the tzfiles? No, what you're seeing is definitely a bug. The issue is how that happened. kre
On 14/08/15 11:25, Robert Elz wrote:
Date: Thu, 13 Aug 2015 16:21:47 +0000 From: Eric Erhardt <Eric.Erhardt@microsoft.com> Message-ID: <CY1PR0301MB1530174F8E3E98052AD684708E7D0@CY1PR0301MB1530.namprd03.prod.outlook.com>
| I've hit a snag with the latest tzdata 2015f.
Aside from what Jon Skeet asked, you should also indicate what you used to generate the tz binary files (tzdata only has the source for the info, not the binary versions you're obviously looking at - and quite properly I think.) Was it the zic that is with the 2015f sources, or did you use some other version, and if so what? What platform was that running on?
For example, the Debian tzdata maintainer seems to be using an old version of zic to generate their tzdata files, as they don't seem to have the initial transitions in them (at least in their tzdata-2015f-1 packages for Debian stretch/sid).
| This causes the TimeZoneInfo parsing code to throw an exception because it | is assuming these transitions are sorted in ascending order.
That's reasonable, they should be.
| Is this an intentional change in the tzfiles?
No, what you're seeing is definitely a bug. The issue is how that happened.
If the initial transitions are also missing in Eric's tzfiles, perhaps the bug is related to that. -- -=( Ian Abbott @ MEV Ltd. E-mail: <abbotti@mev.co.uk> )=- -=( Web: http://www.mev.co.uk/ )=-
Date: Fri, 14 Aug 2015 12:55:20 +0100 From: Ian Abbott <abbotti@mev.co.uk> Message-ID: <55CDD728.7040401@mev.co.uk> | If the initial transitions are also missing in Eric's tzfiles, perhaps | the bug is related to that. Actually, given the values that Paul reported, I suspect the bug might be in the code that's reading the files - the epoch result reported looks like something is not using all 64 bits of the values - somehow dropping the top 8 (or more) bits, which would make the big bang timestamp look like 0. It must be using more than 32 bits though, otherwise it couldn't get the 1883 value - but it only needs to be using 33 bits for that one to work. Perhaps some range check is happening, to keep the years within 32 bit signed numbers? Or maybe the top 32 bits are being used only to set the sign for an unsigned bottom 32 bit value - that would produce the results indicated. (for the big bang value, the sign would be negative, but the value is 0, so it wouldn't affect anything.) kre
Date: Fri, 14 Aug 2015 12:55:20 +0100 From: Ian Abbott <abbotti@mev.co.uk> Message-ID: <55CDD728.7040401@mev.co.uk> | If the initial transitions are also missing in Eric's tzfiles, perhaps | the bug is related to that. After I sent the last message, I wondered if perhaps the system Eric used represented times in (fixed point) Java type notation, in milliseconds since the epoch, rather than seconds - but even with that form, the big bang timestamp doesn't overflow 64 bits (though it gets very close, just 2 non-zero bits remain, the sign, and the most significant bit). Of course so many meaningful bits are lost the value would be nonsense, but not 0. Even moving the epoch back to 1900 (which some systems do) doesn't affect anything (if my back of the envelope calculation is right, the epoch would need to move back over 9 billion years - 2/3 of the way to the big bang, for a millisecond counter to produce 0 for the unix style big bang timestamp). [Do not rely upon, nor quote off this list, that value - I did not verify.] Of course, if the internal representation is microseconds (or any more precise unit) since the epoch (1970 or anything else plausible) then the big bang would (if overflow protection isn't perfect) turn into 0. In any of those representations, very recent times, like 1883, fit in 64 bits just fine. kre
Robert Elz wrote:
After I sent the last message, I wondered if perhaps the system Eric used represented times in (fixed point) Java type notation, in milliseconds since the epoch, rather than seconds - but even with that form, the big bang timestamp doesn't overflow 64 bits
Microsoft file times are unsigned 64-bit quantities that count the number of 100ns intervals since 1601-01-01 00:00:00 universal time. If his system is using this format, that'd explain why it overflows for the Big Bang -- though it wouldn't explain why the result was dated 1970.
Date: Fri, 14 Aug 2015 09:26:59 -0700 From: Paul Eggert <eggert@cs.ucla.edu> Message-ID: <55CE16D3.3030803@cs.ucla.edu> | Microsoft file times are unsigned 64-bit quantities that count the number | of 100ns intervals that's 0.1 us. Amazing. | since 1601-01-01 00:00:00 universal time. That's "recent enough" that it is probably not material. | If his system is using this format, that'd explain why it overflows | for the Big Bang Yes. | though it wouldn't explain why the result was dated 1970. It could, depending upon how the conversions is done - we know that in tzdata files, 0 == 1970-01-01 So, take the big bang 0xF800000000000000 multiply by 10*1000*1000 the result is 0x93D1CC0000000000000000 truncate that to 64 bits, (leaving 0), then add the constant conversion factor to adjust the unix epoch based time to the windows one. Then print that, you get 1970, just as you would have if you'd started with a true 1970-01-01 timestamp (ie: 0). This seems very likely to be the problem. The bug is whatever is doing the conversion isn't range checking the input - if the unix time_t value is smaller (or bigger) than their format can represent, they should be either generating an error, or limiting it to the earliest (or latest) times that the format can represent. kre
Here's the steps that I did. I walked through the steps again just to make sure I got them all. Using an Ubuntu 14.04 machine: 1. mkdir tzdata 2. cd tzdata 3. curl -O http://www.iana.org/time-zones/repository/releases/tzdata2015f.tar.gz 4. curl -O http://www.iana.org/time-zones/repository/releases/tzcode2015f.tar.gz 5. tar -xzf tzcode2015f.tar.gz 6. tar -xzf tzdata2015f.tar.gz 7. make TOPDIR=/home/eerhardt/tzdata/install install I am reading the "V2" information of the tzfile - i.e. the 64-bit timestamp section. Here is the byte dump of the America/Chicago file created with the above steps: [1304] 0x54 -- [1305] 0x5a | "TZif" [1306] 0x69 | [1307] 0x66 -- [1308] 0x32 -- "2" [1309] 0x00 -- [1310] 0x00 | [1311] 0x00 | [1312] 0x00 | [1313] 0x00 | [1314] 0x00 | [1315] 0x00 | 15 unused bytes [1316] 0x00 | [1317] 0x00 | [1318] 0x00 | [1319] 0x00 | [1320] 0x00 | [1321] 0x00 | [1322] 0x00 | [1323] 0x00 -- [1324] 0x00 -- [1325] 0x00 | tzh_ttisgmtcnt [1326] 0x00 | [1327] 0x07 -- [1328] 0x00 -- [1329] 0x00 | tzh_ttisstdcnt [1330] 0x00 | [1331] 0x07 -- [1332] 0x00 -- [1333] 0x00 | tzh_leapcnt [1334] 0x00 | [1335] 0x00 -- [1336] 0x00 -- [1337] 0x00 | tzh_timecnt [1338] 0x00 | [1339] 0xed -- [1340] 0x00 -- [1341] 0x00 | tzh_typecnt [1342] 0x00 | [1343] 0x07 -- [1344] 0x00 -- [1345] 0x00 | tzh_charcnt [1346] 0x00 | [1347] 0x18 -- [1348] 0xf8 -- [1349] 0x00 | [1350] 0x00 | [1351] 0x00 | first transition time [1352] 0x00 | [1353] 0x00 | [1354] 0x00 | [1355] 0x00 -- [1356] 0xff -- [1357] 0xff | [1358] 0xff | [1359] 0xff | second transition time [1360] 0x5e | [1361] 0x03 | [1362] 0xfe | [1363] 0xa0 -- And when reading the file from /usr/share/zoneinfo/America/Chicago, I get the following bytes: [1287] 0x54 -- [1288] 0x5a | "TZif" [1289] 0x69 | [1290] 0x66 -- [1291] 0x32 -- "2" [1292] 0x00 -- [1293] 0x00 | [1294] 0x00 | [1295] 0x00 | [1296] 0x00 | [1297] 0x00 | [1298] 0x00 | 15 unused bytes [1299] 0x00 | [1300] 0x00 | [1301] 0x00 | [1302] 0x00 | [1303] 0x00 | [1304] 0x00 | [1305] 0x00 | [1306] 0x00 -- [1307] 0x00 -- [1308] 0x00 | tzh_ttisgmtcnt [1309] 0x00 | [1310] 0x07 -- [1311] 0x00 -- [1312] 0x00 | tzh_ttisstdcnt [1313] 0x00 | [1314] 0x07 -- [1315] 0x00 -- [1316] 0x00 | tzh_leapcnt [1317] 0x00 | [1318] 0x00 -- [1319] 0x00 -- [1320] 0x00 | tzh_timecnt [1321] 0x00 | [1322] 0xec -- [1323] 0x00 -- [1324] 0x00 | tzh_typecnt [1325] 0x00 | [1326] 0x07 -- [1327] 0x00 -- [1328] 0x00 | tzh_charcnt [1329] 0x00 | [1330] 0x18 -- [1331] 0xff -- [1332] 0xff | [1333] 0xff | [1334] 0xff | first transition time [1335] 0x5e | [1336] 0x03 | [1337] 0xfe | [1338] 0xa0 -- So it looks like my issue is the "big bang" transition that didn't appear in the older tz files I was testing with and I didn't know this transition time existed. This time value isn't possible to represent in .NET (since DateTime.MinValue is 00:00:00.0000000 UTC, January 1, 0001, in the Gregorian calendar). The behavior I was seeing (getting a Unix epoch) was my own logic overflowing with this value, which is my mistake. Here was the incorrect logic: // Windows NT time is specified as the number of 100 nanosecond intervals since January 1, 1601. // UNIX time is specified as the number of seconds since January 1, 1970. There are 134,774 days // (or 11,644,473,600 seconds) between these dates. // private static DateTime TZif_UnixTimeToWindowsTime(long unixTime) { // Add 11,644,473,600 and multiply by 10,000,000. Int64 ntTime = (((Int64)unixTime) + 11644473600) * 10000000; //overflows for large positive or negative unixTime values return FromFileTimeUtc(ntTime); } private static DateTime FromFileTimeUtc(long fileTime) { // This is the ticks in Universal time for this fileTime. long universalTicks = fileTime + FileTimeOffset; return new DateTime(universalTicks, DateTimeKind.Utc); } private const int DaysTo1601 = DaysPer400Years * 4; // 584388 private const long TicksPerMillisecond = 10000; private const long TicksPerSecond = TicksPerMillisecond * 1000; private const long TicksPerMinute = TicksPerSecond * 60; private const long TicksPerHour = TicksPerMinute * 60; private const long TicksPerDay = TicksPerHour * 24; private const long FileTimeOffset = DaysTo1601 * TicksPerDay; A couple questions I have about this "big bang" transition: 1. I shouldn't be checking explicitly for this value (0xf800000000000000), right? I saw some code comments in zic.c that says it could potentially change in the future.
BIG_BANG is approximate, and may change in future versions. Please do not rely on its exact value. */
2. Will there ever be more than one transition time that is before January 1, 0001? Or will the "big bang" transition be the only one? I'm thinking the way I'll handle the "big bang" transition is to represent it as DateTime.MinValue in .NET. I'll check to see if the UnixTime is less than the minimum Unix Time possible, and use DateTime.MinValue instead. Thanks Robert, Paul and Ian, for helping me out on this. Eric -----Original Message----- From: Robert Elz [mailto:kre@munnari.OZ.AU] Sent: Friday, August 14, 2015 9:44 AM To: Ian Abbott Cc: Eric Erhardt; tz@iana.org Subject: Re: [tz] tzfiles contain Unix epoch for the first transition time Date: Fri, 14 Aug 2015 12:55:20 +0100 From: Ian Abbott <abbotti@mev.co.uk> Message-ID: <55CDD728.7040401@mev.co.uk> | If the initial transitions are also missing in Eric's tzfiles, perhaps | the bug is related to that. After I sent the last message, I wondered if perhaps the system Eric used represented times in (fixed point) Java type notation, in milliseconds since the epoch, rather than seconds - but even with that form, the big bang timestamp doesn't overflow 64 bits (though it gets very close, just 2 non-zero bits remain, the sign, and the most significant bit). Of course so many meaningful bits are lost the value would be nonsense, but not 0. Even moving the epoch back to 1900 (which some systems do) doesn't affect anything (if my back of the envelope calculation is right, the epoch would need to move back over 9 billion years - 2/3 of the way to the big bang, for a millisecond counter to produce 0 for the unix style big bang timestamp). [Do not rely upon, nor quote off this list, that value - I did not verify.] Of course, if the internal representation is microseconds (or any more precise unit) since the epoch (1970 or anything else plausible) then the big bang would (if overflow protection isn't perfect) turn into 0. In any of those representations, very recent times, like 1883, fit in 64 bits just fine. kre
Date: Fri, 14 Aug 2015 19:16:29 +0000 From: Eric Erhardt <Eric.Erhardt@microsoft.com> Message-ID: <CY1PR0301MB153029A652C0DE9D9D21B86B8E7C0@CY1PR0301MB1530.namprd03.prod.outlook.com> | the "big bang" transition that didn't appear in the older tz files No, it is relatively new. | This time value isn't possible to represent in .NET No, it is way too far back in time for that representation. But ... | (since DateTime.MinValue is 00:00:00.0000000 UTC, January 1, 0001, That's not really the reason - again, a very rough calculation (and assuming I did it correctly) means that the format you described should be able to represent +/- (almost) 30,000 years from the epoch - that's something more that 27000 BC. The only reason I can see for picking that particular minimum value is that it means avoiding the question of what year came before year 1 (some say it was 1 BC, and there was no year 0, others disagree - there is of course no correct answer, as back then years weren't counted this way, and even if they had been, no-one then would have considered the year we now call year 1 as being in any way significant enough to warrant starting counting from then.) It also means avoiding the question of how to represent negative years. I just tried it on my NetBSD system, and managed to get ... Sat May 19 01:22:04 LMT -7537 (that was from "date -r -300000000000" - the -r option on NetBSD allows providing the time_t value to use, rather than getting it from the clock, a linux-like -d also exists, but the formats for that are just too weird). I have no idea if that -7537 is 7537 BC or 7538 BC (ie: whether it is assumed that there was a year 0 or not). I suspect that this all happens just by accident, and no-one really ever considered the possibility of negative years - it is only since time_t's became 64 bits (the last few years) that it even became possible, before then the range was about 1901..2038) Simply claiming that years before year 1 don't exist avoids both problems, so it is kind of an elegant solution. | in the Gregorian calendar). We all do it, but of course, there was no Gregorian calendar then, Pope Gregory didn't exist yet, nor did his great-great-great grandparents. Nor were there even any popes, the job hadn't been invented yet... | 1. I shouldn't be checking explicitly for this value (0xf800000000000000), | right? I saw some code comments in zic.c that says it could potentially | change in the future. Just check for values too small (or large) to represent in the format you're using. That one is a LONG way out of that range. | 2. Will there ever be more than one transition time that is before | January 1, 0001? Or will the "big bang" transition be the only one? It is kind of unlikely - it's hard getting people to actually include transitions before 1970 .. but back then there was no standard time (no railways, planes, or computer networks that need consistent timekeeping) so it is hard to imagine a reason for anything before about the 16th century ever being meaningful enough to include. kre
Robert Elz wrote:
a very rough calculation (and assuming I did it correctly) means that the format you described should be able to represent +/- (almost) 30,000 years from the epoch - that's something more that 27000 BC.
I think MS-Windows DateTime is unsigned internally, so it can't represent any times before 0001-01-01 00:00:00 UTC. It's a bit confusing, as MS-Windows has several time types each with their own epoch and tick size and range.
I have no idea if that -7537 is 7537 BC or 7538 BC (ie: whether it is assumed that there was a year 0 or not). I suspect that this all happens just by accident,
No accident. NetBSD assumes year 0. tzcode is the same, as is GNU/Linux and Solaris. There is also year -1, etc. For example, the tzcode 'date' command does this: $ date -u -r -62135596800 Mon Jan 1 00:00:00 GMT 0001 $ date -u -r -62135596801 Sun Dec 31 23:59:59 GMT 0000 $ date -u -r -62167219200 Sat Jan 1 00:00:00 GMT 0000 $ date -u -r -62167219201 Fri Dec 31 23:59:59 GMT -001 $ date -u -r -67767978442512096 Tue Jan 1 02:38:24 GMT -2147479778 GNU/Linux 'date' is similar except it says 'UTC' rather than 'GMT' (of course neither abbreviation is correct for these old time stamps).
On Fri 2015-08-14T20:16:52 -0700, Paul Eggert hath writ:
$ date -u -r -67767978442512096 Tue Jan 1 02:38:24 GMT -2147479778
GNU/Linux 'date' is similar except it says 'UTC' rather than 'GMT' (of course neither abbreviation is correct for these old time stamps).
Of course. Williams studied tidal rhythmites and found a nearly constant number of about 410 solar days per year from 2 billion to 1 billion years before present, which is about 77000 SI seconds in one day. http://onlinelibrary.wiley.com/doi/10.1029/1999RG900016/abstract That doesn't work with a calendar that supposes 365.25 days per year. Any date before human record keeping should decide whether it is counting seconds, days, or years (and which kind of each) because using the modern relationships does not correspond to anything. -- Steve Allen <sla@ucolick.org> WGS-84 (GPS) UCO/Lick Observatory--ISB Natural Sciences II, Room 165 Lat +36.99855 1156 High Street Voice: +1 831 459 3046 Lng -122.06015 Santa Cruz, CA 95064 http://www.ucolick.org/~sla/ Hgt +250 m
Steve Allen wrote:
Williams studied tidal rhythmites and found a nearly constant number of about 410 solar days per year from 2 billion to 1 billion years before present, which is about 77000 SI seconds in one day. http://onlinelibrary.wiley.com/doi/10.1029/1999RG900016/abstract
Thanks for mentioning that; I wasn't aware of this work. It appears, though, that there's still considerable uncertainty about how long the day was way back when. A recent review says that although tidal rhythmite analysis may help estimate ancient lunar orbital periods in terms of lunar days/month, estimating the length of the ancient Earth day remains uncertain because we don't know the length of the ancient lunar sidereal month. This is in contrast to something else I think you mentioned a while ago, namely the length of the day going back to about 750 BC, for which Richard Stephenson and coworkers have amassed historical eclipse records showing that our UTC-based clocks would be off by about three hours if we naively took them back to the year 0. See, for example, Sauter et al's reconstruction of the total solar eclipse of 0319-05-06 which legend says converted Mirian III of Iberia to Christianity. Longhitano SG, Mellere D, Steel RJ, Ainsworth RB. Tidal depositional systems in the rock record: A review and new insights. Sedimentary Geology 279, 2-22 (2012-11-20). http://dx.doi.org/10.1016/j.sedgeo.2012.03.024 Morrison L. The length of the day: Richard Stephenson’s contribution. Astrophysics and Space Science Proceedings 43 (2015) 3-10. http://dx.doi.org/10.1007/978-3-319-07614-0_1 Sauter J, Simonia I, Stephenson FR, Orchiston W. The legendary fourth-century total solar eclipse in Georgia: Fact or fantasy? Astrophysics and Space Science Proceedings Volume 43 (2015) 25-45. http://dx.doi.org/10.1007/978-3-319-07614-0_3
On Fri, Aug 14, 2015, at 21:36, Robert Elz wrote:
I have no idea if that -7537 is 7537 BC or 7538 BC (ie: whether it is assumed that there was a year 0 or not).
It is 7538 BC. The same value works on OSX, and -62150000000 gives the year 0000. This is specified by ISO 8601, but neither the C standard nor (as far as I know) POSIX provides any guidance other than saying the meaning of years less than 1 is unspecified. I do notice that if I attempt to enter the actual big bang time, I get the error message "date: localtime: Value too large to be stored in data type" - the actual limit being run into seems to be a 32-bit value for tm_year (the largest negative value it can represent is a year of 2147481748: -2**31 + 1900). Interestingly, strftime apparently has no trouble formatting a year of 2147485547 (2**31+1899) despite that being beyond the 32-bit limit.
I suspect that this all happens just by accident, and no-one really ever considered the possibility of negative years - it is only since time_t's became 64 bits (the last few years) that it even became possible, before then the range was about 1901..2038)
Simply claiming that years before year 1 don't exist avoids both problems, so it is kind of an elegant solution.
Artificially limiting the year range also allows you to use a fixed-size broken-down time format (python datetime uses a 16-bit year) or a floating-point format (MS Excel and therefore COM use a floating-point format measured in days) - both limit the year to 1 through 9999 and not having to contend with different formats having different real limits.
random832@fastmail.us wrote:
strftime apparently has no trouble formatting a year of 2147485547 (2**31+1899) despite that being beyond the 32-bit limit.
tzcode strftime has special code to format years correctly even if tm_year + 1900 exceeds INT_MAX. See the _yconv code involving DIVISOR in strftime.c. As I understand it POSIX requires this sort of thing. That is, although POSIX doesn't require support for UTC years before 1970, POSIX does require that localtime and strftime support UTC years through INT_MAX + 1900 if time_t is wide enough (e.g., the common case of 64-bit time_t and 32-bit int).
Eric Erhardt wrote:
2. Will there ever be more than one transition time that is before January 1, 0001? Or will the "big bang" transition be the only one?
It's unlikely that we'll see more than one transition before that cutoff in the published data. That being said, the binary file format does allow more than one such transition and it should be easy enough to ignore all but the last one. We were toying with the idea having zic put one transition at -2**63 and a later transition at the Big Bang, for example, and I'd rather not rule that out in future versions.
On 13/08/15 17:21, Eric Erhardt wrote:
Notice the first entry is for 1970, and then the next entry is for 1883. This breaks the documentation in 'man tzfile':
The first entry was probably a 'null' timestamp? And the software has to display a valid date for which the current 'default' is used. I see this sort of problem often when looking at genealogical data where '0' has been used as an unknown date. The results depend on just which date processing software is being used ... -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
participants (8)
-
Eric Erhardt -
Ian Abbott -
Jon Skeet -
Lester Caine -
Paul Eggert -
random832@fastmail.us -
Robert Elz -
Steve Allen