zdump with version 1 files
Howdy, As part of the implementation of the new zoneinfo module for the Python standard library, I have been attempting to generate some version 1 files for test purposes (in the sadly likely event that some people only have version 1 TZif files deployed). I have done this by taking existing TZif files, truncating them at the second TZif header, and changing the version number in the truncated file. I notice, however, that with the 2019c version of zdump, the files generated this way are considered invalid: $ zdump --version zdump (tzcode) 2019c $ zdump -i -c2038,2039 $(realpath .)/America/Los_Angeles /tmp/zoneinfon3c1bcna/v1/America/Los_Angeles: Invalid argument $ zdump -i -c2010,2011 $(realpath .)/America/Los_Angeles /tmp/zoneinfon3c1bcna/v1/America/Los_Angeles: Invalid argument Interestingly, if I /do not/ truncate the file and just change the version to "1" (in both the first *and* second header), it will read the file /and/ it will clearly use the version 2/3 data: $ file Australia/Sydney Australia/Sydney: timezone data, version 1, no gmt time flags, 5 std time flags, no leap seconds, 142 transition times, 5 abbreviation chars $ zdump -i -c2200,2201 $(realpath .)/Australia/Sydney TZ="/tmp/zoneinfot3h5_eyf/v1/Australia/Sydney" - - +11 AEDT 1 2200-04-06 02 +10 AEST 2200-10-05 03 +11 AEDT 1 So I guess my question is: does zdump support well-formed version 1 files, and I am failing to create well-formed version 1 files, or does 2019c's zdump have no support for version 1 files? I'll note that this whole thing came up because I was wondering what zdump does for times after the last transition in a version 1 file. RFC 8536 indicates that this behavior is undefined - I was planning to hold the value of the offset after the last transition, but I figured I might as well check what zdump does. Thanks, Paul
Classic files with only 32-bit data start out with "TZif\0" while files with 64-bit data but no TZ string at the end start with "TZif1" and the latest, greatest files start out with "TZif2"--so if you truncate and change the "version number" byte to '\0' rather than '1' all should be well (or at least it is on the system I use). @dashdashado On Fri, Mar 20, 2020 at 8:49 AM Paul Ganssle <paul@ganssle.io> wrote:
Howdy,
As part of the implementation of the new zoneinfo module for the Python standard library, I have been attempting to generate some version 1 files for test purposes (in the sadly likely event that some people only have version 1 TZif files deployed). I have done this by taking existing TZif files, truncating them at the second TZif header, and changing the version number in the truncated file.
I notice, however, that with the 2019c version of zdump, the files generated this way are considered invalid:
$ zdump --version zdump (tzcode) 2019c $ zdump -i -c2038,2039 $(realpath .)/America/Los_Angeles /tmp/zoneinfon3c1bcna/v1/America/Los_Angeles: Invalid argument $ zdump -i -c2010,2011 $(realpath .)/America/Los_Angeles /tmp/zoneinfon3c1bcna/v1/America/Los_Angeles: Invalid argument
Interestingly, if I *do not* truncate the file and just change the version to "1" (in both the first *and* second header), it will read the file *and* it will clearly use the version 2/3 data:
$ file Australia/Sydney Australia/Sydney: timezone data, version 1, no gmt time flags, 5 std time flags, no leap seconds, 142 transition times, 5 abbreviation chars $ zdump -i -c2200,2201 $(realpath .)/Australia/Sydney
TZ="/tmp/zoneinfot3h5_eyf/v1/Australia/Sydney" - - +11 AEDT 1 2200-04-06 02 +10 AEST 2200-10-05 03 +11 AEDT 1
So I guess my question is: does zdump support well-formed version 1 files, and I am failing to create well-formed version 1 files, or does 2019c's zdump have no support for version 1 files?
I'll note that this whole thing came up because I was wondering what zdump does for times after the last transition in a version 1 file. RFC 8536 indicates that this behavior is undefined - I was planning to hold the value of the offset after the last transition, but I figured I might as well check what zdump does.
Thanks, Paul
Ah, thank you! I definitely should have realized at least the part about using the NUL byte - and I see that in RFC 8536 now. However, I note that the RFC 8536 Section 3.1 makes no mention of a '1' version with the 32-bit data: https://tools.ietf.org/html/rfc8536#section-3.1 Are the "TZif1" files non-standard, or are they prevalent enough that I should be aiming to support them? Thanks, Paul On 3/20/20 9:43 AM, Arthur David Olson wrote:
Classic files with only 32-bit data start out with "TZif\0" while files with 64-bit data but no TZ string at the end start with "TZif1" and the latest, greatest files start out with "TZif2"--so if you truncate and change the "version number" byte to '\0' rather than '1' all should be well (or at least it is on the system I use).
@dashdashado
On Fri, Mar 20, 2020 at 8:49 AM Paul Ganssle <paul@ganssle.io <mailto:paul@ganssle.io>> wrote:
Howdy,
As part of the implementation of the new zoneinfo module for the Python standard library, I have been attempting to generate some version 1 files for test purposes (in the sadly likely event that some people only have version 1 TZif files deployed). I have done this by taking existing TZif files, truncating them at the second TZif header, and changing the version number in the truncated file.
I notice, however, that with the 2019c version of zdump, the files generated this way are considered invalid:
$ zdump --version zdump (tzcode) 2019c $ zdump -i -c2038,2039 $(realpath .)/America/Los_Angeles /tmp/zoneinfon3c1bcna/v1/America/Los_Angeles: Invalid argument $ zdump -i -c2010,2011 $(realpath .)/America/Los_Angeles /tmp/zoneinfon3c1bcna/v1/America/Los_Angeles: Invalid argument
Interestingly, if I /do not/ truncate the file and just change the version to "1" (in both the first *and* second header), it will read the file /and/ it will clearly use the version 2/3 data:
$ file Australia/Sydney Australia/Sydney: timezone data, version 1, no gmt time flags, 5 std time flags, no leap seconds, 142 transition times, 5 abbreviation chars $ zdump -i -c2200,2201 $(realpath .)/Australia/Sydney
TZ="/tmp/zoneinfot3h5_eyf/v1/Australia/Sydney" - - +11 AEDT 1 2200-04-06 02 +10 AEST 2200-10-05 03 +11 AEDT 1
So I guess my question is: does zdump support well-formed version 1 files, and I am failing to create well-formed version 1 files, or does 2019c's zdump have no support for version 1 files?
I'll note that this whole thing came up because I was wondering what zdump does for times after the last transition in a version 1 file. RFC 8536 indicates that this behavior is undefined - I was planning to hold the value of the offset after the last transition, but I figured I might as well check what zdump does.
Thanks, Paul
I believe Arthur meant to say "TZif2" and "TZif3", respectively. -- Tim Parenti On Fri, 20 Mar 2020 at 10:05, Paul Ganssle <paul@ganssle.io> wrote:
Ah, thank you! I definitely should have realized at least the part about using the NUL byte - and I see that in RFC 8536 now.
However, I note that the RFC 8536 Section 3.1 makes no mention of a '1' version with the 32-bit data: https://tools.ietf.org/html/rfc8536#section-3.1
Are the "TZif1" files non-standard, or are they prevalent enough that I should be aiming to support them?
Thanks, Paul On 3/20/20 9:43 AM, Arthur David Olson wrote:
Classic files with only 32-bit data start out with "TZif\0" while files with 64-bit data but no TZ string at the end start with "TZif1" and the latest, greatest files start out with "TZif2"--so if you truncate and change the "version number" byte to '\0' rather than '1' all should be well (or at least it is on the system I use).
@dashdashado
On Fri, Mar 20, 2020 at 8:49 AM Paul Ganssle <paul@ganssle.io> wrote:
Howdy,
As part of the implementation of the new zoneinfo module for the Python standard library, I have been attempting to generate some version 1 files for test purposes (in the sadly likely event that some people only have version 1 TZif files deployed). I have done this by taking existing TZif files, truncating them at the second TZif header, and changing the version number in the truncated file.
I notice, however, that with the 2019c version of zdump, the files generated this way are considered invalid:
$ zdump --version zdump (tzcode) 2019c $ zdump -i -c2038,2039 $(realpath .)/America/Los_Angeles /tmp/zoneinfon3c1bcna/v1/America/Los_Angeles: Invalid argument $ zdump -i -c2010,2011 $(realpath .)/America/Los_Angeles /tmp/zoneinfon3c1bcna/v1/America/Los_Angeles: Invalid argument
Interestingly, if I *do not* truncate the file and just change the version to "1" (in both the first *and* second header), it will read the file *and* it will clearly use the version 2/3 data:
$ file Australia/Sydney Australia/Sydney: timezone data, version 1, no gmt time flags, 5 std time flags, no leap seconds, 142 transition times, 5 abbreviation chars $ zdump -i -c2200,2201 $(realpath .)/Australia/Sydney
TZ="/tmp/zoneinfot3h5_eyf/v1/Australia/Sydney" - - +11 AEDT 1 2200-04-06 02 +10 AEST 2200-10-05 03 +11 AEDT 1
So I guess my question is: does zdump support well-formed version 1 files, and I am failing to create well-formed version 1 files, or does 2019c's zdump have no support for version 1 files?
I'll note that this whole thing came up because I was wondering what zdump does for times after the last transition in a version 1 file. RFC 8536 indicates that this behavior is undefined - I was planning to hold the value of the offset after the last transition, but I figured I might as well check what zdump does.
Thanks, Paul
https://tools.ietf.org/html/rfc8536#section-3.1 gets it right. A visit to... https://ftp.iana.org/tz/releases/ ...and a bit of spelunking indicates that as of tzcode2006a.tar.gz, zic was producing "TZif\0" files and as of tzcode2006c.tar.gz it was producing "TZif2" files (with both 64-bit data and a TZ string at the end), so it was a direct move from "TZif\0" to "TZif2" (with no intermediate "TZif1"). There should be no TZif1 files in the wild. As RFC8536 indicates, the only difference between "TZif3" and "TZif2" is that "TZif3" allows non-POSIX extensions to the TZ string at the end of the file ("TZif2" only uses POSIX-blessed TZ strings). @dashdashado On Fri, Mar 20, 2020 at 10:08 AM Tim Parenti <tim@timtimeonline.com> wrote:
I believe Arthur meant to say "TZif2" and "TZif3", respectively.
-- Tim Parenti
On Fri, 20 Mar 2020 at 10:05, Paul Ganssle <paul@ganssle.io> wrote:
Ah, thank you! I definitely should have realized at least the part about using the NUL byte - and I see that in RFC 8536 now.
However, I note that the RFC 8536 Section 3.1 makes no mention of a '1' version with the 32-bit data: https://tools.ietf.org/html/rfc8536#section-3.1
Are the "TZif1" files non-standard, or are they prevalent enough that I should be aiming to support them?
Thanks, Paul On 3/20/20 9:43 AM, Arthur David Olson wrote:
Classic files with only 32-bit data start out with "TZif\0" while files with 64-bit data but no TZ string at the end start with "TZif1" and the latest, greatest files start out with "TZif2"--so if you truncate and change the "version number" byte to '\0' rather than '1' all should be well (or at least it is on the system I use).
@dashdashado
On Fri, Mar 20, 2020 at 8:49 AM Paul Ganssle <paul@ganssle.io> wrote:
Howdy,
As part of the implementation of the new zoneinfo module for the Python standard library, I have been attempting to generate some version 1 files for test purposes (in the sadly likely event that some people only have version 1 TZif files deployed). I have done this by taking existing TZif files, truncating them at the second TZif header, and changing the version number in the truncated file.
I notice, however, that with the 2019c version of zdump, the files generated this way are considered invalid:
$ zdump --version zdump (tzcode) 2019c $ zdump -i -c2038,2039 $(realpath .)/America/Los_Angeles /tmp/zoneinfon3c1bcna/v1/America/Los_Angeles: Invalid argument $ zdump -i -c2010,2011 $(realpath .)/America/Los_Angeles /tmp/zoneinfon3c1bcna/v1/America/Los_Angeles: Invalid argument
Interestingly, if I *do not* truncate the file and just change the version to "1" (in both the first *and* second header), it will read the file *and* it will clearly use the version 2/3 data:
$ file Australia/Sydney Australia/Sydney: timezone data, version 1, no gmt time flags, 5 std time flags, no leap seconds, 142 transition times, 5 abbreviation chars $ zdump -i -c2200,2201 $(realpath .)/Australia/Sydney
TZ="/tmp/zoneinfot3h5_eyf/v1/Australia/Sydney" - - +11 AEDT 1 2200-04-06 02 +10 AEST 2200-10-05 03 +11 AEDT 1
So I guess my question is: does zdump support well-formed version 1 files, and I am failing to create well-formed version 1 files, or does 2019c's zdump have no support for version 1 files?
I'll note that this whole thing came up because I was wondering what zdump does for times after the last transition in a version 1 file. RFC 8536 indicates that this behavior is undefined - I was planning to hold the value of the offset after the last transition, but I figured I might as well check what zdump does.
Thanks, Paul
On 3/20/20 7:04 AM, Paul Ganssle wrote:
Are the "TZif1" files non-standard, or are they prevalent enough that I should be aiming to support them?
For what it's worth, tzcode ignores both the version number and the magic "TZif" string when deciphering a TZif file. That is, tzcode merely looks at the file's payload and does the best that it can do. I suppose that tzcode could be pickier about parsing TZif files, but I expect that in practice the pickiness would not matter much and there is some benefit to being generous about what tzcode accepts.
On 3/20/20 5:48 AM, Paul Ganssle wrote:
I have been attempting to generate some version 1 files for test purposes (in the sadly likely event that some people only have version 1 TZif files deployed).
Do you have reason to believe that people still use version-1 TZif files? As far as I can recall, only Android limits itself to 32-bit timestamps (expiring in 2039) among current operating systems, even the 32-bit OSes. However, Android no longer uses separate TZif files and instead bundles up tzdata as part of a single larger file somehow. I don't know where this stuff is written down; it's merely part of my vague recollection and could be incorrect.
On 2020-03-21 14:17, Paul Eggert wrote:
On 3/20/20 5:48 AM, Paul Ganssle wrote:
I have been attempting to generate some version 1 files for test purposes (in the sadly likely event that some people only have version 1 TZif files deployed).
Do you have reason to believe that people still use version-1 TZif files? As far as I can recall, only Android limits itself to 32-bit timestamps (expiring in 2039) among current operating systems, even the 32-bit OSes. However, Android no longer uses separate TZif files and instead bundles up tzdata as part of a single larger file somehow.
I don't know where this stuff is written down; it's merely part of my vague recollection and could be incorrect.
This appears to be the process doc: https://source.android.com/devices/tech/config/timezone-rules and this appears to be the main code: https://android.googlesource.com/platform/system/timezone/+/refs/heads/maste... which appears to build and use the distributed zic code to build the rearguard data, including backward, and package it into an Android zip archive format including xml info for ICU to use to locate the tzdata. -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised.
On 3/21/20 2:08 PM, Brian Inglis wrote:
This appears to be the process doc:
https://source.android.com/devices/tech/config/timezone-rules
and this appears to be the main code:
https://android.googlesource.com/platform/system/timezone/+/refs/heads/maste...
Thanks for the links. They talk about how TZif files are assembled together into a single tzdata file. However, as I vaguely recall the Android folks modified zic to generate 32-bit only TZif files. I don't offhand know whether that is still true, or how to verify whether it's true. Perhaps it was true long ago but became false when Android started supporting 64-bit CPUs; a bit of Googling suggests this occurred with Android 5.0 Lollipop (2014).
On 2020-03-21 16:24, Paul Eggert wrote:
On 3/21/20 2:08 PM, Brian Inglis wrote:
This appears to be the process doc: https://source.android.com/devices/tech/config/timezone-rules and this appears to be the main code: https://android.googlesource.com/platform/system/timezone/+/refs/heads/maste...
Thanks for the links. They talk about how TZif files are assembled together into a single tzdata file. However, as I vaguely recall the Android folks modified zic to generate 32-bit only TZif files. I don't offhand know whether that is still true, or how to verify whether it's true. Perhaps it was true long ago but became false when Android started supporting 64-bit CPUs; a bit of Googling suggests this occurred with Android 5.0 Lollipop (2014).
At linked code line 53: https://android.googlesource.com/platform/system/timezone/+/refs/heads/maste... def GenerateZicInputFile(extracted_iana_data_dir): # Android APIs assume DST means "summer time" so we follow the rearguard format # introduced in 2018e. zic_input_file_name = 'rearguard.zi' [now appears rearguard.zi is the only input] # 'NDATA=' is used to remove unnecessary rules files. subprocess.check_call(['make', '-C', extracted_iana_data_dir, 'NDATA=', zic_input_file_name]) At linked code line 134: https://android.googlesource.com/platform/system/timezone/+/refs/heads/maste... [everything is untarred to a zic build dir and make built] zic_build_dir = '%s/zic' % tmp_dir ExtractTarFile(iana_zic_code_tar_file, zic_build_dir) ExtractTarFile(iana_zic_data_tar_file, zic_build_dir) # zic print('Building zic...') # VERSION_DEPS= is to stop the build process looking for files that might not # be present across different versions. subprocess.check_call(['make', '-C', zic_build_dir, 'zic']) zic_binary_file = '%s/zic' % zic_build_dir ... print('Generating zic input file...') zic_input_file = GenerateZicInputFile(extracted_iana_data_dir) print('Calling zic...') zic_output_dir = '%s/data' % tmp_dir os.mkdir(zic_output_dir) [everything is generated from rearguard] zic_cmd = [zic_binary_file, '-d', zic_output_dir, zic_input_file] -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised.
On 3/22/20 9:52 AM, Brian Inglis wrote:
Perhaps it was true long ago but became false when Android started supporting 64-bit CPUs; a bit of Googling suggests this occurred with Android 5.0 Lollipop (2014).
https://android.googlesource.com/platform/system/timezone/+/refs/heads/maste...
# zic print('Building zic...') # VERSION_DEPS= is to stop the build process looking for files that might not # be present across different versions. subprocess.check_call(['make', '-C', zic_build_dir, 'zic'])
Thanks for the pointer. If I understand the code aright, they now generate version 2 or 3 TZif data. It is a little odd, though, since the comment about VERSION_DEPS does not match the code. Perhaps someone who's actually built AOSP can verify. Anyway, for now I'll guess that Android 5.0 Lollipop (2014) and later do not use version 1 format. statcounter.com estimates that 2.08% of Android devices are still running KitKat (2013), the release before Lollipop. If you add in the even-older releases I'd guess about 3% of Android devices predate Lollipop. As far as I know nobody supports these older devices and they are not getting tzdata updates. So it appears that version 1 TZif format is truly dead, and we needn't worry about it.
participants (5)
-
Arthur David Olson -
Brian Inglis -
Paul Eggert -
Paul Ganssle -
Tim Parenti