[PROPOSED] Add interoperability sections to tzfile.5
This change consists of text that I contributed to https://tools.ietf.org/html/draft-murchison-tzdist-tzif-15 and that should be useful generally. * NEWS, tzfile.5: New sections. --- NEWS | 4 ++ tzfile.5 | 192 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 195 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index 943db3e..68f826e 100644 --- a/NEWS +++ b/NEWS @@ -16,6 +16,10 @@ Unreleased, experimental changes This reverts to 2011h, as the abbreviation change in 2011i was likely inadvertent. + Changes to documentation + + tzfile.5 has new sections on interoperability issues. + Release 2018f - 2018-10-18 00:14:18 -0700 diff --git a/tzfile.5 b/tzfile.5 index 79b19bf..bbdccfc 100644 --- a/tzfile.5 +++ b/tzfile.5 @@ -9,6 +9,8 @@ tzfile \- timezone information .de q \\$3\*(lq\\$1\*(rq\\$2 .. +.ie \n(.g .ds - \f(CW-\fP +.el ds - \- The timezone information files used by .BR tzset (3) are typically found under a directory with a name like @@ -186,8 +188,196 @@ from 0 through 24. Second, DST is in effect all year if it starts January 1 at 00:00 and ends December 31 at 24:00 plus the difference between daylight saving and standard time. -.PP +.SS Interoperability considerations Future changes to the format may append more data. +.PP +Version 1 files are considered a legacy format and +should be avoided, as they do not support transition +times after the year 2038. +Readers that only understand Version 1 must ignore +any data that extends beyond the calculated end of the version +1 data block. +.PP +Writers should generate a version 3 file if +TZ string extensions are necessary to accurately +model transition times. +Otherwise, version 2 files should be generated. +.PP +The sequence of time changes defined by the version 1 +header and data block should be a contiguous subsequence +of the time changes defined by the version 2+ header and data +block, and by the footer. +This guideline helps obsolescent version 1 readers +agree with current readers about timestamps within the +contiguous subsequence. It also lets writers not +supporting obsolescent readers use a +.I tzh_timecnt +of zero +in the version 1 data block to save space. +.PP +Time zone designations should consist of at least three (3) +and no more than six (6) ASCII characters from the set of +alphanumerics, +.q "-", +and +.q "+". +This is for compatibility with POSIX requirements for +time zone abbreviations. +.PP +When reading a version 2 or 3 file, readers +should ignore the version 1 header and data block except for +the purpose of skipping over them. +.PP +Readers should calculate the total lengths of the +headers and data blocks and check that they all fit within +the actual file size, as part of a validity check for the file. +.SS Common interoperability issues +This section documents common problems in reading or writing TZif files. +Most of these are problems in generating TZif files for use by +older readers. +The goals of this section are: +.IP * 2 +to help TZif writers output files that avoid common +pitfalls in older or buggy TZif readers, +.IP * +to help TZif readers avoid common pitfalls when reading +files generated by future TZif writers, and +.IP * +to help any future specification authors see what sort of +problems arise when the TZif format is changed. +.PP +When new versions of the TZif format have been defined, a +design goal has been that a reader can successfully use a TZif +file even if the file is of a later TZif version than what the +reader was designed for. +When complete compatibility was not achieved, an attempt was +made to limit glitches to rarely-used timestamps, and to allow +simple partial workarounds in writers designed to generate +new-version data useful even for older-version readers. +This section attempts to document these compatibility issues and +workarounds, as well as to document other common bugs in +readers. +.PP +Interoperability problems with TZif include the following: +.IP * 2 +Some readers examine only version 1 data. +As a partial workaround, a writer can output as much version 1 +data as possible. +However, a reader should ignore version 1 data, and should use +version 2+ data even if the reader's native timestamps have only +32 bits. +.IP * +Some readers designed for version 2 might mishandle +timestamps after a version 3 file's last transition, because +they cannot parse extensions to POSIX in the TZ-like string. +As a partial workaround, a writer can output more transitions +than necessary, so that only far-future timestamps are +mishandled by version 2 readers. +.IP * +Some readers designed for version 2 do not support +permanent daylight saving time, e.g., a TZ string +.q "EST5EDT,0/0,J365/25" +denoting permanent Eastern Daylight Time (\*-04). +As a partial workaround, a writer can substitute standard time +for the next time zone east, e.g., +.q "AST4" +for permanent Atlantic Standard Time (\*-04). +.IP * +Some readers ignore the footer, and instead predict future +timestamps from the time type of the last transition. +As a partial workaround, a writer can output more transitions +than necessary. +.IP * +Some readers do not use time type 0 for timestamps before +the first transition, in that they infer a time type using a +heuristic that does not always select time type 0. +As a partial workaround, a writer can output a dummy (no-op) +first transition at an early time. +.IP * +Some readers mishandle timestamps before the first +transition that has a timestamp not less than -2**31. +Readers that support only 32-bit timestamps are likely to be +more prone to this problem, for example, when they process +64-bit transitions only some of which are representable in 32 +bits. +As a partial workaround, a writer can output a dummy +transition at timestamp \*-2**31. +.IP * +Some readers mishandle a transition if its timestamp has +the minimum possible signed 64-bit value. +Timestamps less than \*-2**59 are not recommended. +.IP * +Some readers mishandle POSIX-style TZ strings that +contain +.q "<" +or +.q ">". +As a partial workaround, a writer can avoid using +.q "<" +or +.q ">" +for time zone abbreviations containing only alphabetic +characters. +.IP * +Many readers mishandle time zone abbreviations that contain +non-ASCII characters. +These characters are not recommended. +.IP * +Some readers may mishandle time zone abbreviations that +contain fewer than 3 or more than 6 characters, or that +contain ASCII characters other than alphanumerics, +.q "-", +and +.q "+". +These abbreviations are not recommended. +.IP * +Some readers mishandle TZif files that specify +daylight-saving time UT offsets that are less than the UT +offsets for the corresponding standard time. +These readers do not support locations like Ireland, which +uses the equivalent of the POSIX TZ string +.q "IST\*-1GMT0,M10.5.0,M3.5.0/1", +observing standard time +(IST, +01) in summer and daylight saving time (GMT, +00) in winter. +As a partial workaround, a writer can output data for the +equivalent of the POSIX TZ string +.q "GMT0IST,M3.5.0/1,M10.5.0", +thus swapping standard and daylight saving time. +Although this workaround misidentifies which part of the year +uses daylight saving time, it records UT offsets and time zone +abbreviations correctly. +.PP +Some interoperability problems are reader bugs that +are listed here mostly as warnings to developers of readers. +.IP * 2 +Some readers do not support negative timestamps. +Developers of distributed applications should keep this +in mind if they need to deal with pre-1970 data. +.IP * +Some readers mishandle timestamps before the first +transition that has a nonnegative timestamp. +Readers that do not support negative timestamps are likely to +be more prone to this problem. +.IP * +Some readers mishandle time zone abbreviations like +.q "-08" +that contain +.q "+", +.q "-", +or digits. +.IP * +Some readers mishandle UT offsets that are out of the +traditional range of \*-12 through +12 hours, and so do not +support locations like Kiritimati that are outside this +range. +.IP * +Some readers mishandle UT offsets in the range [\*-3599, \*-1] +seconds from UT, because they integer-divide the offset by +3600 to get 0 and then display the hour part as +.q "+00". +.IP * +Some readers mishandle UT offsets that are not a multiple +of one hour, or of 15 minutes, or of 1 minute. .SH SEE ALSO .BR time (2), .BR localtime (3), -- 2.17.2
participants (1)
-
Paul Eggert