
I'd say those cons are pretty significant - I find it very significantly harder to read than the format I've propsed. I'm also confused by your "pro" that it doesn't depend on any implementation details, but it really *exposes* the implementation details in naming ("isdst" and "gmtoff" for example, along with the mysterious huge numeric values). The aim is to have a format which is easy and natural to generate on *multiple* platforms, so that authors of other code parsing the data can validate against it. That would be a *very* unnatural format to generate from .NET with either Noda Time or TimeZoneInfo, or from Java 7, Java 8 or Joda Time. I'd certainly be in favour of my "zicdump" C# code being rewritten in C (and maybe even offered as a different format for zdump, based on a command-line flag) but I see very little benefit in adopting a format which doesn't seem to have been designed for the same purpose as the tzvalidate one. Now having said all that, I'm very happy to tweak the format (and have already done so based on earlier suggestions) - but I wouldn't want to use the zdump -v format just because it already exists, if it's not really fit for purpose. Jon On 27 April 2016 at 16:51, Random832 <random832@fastmail.com> wrote:
I missed this the first time around...
On 11 July 2015 at 11:35, Jon Skeet <skeet@pobox.com> wrote:
Background: I'm the primary developer for Noda Time < http://nodatime.org> which consumes the tz data. I'm currently refactoring the code to do this... and I've come across some code (originally ported from Joda Time) which I now understand in terms of what it's doing, but not exactly why.
For a little while now, the Noda Time source repo has included a text dump file < https://github.com/nodatime/nodatime/blob/master/src/NodaTime.Test/TestData/... , containing a text dump of every transition (up to 2100, at the moment) for every time zone. It looks like this, picking just one example:
Zone: Africa/Maseru LMT: [StartOfTime, 1892-02-07T22:08:00Z) +01:52 (+00) SAST: [1892-02-07T22:08:00Z, 1903-02-28T22:30:00Z) +01:30 (+00) SAST: [1903-02-28T22:30:00Z, 1942-09-20T00:00:00Z) +02 (+00) SAST: [1942-09-20T00:00:00Z, 1943-03-20T23:00:00Z) +03 (+01) SAST: [1943-03-20T23:00:00Z, 1943-09-19T00:00:00Z) +02 (+00) SAST: [1943-09-19T00:00:00Z, 1944-03-18T23:00:00Z) +03 (+01) SAST: [1944-03-18T23:00:00Z, EndOfTime) +02 (+00)
...
Any thoughts? If the feeling is broadly positive, the next step would be to nail down the text format, then find a willing victim/volunteer to write the C code. (You really don't want me writing C...)
What's wrong with zdump's output format?
$ zdump -v Africa/Maseru Africa/Maseru -9223372036854775808 = NULL Africa/Maseru -9223372036854689408 = NULL Africa/Maseru Sun Feb 7 22:07:59 1892 UTC = Sun Feb 7 23:59:59 1892 LMT isdst=0 gmtoff=6720 Africa/Maseru Sun Feb 7 22:08:00 1892 UTC = Sun Feb 7 23:38:00 1892 SAST isdst=0 gmtoff=5400 Africa/Maseru Sat Feb 28 22:29:59 1903 UTC = Sat Feb 28 23:59:59 1903 SAST isdst=0 gmtoff=5400 Africa/Maseru Sat Feb 28 22:30:00 1903 UTC = Sun Mar 1 00:30:00 1903 SAST isdst=0 gmtoff=7200 Africa/Maseru Sat Sep 19 23:59:59 1942 UTC = Sun Sep 20 01:59:59 1942 SAST isdst=0 gmtoff=7200 Africa/Maseru Sun Sep 20 00:00:00 1942 UTC = Sun Sep 20 03:00:00 1942 SAST isdst=1 gmtoff=10800 Africa/Maseru Sat Mar 20 22:59:59 1943 UTC = Sun Mar 21 01:59:59 1943 SAST isdst=1 gmtoff=10800 Africa/Maseru Sat Mar 20 23:00:00 1943 UTC = Sun Mar 21 01:00:00 1943 SAST isdst=0 gmtoff=7200 Africa/Maseru Sat Sep 18 23:59:59 1943 UTC = Sun Sep 19 01:59:59 1943 SAST isdst=0 gmtoff=7200 Africa/Maseru Sun Sep 19 00:00:00 1943 UTC = Sun Sep 19 03:00:00 1943 SAST isdst=1 gmtoff=10800 Africa/Maseru Sat Mar 18 22:59:59 1944 UTC = Sun Mar 19 01:59:59 1944 SAST isdst=1 gmtoff=10800 Africa/Maseru Sat Mar 18 23:00:00 1944 UTC = Sun Mar 19 01:00:00 1944 SAST isdst=0 gmtoff=7200 Africa/Maseru 9223372036854689407 = NULL Africa/Maseru 9223372036854775807 = NULL
Cons: - A bit verbose - technically uses instants (from before and on each transition) rather than spans. - The NULLs are a bit mysterious - I'm personally not sure *exactly* how it finds the transitions, and in particular I'm not sure if it will reliably find multiple transitions per day
Pros: - Already exists - Is already written in C, and already installed on many systems - Does not depend on any implementation internals