I'd say those cons are pretty significant - I find it very significantly harder to read than the format I've propsed. I'm also confused by your "pro" that it doesn't depend on any implementation details, but it really exposes the implementation details in naming ("isdst" and "gmtoff" for example, along with the mysterious huge numeric values).
The aim is to have a format which is easy and natural to generate on multiple platforms, so that authors of other code parsing the data can validate against it. That would be a very unnatural format to generate from .NET with either Noda Time or TimeZoneInfo, or from Java 7, Java 8 or Joda Time.

I'd certainly be in favour of my "zicdump" C# code being rewritten in C (and maybe even offered as a different format for zdump, based on a command-line flag) but I see very little benefit in adopting a format which doesn't seem to have been designed for the same purpose as the tzvalidate one.

Now having said all that, I'm very happy to tweak the format (and have already done so based on earlier suggestions) - but I wouldn't want to use the zdump -v format just because it already exists, if it's not really fit for purpose.

Jon


On 27 April 2016 at 16:51, Random832 <random832@fastmail.com> wrote:
I missed this the first time around...

> On 11 July 2015 at 11:35, Jon Skeet <skeet@pobox.com> wrote:
>
> > Background: I'm the primary developer for Noda Time <http://nodatime.org> which
> > consumes the tz data. I'm currently refactoring the code to do this... and
> > I've come across some code (originally ported from Joda Time) which I now
> > understand in terms of what it's doing, but not exactly why.
> >
> > For a little while now, the Noda Time source repo has included a text
> > dump file
> > <https://github.com/nodatime/nodatime/blob/master/src/NodaTime.Test/TestData/tzdb-dump.txt>,
> > containing a text dump of every transition (up to 2100, at the moment) for
> > every time zone. It looks like this, picking just one example:
> >
> > Zone: Africa/Maseru
> > LMT: [StartOfTime, 1892-02-07T22:08:00Z) +01:52 (+00)
> > SAST: [1892-02-07T22:08:00Z, 1903-02-28T22:30:00Z) +01:30 (+00)
> > SAST: [1903-02-28T22:30:00Z, 1942-09-20T00:00:00Z) +02 (+00)
> > SAST: [1942-09-20T00:00:00Z, 1943-03-20T23:00:00Z) +03 (+01)
> > SAST: [1943-03-20T23:00:00Z, 1943-09-19T00:00:00Z) +02 (+00)
> > SAST: [1943-09-19T00:00:00Z, 1944-03-18T23:00:00Z) +03 (+01)
> > SAST: [1944-03-18T23:00:00Z, EndOfTime) +02 (+00)

...

> > Any thoughts? If the feeling is broadly positive, the next step would be
> > to nail down the text format, then find a willing victim/volunteer to write
> > the C code. (You really don't want me writing C...)

What's wrong with zdump's output format?

$ zdump -v Africa/Maseru
Africa/Maseru  -9223372036854775808 = NULL
Africa/Maseru  -9223372036854689408 = NULL
Africa/Maseru  Sun Feb  7 22:07:59 1892 UTC = Sun Feb  7 23:59:59 1892
LMT isdst=0 gmtoff=6720
Africa/Maseru  Sun Feb  7 22:08:00 1892 UTC = Sun Feb  7 23:38:00 1892
SAST isdst=0 gmtoff=5400
Africa/Maseru  Sat Feb 28 22:29:59 1903 UTC = Sat Feb 28 23:59:59 1903
SAST isdst=0 gmtoff=5400
Africa/Maseru  Sat Feb 28 22:30:00 1903 UTC = Sun Mar  1 00:30:00 1903
SAST isdst=0 gmtoff=7200
Africa/Maseru  Sat Sep 19 23:59:59 1942 UTC = Sun Sep 20 01:59:59 1942
SAST isdst=0 gmtoff=7200
Africa/Maseru  Sun Sep 20 00:00:00 1942 UTC = Sun Sep 20 03:00:00 1942
SAST isdst=1 gmtoff=10800
Africa/Maseru  Sat Mar 20 22:59:59 1943 UTC = Sun Mar 21 01:59:59 1943
SAST isdst=1 gmtoff=10800
Africa/Maseru  Sat Mar 20 23:00:00 1943 UTC = Sun Mar 21 01:00:00 1943
SAST isdst=0 gmtoff=7200
Africa/Maseru  Sat Sep 18 23:59:59 1943 UTC = Sun Sep 19 01:59:59 1943
SAST isdst=0 gmtoff=7200
Africa/Maseru  Sun Sep 19 00:00:00 1943 UTC = Sun Sep 19 03:00:00 1943
SAST isdst=1 gmtoff=10800
Africa/Maseru  Sat Mar 18 22:59:59 1944 UTC = Sun Mar 19 01:59:59 1944
SAST isdst=1 gmtoff=10800
Africa/Maseru  Sat Mar 18 23:00:00 1944 UTC = Sun Mar 19 01:00:00 1944
SAST isdst=0 gmtoff=7200
Africa/Maseru  9223372036854689407 = NULL
Africa/Maseru  9223372036854775807 = NULL

Cons:
- A bit verbose
- technically uses instants (from before and on each transition) rather
than spans.
- The NULLs are a bit mysterious
- I'm personally not sure *exactly* how it finds the transitions, and in
particular I'm not sure if it will reliably find multiple transitions
per day

Pros:
- Already exists
- Is already written in C, and already installed on many systems
- Does not depend on any implementation internals