RE: Possible corrections?

July 6, 2006

      Just to clarify...

I'm not advocating changing the format of the datastore.  You notice I
refer to it as a datastore, not a database, because to do so implies
IMHO that it is structured with a unique database format, which of
course it is not.  Easy editing by humans is certainly the goal, what I
suggested is merely avoiding the ambiguity assosiciated with some of the
data, not the structure exactly.  

As I move more comfortably with using my new Linux test machine, I would
probably agree that the zic output would be maybe a better choice for
machine exchange, and my efforts would have taken less time if I had a
system to use ZIC when I began.  However, when the original parsing
language and programms are not used with the native datastore, the
TZINFO datastore is what is used as a base.  The C language, while
indeed portable, is not the issue for converting to an easily integrated
datastore for MS .Net compliant apps.  The MS O/S uses timezone
information in their registry, and that is not exactly a good thing,
given the lack of historical reference support.  Others on the list have
created apps using Ruby, Perl, and other languages for use in providing
MS O/S apps, and MS .Net integration, so it's just a matter of providing
the datastore as a base for whatever integration is required.  Then, the
apps would deal with the data and timezone conversion in whatever way is
needed.

I think the learning curve here for most newer folks is understanding
the legacy of the TZ datastore.  Many newbies are not familiar with C or
Posix, and really need to understand that the datastore is formatted
just a Ken says 'The tzdata source format is intended for easy editing
by humans...' and 'the primary format really needs to be oriented for
the ease of human editing and proofreading.'

So, including the mention of this type of information in the download
would be very beneficial, and may lessen the frustration for folks
wanting to use the datastore for their own development efforts.  I would
not change anything, as after all is said, my development work depends
on the datastore staying the same as it is now, I can handle the
cross-machine irregularities after reading the ZIC doc, and that's the
way it should be.  Of course, developers need to actually 'read' the
doc.

-----Original Message-----
From: Ken Pizzini [mailto:tz.@explicate.org] 
Sent: Wednesday, July 05, 2006 3:43 PM
To: tz@lecserver.nci.nih.gov
Cc: Phillip Guerra
Subject: Re: Possible corrections?

On Wed, Jul 05, 2006 at 10:59:43AM -0500, Phillip Guerra wrote:
...
Yes, it would be a help to provide the documentation note.  As one of 
the new 'independent' TZ database users (actually developer), I 
struggled with the format, until I found the zic doc by accident.  I 
didn't have a linux or Unix system available for a resource, so was 
trying to use the TZ datastore and convert it for use with MS ASP.Net.
The tab formatting of the file was not consistent, and I remedied that
...
by parsing it with custom programs that converted it.
It sounds like what is missing is documentation (presumably in the
tzdata file comments?) that independent developers are encouraged to
download the reference tzcode tarball.  This way they get both the
zic.8.txt documentation of the zone file format, and the reference
implementation of zic.c to use both as a starting point for their own
conversion software and to clarify any ambiguities in the zic.8 document
(ideally, in this latter case, with a bug report about the ambiguity
sent to the TZ list).
...
I don't really have a preference for which operating system is a 
developer or user's choice,
While the heritage is Unix, the tzcode source is mostly portable ANSI C
(ignoring the code for the "date" target, which has highly
system-specific code, and the tendency to use POSIX I/O even when stdio
would be fine).  The bit that parses the tzdata files is highly system
agnostic.
...
but find it really handy to opt for some kind of datastore structure 
that is easily translated from one system to another.  It's just a 
nice feature to provide.  In this case, the TZ datastore is the de 
facto standard for this type of information, not only for Unix /Linux 
systems but becoming a standard reference tool for other systems as 
well, like it or not.
The tzdata source format is intended for easy editing by humans, not
interchange between machines; if one is interested in machine-oriented
data, the zic output is probably a better first choice, though creating
a third format that is more database-like (transformed from tzdata by a
zic-like tool) might be a reasonable idea.  But the primary format
really needs to be oriented for the ease of human editing and
proofreading.
...
The only critique I have is the
somewhat confusing aspect of the use of the AT field:
[...]
             and hour 24 is midnight at the end of the day.  Any
             of these forms may be followed by the letter w if
             the given time is local "wall clock" time, s if the
             given time is local "standard" time, or u (or g or
             z) if the given time is universal time; in the
             absence of an indicator, wall clock time is assumed.
It would make lessen the confusion or how to translate this field if 
the data were always presented in wall clock time, for us independent 
types, anyway.
Again, the source file is oriented towards human editing.  The suffixes
are provided to allow the tzdata entry to mimic the official legislation
or decrees, to allow easier eyeball cross-checking.

		--Ken Pizzini

Confidentiality Notice:  This email message, including any attachments, is for the sole use of the intended recipient(s), and may contain privileged or confidential information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by replying to this email, and delete all copies of the original message.