Re: [tz] [Patch] Make it slightly easier to parse tzdata
On Sat, Nov 1, 2014, at 16:24, Paul Eggert wrote:
Brian Inglis wrote:
This would require a tradeoff between the compiled sizes of the decompressors on your platform and the compressed data: 94K tz.tar.xz
We can shrink it even further than that, as follows:
This requires embedding zic, though. I got the impression his post was about compressing the compiled zoneinfo files (which are, after all, what is required by the library functions)
On 2014-11-01 16:35, random832@fastmail.us wrote:
On Sat, Nov 1, 2014, at 16:24, Paul Eggert wrote:
Brian Inglis wrote:
This would require a tradeoff between the compiled sizes of the decompressors on your platform and the compressed data: 94K tz.tar.xz
We can shrink it even further than that, as follows:
This requires embedding zic, though. I got the impression his post was about compressing the compiled zoneinfo files (which are, after all, what is required by the library functions)
He said he was unable to use zic and zoneinfo, so was talking about simplifying the source data to run against his Python parser, which could not handle constructs like last weekday of month, or 24.00 time. If he does that, he is reverting the data to an earlier state that did not properly reflect the DST rules, loses the benefit of using the widely distributed and tested code and data, has to maintain his own patches against the distributed source, and his weak parser, which may have other bugs in handling the existing source data. The alternative as others have pointed out is to find some way to use zic precompiled binary data. Many other projects have chosen to use compressed binary archive formats: that approach may meet his unstated needs better, would be generally useful in the embedded space, and for network updates to similar devices. That approach could be added to the source code as an alternative to the standard hosted directory structure, and justify adding another compressed binary archive to the public distribution. -- Take care. Thanks, Brian Inglis
On 02/11/14 08:41, Brian Inglis wrote:
On 2014-11-01 16:35, random832@fastmail.us wrote:
On Sat, Nov 1, 2014, at 16:24, Paul Eggert wrote:
Brian Inglis wrote:
This would require a tradeoff between the compiled sizes of the decompressors on your platform and the compressed data: 94K tz.tar.xz
We can shrink it even further than that, as follows:
This requires embedding zic, though. I got the impression his post was about compressing the compiled zoneinfo files (which are, after all, what is required by the library functions)
He said he was unable to use zic and zoneinfo, so was talking about simplifying the source data to run against his Python parser, which could not handle constructs like last weekday of month, or 24.00 time.
If he does that, he is reverting the data to an earlier state that did not properly reflect the DST rules, loses the benefit of using the widely distributed and tested code and data, has to maintain his own patches against the distributed source, and his weak parser, which may have other bugs in handling the existing source data.
The alternative as others have pointed out is to find some way to use zic precompiled binary data.
Many other projects have chosen to use compressed binary archive formats: that approach may meet his unstated needs better, would be generally useful in the embedded space, and for network updates to similar devices. That approach could be added to the source code as an alternative to the standard hosted directory structure, and justify adding another compressed binary archive to the public distribution.
It is probably worth adding into that nice summary that iCalandar has it's own method of describing rules and the tzdist discussions are based on transforming tz data into THAT format before transmission, which since they are not then capable of being transformed BACK into a tz rule set makes a lot of existing code obsolete anyway? Paul - I have my own reasons for wanting to maintain the existing tz rules after transmission, but is this another one dictating that a proper transparent transmission format is a requirement earlier rather than later? -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
Lester Caine wrote:
I have my own reasons for wanting to maintain the existing tz rules after transmission, but is this another one dictating that a proper transparent transmission format is a requirement earlier rather than later?
It appears to be further support for that sort of thing, yes. This is basically a tzdist issue, not a tz issue, right? Unless the idea is that tz also start distributing binaries?
On 03/11/14 06:42, Paul Eggert wrote:
Lester Caine wrote:
I have my own reasons for wanting to maintain the existing tz rules after transmission, but is this another one dictating that a proper transparent transmission format is a requirement earlier rather than later?
It appears to be further support for that sort of thing, yes. This is basically a tzdist issue, not a tz issue, right? Unless the idea is that tz also start distributing binaries?
The basic problem is trying to compartmentalise elements of a system that needs to work as a whole. tz provides a set of tools and a set of data that - apart from our different views on historic details - does a good job at producing a single standard set of material. tzdist fails at the first hurdle in plugging the very hole it's trying to plug simply because it is targeting a different end platform. Most operating systems don't use iCalendar internally? If tzdist had not started to realise that if it is never going to provide a single clean source of tz data I think there would be a case for tz plugging that hole itself. All the tools are in place to simply provide each update as a diff which to my mind is all that is missing in allowing a cleaner update mechanism that would eliminate the need for tzdist? tzdist has to identify properly just what it is providing, and for the majority of end users it I simply maintaining a clean up to date copy of tz to use with existing tools? Providing a cut down version of that as a service for simple devices does not need iCalendar but is now complicated since one has to know what set of data your 'desktop' devices are using in order that the simple devices use the same rules? The original draft proposal for tzdist seemed to be based on the idea that there WAS a single set of data to work with, and that everybody would always have the current version. That was only ever going to work if there WAS only one source of published data, but as a lot of the traffic here ... like Mikes last night ... users need to know which version they are using so they know that they may not be seeing the right results. And we still need an authoritative pre-1970 source of data. :) -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On 2014-11-02 02:08, Lester Caine wrote:
On 02/11/14 08:41, Brian Inglis wrote:
Many other projects have chosen to use compressed binary archive formats: that approach may meet his unstated needs better, would be generally useful in the embedded space, and for network updates to similar devices. That approach could be added to the source code as an alternative to the standard hosted directory structure, and justify adding another compressed binary archive to the public distribution.
It is probably worth adding into that nice summary that iCalandar has it's own method of describing rules and the tzdist discussions are based on transforming tz data into THAT format before transmission, which since they are not then capable of being transformed BACK into a tz rule set makes a lot of existing code obsolete anyway? Paul - I have my own reasons for wanting to maintain the existing tz rules after transmission, but is this another one dictating that a proper transparent transmission format is a requirement earlier rather than later?
Looked at what tzdist now proposes. Seems to want to keep (one or more?) change date(s) per zone and provide roughly equivalent output to zdump, returning what if anything was affected since the last zone change time, since a user specified change time (last user update timestamp), or current data, for a given UTC time range (like "zdump -c" but to the second), or the whole historical range of the zone. Requested via REST URI parameters and data returned in JSON or VCALENDAR format, with selectivity as optional server capabilities which may be queried. Basic capabilities could be satisfied by a thin shim around zdump in your favourite web language. Does that appear to be a reasonable summary or have I missed some subtleties? It was unclear to me if MS could be a publisher of Windows "zones"? -- Take care. Thanks, Brian Inglis
On 04/11/14 02:40, Brian Inglis wrote:
On 2014-11-02 02:08, Lester Caine wrote:
On 02/11/14 08:41, Brian Inglis wrote:
Many other projects have chosen to use compressed binary archive formats: that approach may meet his unstated needs better, would be generally useful in the embedded space, and for network updates to similar devices. That approach could be added to the source code as an alternative to the standard hosted directory structure, and justify adding another compressed binary archive to the public distribution.
It is probably worth adding into that nice summary that iCalandar has it's own method of describing rules and the tzdist discussions are based on transforming tz data into THAT format before transmission, which since they are not then capable of being transformed BACK into a tz rule set makes a lot of existing code obsolete anyway? Paul - I have my own reasons for wanting to maintain the existing tz rules after transmission, but is this another one dictating that a proper transparent transmission format is a requirement earlier rather than later?
Looked at what tzdist now proposes.
Seems to want to keep (one or more?) change date(s) per zone and provide roughly equivalent output to zdump, returning what if anything was affected since the last zone change time, since a user specified change time (last user update timestamp), or current data, for a given UTC time range (like "zdump -c" but to the second), or the whole historical range of the zone.
Two basic modes ... One provides the rules for a zone ... currently only VTIMEZONE documented. tz generic rules could be passed via this method, but no means of combining rules to make a complete zone set. Second provides an 'extract' as a series of events in JSON format which can be used for simple devices that just need the next couple of transitions. What is currently missing is accessing a known state of data so one can synchronise what is read from tzdist and what was used for the calendar you want to use it with.
Requested via REST URI parameters and data returned in JSON or VCALENDAR format, with selectivity as optional server capabilities which may be queried.
The URI based format has not yet been incorporated in the draft, just waiting for the current lock down on IETF before that can be uploaded.
Basic capabilities could be satisfied by a thin shim around zdump in your favourite web language.
The draft proposal allows for expansion in any direction. It is only concerned with 'transmitting data' ... not what is actually transmitted.
Does that appear to be a reasonable summary or have I missed some subtleties?
There are still ongoing discussions on how to combine for example tz rules with additional historic material, but that may well not make the first release.
It was unclear to me if MS could be a publisher of Windows "zones"? The charter for the tzdist group specifically has no jurisdiction about who published the data, what TZID's they use, or even if a TZID from two different sources produces the same data :(
-- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
On 2014-11-04 02:51, Lester Caine wrote:
On 04/11/14 02:40, Brian Inglis wrote:
Two basic modes ... One provides the rules for a zone ... currently only VTIMEZONE documented. tz generic rules could be passed via this method, but no means of combining rules to make a complete zone set.
Second provides an 'extract' as a series of events in JSON format which can be used for simple devices that just need the next couple of transitions.
What is currently missing is accessing a known state of data so one can synchronise what is read from tzdist and what was used for the calendar you want to use it with.
Calendar and JSON data formats appear to be alternatives in the latest release.
Requested via REST URI parameters and data returned in JSON or VCALENDAR format, with selectivity as optional server capabilities which may be queried.
The URI based format has not yet been incorporated in the draft, just waiting for the current lock down on IETF before that can be uploaded.
I was looking at draft 2 from Oct 17 and there were a lot of URI examples. -- Take care. Thanks, Brian Inglis
On 04/11/14 13:10, Brian Inglis wrote:
On 2014-11-04 02:51, Lester Caine wrote:
On 04/11/14 02:40, Brian Inglis wrote:
Two basic modes ... One provides the rules for a zone ... currently only VTIMEZONE documented. tz generic rules could be passed via this method, but no means of combining rules to make a complete zone set.
Second provides an 'extract' as a series of events in JSON format which can be used for simple devices that just need the next couple of transitions.
What is currently missing is accessing a known state of data so one can synchronise what is read from tzdist and what was used for the calendar you want to use it with.
Calendar and JSON data formats appear to be alternatives in the latest release.
There is no documented format for a JSON rule set just as there is no matching mechanism in iCalendar to use the JSON expanded format packet. It would be nice at least to outline a JSON format that can transparently handle the tz format even if it is not included in the standard on the first release.
Requested via REST URI parameters and data returned in JSON or VCALENDAR format, with selectivity as optional server capabilities which may be queried.
The URI based format has not yet been incorporated in the draft, just waiting for the current lock down on IETF before that can be uploaded.
I was looking at draft 2 from Oct 17 and there were a lot of URI examples.
Draft 3 will replace the ?action=xxx with a tidier RESTfull style URI and will add publisher and version facilities ... but we have not yet seen Cyrus's proposals other then some outlines on the tzdist list. -- Lester Caine - G8HFL ----------------------------- Contact - http://lsces.co.uk/wiki/?page=contact L.S.Caine Electronic Services - http://lsces.co.uk EnquirySolve - http://enquirysolve.com/ Model Engineers Digital Workshop - http://medw.co.uk Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
participants (4)
-
Brian Inglis -
Lester Caine -
Paul Eggert -
random832ļ¼ fastmail.us