Actually, I suggest that we should really be discouraging people from creating alternate parsers for the zone input files than zic - those two should remain closely tied together. From time to time we discover the need to add some new feature to the input language, doing that is really hard if all kinds of other implementations will suddenly break.
That makes it even harder? Because presumably the code behavior changes too, which would be harder to adapt to. IMO this seems like a further argument for XML as it's pretty easy to extend some things w/o breaking other stuff
Further, the input file format is rather quirky, and hard to explain completely in a way that makes a lot of sense (though it is perfectly fine for zic).
I don't find that comforting or reassuring. -Shawn
Date: Fri, 3 May 2013 17:30:34 +0000 From: Shawn Steele <Shawn.Steele@microsoft.com> Message-ID: <a5598fbf7f2f46dd818bd0dcf00e7564@BLUPR03MB168.namprd03.prod.outlook.com> | That makes it even harder? Because presumably the code behavior | changes too, which would be harder to adapt to. I'm not sure what you're trying to say there, but of course, if the file (input file) format changes, the code changes too - its behaviour changes in that it processes the changed input, whatever that is - but the output format remains the same (generally) - it is actually a quite simple format. | IMO this seems like a further argument for XML as it's pretty easy | to extend some things w/o breaking other stuff But that turns out to be useless for anything more complex that changing rules about separators between fields, which isn't the kind of change we are likely to want to make - for the content that matters, the parser has to understand all of it, in order to correctly make sense of the input. Unlike other uses where it makes sense to process some or (relevant parts of) the data and simply ignore anything unknown, for converting the rules into the transition points, all of the data needs to be understood. The most likely reason for change is to add a new specification mechanism, to allow specification of something that we cannot currently say (which has not needed saying in the past.) A parser that cannot handle the new specification cannot process the data - whether the input format is XML, JSON, the current zic format, or anything else. kre
Shawn Steele <Shawn.Steele@microsoft.com> wrote: |> Actually, I suggest that we should really be discouraging people from \ | creating alternate parsers for the zone input files than zic - those two \ | should remain closely tied together. From time to time we discover the \ | need to add some new feature to the input language, doing that is really \ | hard if all kinds of other implementations will suddenly break. | |That makes it even harder? Because presumably the code behavior changes \ |too, which would be harder to adapt to. | |IMO this seems like a further argument for XML as it's pretty easy to \ |extend some things w/o breaking other stuff Whereas i never had a problem to write a parser for the data (first in Perl, then in C++), i just can't imagine just about any argument for XML, even if i'm soaped in the bath. It's almost unreadable for a human on the one hand (especially if you „attribute-it-up“ to ship a lot of information, though this doesn't seem to be an issue with TZ ,)) and terribly expensive to parse for a machine on the other, even if it's a SAX parser. If done right, a JSON / YAML format can remain human readable while also being very easy to parse, even without any library except a C library with POSIX regular expressions. I hope ICU does this, too. Besides, a lot of people seem to go LUA today which possibly would also be an option, but i don't know; i wouldn't do that (i've heard that there is an efficient JSON parser for LUA). In the end it's nice not to understand a thread. :-) |> Further, the input file format is rather quirky, and hard to explain \ | completely in a way that makes a lot of sense (though it is perfectly fine \ | for zic). | |I don't find that comforting or reassuring. Yeah, i never understood why these nice input data has to be broken down into all these little files; VM space doesn't really count (our single file DB was about ~140 KB, i had to look), but just imagine how many space is wasted on old 32 KB VFAT! *That* is grazy! … and, we did (in 2005): auto CStringList fields, xfields; _Nydin; (void)fields.splitWS(*_curr->data(), fal0); * \brief Fill list by splitting a string at whitespace * \param _template CString to split. * \param _empty_ok Should empty substrings be added? * \param _lc * Locale::Handle to use for whitespace detection. * The currently active one is used if this is ::NIL. pub CStringList &splitWS(const CString &_template, boolean _empty_ok=tru1, Locale::Handle _lc=NIL) { return(splitWS(_template.data(), _template.length(), _empty_ok, _lc)); } Worked! |-Shawn --steffen
On Fri, 03 May 2013, Shawn Steele wrote:
IMO this seems like a further argument for XML as it's pretty easy to extend some things w/o breaking other stuff
I think that adding an optional XML output format to zic would be fine. However, I think that the information used as input to zic should remain easily readable and editable, which rules out XML. The binary output format probably also needs to be retained, or be extended only in compatible ways. --apb (Alan Barrett)
participants (4)
-
Alan Barrett -
Robert Elz -
Shawn Steele -
Steffen Daode Nurpmeso