On Wed, May 1, 2013 at 1:35 PM, <random832@fastmail.us> wrote:
On Wed, May 1, 2013, at 1:42, David Muir Sharnoff wrote:
I wanted to extract some of the data from the timezone files so I wrote a quick parser for them. In the process I discovered 1,477 lines that have spaces where they should have tabs.
Tokens in the timezone files are separated by _any whitespace_. I don't see the word "token" in ftp://ftp.iana.org/tz/code/Theory
And I assume not any white space separates tokens.
In most languages, splitting up by 'any whitespace' is the simplest thing in the world. Evidence? I assume in most languages there are things simpler than that, e.g. splitting by space.
In C (where nothing is simple), you could reuse the code from zic itself. Or not.
-- Tobias Conradi Rheinsberger Str. 18 10115 Berlin Germany http://tobiasconradi.com