Thanks for the message.
Yes, you were correct in thinking that
If the FORMAT field contains either a "%s" or a '/' then the RULES field must contain a named rule.
was a trial rule of my own invention for my own parser implementation. I saw a pattern in the historic data and was curious as to whether it applied everywhere. I have now modified my checks so that all files pass.
To be clear, I am not suggesting that the zic compiler mishandles any old data files. Neither am I suggesting that there are any errors in the zic documentation.
When I was referring to data being at slight variance to the documentation, the documentation I was referring to was:
I now recognise that I would have been better off using the zic documentation as my primary source.
Nonetheless, here are a few things I have found:
1. tz_link.html states that:
Sources for the tz database are UTF-8 text files...
Some of the comments in some of the old files contain non UTF-8 single byte representations of accented letters. Since such occurrences are in the comments this will not affect anything.
2. The tz_how-to.html states that:
Prior to the 2020b release, it was called the TYPE field, though it was never used in the main data ...
However, some of the old data in
https://data.iana.org/time-zones/releases/ contains "even" and "odd" to account for the Adeleide festival. (I got round this by excluding the versions of the Australia/Adeleide exhibiting "even" and "odd".)
3. The tz_how-to.html states that:
The FORMAT column specifies the usual abbreviation of the time zone name. It can have one of three forms:
a string of three or more characters that are either ASCII alphanumerics, “+”, or “-”, in which case that’s the abbreviation ...
I had to allow an underscore and space to allow all the files to pass. In the case of St. Helena I also had to allow a '?' as the first character. Further, I had to allow an abbreviation in a '/' separated format to be only two characters.(I recognise that this is not technically in violation of the statement above.)
4. I can see that some of the older files use a '?' where the more modern files use '%s'. This is not mentioned in the tz_how-to.html documentation, I recognise that putting such obscurities in the document may not be a good idea.
As you can see these are all very minor things. I appreciate your quick responses.
Regards
Nick