
2014-10-31 21:18 GMT+01:00 Paul Eggert <eggert@cs.ucla.edu>:
On 10/31/2014 07:56 AM, Ed Schouten wrote:
there is a very small number of directives that would require quite a lot of additional code to parse properly. For example, "lastSun" makes a lot of sense as a special keyword in "Rule" directives, but it provides no functional gain in "Zone" directives.
The same holds for the use of the "Dec 29 24:00" time used in Samoa's timezone. We should be able to use "Dec 30" instead, right?
We could, but if the original announcement said the equivalent of "Dec 29 24:00" it's helpful if the corresponding zone line matches the announcement. Similarly, if the Zone transition is supposed to be at the same time and date as a Rule transition, it simplifies maintenance a bit to use the same string for both.
As I understand it, in zic the same code is used to parse dates regardless of whether they appear in Rule or Zone lines. I assume the same thing could be done in a Python parser....
My idea was to just make it easier for the next person. Adding support for it to my script is of course not infeasible. In fact, taking into account that it's only used in a couple of place, it will even be shorter to have a fixed map for these irregularly shaped entries: last_sun_map = {('1980', 'Mar'): 30, ...} You could argue that the dates in Zone and Rule entries are simply not the same thing. Dates in Zone entries are absolute. They indicate an end date of a ruleset. "lastSun" would need to be applied to the year used in the statement itself. "lastSun" in Rule entries are not applied to a year specifically. They are merely copied into the compiled timezone. I'd say that requiring the same parsing logic may be too demanding. Though I agree that it would be nice to have the definitions matching up with original announcements, in the end they will need to be processed by machines. If we are afraid that people get confused between "Dec 29 24:00" and "Dec 30", there is still the possibility to add a comment to clarify. If a feature is only used so rarely in the datasets that it's easier to use a lookup table to translate them to the proper value than it is to actually parse it properly, we might be sacrificing reusability. Best regards, -- Ed Schouten <ed@80386.nl>