Date: Thu, 1 Mar 2012 12:17:15 +0000 From: Zefram <zefram@fysh.org> Message-ID: <20120301121715.GE3007@lake.fysh.org> | The comments at the top of the file include "Columns are separated by a | single tab.". The extra tab is gone from 2012b, so the immediate problem is gone. However, that said, I believe your parser might be incorrect. The comments in zone.tab also say ... # This file contains a table with the following columns: # 1. ISO 3166 2-character country code. See the file `iso3166.tab'. # 2. Latitude and longitude of the zone's principal location # in ISO 6709 sign-degrees-minutes-seconds format, # either +-DDMM+-DDDMM or +-DDMMSS+-DDDMMSS, # first latitude (+ is north), then longitude (+ is east). # 3. Zone name used in value of TZ environment variable. # 4. Comments; present if and only if the country has multiple rows. That is, there are exactly 4 columns, separated by a single tab (3 tabs). The final column is "comments" - there's no stated restriction on the characters that can be used in comments (the first three columns all have a format that implicitly defines their content, the first is always 2 alpha chars, the second signs and digits, the third something that is reasonable as an environment var value (letters, digits, underscores, hyphens, slashes ... (but no white space generally) - but the last is just comments. I'd typically allow anything in comments, including white space, including tabs, wherever they may occur, including at the start of the field. So, while it was certainly not intentional, it is reasonable to argue that the entry CA +4906-11631 America/Creston Mountain Standard Time - Creston, British Columbia that was in 2012a contained column 1 "CA" column 2 "+4906-11631" column 3 "America/Creston" column 4 " Mountain Standard Time - Creston, British Columbia" and I'd suggest that's how a parser should interpret it. It might not make much sense, but I don't actually see it as a syntax error. kre