Paul Eggert wrote:
I would, yes. What sort of thing did you have in mind?
At the core, a new program "tzwinnow" that reduces a list of timezones to inequivalent ones, comparing equivalence over the user's choice of date range. It compares tzfiles, not source, partly because that's easier and partly so that it can be used at tzselect time. Supporting tzwinnow, a new .tab file containing population data, derived from new magic comments in the data source files. Once that's available, lots of things can be winnowed. time.tab is actually a winnowed version of zone.tab. (Resolving links is the minimal form of winnowing.) At build time, we initially build the maximalist set of tzfiles, for every zone with distinct data. A second stage of building winnows the tzfiles and zone.tab according to the installer's choice of date range. I'm wondering about the time.tab/zone.tab distinction. We need to start from the existing maximalist zone.tab, of course. Applying winnowing to the full set of zone names in the file would cause cross-country links of the type that time.tab has, which some find objectionable. We can get a winnowed zone.tab that avoids this if each country's zone set is winnowed separately. We can do both, of course, but the regularity of the process may cast doubt on the value of one or the other. There's been some mention of winnowing the source files. That wouldn't be part of the standard build process, but it'd be easy enough to have a program to do that. The amount of source parsing required is not very great; far less than would be required to have tzwinnow work from source. Actually I had such a solid design in my head that I went ahead and started work. I have population figures for all the zones added to the source, generation of the .tab, the tzwinnow program, and documentation for all of this. Next up is use of tzwinnow in tzselect, which is the point at which we start to get a tangible benefit, so that's probably when I should start actually posting the patches. Some early output from tzwinnow has shown that you weren't entirely consistent about removing pre-1970 distinctions. For example, you retained Europe/Copenhagen, Europe/Oslo, and Europe/Stockholm despite the fact that they've all matched Europe/Berlin since 1970. I guess that's because their source descriptions refer to different sets of DST rules for the 1970s, with those rule sets happening to agree in that range (none having any DST transitions since 1965). -zefram