I kept my implementation for 'follow' very simple:

- only at the end
- not recursive, ie., the zone to which follow points does not have follow itself.

The reason is my experience with the Shanks database in its software version (not the book variant, which is nearly flat).
That one is full of jumps from one table to the other and back. It is extremely complicated to follow this as a human, which makes it extremely to maintain. The reason for this structure chosen by Shanks was that it was implemented in the lat 1970ies in Fortran an machines with very limited RAM. Data volume had to be kept to a minimum.
In the book publication, there was no space limit. This is why he used flat tables there.

It is better to have a structure as simple and flat as possible. Readability for humans has to be kept in mind.

I designed my 'follow' excatly for the purpose of extending TZ database backwards. All sub-areas which exist pre-1970, for example 128 areas in the state of Illinois, 340 in Indiana, 222 in New York state, 114 in Ohio, end up in the single zone which covers post 1970 Illinois, Indiana, New York or Ohio.

That gives a simple, readable and maintainable 'follow' syntax.

Any more complex follow structures are not worth the trouble.

In my implementation, each zone gets of course its complete stand-alone binary file. So any space saved in the source files with more complex jump structures gets lost in the binary files anyway.

A major problem is that zic.c is not the only converter from source files to binary. If the source syntax is changed, all converters which are not patched, will fail.


On 14.06.21 09:27, Paul Eggert wrote:
Adding a "follow"-like syntax is a fine idea. I think it's been proposed before, but not implemented. I would hope that one could use "follow" not only at the end of a Zone, but also within a zone, such as "location X follows location Y from 1950 to 1990". Also, the code should defend itself against follow cycles (e.g., A follows B follows C follows A).

Any changes along these lines would be complicated by backward-compatibility concerns, though. That is, we could use a "follow"-like syntax only in vanguard format at first. To give you a feel of how long this can take, we added support for %z in zone abbreviations in 2015f, and we're not using this feature even now in the vanguard data, much less the main data (this is mostly due to my lack of time...).