Date: Thu, 16 May 2019 13:42:57 -0700 From: Alan Mintz <alan.mintz@gmail.com> Message-ID: <CAMLM5-XApfetNyjyG2wwuT6Br6oi0rOvuxoxGxTzM1sOtq3gGg@mail.gmail.com> | With all due respect for the volunteer nature of the project, Really, that affects validating and dealing with the data more than anything else - the "make a release" part of this is (by comparison) trivial. | putting a specific timeframe on it is a good idea, | even if it's just a loose goal. That may be, but not because of: | Policy makers are much more likely to co-operate when they have a specific | target (instead of just "as early as possible") that they use to push their | colleagues to a decision. It would likely have exactly the wrong effect ... "we know we only need to announce our decision a month in advance, as they guarantee a new release within a month" - what we want if for the policy makers to make their decisions as early as possible, and for that, it would be better for them to believe it might take us a year to make their new rules available! Not that I would suggest that we go that way ... a specific timeframe is more as a push upon us (well, upol Paul currently) to make the releases happening. At least tzdata releases - nothing says that we have to continue the rececent practice of doing source releases in sync with data, every time. The only real issue with doing "too many" releases (aside from using up the alphabet for release names - which really, is very unlikely to happen) is concern for the workload imposed upon downstream maintainers, having to continually import, test, and distribute updated versions (that is real work that is part of a release). But nothing says that they are required to do that for every release, or to do it without waiting a while to see if a new release appears first. But for that (as also being one of those people for now) I know I'd much rather have a new release available, and ignore it for a while in case something new appears to save myself some work, than to not have it available, because someone else is waiting for the same reason, upstream, and thus miss my release deadline and end up distributing known bad data (or have to go install "experimental" patches - perhaps only installed that day, and not yet really known correct, and thus take the risk of breakingw what worked before, rathern than just failing to keep up). Re earlier messages: Tim Parenti <tim@timtimeonline.com> said: | For as low-volume as our changes are, I don't think cron is necessarily | the best strategy, Agreed, that was just a suggestion to require less manual work. But: | though, and obviously, manually turning the crank on a release is | non-trivial, not obviously at all, it ought to be. What's more, since we're telling people (downstream distributors) that they should be fetching the data from git, and distributing that, every git commit (push?) is effectively making a new release (just in a different sequence of release names). All of the hard work ha shappened before that. What follows to convert the data in the git repo (assuming you have a git command!) into the tarballs with 2019d names is just bookkeeping - the only slightly tricky part is generating the signature, which needs access to the key (and which is the real reason that using cron might not be the best idea). | For something ~6 months out, perhaps waiting ~4?6 weeks to catch straggling | changes is warranted, Even that's too long, and there is no real point. Better to get the data out to as many people as possible, so we get better feedback, and can make corrections (and a new release) on the odd occasion that it is needed. I think something between a week and ten days is about right to wait for list readers and very early adoptors to find errors (which is all we really should be waiting for - if we pick up some other jurisdictio's changes in the same update, just by fluke, that's fine - but we should never be waiting in the hope that someone else will change their rules before we are forced to release the pervious changes - never). So, pick a weekday (which includes weekend days) as the regular release day, and then release everything that was in the repo 4 days earlier than that (ie: make the tarballs then, but only actualy distribute them if there are no reported problems). So, release Friday, with everything done up to the end of the previous Monday (or whatever). (Or if manual work is needed at IANA, release Monday, with the cutoff for the data being the end of the previous Thursday - the actual days don't matter). If there is a problem, simply remove the pending tarballs and that week's release doesn't happen. (Similarly, if nothing has changed, no new tarballs are made, and no release happens.) Then downstrem can adopt their own strategy for updaing their releases in the knowledge that they have new data available, and they can wait until the next tzday if that is ahead of their deadline, to see if there are more changes. Finally, for anyone concerned that we might run out of alphabet, don't be ... first because the a..z za..zz sequence contains 52 identifiers, sufficient for one for every week of the year, but more rationally because there simply never are that many changes. Even in the one year when we "came close" to exhausting the a..z sequence (it wasn't really that close, in 2009 we used 21 letters (up to 'u'), which left 5 still available - that's about 20% - lots of margin there, and that year was extraordinary, there were changes every month, 2 months had 3, and 5 more had 2, the other 5 months just 1 change. It is hard to imagine anything more messy than that year - most years don't even get close. kre ps; Even if we keep doing combined source/data releases, nothing says that we need to make new source tarballs from the repo every week - those can be generated manually when it is decided that the sources should be updated in the release - deciding to release sources is a more complex process, as those really need to have been verified to work in different environments, whereas the data either is correct, or is not.