Estimate for a new tzdb release with Brazil rules.
Robert Elz kre at munnari.OZ.AU on Tue May 14 07:35:10 UTC 2019 wrote:
When I once thought I might try that, I got told "git: command not found" I have no interest whatever in installing that thing, just for this
kre
Wow. Perhaps one might check if "git" was installed. As we've no information as to what kind of system you're running on, it's hard to help you. (It might be as simple as it isn't in your PATH.) After checking the obvious, I usually start with Google: how to install "git" on .... If you're not the sysadmin, perhaps that would be a good place to start, too. She/he may have more motivation, too.
Robert Elz kre at munnari.OZ.AU on Tue May 14 07:35:10 UTC 2019 wrote:
When I once thought I might try that, I got told "git: command not found" I have no interest whatever in installing that thing, just for this
kre
Wow. Perhaps one might check if "git" was installed.
I'll simply note that kre was one of the nine recipients of the first message sent out to the time zone mailing list in 1986. https://mm.icann.org/pipermail/tz/1986-November/008946.html @dashdashado
On 5/15/19 12:25 PM, Chris Woodbury via tz wrote:
Perhaps one might check if "git" was installed.
I'm sure Robert could do that, but I assume he doesn't want to bother. I don't know which version-control system he used (if any) when he maintained tzdata (2011l through 2012c). Arthur David Olson used SCCS before that. tzdb has always been distributed as a tarball that requires only standard POSIX utilities like 'make', 'awk' and a C compiler to build, and there are no plans to change this. However, maintainers can be expected to have a better set of development tools than what POSIX requires. Nowadays a good repository system is invaluable for any nontrivial software project, and I didn't have any qualms about using Git for current maintenance, as it's by far the most-popular such system and it's free. If you don't want to use Git that's fine; we'll take plain POSIX 'diff -u' patches instead.
Date: Wed, 15 May 2019 13:52:15 -0700 From: Paul Eggert <eggert@cs.ucla.edu> Message-ID: <342b30cb-efa9-82cc-7cd9-81d67d528387@cs.ucla.edu> | I'm sure Robert could do that, but I assume he doesn't want to bother. That's right - I have something of an aversion to git. (And for Chris, no, it is not just not in my path, it is not installed, and is not going to be.) | I don't know which version-control system he used (if any) when he | maintained tzdata (2011l through 2012c). "maintained" is a stretch, I was just acting as temporary caretaker. RCS (I still have those files if they're of any use). There's no real need for anything distributed, it was, and still is, maintained by one person. All use of a DVCS is giving you is a way to avoid making releases as many people can simply fetch it that way. Doing real releases when needed is still the right thing to do - the current aversion to that, in all kinds of projects, is annoying. We ought to have had a new release a while ago so we can stop distributing bad data for Palestine. Further, since we got lots of advance notice of the Brazil change, which is what we always say we want, we should be providing positive feedback for that good behaviour by making sure that all distributors have the best possible chance of making the updated rules available long before they are to take effect, so when that happens it will be as seamless as possible, and perhaps act as an example for others how to manage things to avoid needless disruption. Doing a release should be trivial (at least compared to verifying data updates and then making sure they are translated into zoneinfo rule format correctltly, and then verifying the result is what is expected - which all happens now, and very quickly). Releases could even be automated, to happen once a week (or whatever), if the data has altered (as demonstrated by zic producing a set of zoneinfo files that are not identical to the last release). That could be run out of cron, and so add no workload at all. kre
On Thu, 16 May 2019 at 06:57, Robert Elz <kre@munnari.oz.au> wrote:
Further, since we got lots of advance notice of the Brazil change, which is what we always say we want, we should be providing positive feedback for that good behaviour by making sure that all distributors have the best possible chance of making the updated rules available long before they are to take effect, so when that happens it will be as seamless as possible, and perhaps act as an example for others how to manage things to avoid needless disruption.
+1 to providing positive feedback. Our strategy for releases has more or less been "only when there's something that has become urgent", which sends an undesirable mixed message: Why should one provide ample notice for their changes if their timely dissemination relies on someone else *failing* to provide ample notice for their own? For as low-volume as our changes are, I don't think cron is necessarily the best strategy, though, and obviously, manually turning the crank on a release is non-trivial, so there's a desire to batch changes together when possible. However, the strategy must be more nuanced than "sit on things until we *have* to deal with them". (I think it already is, but I can see how a perception that it isn't might arise.) For something ~6 months out, perhaps waiting ~4–6 weeks to catch straggling changes is warranted, but waiting much longer seems self-defeating. Perhaps it also depends on time-of-year: We're now out of the traditional "silly season" and are therefore less likely to get too many more changes before Brazil's would become urgent, which tips the balance more toward releasing sooner, rather than waiting. -- Tim Parenti
On May 16, 2019, at 12:53 PM, Tim Parenti <tim@timtimeonline.com> wrote:
[EXTERNAL EMAIL]
On Thu, 16 May 2019 at 06:57, Robert Elz <kre@munnari.oz.au> wrote: Further, since we got lots of advance notice of the Brazil change, which is what we always say we want, we should be providing positive feedback for that good behaviour by making sure that all distributors have the best possible chance of making the updated rules available long before they are to take effect, so when that happens it will be as seamless as possible, and perhaps act as an example for others how to manage things to avoid needless disruption.
+1 to providing positive feedback. Our strategy for releases has more or less been "only when there's something that has become urgent", which sends an undesirable mixed message: Why should one provide ample notice for their changes if their timely dissemination relies on someone else failing to provide ample notice for their own?
That's a good point. I would suggest a goal that any change properly submitted will appear in a tzdata release within a month. "Properly submitted" means there is an authoritative official statement of the rule change, with the detail necessary to turn it into a tzdata rule. Such a guideline would reward those who use the system properly. And it also implicitly says that we don't necessarily deliver changes any sooner than a month, so if you get it to us with one week lead time you may be out of luck for three weeks. This is as it should be. The "properly" rule also means that it's clear news paper articles, blog entries, and rumors are not submissions and don't start the one month clock. paul
On 5/16/19 10:57 AM, Paul.Koning@dell.com wrote:
That's a good point. I would suggest a goal that any change properly submitted will appear in a tzdata release within a month. "Properly submitted" means there is an authoritative official statement of the rule change, with the detail necessary to turn it into a tzdata rule.
I could go along with something like this, though I don't want to guarantee any specific schedule. It could be that a change (like Brazil's) is not urgent and that later changes need some time to percolate through the system, for example. With that in mind, I'll try to generate other low-priority changes that are consequences of email to this mailing list that haven't been acted on, or are other changes I've been meaning to do. We might as well batch them in while we're putting out the Brazil change.
On May 16, 2019, at 2:59 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
[EXTERNAL EMAIL]
On 5/16/19 10:57 AM, Paul.Koning@dell.com wrote:
That's a good point. I would suggest a goal that any change properly submitted will appear in a tzdata release within a month. "Properly submitted" means there is an authoritative official statement of the rule change, with the detail necessary to turn it into a tzdata rule.
I could go along with something like this, though I don't want to guarantee any specific schedule. It could be that a change (like Brazil's) is not urgent and that later changes need some time to percolate through the system, for example. With that in mind, I'll try to generate other low-priority changes that are consequences of email to this mailing list that haven't been acted on, or are other changes I've been meaning to do. We might as well batch them in while we're putting out the Brazil change.
I specifically said "goal" to make it clear this is no guarantee, and you could make the wording even more plain and explicit. But it does seem useful to have a loose upper bound. The notion of rewarding those who plan ahead is a good one. The other benefit in getting information out reasonably soon is that it helps with situations where the back end processes take a long time. While some software streams may have weekly updates and can deliver a new tzdata dump to end users within a month or so, other products may have much longer delivery cycles. If you tailor your delivery to how quickly Android releases are cycled, things get tight for embedded system products that don't move so fast. paul
On Thu, May 16, 2019 at 10:58 AM <Paul.Koning@dell.com> wrote:
... I would suggest a goal that any change properly submitted will appear in a tzdata release within a month. "Properly submitted" means there is an authoritative official statement of the rule change, with the detail necessary to turn it into a tzdata rule. ...
With all due respect for the volunteer nature of the project, putting a specific timeframe on it is a good idea, even if it's just a loose goal. Policy makers are much more likely to co-operate when they have a specific target (instead of just "as early as possible") that they use to push their colleagues to a decision. -- Alan Mintz <Alan.Mintz@gMail.com>
Date: Thu, 16 May 2019 13:42:57 -0700 From: Alan Mintz <alan.mintz@gmail.com> Message-ID: <CAMLM5-XApfetNyjyG2wwuT6Br6oi0rOvuxoxGxTzM1sOtq3gGg@mail.gmail.com> | With all due respect for the volunteer nature of the project, Really, that affects validating and dealing with the data more than anything else - the "make a release" part of this is (by comparison) trivial. | putting a specific timeframe on it is a good idea, | even if it's just a loose goal. That may be, but not because of: | Policy makers are much more likely to co-operate when they have a specific | target (instead of just "as early as possible") that they use to push their | colleagues to a decision. It would likely have exactly the wrong effect ... "we know we only need to announce our decision a month in advance, as they guarantee a new release within a month" - what we want if for the policy makers to make their decisions as early as possible, and for that, it would be better for them to believe it might take us a year to make their new rules available! Not that I would suggest that we go that way ... a specific timeframe is more as a push upon us (well, upol Paul currently) to make the releases happening. At least tzdata releases - nothing says that we have to continue the rececent practice of doing source releases in sync with data, every time. The only real issue with doing "too many" releases (aside from using up the alphabet for release names - which really, is very unlikely to happen) is concern for the workload imposed upon downstream maintainers, having to continually import, test, and distribute updated versions (that is real work that is part of a release). But nothing says that they are required to do that for every release, or to do it without waiting a while to see if a new release appears first. But for that (as also being one of those people for now) I know I'd much rather have a new release available, and ignore it for a while in case something new appears to save myself some work, than to not have it available, because someone else is waiting for the same reason, upstream, and thus miss my release deadline and end up distributing known bad data (or have to go install "experimental" patches - perhaps only installed that day, and not yet really known correct, and thus take the risk of breakingw what worked before, rathern than just failing to keep up). Re earlier messages: Tim Parenti <tim@timtimeonline.com> said: | For as low-volume as our changes are, I don't think cron is necessarily | the best strategy, Agreed, that was just a suggestion to require less manual work. But: | though, and obviously, manually turning the crank on a release is | non-trivial, not obviously at all, it ought to be. What's more, since we're telling people (downstream distributors) that they should be fetching the data from git, and distributing that, every git commit (push?) is effectively making a new release (just in a different sequence of release names). All of the hard work ha shappened before that. What follows to convert the data in the git repo (assuming you have a git command!) into the tarballs with 2019d names is just bookkeeping - the only slightly tricky part is generating the signature, which needs access to the key (and which is the real reason that using cron might not be the best idea). | For something ~6 months out, perhaps waiting ~4?6 weeks to catch straggling | changes is warranted, Even that's too long, and there is no real point. Better to get the data out to as many people as possible, so we get better feedback, and can make corrections (and a new release) on the odd occasion that it is needed. I think something between a week and ten days is about right to wait for list readers and very early adoptors to find errors (which is all we really should be waiting for - if we pick up some other jurisdictio's changes in the same update, just by fluke, that's fine - but we should never be waiting in the hope that someone else will change their rules before we are forced to release the pervious changes - never). So, pick a weekday (which includes weekend days) as the regular release day, and then release everything that was in the repo 4 days earlier than that (ie: make the tarballs then, but only actualy distribute them if there are no reported problems). So, release Friday, with everything done up to the end of the previous Monday (or whatever). (Or if manual work is needed at IANA, release Monday, with the cutoff for the data being the end of the previous Thursday - the actual days don't matter). If there is a problem, simply remove the pending tarballs and that week's release doesn't happen. (Similarly, if nothing has changed, no new tarballs are made, and no release happens.) Then downstrem can adopt their own strategy for updaing their releases in the knowledge that they have new data available, and they can wait until the next tzday if that is ahead of their deadline, to see if there are more changes. Finally, for anyone concerned that we might run out of alphabet, don't be ... first because the a..z za..zz sequence contains 52 identifiers, sufficient for one for every week of the year, but more rationally because there simply never are that many changes. Even in the one year when we "came close" to exhausting the a..z sequence (it wasn't really that close, in 2009 we used 21 letters (up to 'u'), which left 5 still available - that's about 20% - lots of margin there, and that year was extraordinary, there were changes every month, 2 months had 3, and 5 more had 2, the other 5 months just 1 change. It is hard to imagine anything more messy than that year - most years don't even get close. kre ps; Even if we keep doing combined source/data releases, nothing says that we need to make new source tarballs from the repo every week - those can be generated manually when it is decided that the sources should be updated in the release - deciding to release sources is a more complex process, as those really need to have been verified to work in different environments, whereas the data either is correct, or is not.
On Thu, 16 May 2019 at 20:55, Robert Elz <kre@munnari.oz.au> wrote:
| For something ~6 months out, perhaps waiting ~4–6 weeks to catch straggling | changes is warranted,
Even that's too long, and there is no real point. … I think something between a week and ten days is about right
My rationale for ~4 weeks or so was more in terms of proposed changes that aren't well-sourced, or in cases where it's not obvious what the "Right Thing" should be, as has been known to happen. But, yes, for "properly submitted" cases such as this one, something closer to a week or two from when the "Right Thing" becomes known should certainly be sufficient. we should never be waiting in the hope that someone else will change their
rules before we are forced to release the pervious changes - never
In most cases, and certainly at this time of year, I would agree. But trying to reduce unnecessary release churn during the silly season has its benefits, up to the point where it begins to be at odds with the relative urgency of typical silly season changes. I do agree that a somewhat more regularized approach, though, would be of more help to downstream maintainers. -- Tim Parenti
On 2019-05-16 04:55, Robert Elz wrote:
Date: Wed, 15 May 2019 13:52:15 -0700 From: Paul Eggert <eggert@cs.ucla.edu> Message-ID: <342b30cb-efa9-82cc-7cd9-81d67d528387@cs.ucla.edu>
| I'm sure Robert could do that, but I assume he doesn't want to bother.
That's right - I have something of an aversion to git. (And for Chris, no, it is not just not in my path, it is not installed, and is not going to be.)
| I don't know which version-control system he used (if any) when he | maintained tzdata (2011l through 2012c).
"maintained" is a stretch, I was just acting as temporary caretaker.
RCS (I still have those files if they're of any use). There's no real need for anything distributed, it was, and still is, maintained by one person. All use of a DVCS is giving you is a way to avoid making releases as many people can simply fetch it that way.
You can download github automatic archives and tz versioned archives with curl or wget, as below: wget returns exit codes on problems. The automatic archives have different bits of git hashes appended to both tar file names and tarred directory names, whereas the versioned archives have only tz versions in both tar file names and tarred directory names. # curl save with remote name from github api owner repo type ref $ curl -sSLOJR https://api.github.com/repos/eggert/tz/tarball/2019a $ echo $? 0 $ ls -glo eggert-tz-2019a*.tar.gz -rw-r--r-- 1 515408 May 17 03:31 eggert-tz-2019a-0-g14c7338.tar.gz $ tar -tf eggert-tz-2019a*.tar.gz | wc 56 56 1595 # wget save with remote name from github api owner repo type ref $ wget -nv -N -P /tmp/wget/ --content-disposition --trust-server-names \ https://api.github.com/repos/eggert/tz/tarball/2019a Last-modified header missing -- time-stamps turned off. 2019-05-17 03:31:56 URL:https://codeload.github.com/eggert/tz/legacy.tar.gz/2019a [515408/515408] -> "/tmp/wget/eggert-tz-2019a-0-g14c7338.tar.gz" [1] $ echo $? 0 $ ls -glo /tmp/wget/eggert-tz-2019a*.tar.gz -rw-r--r-- 1 515408 May 17 03:31 /tmp/wget/eggert-tz-2019a-0-g14c7338.tar.gz $ tar -tf /tmp/wget/eggert-tz-2019a*.tar.gz | wc 56 56 1595 $ tar -tf /tmp/wget/eggert-tz-2019a*.tar.gz | head eggert-tz-1e06373/ eggert-tz-1e06373/.gitignore eggert-tz-1e06373/CONTRIBUTING eggert-tz-1e06373/LICENSE eggert-tz-1e06373/Makefile eggert-tz-1e06373/NEWS eggert-tz-1e06373/README eggert-tz-1e06373/africa eggert-tz-1e06373/antarctica eggert-tz-1e06373/asctime.c # wget gives exit code 8 on 404 $ wget -nv -N -P /tmp/wget/ --content-disposition --trust-server-names \ https://api.github.com/repos/eggert/tz/tarball/2019z https://codeload.github.com/eggert/tz/legacy.tar.gz/2019z: 2019-05-17 03:31:57 ERROR 404: Not Found. $ echo $? 8 # wget save from versioned archive $ wget -nv -N -P /tmp/wget/ https://github.com/eggert/tz/archive/2019a.tar.gz Last-modified header missing -- time-stamps turned off. 2019-05-17 03:31:59 URL:https://codeload.github.com/eggert/tz/tar.gz/2019a [515298/515298] -> "/tmp/wget/2019a.tar.gz" [1] $ echo $? 0 $ ls -glo /tmp/wget/2019a.tar.gz -rw-r--r-- 1 515298 May 17 03:31 /tmp/wget/2019a.tar.gz $ tar -tf /tmp/wget/2019a.tar.gz | wc 56 56 1091 $ tar -tf /tmp/wget/2019a.tar.gz | head tz-2019a/ tz-2019a/.gitignore tz-2019a/CONTRIBUTING tz-2019a/LICENSE tz-2019a/Makefile tz-2019a/NEWS tz-2019a/README tz-2019a/africa tz-2019a/antarctica tz-2019a/asctime.c # wget gives exit code 8 on 404 $ wget -nv -N -P /tmp/wget/ https://github.com/eggert/tz/archive/2019z.tar.gz https://codeload.github.com/eggert/tz/tar.gz/2019z: 2019-05-17 03:32:00 ERROR 404: Not Found. $ echo $? 8 -- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada This email may be disturbing to some readers as it contains too much technical detail. Reader discretion is advised.
participants (8)
-
Alan Mintz -
Arthur David Olson -
Brian Inglis -
Chris Woodbury -
Paul Eggert -
Paul.Koning@dell.com -
Robert Elz -
Tim Parenti