Were these principles to be adopted, it would be hugely beneficial to the project. There is no reason that I can see why tzdb could not be managed using the best practice git techniques described above. Stephen On 26 July 2014 19:38, Matt Johnson <mj1856@hotmail.com> wrote:
I think it's great that we're using Git and Github as the experimental / unofficial repository at https://github.com/eggert/tz. It's much easier to track change history looking through the commit log and see the changes than by reading through emails with patch attachments. However, we're not currently taking advantage of all that this environment has to offer.
-- Item 1 -- We should be making better use of branches. We currently have a single "master" branch that everything gets committed to. This is problematic, because it doesn't separate things that are certain to be released from things that are proposed changes. For example, the recent time.tab file, and the other large-scale proposed changes that are currently being debated, could have been created on feature branches. This would have given the tz list members a place to look at the proposed changes and make additional suggestions (via pull requests) before things are finalized.
As it sits today, since everything is in master, if the proposal is ultimately defeated then new commits will have to be made to master to revert these changes. The danger comes if, say we needed to issue an emergency release sometime in between. Since master isn't in a state of positive agreement, then one would have to branch from an earlier point in history to build a hotfix release, then merge that hotfix back to master later. It's much easier if we can just trust that master always consists of things that are certain to be released.
See also: https://www.atlassian.com/git/workflows#!workflow-feature-branch http://www.git-scm.com/book/en/Git-Branching-Basic-Branching-and-Merging
-- Item 2 -- I think that we should all make better use of forking and pull-requests for submitting proposed changes. Instead of submitting a patch file to the mailing list, one should fork the GitHub repo, make their changes, then create a pull request. This allows for place for discussion on proposals where the code can be referenced much easier. It also ensures that the author of each and every change is tracked in the commit log. And finally, it makes it much clearer which proposals were adopted and which were not. Presently, looking through the mailing list archives, it's quite difficult to tell if any given patch was actually applied or not.
-- Item 3 -- We should decide how the GitHub issue tracker fits in to the ecosystem. I see that there have been a few issues reported to via the issue tracker in the past, but most things have come through the mailing list. If we adopt the conventions used by other modern projects, then we should be reporting bugs through the issue tracker so their history can be more easily found. Another benefit is that you can reference issue numbers in commits, and you can reference commits in the comments of an issue. This linking makes it quite easy to find the code or data that was changed in response to an issue. The mailing list should probably be used for extended discussion, rather than as a place to report issues. Though, there may be some blend of both, I personally think that an issue tracker is much more palatable than a mailing list for many of these kinds of things. There should probably be some guidance document on the iana tz page about what goes where.
-- Item 4 -- While Paul Eggert is the tz maintainer, and I appreciate his efforts greatly, I personally don't feel that it's appropriate for the github repo to be in his personal "eggert" account. There should instead be a common "organizational account" for the project, such as github.com/tzdb or similar. ("iana" is taken, but appears to be unused or abandonded. Someone may want to inquire about obtaining it, as "github.com/iana/tz" would be quite appropriate IMHO). Though Paul would be the administrator of this account, his own personal account would no longer be authoritative.
That also ties back to the idea of pull-requests. Since Paul makes the majority of changes, he would first make them in his own account, and then send a pull-request to the main account. Then a link could be sent to the mailing list for discussion on the pull request before it was merged in.
As a side note - I've found that several third-party projects are linking to the unofficial sources using git submodules. While this isn't officially sanctioned, it would be much better if they could link to iana/tz instead of eggert/tz.
-- Item 5 -- While code and data often go hand-in-hand, there are quite a lot of projects these days that only rely on the tz data. There are also a lot of releases of code changes that don't require data changes. Having both code and data in a single project seems rather inefficient. I propose that they be split back to separate projects, and maintained in separate github repos (tzdata / tzcode).
Also, consider also that perhaps there are too many merged projects just within the code. For example, tzselect, zic, zdump, etc. might be broken out for better visibility of changes and for clarity of dependent files.
I look forward to feedback on these items. I'm sure not all will be in agreement, but I think it's important that we look forward to new and better ways to manage this project - rather than just sticking with the ways of the past.
-Matt