Date: Fri, 14 Sep 2012 16:33:03 -0700 From: Paul Eggert <eggert@cs.ucla.edu> Message-ID: <5053BEAF.2010607@cs.ucla.edu> | That "slightly" and "some of" are bothersome. The "some of" shouldn't be, that was just meant to indicate that we (or you) get to decide which files get the version number in them, which might be some of the files, or all of them, a choice we make - after that it's irrelevant to this issue. The "slightly modified" part sounds not so good, but in reality, it is what we're doing now anyway, and what just about everyone does ... the real master files are the ones in the git repository (or when ado was doing it, his sccs repository, or for the short while I did it, RCS) and those we don't distribute - to truly distribute unmodified master files, allowing everyone to do everything, we'd need to distribute via "git clone" or whatever git has to make repository copies (I'm not a git user, if that isn't obvious). We could do that, but for 99% of the users it is useless overhead and unnecessary extra work to deal with. Given that we're modifying the files anyway, does it really matter whether or not the distribution contains exactly the same bits as whatever git's equivalent of the sccs/rcs "checkout" commands produce? That is, we do a checkout, then one small extra post-processing step, and the results of that are the distribution. | The distributed files should | have all the info needed to make the distribution, so that the | whole procedure is self-documenting and automatic. Self documenting and automatic, absolutely. All the info, just almost, and just because there's no safe way to make that actually happen the way we have historically done the distributions (as the master copy has a single version number that applies to everything, but everyone else gets two potentially different versions, one for the code, and one for the data). If we abolished the code/data distinction, then we could distribute everything. Otherwise we distribute everything, except one 5 byte (well, 6 including the \n) file that contains "2012f" (or whatever version we're up to). | Yes, and that's the advantage of putting the data version number | into (say) zone.tab. Where it goes isn't as important as that it is in some file distributed in tzdataNNNNx.tgz and not distribued in tzcodeNNNNx.tgz (and vice versa for the code version number). It could be in any single file that qualifies in each (like Makefile and zone.tab) or it could be in every file. This should really be based upon what is best for the users, as it makes almost no difference to anything else. | So perhaps we should do it that way, even if it's a bit more | error-prone because the two version numbers are in different files. Please don't jumble the decisions on what the results should look like, with how they should be constructed - while those are related, they don't have to be even close to the same. It is trivially easy to avoid the "bit more error-prone" by making sure that there is exactly one version number to manage, and then using automation to apply that version number to the distribution files in the proper way. Given that it is so easy, I think we should do that -- but as this affects operations that only you ever perform, it really comes down to what works best for you (as long as the end result looks OK, none of the rest of us will care much how much work you need to do to get it to that state...) I haven't just written the few lines of shell script (that would be implemented as Makefile recipe lines) and sent them, as it isn't clear to me that the simpler solution of just abandoning the split code & data distributions, and going back to a single tgz file, with a single trivial version number (the way it was originally,) isn't also the better solution. There's almost no-one concerned with 1200bps uucp data transfers any more... kre