On 2016-09-01 12:56, Deborah Goldsmith wrote:
What is the goal for making these changes to the distribution format?
On Aug 30, 2016, at 5:21 PM, Paul Eggert <eggert@cs.ucla.edu> wrote: Deborah Goldsmith wrote: 2. Shrinking the distribution tarball. This matters less, but while we're doing (1) we might as well do (2). Not everyone is as well-off and well-connected as Apple and UCLA. Is there any evidence anyone cares about the minor differences we’re discussing? The difference between bz2 (widely supported) and lz (not as widely supported) is 43K. Why put the onus on consumers of the data to find an implementation of a less-known compression scheme their platform doesn’t support?
GNU now provides .lz with new packages but they also provide the .gz to support older systems. Lzip appears to have advantages where only poor and/or slow connections are available. Thus there would also be a size advantage in distributing the test data as a separate archive in all the formats desired. A case could be made for also providing .zip archives for those who deal only with .Net and java on Windows (one of those is now MS for UWP and Store apps).
It’s going to cause a lot of work for a lot of people. It'll be a bit of work at the start, to change unpacking scripts. But the changes are small, and it should save some work in the long run. And there's no rush, as the old-format tarballs will continue to be distributed. Alexander Belopolsky wrote: If the size of data distribution is a concern, it looks like one can achieve a much better compression by simply discarding comments But the comments are the best part! :-) +1 junk the test data used by few, over the comments enjoyed by many!
-- Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada