Re: [tz] [PROPOSED PATCH 2/2] Use lz format for new tarball

Aug. 30, 2016

      On Tue, Aug 30, 2016 at 5:18 PM, Paul Eggert <eggert@cs.ucla.edu> wrote:
...
$ ls -l tz*.tar.*z*
  -rw-r--r-- 1 eggert eggert 202609 Aug 30 14:00 tzcode2016X.tar.gz
  -rw-r--r-- 1 eggert eggert 394169 Aug 30 14:00 tzdata2016X.tar.gz
  -rw-r--r-- 1 eggert eggert 426667 Aug 30 14:10 tzdb-2016X.tar.bz2
  -rw-r--r-- 1 eggert eggert 382991 Aug 30 14:00 tzdb-2016X.tar.lz
If the size of data distribution is a concern, it looks like one can
achieve a much better compression by simply discarding comments in the data
files:

$ cat africa antarctica asia australasia \
    europe northamerica southamerica | wc -c
  647830
$ cat africa antarctica asia australasia \
     europe northamerica southamerica | egrep -v '^\w*(#.*|$)' | wc -c
  151231

Given the structured (low entropy) nature of the resulting stream, it
compresses very well:

$ cat africa antarctica asia australasia \
     europe northamerica southamerica | egrep -v '^\w*(#.*|$)'| xz -c | wc
-c
   24600

Re: [tz] [PROPOSED PATCH 2/2] Use lz format for new tarball

Alexander Belopolsky