A new OSS compression: XZ
XZ compression has actually been out for a little while, but it just recently began to build traction in Linux distributions.
As of version 1.22 of Gnu Tar, Short option -J is reassigned as a shortcut for xz, meaning that instead of a usual tar czvf, you’d do replace the z (for gzip) to J for xz.
Here are a few benchmarks I lifted from the Arch Linux mailing list.
Did some testing with openoffice-base 3.2.0-1-x86_64.tar:
XZ allows choosing the level of compression, between 1-9 (1 being the least amount of compression, 9 being the most, and 6 being the default)
gzip: 0m28.945s bzip2: 1m21.876s xz -1: 0m49.244s xz -2: 1m18.444s xz -3: 3m34.208s xz -6: 4m41.148s decompression speed: gzip: 0m 5.772s bzip2: 0m29.433s xz -1: 0m13.983s xz -2: 0m12.949s xz -3: 0m12.706s xz -6: 0m11.462s
Interesting, right? Obviously, the more you compress a file, the longer it takes, but the interesting part is the decompression speed. Decompression gets faster with higher compression ratio! With' xz -6 you only need to read and process 124MB, with xz -1 you have to read 150MB. The decompression algorithm is the same for both ratios, only change is the archive size and the dictionary used. The downside is that the higher the ratio, the bigger the dictionary becomes and the more memory you’ll need for decompression.
Here are some more benchmarks comparing file size (using the default xz -6) (lifted from the Arch Linux forums)
The Kernel compressed extremely well, to 27.%5 of its original size). It may not be worth it for many applications though, as it takes over 3x as long to compress, vs gzips 35%.
86M kernel26-22.214.171.124-1-i686.pkg.tar 30M kernel26-126.96.36.199-1-i686.pkg.tar.gz 22M kernel26-188.8.131.52-1-i686.pkg.tar.xz 287M wesnoth-1.6.1-1-i686.pkg.tar 220M wesnoth-1.6.1-1-i686.pkg.tar.gz 202M wesnoth-1.6.1-1-i686.pkg.tar.xz
xz (in default configuration) takes 3-4 times longer (vs gzip) to compress for an extra 10-15% compression ratio. It also decompresses at half the speed.
So for the average user, this won’t be of huge interest. If you want the best combination of size and speed, gzip is still king.
The real benefit here is for website mirrors and people who value size more than speed. Imagine hosting a file for millions of people to download (like being a mirror for Firefox, the Kernel or OpenOffice.org etc..), shaving 10% off your entire bandwidth can be huge. For this reason, Arch Linux and Slackware have switched their repositories to xz. If you use bzip2, it’s certainly worth switching! For the average user, however, it probably is not worth it.