Monday, October 3, 2011

pigz: the parallel implementation of gzip

Backuping is a very important todo nowadays. Even i backup constantly my whole systems. Mostly with lvm + tar.
Recently i've started to play around with btrfs and also with those snapshot features. Since on this partition are just virtual machines and since btrfs is still under development i regularly make a backup of those images. This leads me to gzip. My usual solution for copying and instantly compressing images to another partition is done by:

dd if=/path/to/image | gzip -c > /path/to/backupfolder

Now the problem is: gzip is single threaded, which means it's slow, especially on a multicore system, like mine which has 8 cores.
Asking google i found a nice replacement for gzip, called pigz. The really cool thing is, it's syntax is exactly the same like for gzip which means i don't had to change my code, i just had to change the command.

The improvement is enormous: (a image with 2,9 GB, simple stopped with "time")
real    3m50.361s
user    3m43.210s
sys     0m10.480s

real    0m37.080s
user    4m2.370s
sys     0m16.070s

Really nice, isn't it? Now in combination with my backup scripts for my gentoo systems, it's a really neat upgrade.