Monday, October 3, 2011

pigz: the parallel implementation of gzip

Backuping is a very important todo nowadays. Even i backup constantly my whole systems. Mostly with lvm + tar.
Recently i've started to play around with btrfs and also with those snapshot features. Since on this partition are just virtual machines and since btrfs is still under development i regularly make a backup of those images. This leads me to gzip. My usual solution for copying and instantly compressing images to another partition is done by:

dd if=/path/to/image | gzip -c > /path/to/backupfolder

Now the problem is: gzip is single threaded, which means it's slow, especially on a multicore system, like mine which has 8 cores.
Asking google i found a nice replacement for gzip, called pigz. The really cool thing is, it's syntax is exactly the same like for gzip which means i don't had to change my code, i just had to change the command.

The improvement is enormous: (a image with 2,9 GB, simple stopped with "time")
gzip:
real    3m50.361s
user    3m43.210s
sys     0m10.480s

pigz:
real    0m37.080s
user    4m2.370s
sys     0m16.070s

Really nice, isn't it? Now in combination with my backup scripts for my gentoo systems, it's a really neat upgrade.

No comments:

Post a Comment