


You only have to verify that the two sums match upon completion. You can tune the compression level to better fit the ratio of CPU to network bandwith and swap it out with pxz -9e and pxz -d if you have much more CPU than bandwidth. Pv is a nice progress viewer program for your pipe and pigz is a parallel gzip program that uses as many threads as your CPU has by default (I believe up to 8 max). This will add parallel compression, a progress indicator and check integrity across the network link: tar c file_list | Thanks to Scott Pack's wonderful answer (I didn't know how to do this with ssh before), I can offer this improvement (if bash is your shell). To transfer files between servers, you can use fast-archiver with ssh, like this: ssh "cd /db fast-archive -c data -exclude=data/\*.pid" | fast-archiver -x Tar: /db/data/base/16408/12445.2: file changed as we read it Tar: Removing leading `/' from member names $ time tar -cf - /db/data | cat > /dev/null $ time fast-archiver -c -o /dev/null /db/dataġ008.92user 663.00system 27:38.27elapsed 100%CPU (0avgtext+0avgdata 24352maxresident)kĠinputs+0outputs (0major+1732minor)pagefaults 0swaps tar on a backup of over two million files fast-archiver takes 27 minutes to archive, vs. I wrote an open source tool called fast-archiver that is faster than tar for these scenarios: it works faster by performing multiple concurrent file operations. When copying a large number of files, I found that tools like tar and rsync are more inefficient than they need to be because of the overhead of opening and closing many files.
