MySQL database backup file compression: gzip vs bzip2

In one of my previous posts: MySQL backups, I talked about using a script for automating backups. I show that we can use gzip to compress backup file to compress and save. Since then, our backup file has been growing meg or two a day which is causing our backup files to get bigger and bigger every day. We keep hourly backups for a week so you can imagine space usage is quite high.

I went and did few tests to see if its beneficial for us to use Bzip2 instead of gzip. I could’ve tried 7z also but that is not something installed on most linux machines I work on and I didn’t want use a solution which will require me to add more software. Here are the results:

Original sql file size: 584M
Using gzip: 166M
Time spent on compressing 1 minute 25 secs
Using bzip2: 125M
Time spent on compressing 1 minute 50 secs

As we can see that it takes longer to compress but file size is much smaller (adds up fast for multiple backups). As our database grows bigger and bigger, the size difference will matter quite a bit. Since we also ftp the file off the server to onsite/offsite location hourly as well.

Does anybody know of any backup techniques which we can use for free and we can do incrementals with? comments are always welcome.  To learn more about gzip or bzip2, see man gzip or man bzip2 respectively.

15 thoughts on “MySQL database backup file compression: gzip vs bzip2

  1. dudus

    You could keep a backup copy remote and locally. Then everytime you do a new backup you binary diff both and just submit the patch. So onsite you keep only the last one. And off site you keep an old one with several patches.
    Then you can apply some of this patches just to clenup once in a while.

    Just an idea

  2. Steven Haryanto

    There’s always rsync 🙂

    I’ve used rdiff-backup in the past, it’s pretty good albeit not very robust (fails with fatal error a lot in situations where it should have just retried or ignored the error). i’m now using rsync combined with some simple homebrewn script.

    Unix’s “cp -la” is also useful 🙂

  3. CrazyToon Post author

    Steven: what am I rsync’ing? are you talking about rsync’ing the db files themselves? if so, it does make it hard for us to go back to certain date ad pull data out if needed.

    dudus: intresting idea. I never heard anybody do this… I will look into it. Thanks!

  4. Martin

    For incremental backup I’d suggest using binary logs. Set max_binlog_size to something small and every time a new binlog file is started, copy the last one (I’m not sure if it is safe to copy the currently used binlog file as well).

    Even better would be a replication. Then you get a real-time copy of all transactions. That would also use less resources on the DB server, as all further copying and comression (hourly snapshots etc.) can take place on the backup server.

  5. CrazyToon Post author

    Shin: -9 creates a same size compressed file. -s creates 135 meg file. -s is used so bzip doesn’t use too much ram. therefore its not worth using since ram is not an issue here.

    Martin: we will be looking at replication in near future. Thanks for binlog comment. another thing for me to try.

  6. Casey

    Crazytoon: I believe Steven is saying you can use rsync on the actual dumps. So, the basic process would be:

    1. Create a new SQL file using MySQLdump
    2. Rsync the SQL file to your backup server.

    Martin does have good advice for incremental backups. The nice thing about that is when you do a full-backup (once a week?), you can delete the old binary logs to save space.

  7. Pingback: MySQL: How do I dump all tables in a database into separate files? | Technology: Learn and Share

  8. kindDBA

    which one is the fastest way to compress database backup. I have a backup of 30G and is taking more than an hour to finish compressing

  9. Bill

    Hey. Using MySQL’s archive storage engine can be good for things like logs I believe it uses gzip compression as well.

  10. Killer Bees

    Hey Everyone! Yeah using MySQL’s archive storage engine makes things small without the need to dump the .sql all the time. Tradeoff there is that the queries run slowly (Im not sure if it supports indexing)

  11. nmd

    I just wanted to send a quick word to say thanks to you for some of the stunning tips and hints you are giving at this site. My rather long internet investigation has finally been recognized with excellent content to talk about with my family and friends. I ‘d repeat that we readers are very fortunate to live in a fantastic community with so many outstanding professionals with great guidelines. I feel rather blessed to have seen the weblog and look forward to really more fabulous moments reading here. Thank you again for everything.

  12. golden goose

    I want to show my passion for your kind-heartedness giving support to those individuals that require guidance on in this area. Your very own dedication to passing the solution up and down came to be astonishingly advantageous and have regularly encouraged somebody much like me to achieve their goals. Your entire insightful guide indicates this much a person like me and even further to my colleagues. Regards; from each one of us.

Leave a Reply

Your email address will not be published. Required fields are marked *