Backups revisited – Insignificant Bits

I spent most of last weekend doing home IT tasks. That involved upgrading my main desktop machine from Pentium III to an Athlon XP. Welcome to 7 years ago! But most of the work was spent reorganizing my data and coming up with a better backup regime.

Now that hard drives are so cheap, and we now rent a storage space, spending $1/GB-month for off-site network backup is just not worth it any more. Also, with my off-site backup, I was only keeping a single full backup, which is not terribly useful if a few weeks elapse before you notice something is missing. So, I have been playing around with incremental backups using rsync and hard links, similar to the way Apple’s time machine supposedly works. Then I stumbled across ‘gibak,’ a set of shell scripts that use the git version control system as the backup tool.

In the end, I went with my own dozen-liner script to use git and metastore, with rsync/cifs to collect the stuff in windowsland for backup in separate repositories. A cron job does a daily commit and push from the checked-out repo in my home directory. So far, the result is pretty nice. If I screw something up, a ‘git reset’ gets me back to any earlier date. It also solves a minor annoyance with keeping files in sync across multiple machines: both can use a clone of the git repo and then syncing is as easy as a push from one and a pull to the other. I can rotate portable hard drives to the storage area to solve the ‘apartment burning down’ scenario, though I’m admittedly vulnerable to the ‘global thermonuclear war’ scenario.

I’ve already used this scheme to rebuild a machine’s home dir and it worked flawlessly. Hopefully the same will hold when I move my laptop from Ubuntu 8.04 to Fedora 10. Anyway, this should keep me satisfied until btrfs is everywhere and I can just use filesystem snapshots.