Backups and rdiff-backup

I always thought nearlyfreespeech.net's advice on backups was good advice:

You should adopt a backup policy that assumes we are storing crates of sweaty dynamite on top of the servers that hold your important data. (Even though we aren't.) [NFSN FAQs]

If you only have one copy of something, stop what you are doing, obtain a disk, and replicate it.

Here are some brief notes on my backup setup, including some things I've learned since I last wrote about backups. (I had a disk failure last December and restored everything from backup. No tears, no sweat. In fact the exact same thing seems to happen about once every year, which I suppose is a good testimonial. I'm probably due for a disk failure real soon now.)

  • My first line of defense is backing up to a secondary HDD in my machine. I mostly use rdiff-backup now (and only rsync for huge files, like disk images). This system seems to work well. rdiff-backup creates reverse diffs on each backup so you can retrieve old versions. All the diffs go in the rdiff-backup-data subdir; if you remove that you just get a plain mirror, like what rsync would do.
  • I wrote a FUSE filesystem, rdiff-snapshot-fs, that displays rdiff's repository format as a series of mirrors in order to make it easier to browse historical snapshots. Doing a restore of individual files from time to time is key to ensuring your system is working when you really need it.
  • Rather than scheduling backups with cron and having to leave my computer on at night or, alternatively, having backups happen while I'm working, I bound a hotkey to a script that backs up and then puts the computer into suspend. I run it when I leave the computer for the day, every day.
  • I also rsync to other backup backup locations, including a portable HDD that stays in a safe place when I'm not using it.
  • When restoring from a mirror, the -c flag to rsync is useful. It makes rsync compare the checksum of the data being copied back with the checksum of the original. Then if you have multiple backups of the same stuff you can easily identify and reconcile any differences between them.
  • I did try rsnapshot. Unfortunately it caused my system load average to shoot through the roof, making the system unresponsive while backups were being made. I have no idea why this is but a few other people have reported the same thing.

No comments:

Post a Comment