de Incremental Server-Backup Using rsync

Using rsync As a Backup Tool

While managing a few servers it became clear that - in addition to our usual incremental and full backups - some kind of "live backup" would be a handy addition. That is, a backup copy which would regularily store a snapshot view of the live system "as-is" by replicating the remote file system into a local backup directory. This easily allows to access single files of the mirrored system as they were at the time of the last backup run. An additional idea I got was to keep some limited history by storing a predefined number of snapshots copies, deleting the oldest whenever a new copy is created. rsync allows you to do all this in a very simple way, using only a fraction of the transmission bandwidth and storage space you probably thought were neccessary.

Note: If you're trying to keep a complete history of changes of a whole system or some directory hierachy ("/home/yourself", anyone? ;), you should definitely have a look at fsvs instead. fsvs uses the well-known Subversion version control system as its backend storage to efficiently record the complete history of changes of a selected directory hierachy, including the file's meta data like access permissions and file ownership, and within the storage space penalty ususally associated with a Subversion working copy.

I'm using fsvs to keep a versioned backup of my whole /home directory and to synchronize it between my laptop and my desktop computer.

rsync is widely renowned for its ability to efficiently copy and synchronize files via (low-bandwidth) network connections. In case an older version of a copied file already exists at the target system, rsync can update it by copying only the differences which often greatly reduces the amount of data to be transferred and which boosts the synchronization speed significantly.

It seems to be less commonly known that rsync is also able to use an existing copy of files or even complete directory hierachies to created a new and independant local copy, while it still copies only the differences via the network and takes all unchanged data from the existing local copy without changing it. rsync even can be told to copy only changed files and hard link all the others, such that any unchanged files do not take up any additional storage space.

I'm utilizing these capabilities to created versioned short-term backups of quite a few servers and it works really well. However, as there's still a catch or two which you may not think of at first until someting bad happens (at least, I didn't. ;) I decided to publish the scripts I'm currently using and which have evolved during the last years as example scripts on which you can base your own rsync-powered backup scripts.

Besides that, publishing the scripts at this place finally gets them into my version control system as a side effect... ;)

IMPORTANT DISCLAIMER: The scripts provided for download at this page are mere examples which solely should be seen as templates for your own backup solutions. These scripts may not provide the bullet-proof high security backup solution you're looking for. There's no guarantee that these scripts will even work for anyone but me. (Although its likely they will... ;) Anyone who wants to use these scripts shoudl have read them and should have understood what they are doing, how they are doing it and why.

Download

To increase the flexibility of the backup scripts I split it into a rather static main part which does all the dirty work and into a tiny start stub which contains a bunch of configurable parameters which likely differ from backup-server to backup-server.

Conceptual Overview

The Big Picture

The provided backup script uses rsync to mirror (a part of) a remote server's filesystem. An already existing local copy will not be overwritten, but be used as a template instead on which a new copy will be based. This new copy will in fact only contain copies of file which actually changed compared to the previous mirror copy and simply reference any unchanged files using hard links.

The number of backup copy generations kept this way is freely configurable, I use setting of 7 and 2 (corresponding to one week and 2 days, respectively, if one backup run is performed each day) on different systems.

Additionally, the script provides the possibility to keep montly snapshots of the directories backed up. This will also reference any unchanged files in the normal backups so the monthly snapshots will only start to take extra disk space once the files on the live system change during daily operations.

Managing Rsync Access

The keep the following explanations more concise, I will refer to the server which should be backed up simply as "Server A" in all following paragraphs, the server performing the backup will be refered to as "backup server".

The provided example scripts do not require a rsync server process to be running on Server A. Instead, rsync is accessed and controlled solely via Secure SHell (SSH), which is achieved by using a special administrative user account - called "rsynslave" in my scripts - created for this purpose.

"rsyncslave" should be used for this purpose exclusively, it should have an invalid password and should only be able to execute a single command - the rsync command for mirroring the desired file systems. For this purpose, I use sudo to allow rsyncslave root access to all files for backup purposes:

XXXXXXXX

(If only parts of the remote file system should be mirrored, a more restricted user account will probably suffice to access all required files. If this is the case, you should exploit this opportunity in any case to increase your backup system's security even more.)

The backup server must have key-authenticated SSH-access to Server A. Using a corresponding rsyncd.conf, it is able to access the part of the filesystem it should backup.

You can automate the backup by periodically invoking the script using the cron daemon. This way, you'll also get the scripts output messages via eMail and can easily monitor the backup's result.

Important Security Considerations

Performin a "live backup" as described should only be an additional means to secure you data and improve its accessibility in case of a server failure, BESIDE some reliable traditional backup.

Compared to usual tape backups or archived backups to special storage arrays, a "live backup" is much more fragile. All files are mirrored one by one into the backup file system, and if it fails it may take your whole backup with it. Additionally, unchanged files are only physically stored once on the backup machine, which was an explicit design decision of the described backup procedure. However, this also means that you'll completely lose this file in case of a disk surface failure in this file's storage area!

The fact that the rsync process at the backup server must usually run with root permissions to be able to correctly copy all file ownerships and attributes may impose a risk to the backup system. In case of a security critical bug in rsync this could be used to gain root access to the backup server, however an attacker would have to take over Server A first in order to exploit this bug.

Just the opposite is true for Server A, rsync will also have to run with root permissions to be able to read all data which should be backed up, if someone already hacked your backup server and rsync is exploitable, the cracker could try to enter your Server A this way. This problem is not specific to the backup mechanism presented here, as most backup programs have to run with root permissions to read their data. Programs which do not utilitze a split privileges model will suffer by just the same potential problem.

Valid XHTML 1.0 StrictValid CSS!
-- /software/rsyncbackup_en.php#20070510-233903  [0] © Gunter Ohrner 2007-2013. Powered by CubbiCMS