Incremental Backup using rsync with Hardlinks
Backup, Linux, Sysadmin, Ubuntu, rsync
This method involves an incremental backup bash script that runs on a production server. The aim is to automatically build a local backup which can then be synchronised with a remote backup server.
Production Server: Backup User
Create a backup user - without sudo privileges:
adduser servernamebackup
Production Server: Source Directory
Create a directory called /home/backupuser/source
. This will be used as the source for the local rsync, so it forms a target for:
- a MySQL backup - mysqldump dumps database copies to
/home/backupuser/source/sql
- Symlinks to config files (in
/home/backupuser/source/config
) - Symlinks to site files (from the Apache root directory -
/var/www/html
)
Set up directories and add symlinks to /source
:
sudo mkdir -p /home/backupuser/source/config # make the config directories
sudo mkdir /home/backupuser/source/sql # target for mysqldump
sudo ln -s /var/www/html /home/backupuser/source # Symlink site files
sudo ln -s /etc/apache2 /home/backupuser/source/config # Symlink Apache config files (including vhosts setup)
sudo ln -s /etc/mysql /home/backupuser/source/config # Symlink MySQL config files
Production Server: Incremental Backup
Incremental backup script:
#!/bin/bash
#
# Website Backup Script for SERVER
#
# Add this script as a cronjob to run daily. It creates an archive of incremental
# backups, with the current date set as the name of the backup directory.
# Assumes the script is run from /usr/local/sbin.
# ------------------------------------------------------------------------------
# Todays date in ISO-8601 format:
# ------------------------------------------------------------------------------
DAY0=$(date -I)
TIMESTAMP=$(date '+%Y-%m-%d at %H:%M:%S')
# Yesterdays date in ISO-8601 format:
# ------------------------------------------------------------------------------
DAY1=$(date -I -d "1 day ago")
# The source directory: this should contain a symlink to Apache doc root.
# ------------------------------------------------------------------------------
SRC="/home/servernamebackup/source/"
# The target directory
# ------------------------------------------------------------------------------
TRG="/home/servernamebackup/backup/$DAY0"
# The link destination directory
# ------------------------------------------------------------------------------
LNK="/home/servernamebackup/backup/$DAY1"
# Backup databases
# ------------------------------------------------------------------------------
USER="XXXXXXXXXXXXXXXXXXX"
PASSWORD="XXXXXXXXXXXXXXXXXXXXXXXXXX"
HOST="localhost"
DB_BACKUP_PATH="${SRC}sql"
# Get list of databases, but not 'Database' or 'information_schema'
# ------------------------------------------------------------------------------
DATABASES=$(mysql --user=$USER --password=$PASSWORD -e "SHOW DATABASES;" | grep -Ev "(Database|information_schema)")
DUMPFAIL=false
# Remove previous dumped databases
# ------------------------------------------------------------------------------
rm $DB_BACKUP_PATH/*
# Set up log
# ------------------------------------------------------------------------------
echo "Database backup report. ${TIMESTAMP}" > $DB_BACKUP_PATH/DB_LOG
echo "=======================================" >> $DB_BACKUP_PATH/DB_LOG
# Create dumps for each database
# ------------------------------------------------------------------------------
for DB in $DATABASES
do
mysqldump -v --user=$USER --password=$PASSWORD --single-transaction --log-error=$DB_BACKUP_PATH/$DB.log --host=$HOST $DB > $DB_BACKUP_PATH/$DB.sql
# Reportage - log result of mysqldump
# ----------------------------------------------------------------------------
if [[ $? -eq 0 ]]
then
echo -e "Mysqldump created ${DB}.sql\n" >> $DB_BACKUP_PATH/DB_LOG
else
echo "Mysqldump encountered a problem backing up ${DB}. Look in ${DB_BACKUP_PATH}/${DB}.log for information.\n" >> $DB_BACKUP_PATH/DB_LOG
$DUMPFAIL=true
fi
done
# The rsync options: follow the symlinks to make a hard backup
# ------------------------------------------------------------------------------
OPT=(-avL --progress --delete --link-dest=$LNK)
# Execute the backup
# ------------------------------------------------------------------------------
rsync "${OPT[@]}" $SRC $TRG
# Log Results
# ------------------------------------------------------------------------------
if [[ $? -gt 0 ]]
then
# rsync Failure
# ----------------------------------------------------------------------------
echo "ERROR. rsync didn't complete the nightly backup: ${TIMESTAMP}" >> /var/log/server-backup.log
echo "There was an error in the nightly backup for <servername>: ${TIMESTAMP}"| mail -s "Backup Error, <servername>" info@yourdomain.com
else
# rsync Success
# ----------------------------------------------------------------------------
if [[ false == $DUMPFAIL ]]
# rsync & mysqldump worked OK
# --------------------------------------------------------------------------
then
echo "SUCCESS. Backup made on: ${TIMESTAMP}" >> /var/log/server-backup.log
# email the report
# ------------------------------------------------------------------------
echo -e "${TIMESTAMP}: Server <servername> successfully ran a local backup.\nBoth rsync & mysqldump report success."| mail -s "Backup Success, <servername>" info@yourdomain.com
# rsync worked but there was at least one mysqldump error
# --------------------------------------------------------------------------
else
echo "PARTIAL SUCCESS. File backup (rsync) was successful, but mysqldump reports errors: ${TIMESTAMP}" >> /var/log/server-backup.log
# email the report
# ------------------------------------------------------------------------
echo -e "${TIMESTAMP}: Server <servername> ran a local backup.\nFile backup reports success, however mysqldump reports at least one problem.\nCheck "| mail -s "Backup: Partial Success, <servername>" info@yourdomain.com
fi
fi
- Add the incremental backup script to
/usr/local/sbin
- Make executable
- Set up cronjob
# Upload script
rsync --progress -a -v -rz -e "ssh -p 1234" ~/bash-projects/remote-backup-scripts/servername/backup-servername daviduser@123.456.789.0:~/
# Move into position
sudo mv ~/backup-servername /usr/local/sbin
# Make executable
sudo chmod u+x /usr/local/sbin/backup-servername
# Open crontable
sudo crontab -e
# Add the following to the crontab, save and exit
# Run backup script every day at 3 am
00 03 * * * /usr/local/sbin/backup-servername
# Send an email report from this cronjob
MAILTO=info@yourdomain.com
–delete
The delete option causes files that are not in source to be deleted from the target - and ONLY the target:
This tells rsync to delete extraneous files from the receiving side (ones that aren’t on the sending side), but only for the directories that are being synchronized. You must have asked rsync to send the whole directory … Files that are excluded from the transfer are also excluded from being deleted unless you use the –delete-excluded option or mark the rules as only matching on the sending side
I originally used the --delete
flag on this script - thinking that it would be necessary to keep the latest incremental backup synchronised with the current source filesystem.
However, --delete
has no effect when syncing into an empty directory, which is basically what happens when using the --link-dest
method of rsync. The directory specified by --link-dest
is used as a reference point, and any files in source that are unchanged relative to this reference point are hardlinked to their exisiting inode in the reference directory.
Files that are not in the source directory (because they have been deleted) are therefore not hardlinked in the backup directory.
However there is a possibility that rsync is backing up to an incomplete directory that is left over from a previous failed run, and may contain things to delete.
--delete
is probably superfluous when using --link-dest
. The backup directory accurately reflects the source directory. However, we leave it in place in case of backing up to an incomplete directory.
Note: Be careful deleting old snapshots. The old directory in it’s entirety should be properly archived and a new full backup snapshot should be taken to kick off the next round of incremental backups.
This warning is not necessary as using the --link-dest
option creates hard links. Thanks to aselinux for pointing this out in the comments.
Result
This will result in a backup directory on the production server, /home/backupuser/backup
, that contains dated incremental backups.
The Production backup directory can then be targeted by a Backup server, again via rsync. Typically, this might involve:
- Access to the backup directory by means of SSH public/private key pair
- rrsync access for the backupuser - only rsync can run on the given SSH key, restricted to read-only
Next Steps
- Setup Secure rsync link between servers
- Run a cronjob on the backup server to read and rsync the backup directory on the Production server
Resources
- Excellent readable article on rsync
- The original article by Mike Rubel
- Incrementally numbered backups, using rsync
- Database backup
- Example bash script using rsync for local and remote backups
- rsync & cp for hardlinks
- Snapshot backup - rsync with hardlinks - with detailed examples
- Basic rsync examples
- rsync man page - surprisingly helpful!
- Time machine/rsync with good illustration of hard links - good sample “instant backup” script - backup every minute!
- mysqldump status
- Link Dest behaviour explation: answer in the rsync mailing list archive
- Link Dest and deleting old snapshots
comments powered by Disqus