Tuesday, 7 July 2020

Rsync Backup System [6]: Latest Progress / Using The Rsync Daemon

Because of issues with Rsync over ssh I am looking into using the Rsync daemon instead. The commentary on that is below. Until I can get that working properly I am backing up each computer individually using its own removable drive bay. This can be achieved by changing the ownership of the backup folder on the removable backup drive to backupuser and then  logging into a virtual terminal as backupuser and then running the backup under that user account. We use that account rather than root or our regular user account on the system being backed up in order to ensure the account has only read and execute permissions on the backed up computer so that it cannot in any way risk deleting files from the computer. This is very important because I have used the --delete flag on rsync to ensure files removed from the source are also removed from the backup. When doing an incremental backup that flag is important to ensure the disk does not end up filling up with removed files and running out of space. But it is also useful to have the safeguard that you can't accidentally copy files from the backup disk over the source files, especially with incremental where your backup disk is already full of files.

The simple means for an incremental backup that rsync supports is using an existing backup disk that has previously had full backup onto it. This enables rsync to compare the source and backup without any additional system to record file details that it can then work out if they have changed. In other words with each source file it compares to the existing backup then only backs up incremental changes. This is quite different from other backup solutions I have used that have a separate incremental backup disk.

To implement this system I will need at least 3 disks for each computer if I want to keep one full backup and two incrementals. In other words each disk will start off as a full backup, then two of them will be alternating incrementals, and a full backup from scratch will be done every few months and kept separately from the incrementals.

I would still have preferred separate incrementals, and in fact it is possible with two removable drive bays, one contains a full backup disk and one contains an incremental disk, then the incremental can be written to the incremental disk instead of the full backup. However the problem with this solution is that rsync cannot produce progressive incrementals since it is always comparing with the last full backup, that means the incrementals will not be a snapshot at a particular point since the last incremental, but will always be incremented from the date of the last full backup. It is also messy working with the extra paths to the different disks in the backup command, and therefore, more likely to risk mistakes like backing up to the wrong disk.

So in summary the best solution is using a separate disk for each backup, which will be an incremented full backup in the case of 2 disks and a separate full-only backup in the case of one disk.

At the moment I am down to 1 backup disk for each computer and so I need to buy some more disks along with the removable bay caddies and storage cases. I am also working on secure storage for the disks in the storage location which is away from the house. At the moment I have only been doing one backup about every 3 months because the RAID-1 array system in the computer is so reliable for day to day backup that I don't need backups too often. But I will attempt to schedule a monthly backup cycle in future.

Since I have been using Rsync as a backup solution, it has been very reliable, but there are occasional difficulties, and one I am having at the moment is the backup terminating with rsync error 12 partway through. I have done some fault finding but haven't been able to narrow down what is occurring that causes this termination but it is probably something to do with the ssh client in the source computer, which rsync connects to. This means I have to either try and find out what is happening with ssh on the source, or try the alternative, which is using the rsync daemon (service) on the source and connecting rsync to that from the backup server.


Firstly as I am on Debian, rsyncd is already installed with rsync. Next step is to configure the /etc/rsyncd.conf file which I will do from this sample provided in that article:


uid = 1001
gid = 1001
pid file = /var/run/rsyncd.pid
lock file = /var/run/rsync.lock
log file = /var/log/rsync.log

[backupuser]
path = /home/patrick/
comment = backup patrick
read only = true
timeout = 300
     
 
There are some changes I am doing from their sample, the most notable being the uid and gid, which are set to the values for backupuser, which is the user that has permissions to do the backup on the source computer. Assuming rsyncd is run as root, this will ensure the permissions are correct. Also I am specifying the module name as backupuser, and it will automatically connect to the backup path for my home folder.

Start rsync with systemctl start rsync and set it for startup with systemctl enable rsync .

So next time I will list the result of testing with the daemon and whether it worked successfully.