Monday, 21 October 2019

Rsync Backup System [2]: Using ZFS For Compressed Backup Disks

So last time I talked briefly about ideas for using Rsync to do my backups. Over time this will gel into a whole lot of stuff like specific scripts and so on. Right now there will be a few different setup steps to go through. The first stage is to come up with a filesystem for backup disks, which are removable, and this time around I am having a look at ZFS.

Basing on some stuff I found on the web, the first step is to install zfs on the computer that does the backups. Firstly the contrib repository needs to be enabled in Debian, and then after that we need to install the following packages:

apt install dpkg-dev linux-headers-$(uname -r) linux-image-amd64
apt-get install zfs-dkms zfsutils-linux

After completing these steps I had to run modprobe zfs as well to get it running for systemd (this was flagged by an error message from systemd). This only works on the current running session. To ensure there are no "ZFS modules are not loaded" errors after rebooting we need to edit /etc/modules and add zfs to the list of modules to be loaded at startup.

Since Debian Buster we have to run a lot of commands in a virtual console by pressing Ctrl-Alt-F1, that previously could be run in a terminal window. This is certainly the case for the zfsutils commands so the following steps need to be run in such a console.

The next step after installation is to look at the steps needed to create a filesystem on a disk and set up compression on it. In this case the existing disk is called spcbkup1 and its ext4 partition currently set up on it appears as /dev/sdb1 which means the actual disk is /dev/sdb.

After deleting the existing filesystem from /dev/sdb the following command creates the storage pool:
zpool create -m /mnt/backup/spcbkup1 spcbkup1 /dev/sdb

whereby the new storage "pool" (just a single disk which is not really what ZFS is made for but it can be done) is set up to mount to /mnt/backup/spcbkup1, is called spcbkup1 and is using the physical disk /dev/sdb.

Issuing the blkid command at this point tells me I have two new partitions. Apart from /dev/sdb1 which is of type "zfs_member" there is also another partition called /dev/sdb9 for some other purpose known to zfs. /dev/sdb1 is mounted up at this point and can be used as a normal disk.

To turn on compression we then use this command:
zfs set compression=lz4 spcbkup1

There is one more question and that is automounting. When you put an entry into /etc/fstab then the default for mounting is automatic. For a removable disk we don't want this, so the entries in /etc/fstab end up looking like this:
UUID=....     /mnt/backup/spcbkup1 ext4 noauto 0 2
whereby noauto as the options means it will not automount. Instead you have to manually mount it before use. Obviously you do this after inserting the disk. This requires me to hibernate or turn off the computer before inserting or removing the disk.

ZFS is a little different. We use zfs set mountpoint=none spcbkup1 to remove the mount point before removing the disk. Then later on when we want to put the disk back in we would use zfs set  mountpoint=/mnt/backup/spcbkup1 spcbkup1 to reset the mountpoint. As we can change mountpoints on the fly without a config file or needing to know a long UUID, this immediately and obviously lends itself to being able to mount more than one disk to the same mountpoint. In other words, I can dispense with a different mountpoint for each of the disks and mount them all to the same path. So for this backup scheme I will have two mountpoints for spcbkup. One mountpoint is for the backup disk(s) for the full backup, and the other is for the backup disk(s) for the incremental backup. These will look like:
/mnt/backup/spcbkup-full
/mnt/backup/spcbkup-incr

where the suffix is self explanatory hopefully
and as long as I swap the incremental disks in and out then the backups proceed automagically.

For this scheme we just have one backup disk for the full backup and we are sharing one backup disk for incrementals for all sources so the paths will look like
/mnt/backup/fullbackup
/mnt/backup/incrbackup

and the two incremental disk pools will be called incrbackup-odd and incrbackup-even which refers to the week number. The backup script will work out the week number itself and work accordingly. The plan is each incremental disk will contain two generations of backups and with two disks, there will accordingly be four incremental generations at any one time. In practice we take the week number and get the modulus after dividing by 4, which will give a number from 0 to 3, then the backup for that week is done into a folder on the disk called mainpc-x or serverpc-x or mediapc-x where x is the modulus. rsync will be set to ensure that any redundant files are deleted from the target at each sync and the week number is also used to calculate the date range for the command so that it knows how to handle file modification times to identify the files that have changed within that date range. The script automatically runs the same day and same time each week for the incrementals. At about every three months a full backup is done on a different day from the scripted day so the script is not disturbed, this amounts with three computers to one full backup each month.

It is important to unmount using the above commands before removing the disk otherwise the system will behave badly at next startup (although the Raidon disk caddies are theoretically hotswappable, in practice I am using them as non-hotswap for various reasons, so the system is shutdown or hibernated before changing disks).

So having worked out how to set up disks, the next step is to work out how to use rsync and I will be running a full backup of serverpc first of all.