Wednesday, 23 October 2019

Rsync Backup System [4]: Linux File and Directory Security Using ACLs

As per our series on Rsync backups, we desire to use a different user from the one that owns the home directory in order to ensure they have only the permissions they need when logging in remotely. At this point whilst by default the backup user could access many things on the computer, there were some files that they received permission denied errors for.

Whilst there is a simple file and directory permission scheme built into Linux, the acl extensions are very worthwhile as they give a lot more control, especially with the all-important ability to have inheritable permissions on a parent folder. In this case we'd want to give the backup user read-only permissions on the home directory that are inherited by all subdirectories and files as they are created.

Getting ACL is pretty simple, firstly you need to install the acl package using apt. The second requirement is to see if a volume is mounted with acl support. Use a command called tune2fs -l /dev/xxx (part of e2fsprogs package) and look for "Default mount options". If it says acl in there then the volume will be mounted by default with ACLs enabled. If this is not a default then you'd want to go into the volume mount entry in /etc/fstab and you can add acl to the mount options there. In this case because in fstab I used "defaults" as the mount options, acl is automatically turned on at mount time. If I needed to specify acl then I could change the word "defaults" to "defaults,acl" in the fstab entry to enable ACLs.
ZFS also has its own (more extensive) ACL support which is not covered by this post as we are referring to the existing filesystems which are ext4 on the computers being backed up in this case. I am not planning to change the existing RAID arrays from MDADM to ZFS but it could be an interesting consideration for the future should a disk array be replaced in any system.

Now how to configure ACLs for our requirement? Commands used to set and get ACLs are setfacl and getfacl respectively. To set up read access for backupuser on /home/patrick we need to use a command that looks like this:
setfacl -dR -m u:backupuser:rx /home/patrick

If we then use a ls -l (or ls -al) command in /home/patrick we can see next to the standard permissions a + sign which tells us that ACLs are set. In this case I can see whilst all directories have ACLs set, the files that are in each directory are not set with ACLs. So I would need to run another command to set these:
setfacl -R -m u:backupuser:rx /home/patrick/* /home/patrick/.* /home/patrick/*.*
The two different commands are as follows:

The first one using 
-d setting the default ACL for each directory (the default ACL to be applied to any new file or directory)
-R operating recursively on all subdirectories
-m modifying the ACLs
u:backupuser:r is user backupuser granting read permission.

The second one is practically the same except there is no default applied because it is for existing files and the paths following it are the different filespecs to match. 

Whenever a new file or directory is created in future it will inherit the default ACLs that have been defined using the first command which means backupuser will get the permissions it needs. Multiple paths to apply the ACLs to can be specified but only one user specification can be put in at a time.

ACLs override the default permissions system when they come into effect and so you need to have good knowledge of how Linux permissions work. In particular you need to ensure your user has x (execute) as well as r (read) permissions specified in ACLs for directories in order to be able to traverse directories, otherwise you will keep getting permissions issues. It took me a bit of work to get the ACLs set up but once they are done properly, the advantage is that inheritable permissions are possible, so for my backups, the backup user will have automatic rights to new files and folders that are created in future. In fact for any computer the way to make it happen is to set a default ACL on /home that gives the backup user rx permissions on everything below in future.

With the ACLs properly set the test backup was able to access all the files it needed to and complete the backup. Since I knew there were roughly 114,000 items waiting for backup, it was interesting to watch rsync at work discovering new files (a big increase from the roughly 7000 it had access to the first time). 

The means of doing the incrementals is still being worked out as it is not as straightforward as I had hoped. The log file is not consistent in its format (at least not so far) that would make it easy to determine which files successfully transferred. The option I am tending towards would probably consist of getting a list of files that have changed since the last backup, then getting rsync to copy one at a time, looking at the result code that it returns, and if there is no error then writing an extended attribute of the source file to indicate a successful backup result. The extended attributes will probably be backup.full = <date> and backup.incr = <date> and these two extended attributes will be written by the appropriate backup script. A full backup simply updates the attribute at backup completion without checking it. Whereas for an incremental, we compare the last modified date in the file with the date stored (backup.incr if set, otherwise backup.full) and do the backup if the file has been modified since the last backup date. So I expect to start more testing reasonably soon with developing scripts to perform these tasks.

But so far as the full backups go, it's doing pretty good. I have just set up another full backup and it's working very well. The second time setting it all up was so much smoother. It also looks like ZFS disk compression works very well and has achieved some useful saving but I will check out exactly what sort of improvement I am getting on these disks.