Automatic backup of users’ files on a NAS device to an external USB HDD

One of my Linux machines is a 4-bay server that performs various roles, one of which is as NAS (network-attached storage) for family and visitors’ devices connected to my home network. I had configured each pair of HDDs in a RAID 1 array in order to provide some internal redundancy, but I was nervous about not having an external backup for users’ shares. Therefore I recently purchased a 6TB external USB 3.0 HDD (Western Digital Elements Desktop WDBWLG0060HBK-EESN) to connect permanently to one of the server’s USB 3.0 ports for backup purposes. I created a Bash script ~/backup_to_usbhdd.sh to perform the backup, plus a cron job to launch it automatically at 05:01 daily:

user $ sudo crontab -e
user $ sudo crontab -l | grep -v ^# | grep backup
01 05 * * * sudo /home/fitzcarraldo/backup_to_usbhdd.sh

The use of ‘sudo‘ in the crontab command may appear superfluous because the cron job was created for the root user (i.e. by using ‘sudo crontab -e‘ rather than ‘crontab -e‘). However, this is done to make cron use the root user’s environment rather than the minimal set of environment variables cron would otherwise use [1].

#!/bin/bash
#
# This script backs up to an external USB HDD (NTFS) labelled "Elements" the contents
# of the directories /nas/shares/ on my server.
# It can be launched from the server either manually using sudo or as a root-user cron
# job (Use 'sudo crontab -e' to configure the job).
#
# Clean up if the backup did not complete last time:
umount /media/usbhdd 2>/dev/null
rm -rf /media/usbhdd/*
# Unmount the external USB HDD if mounted by udisks2 with the logged-in username in the path:
umount /media/*/Elements 2>/dev/null
# Find out the USB HDD device:
DEVICE=$( blkid | grep "Elements" | cut -d ":" -f1 )
# Create a suitable mount point if it does not already exist, and mount the device on it:
mkdir /media/usbhdd 2>/dev/null
mount -t ntfs-3g $DEVICE /media/usbhdd 2>/dev/null
sleep 10s
# Create the backup directories on the USB HDD if they do not already exist:
mkdir -p /media/usbhdd/nas 2>/dev/null
# Backup recursively the directories and add a time-stamped summary to the log file:
echo "********** Backing up nas shares directory **********" >> /home/fitzcarraldo/backup_to_usbhdd.log
date >> /home/fitzcarraldo/backup_to_usbhdd.log
# Need to use rsync rather than cp, so that can rate-limit the copying to the USB HDD:
rsync --recursive --times --perms --links --protect-args --bwlimit=22500 /nas/shares /media/usbhdd/nas/ 2>> /home/fitzcarraldo/backup_to_usbhdd.log
# No --delete option is used, so that any backed-up files deleted on the server are not deleted from the USB HDD.
echo "Copying completed" >> /home/fitzcarraldo/backup_to_usbhdd.log
date >> /home/fitzcarraldo/backup_to_usbhdd.log
df -h | grep Filesystem >> /home/fitzcarraldo/backup_to_usbhdd.log
df -h | grep usbhdd >> /home/fitzcarraldo/backup_to_usbhdd.log
echo "********** Backup completed **********" >> /home/fitzcarraldo/backup_to_usbhdd.log
cp /home/fitzcarraldo/backup_to_usbhdd.log /media/usbhdd/
# Unmount the USB HDD:
umount /media/usbhdd
exit 0

The initial version of the above script used ‘cp‘ rather than ‘rsync‘, which worked fine when I launched the script manually:

user $ sudo ./backup_to_usbhdd.sh

However, the script always failed when launched as a cron job. In this case the command ‘df -h‘ showed the root directory on the server was ‘100% used’ (full). Also, the mount point directory /media/usbhdd/ had not been unmounted. The log file had twenty or so lines similar to the following, indicating the script had failed due to the root filesystem becoming full:

cp: failed to extend ‘/media/usbhdd/nas/user1/Videos/20130822_101433.mp4’: No space left on device

Apparently data was being read from the server’s HDD into the RAM buffer/cache faster than it could be written to the external HDD. The bottleneck in this case is not USB 3.0, but the USB HDD itself. The specifications for the USB HDD do not mention drive write speed, but a quick search of the Web indicated that an external USB HDD might have a write speed of around 25 to 30 MBps (Megabytes per second). I do not know why the problem happened only when the script was launched as a cron job, but I clearly needed to throttle the rate of writing to the external HDD. Unfortunately the ‘cp‘ command does not have such an option, but the ‘rsync‘ command does:

--bwlimit=RATE          limit socket I/O bandwidth

where RATE is in KiB if no units are specified. I opted to use a rate of 22500 KiB to be safe, and it is not too far below the aforementioned 25 MBps. Indeed, using this limit the script runs to completion successfully when launched by cron:

user $ cat backup_to_usbhdd.log
********** Backing up nas shares directory **********
Thu Sep 13 05:01:26 BST 2018
Copying completed
Thu Sep 13 11:41:31 BST 2018
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdf1       5.5T  386G  5.1T   7% /media/usbhdd
********** Backup completed **********
********** Backing up nas shares directory **********
Fri Sep 14 05:01:26 BST 2018
Copying completed
Fri Sep 14 05:20:08 BST 2018
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdf1       5.5T  403G  5.1T   8% /media/usbhdd
********** Backup completed **********
********** Backing up nas shares directory **********
Sat Sep 15 05:01:26 BST 2018
Copying completed
Sat Sep 15 05:04:58 BST 2018
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdf1       5.5T  404G  5.1T   8% /media/usbhdd
********** Backup completed **********
********** Backing up nas shares directory **********
Sun Sep 16 05:01:26 BST 2018
Copying completed
Sun Sep 16 05:15:14 BST 2018
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdf1       5.5T  416G  5.1T   8% /media/usbhdd
********** Backup completed **********
********** Backing up nas shares directory **********
Mon Sep 17 05:01:26 BST 2018
Copying completed
Mon Sep 17 05:04:15 BST 2018
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdf1       5.5T  416G  5.1T   8% /media/usbhdd
********** Backup completed **********

Notice that the first job listed in the log file took much longer than subsequent jobs. This was because rsync had to copy every file to the external USB HDD. In subsequent runs it only had to copy new files and files that had changed since they were last copied.

The disk in the external USB HDD spins down after 10 minutes of inactivity and the drive goes into Power Saver Mode. Its LED blinks to indicate the drive is in this mode. Therefore the cron job only spins up and down the external HDD once per day.

Reference
1. Why does root cron job script need ‘sudo’ to run properly?

About Fitzcarraldo
A Linux user with an interest in all things technical.

2 Responses to Automatic backup of users’ files on a NAS device to an external USB HDD

  1. Great, I never thought about the drive bottlenecking when looking for USB3.0 HDD’s. Thanks for sharing all you have over the years!

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.