Recently I managed to acquire a few surplus hard drives so I decided to get a RAID device. Already owning a Mac mini that I use as a server I wanted to use directly attached storage, but the device I got (a Sonnet Fusion QR) was pretty terrible and didn't work properly, and their support was pretty rubbish too. I ended up getting a Synology DS1815+, an 8-slot NAS. We use Synologys a lot for clients at work and I've been unimpressed with their performance in the past - they are just powered by Intel Atom processors after all - but since the x15+ models now have quad core Atoms I decided to get one and I've been pleased with the performance so far. It's able to transcode 1080p video at 10Mbps rate (with Plex) smoothly whilst also running a virus scan, so I'm happy with it. I've also moved my Mac mini (and the Synology) into my kitchen (I may write a little post about my current IT setup because I'm quite pleased with it, and between these two I have plenty of horsepower for running plenty of things. Which begs the question - what's the point of having my Raspberry Pi running any more? Well, there is one thing I can do with it, which is using it as an off-site backup location.
I was messing around with my Raspberry Pi preparing stuff for this, and the SD card corrupted. Apparently, this is quite a common thing for these but it's the first time it's happened to me. I wiped it and reinstalled Raspbian, following some of the steps in my doing useful things guide (I'm glad I wrote it now!) to set it up, remove unnecessary packages and the like.
For the backups, I wanted to use a simple tool that I've been using a bunch at work recently, rsync. rsync is a command-line tool that will synchronise two folders, which can either be local folders or remote folders, connected either via SSH or to an rsync daemon. It's fast and reliable, and is the basis for many other applications - for instance, Carbon Copy Cloner on OS X uses rsync. It's perfect for my situation, as I can have rsync running on the Raspberry Pi and connecting to my Synology via SSH. That way I don't need to have any ports open wherever my Raspberry Pi is, as long as the Raspberry Pi can get onto the Internet it'll be able to connect, and I don't need to worry about it attempting to maintain an SMB connection or the like.
rsync, SSH configs and public keys
rsync has a whole bunch of options, and fortunately there's one modifier which summed up most of the options I wanted in the --archive or -a modifier. The whole command is as follows:
rsync -az isonoeroot:/volume1/Bethany/ /crap/Backups/Bethany/
That's modifiers, source, destination. This backs up anything I put in the shared folder called Bethany on my Synology, to a folder on my Raspberry Pi in the mount point "crap" which is a 500GB external hard drive. The colon symbol in the source denotes that this is an SSH connection, in the form serveraddress:/path/to/source/on/server/. It's connecting to "isonoeroot" as this is a config saved in ~/.ssh/config - just as how in the screenshot header at the top I was able to type "ssh euporie" and it used the information on Ananke (my Ubuntu laptop) in the ~/.ssh/config file to connect to Euporie (my Raspberry Pi). You can find more information about SSH config files at this page. In the case of how this interacts with rsync, I need to use the SSH config as I need to specify a different port for the SSH connection, and this is the simplest way of doing it.
You'll notice in the screenshot it also didn't ask for the password - this is also important for unattended use of rsync, as typically it'll ask for the password when connecting via SSH or rsync. The best way of having passwordless SSH access for rsync is with setting up a public key, then adding it on to the server that you're connecting to with the ssh-copy-id command. A nice little guide for this can be found here. Most distros of Linux have the ssh-copy-id built-in, but OS X doesn't. There's a version of it for OS X that's a bit less featured (for instance, I couldn't find a way to add a port number into the connection command) but it does the job at least.
Once the rsync command has been decided on, and it can be run as a single command without needing to put in a password, we need to find a way of scheduling it. Fortunately, Linux has a scheduling system built into it called cron that we can use. The easiest way to set this up (that I know of!) is to edit the crontab file, which in Raspbian is located at /etc/crontab. This allows you to set a schedule and a command to run on that schedule. The entire line I've used is:
# m h dom mon dow user command 0 2 * * * root rsync -az isonoeroot:/volume1/Bethany/ /crap/Backups/Bethany/
The commented line doesn't need to be added, I've just added it to show some of the syntax. This means it'll run at minute 0 of hour 2 (so, 2am) on every day of month, every month, every day of the week, and it'll run as root. Once the file is saved it'll automatically start working. And that's the basic setup!
There's two things I want to add to this - firstly, some way of allowing me to SSH into the Raspberry Pi without having to have an open port available at the destination (I'm hoping to either put this at a friend's place or at work, as then it'll act as an off-site backup) - I'm thinking perhaps some form of VPN may be suitable, but a bit unwieldy. Secondly, I want to set up some form of automatic log emailing but only in the event of an error. However, I've not yet had the energy to figure out either of these things. When I do, I'll be able to have this set up off-site knowing that I'll be able to connect to it as needed, and be alerted to any issues. I'll write an extra blog post once I get around to doing that!