Like many of you, I’ve been a long-time subscriber to CrashPlan Home Family. But recently I received an e-mail that this is soon going to end. Like many before it, CrashPlan is cancelling their Unlimited Family subscription, no more multi-PC cloud backup! So I started thinking, how hard would it be to “build” a self hosted replacement?
This will be a multi-article guide on how to build your own multi-tenant Crashplan replacement for yourself or you and your friends and family.
No true cloud replacement available
While looking around online, there are lots of cloud vendors offering backup plans or storage plans, but none that come close to what CrashPlan Home Family offered for 150$ per year.
Either they are way more expensive, or intended for a single home computer, don’t allow a NAS to backup, don’t automatically select video files or files larger then 4GB, only keep one version or have a max retention of 30 days, etc. etc.
What features would a backup replacement need to have to truly replace Crashplan for me?
Features desired for the replacement service
- Multi-tenant and for those tenants, multi-client
- I want to give accounts to my friends and family to share the costs of hardware and allow everyone multi-pc backup in the same style as CrashPlan Home Family offered
- “Unlimited” in size or at least, quite big
- It needs to easily hold 10TB, 20TB or even 30TB of data
- Setting soft quotas per account would be a bonus feature and prevent “abuse”
- “Incremental forever” backups
- One long first full backup and after that I want to run quick incremental backups, preferably forever
- Deduplication and compression on the client, not the server
- File versioning
- Each backup should hold a new copy of a changed file so I can choose which version to restore
- Encryption key set on client, not shared with server side
- Whoever the admins are going to be, I need to not have to trust this person and be 100% sure he cannot access or view my data, a trust-no-one setup
I think that’s most of it, that list started out simple when I began looking for a replacement, but it turns out CrashPlan was actually quite features rich in what it provided!
Cost calculations Cloud VS D.IY.
Cheapest cloud solution
Let’s say I want to backup 6 PC’s (2x Desktop, Laptops (Mine, Girlfriend, Mother, Father), using Blackblaze, who seems to have the best cloud offering at this moment, this would cost me 300$ a year. Looking at a period of 3 years that would be 900$ and for 5 years it would come down to about 1500$. And for that price, I’m not even allowed to backup my server!
Looking at Blackblaze B2 we can again do the calculations. Let’s say we want to store 12TB of backups. Blackblaze B2 asks 0,005$/Month per GB . That means it would cost 60$ a month but we can backup unlimited computers (desktop, laptop) and servers/NAS devices. Calculating that it would cost 720$ a year to store all the data. For 3 years this would be 2160$ and in 5 years you would pay 3600$!
One thing I would also like to note, most of the “unlimited backup/storage” cloud vendors have stopped those plans this year. I don’t know if Blackblaze is going to stop offering their unlimited plans anytime soon, but betting on it is at least a bit risky at this point.
Self built DIY cloud backup
When doing some rough calculations you quickly find out that setting up a self hosted solution such as the one I’m proposing is going to easily cost you an upfront cost of around 1300$. That would include a server PC, CPU, memory, 20TB of disk space, etc.. This hardware allows scaling too, so scaling it to 30TB or 40TB would be quite easy.
The DIY cloud backup solution would have almost no limits regarding how many clients you can backup or what type of clients. Also, the way we’re going to set it up, it’s going to be very easy to share this solution with friends and family. If they can pitch in a little bit, it soon becomes (much) cheaper then a cloud setup.
Sharing the DIY cloud backup solution
So I adjusted my requirements and I looked into making a solution with the above list where the most important feature would be the “Encryption key set on client, not shared with server side” part. With that, parts of the environment can be shared. Because of the encryption and the way we’ll setup tenants, everyone can manage their own data and it becomes a true trust-no-one setup.
And that’s an important fact, when sharing things with friends and/or family, you don’t ever want to be in the situation where you need to talk about if they trust you or not. Better to avoid that situation all together by making it impossible in the first place.
You need a remote location
One of the only prerequisites the self hosted setup needs is that you have a remote location where you can put the hardware which has an internet line you can use.
In my case, my parents have a 100/100 fiber connection so they are the ideal candidate to host the server. But even with asymmetric connections like 150/15, at the server side most data traffic will be ingested through the 150Mbps downlink. Only when restores are required will the 15Mbps be needed. So during normal operations, when backups are made, the download speeds will almost always be more then sufficient. Most often the client side will be limited by the upload speeds their service has.
Running this server at home
You can run this server at home but in my opinion you want backups to be outside of the same location where you keep all your original data. In my opinion a backup should survive when a fire or flood comes to destroy your home. So maybe you could run your server at a friends home or a family relative? Myself I’m going to be placing this server at my parents home and in return they can make their backups to my server. As a bonus, they can use this same box as a NAS locally!
Software evaluated and the road to Minio with Duplicati 2
UrBackup fell off the list after building and testing it for 2 days. It does a lot of things right but sadly does not offer any form of client-side encryption abilities. A key factor in being able to share the same environment with a “trust-no-one” setup.
After that I moved on to BURP and it showed a lot of potential! Especially BURP2 with protocol version 2 worked quite well. All the warnings that this was not production ready and some other inefficiencies kind of still left me wanting though. Also the interface was not very intuitive for non-tech people.
I even tried to build my own variants using Opendedup and combining other software packages to alleviate these inefficiencies, but never quite succeeded. I learned a lot evaluating both software packages and they all have their own strong points.
And this also changed my perspective a lot! I was trying to find one piece of software which would have a client and a server module and as it turns out, that limits down what’s available quite a bit. As soon as I realized that I could 2 different packages for that but just needed a compatible method of communication, the new plan was born!
Introducing Duplicati 2
After a while and resetting my vision a couple of times of how I wanted things to work I stumbled onto Duplicati 2. This backup software package does a lot of things right!
It has a client for Linux and Windows and even Synology, it does client-side encryption with no key on the server and on top of that also deduplicates and compresses! It also keeps versions of all files in all backups and has a nice and clear interface to manage. And on Windows you can even enable VSS snapshots to protect open files such as PST files, etc..
After running some tests against a SFTP/SSH server I liked it quite a bit but it left me wanting in the multi-tenant multi-client experience. Setting up shell accounts and directories for a lot of people would be quite a pain and I wasn’t going to let other people log into the server also making the solution too complex.
But as it turns out Duplicati 2 can use a very wide variety of storage backends! Next to normal FTP or SFTP/SSH it can also use a variety of cloud vendors and general storage vendors which offer S3 storage.
Different client software
If you don’t like Duplicati, there are several other open-source backup tools available like Duplicity or Restic which can also talk to an S3 backend and do client side encryption, compression and deduplication, making them all good candidates.
Or if for instance you have a Synology NAS, their Hyper back software also natively supports backing up to S3 storage! So it’s all compatible with what we’re building!
After searching for a storage backend that can run in combination with Duplicati to offer me more functionality, I found Minio. Minio is a lightweight S3 storage backend you can run on Linux or Windows.
Each Minio instance also only uses a single TCP port so firewall configurations can remain simple.
While I am familiar with CEPH and large object stores like it, those are often way too resource intensive and like CEPH designed for a completely different scale.
Minio on the other hand is a very simple S3 backend server with some very nice features and best of all, really low resource utilization! After setting it up and configuring it, I liked it a lot and by starting a Minio instance per tenant (officially supported), each tenant can arrange their own buckets into which to direct separate backup clients. Each client or rather even each backup can use it’s own encryption key which is only kept client-side. Making all data stored on the server unviewable by the server admin or any other user with access to the system or disks.
Combine this with ZFS datasets and Quota’s and it my ideal DIY backup solution was born!
I’ve been running the combined setup in my VM environment for a few days now and it’s been perfectly stable and I’ve been completely satisfied with performance/resource utilization and the way Duplicati 2 works in combination with Minio.
One of the big advantages above Crashplan for instance is that Duplicati client uses about 80MB of memory instead of the 800MB or more Crashplan’s Java client would often use.
Remote manageability isn’t up to par with Crashplan at this point but by configuring status e-mails on all clients gives you the ability to easily log all activity and verify all clients are making their configured backups just like I used to do with Crashplan.
Multi-Part Blog Posts
Since this is quite a lot to explain, this is going to be a multi-part post! Please see the index below to continue on to the next one!
- (This Post) DIY cloud backup: Replacing CrashPlan Home Family DIY style
- DIY cloud backup: Server and storage hardware
- DIY cloud backup: OS and Storage configuration
- DIY cloud backup: Installing and configuring the server
- DIY cloud backup: Installing and configuring a client
- DIY cloud backup: How to perform a restore and Tips and Tricks
*Currently unlinked articles are not yet complete, but I’m working on them!