In my previous post I examined how to use BlobSync to create a tool that not only uploads/downloads deltas to Azure Blob Storage (and hence saving LOTS of bandwidth), but also how to keep multiple versions in the cloud easily.
As a sample file for uploading/downloading I’ve picked the entire Sherlock Holmes collection. Big enough that it can show the benefits of dealing with deltas for bandwidth savings, but small enough that it can be easily edited (text).
Firstly, I perform the original upload.
Here you can see that the original sherlock file about 3.6M and for the initial upload the entire file is uploaded (indicated by the “Uploaded 3868221 bytes” message).
Then I list the blobs and it shows I only have 1 version (called “sherlock” as expected).
Now, I edit the sherlock file and modify a few lines here and there, and reupload it.
We can instantly see that this time the upload only transferred 100003 bytes. Which is about 2.6% of the original file size. Which is a nice saving.
Then we list the blobs associated with “sherlock” again. This time we see 2 versions:
- sherlock 8/01/2015 11:36:09 AM +00:00
- sherlock.v1 8/01/2015 11:36:01 AM +00:00
Here we see sherlock and sherlock.v1. The original sherlock blob that was uploaded was renamed to sherlock.v1. The new sherlock uploaded is now the vanilla “sherlock” blob.
Note: The timestamps still need a little work. The ones displayed are when blobs were copied/uploaded. This means that sherlock.v1 doesn’t have the original timestamp when sherlock was originally uploaded but when it was copied from sherlock to sherlock.v1. But I can live with that for the moment.
Now, say I realise that I really want to have a copy of the original sherlock. The problem is that my local version has been modified. No problems, now I can tell update my local file with the contents of sherlock.v1 (remember, thats the original one I uploaded).
The download was 99k (again, not the 3.6M of the full file). In my case the c:\temp\sherlock is now updated to be the same as the blob sherlock.v1 (ie the original file). How can I be sure?
Well, I happen to have a spare copy of the original sherlock file on my machine (c:\temp\sherlock-orig), and you can see from my file compare (fc.exe) that the original sherlock and my newly updated local copy are the same.
Now I can upload/download deltas AND have multiple versions available to me for future reference.
So, what happens with all my backups I don’t want? Well, you can always load up any Azure Storage Explorer program and delete the blobs you don’t want. Or you can use Icerberg to prune them for you.
Say I’ve created a few more versions of sherlock.
But I’ve decided that I only want to keep the latest 2 backups (ignoring the most current one). ie I want to keep sherlock, sherlock.v2 and sherlock.v3.
I can issue the prune command as such:
Here I tell it prune all but the latest 2 backups of the sherlock blob. I list the blobs afterwards and you can indeed see that apart from the latest (sherlock) there are only the 2 latest backups.
I’m starting to look at using this for more of my own personal backups. Hopefully this may be of use to others.