Virtual Directories

Over the last few days I’ve been expanding AzureCopy to handle S3 and Azure Virtual Directories (and I’m sure the concept will be useful with other cloud storage providers as well).

The idea behind Virtual Directories (I just cant bring myself to say I’ve been dabbling with VD) is to imitate the appearance of a regular filesystem structure. ie, many levels of directories and files.

Both S3 and Azure have a fairly flat way of looking at the world. S3 has its “bucket” in which you upload all your blobs, no “sub buckets”, no directories, everything is simply in the bucket.

Azure is a little more refined with its “container” concept, in the fact you can make many containers each with their own set of blobs but you can’t have containers in containers. ie, all containers must sit off the root.

Fortunately both Azure and S3 provide an out of the box method to create the illusion of a complex tree structure. Bring on Virtual Directories (Azure terminology, but is the same for S3). In this case all that is happening is the blob name itself is allowed to have the ‘/’ character in the name.

Simple…..

So if I have the URL https://myazureacct.blob.core.windows.net/mycontainer/dir1/dir2/myfile.txt then what’s really happening is that I have a container called “mycontainer” but the blob name is really “dir1/dir2/myfile.txt”.

If I wanted to list all the blobs in the container I’d use the usual process of:

var container = client.GetContainerReference( “mycontainer”);

var blobList = container.ListBlobs();

But, if I wanted to list all the blobs that were in “mycontainer/dir1” then we have a couple of options. The first would be do the above code, then filter it manually ourselves, with code like:

var newBlobs = (from blob in blobList where blob.Uri.AbsoluteUri.StartsWith(“…./mycontainer/dir1”))

This would work fine, but fortunately the Azure Storage library provides a built in alternative. What we can now do is get the container, then tell it to get a Virtual Directory within that container.

var container = client.GetContainerReference( “mycontainer”);

var vd = container.GetDirectoryReference(“dir1”);
var blobList = vd.ListBlobs();

This will now list the blobs that are listed in the Virtual Directory “dir1”.

Obviously this post has had an Azure slant to it, but S3 provides a very similar piece of functionality.

My only question now is…   when the AzureCopy API’s return a blob name…   what do we say? “myfile.txt”  or “dir1/dir2/myfile.txt” ??

Am still trying to figure that out…

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s