Azure and Go

Although a fan of the official Azure SDK for Go, I’ve found myself writing more and more smaller utils simply against the Azure REST APIs. Maybe it’s just me trying to get to grips with the SDK (I use it for a few weeks then leave it for months, then back again…. so it’s never fresh in my mind). But I always find myself going back to REST.

Whenever I go back to REST I also go through the usual dance (although fairly quick) of how do I authenticate, what’s the endpoints again etc etc. This time I want to document what I’m currently doing (as of end 2019) and maybe it will help myself (and others) in the future.

Firstly, let’s authenticate and see about an operation or two against Azure.

Prerequisites

Firstly, I’m assuming that you’ve already made a Service Principal (SP) to use against your specific Azure resource (see here ). Once you’ve got a SP you’ll have 4 pieces of information you’ll need for authentication:

  • Client ID
  • Client Secret
  • Tenant ID
  • Subscription ID

Once you have those 4 pieces of info, you’re good to go.

Authentication

All Azure REST calls require the Authorization header to be set in the HTTP request. To generate the correct token to insert, you simply need to generate an OAuth2 request to the Microsoft login URL.

 urlTemplate := "https://login.microsoftonline.com/%s/oauth2/token"
 bodyTemplate := "grant_type=client_credentials&resource=https://management.core.windows.net/&client_id=%s&client_secret=%s"
 url := fmt.Sprintf(urlTemplate, tenantID)
 body := fmt.Sprintf(bodyTemplate, clientID, clientSecret)
 request, _:= http.NewRequest("POST", url, strings.NewReader(body)

The above is the basics of generating a token that can then be used for all subsequent calls. The response from the POST will return 2 key pieces of information, one is the actual token string itself and the other is the expiry of the token. This way we can use the same token for a certain duration without having to go through token generation again and again.

Once we have the token, we simply create a header “Authorization : Bearer ”

Before going any further, it would be remiss of me not to mention the UTTERLY BRILLIANT site of https://docs.microsoft.com/en-us/rest/api/appservice/ . This is basically the Azure REST API documentation site, but most importantly it allows you try literally try the API out (using your own acct). Being able to see the HTTP requests (headers, body etc etc) and see the responses come in is invaluable.

For example, for list app settings in an app service:

appsettings1

generates the request

appsettings2

which produces the results

appsettings3

Now, I won’t go through the unmarshalling etc that’s required (but trust me,JSON to Go is your friend) but you get the idea. In this case we have 2 appsettings key/value pairs set, highlighted in red.

So, now we know what the query should look like and the response we should get, let’s code it. Firstly the query.

template := "https://management.azure.com/subscriptions/%s/resourceGroups/%s/providers/Microsoft.Web/sites/%s/config/appsettings/list?api-version=2019-08-01"
url := fmt.Sprintf(template, subscriptionID, resourceGroup, appServerName)
req, _ := http.NewRequest("POST", url, nil)
req.Header.Set("Content-Type", "application/json")
client := http.Client{}
req.Header.Set("Authorization", "Bearer "+accessToken)
resp, _ := client.Do(req)

 

Note, yes I find it a bit weird that to retrieve (ie GET) the app settings we need to do a HTTP POST, go figure :/

Anyways, if we’ve got our subscription, resource groups etc etc then generating the query required is very basic. Once again we’ll just get back the raw JSON and unmarshal back to a suitable struct.

The most important thing I think you need to remember is go to the documentation pageand use the “Try it” button. Seeing the request/responses against your own services makes everything so much easier before you write a single line of code!

 

 

Azure Storage Tools

After quite a number of comments about having some consistent tools to perform basic operations on Azure Storage resources (blobs, queues and tables) I’ve decided to write up a new set of tools called “Azure Storage Tools” (man I’m original with naming, I should be in marketing).

The primary aims of AST are:

  • Cross platform set of tools so there is a consistent tool across a bunch of platforms
  • Be able to do all the common operations for blobs/queues/tables.
    • eg. for blobs we should be able to create container, upload/download blobs, list blobs etc.  You get the idea.
  • Be completely easy to use, I want to be able to provide a simple tool with a simple set of parameters that commands should be fairly “guessable”

Instead of a single tool that provides blobs, queues and tables in one binary, I’ve decided to split this into 3, one for each type of resource. First cab off the rank is for blobs!  The tool/binary name is astblob (again, marketing GENIUS at work!!).

If a picture is worth a thousand words, here’s 3 thousand words worth.

 

listcontainers

Firstly, in this image we have 3 machines all running astblob, a Windows machine, Linux machine and OSX machine. Each of them are connected to the same Azure Storage container (configured through environment variables), and we’re simply asking for the containers in that Azure Storage account.

Easy enough. Now, for some more details.

listcontainercontents

Here we’re asking for the blob contents of the “temp” container. Remember Azure (like S3) doesn’t really have the concept of directories within containers/buckets. The blobs have ‘/’ in them to “fake” directories, but really they’re just part of the blob name.

So now what happens if we download the temp container?

 

download

Here we download the temp container to some place on the local filesystem. You’ll see that the blob name that had “fake” directories in it’s name has actually had the directories made for real (executive decision made there… by FAR I believe this is what is wanted). You’ll see in both Windows and the *nix environments that that “ken1” directory was made and within them (although not shown in the screen shot) the files are contained within.

AST blob is just the first tool from the AST suite to be released. The plan is that Queue and Tables will follow shortly, also for more functionality for Blobs to be released.

The download for AST is at Github and binary releases are under the usual releases link there. Binaries are generated for Windows, OSX, Linux, FreeBSD, OpenBSD and NetBSD (all with 386 and AMD64 variants) although only Windows, Linux and OSX have been tested by me personally.

The AST tools (although 3 separate binaries) will all be self contained. The ASTBlob binary is literally a single binary, no associated libs need to be copied along with it.

Before anyone comments, yes the official Azure CLI 2 handles all of the above and more but it has more dependencies (rather than a single binary) and is also a lot more complex. AST is just aimed at simple/common tasks….

Hopefully more people will find a consistent tool across multiple platforms useful!

Go embedded structs and pointer receivers

While working on a PR for the Azure storage GO library I discovered (yet another) corner of Go that confused the hell out of me. I know enough about embedded types/structs to use them effectively (so I thought) and I know the difference between receivers and pointer receivers. But when using them together you really need to make sure you’re careful.

The specifics of the problem I hit was that in the Azure GO library we have the struct called “Entity” which is used for table storage. Now, for purposes I wont go into I needed to create a another struct called BatchEntity which was declared as such:

type BatchEntity struct {
Entity
otherstuff string
}

Now, the Entity struct has a very useful JSON formatting method with the signature:

func (e *Entity) MarshalJSON() ([]byte, error)

Please please PLEASE note that this is a pointer receiver! Now I’d spend an embarrassingly long time trying to figure out why when I tried to marshal my BatchEntity class, why it didn’t call this MarshalJSON method as I expected. Of course it turned out that when I passed the marshalling method BatchEntity.Entity (referencing the embedded type) it would *somehow* (I haven’t figure out the specifics yet) treat it as a regular receiver and not call the MarshalJSON method since that specifically had a pointer receiver. I still need to dig into the details but this is a big warning to me, when dealing with embedded types make sure you check the receivers of the base structs. Would have saved me hours!!!

Kind of surprised there wasn’t a compile warning/error around that….   but I need to dig further.

AzureCopy Go now with added CopyBlob flag

Azurecopy (Go version) 0.2.2 has now been released. Major benefit is now that when copying to Azure we can use the absolutely AWESOME CopyBlob functionality Azure provides. This allows blobs to be copied from S3 (for example) to Azure without having to go via the machine executing the instructions (and use my bandwidth!)

An example of copying from S3 to Azure is as simple as:

azurecopycommand_windows_amd64.exe  -S3DefaultAccessID=”S3 Access ID” -S3DefaultAccessSecret=”S3 Access Secret” -S3DefaultRegion=”us-west-2″ -dest=”https://myaccount.blob.core.windows.net/mycontainer/” -AzureDefaultAccountName=”myaccount” -AzureDefaultAccountKey=”Azure key” -source=https://s3.amazonaws.com/mybucket/ –copyblob

The key thing is the –copyblob flag. This tells AzureCopy to do it’s magic!

By default AzureCopy-Go will copy 5 blobs concurrently, this is to not overload your own bandwidth, but with Azure CopyBlob feature feel free to crank that setting up using the –cc flag (eg add -cc=20) Smile

Azure Functions (and Go)

UPDATE:

Ignore this post and check out this instead!


I’m a HUGE Azure Webjob fan, they’re so useful for a wide variety of scenarios. Being able to setup a Webjob listening to an input queue and just throw messages at it is such a nice way to do adhoc data processing. Azure Functions is a nice improvement over Webjobs (but basically do the same thing). Whereas Webjobs are constantly running in the context of you Application Service (and potentially using up your App Service quotas), Azure Functions aren’t tied to App Service plans and best of all, you’re only charged for when they’re actually processing something. Setup them up and if they don’t process anything for a month, you get charged nothing. You don’t need to tear them down and restart them, they’re just there ready and waiting.

There are plenty of tutorials/pages on Azure Functions, so I wont go into the general details of how they work, but what I will touch on it how to run Go in them. Now, Go within Azure Functions isn’t a first class citizen. Javascript and C# are first class citizens; Azure Functions know how to execute those and it does that well. One nice feature Microsoft has provided is the ability to run batch files. Batch files can in run anything <queue evil laugh/>

What I’m describing isn’t specific to Go, infact I blatantly stole it from an Azure website describing how to do this with Java. But I’m not a Java fan-boy…

Firstly, setting up an Azure Function is pretty straight forward (under a minute to complete). Essentially go to the Azure Functions Portal (you can also do it via the regular Azure Portal) , give your function a name and region then hit “create”. It’s THAT easy!

You’ll get redirected to the Azure Portal, select “Create your own custom function”

customfunction

Then you have a LOT of options for triggering the processing of data. Expand to see all the options, then select QueueTrigger-Batch (if you want to use queues that is, which I do).

queuetrigger

Now that we have configured the function will really be triggered by a batch file we can make it in turn call our Go compiled binary (yes, I keep talking about Go here, but in reality this will work for any executable regardless of the language it was coded in).

azurefunction1

The input queue message is converted into an environment variable (see line 2), then we call azurefunction.exe. Where does Azure Functions know about azurefunction.exe?  Click on the “view files” then “upload”, then you can upload any files required for processing the messages.

One last thing to do in the portal is to specify how the Azure Function returns its results. In my case I also want the output to be via Azure Queues. To enable this, select Integrate on the left menu, select New Output and then Azure Queue Storage.

output

Now, EVERYTHING so far has had nothing to do with Go. So now we’ve done the Azure plumbing, let’s actually Go do some work (BOOM BOOM!)

In this example my super sophisticated Azure Function will count the number of characters in the input message and send that result to the output message. Rocket Surgery for the win!

The entire source for my Go application is:

package main

import (
“fmt”
“log”
_ “net/http/pprof”
“os”
“strconv”
)

func main() {
inputMessage := os.Getenv(“inputMessage”)
outputMessageLocation := os.Getenv(“outputQueueItem”)

if inputMessage != “” {
// just write the length to the queue location. Just to prove this works.
cacheFile, err := os.OpenFile(outputMessageLocation, os.O_WRONLY|os.O_CREATE, 0)
if err != nil {
log.Fatalf(“Unable to open file %s %s”, outputMessageLocation, err)
}
l := len(inputMessage)
s := strconv.Itoa(l)
cacheFile.WriteString(s)
}
}

=============================

The code is extremely simple. The input messages comes from the “inputMessage” environment variable. We get the location of the output file via the “outputQueueItem” environment variable. We create a file at that location, write the results (length of input) and hey presto, one usable Azure Function!

Now, we compile this and upload via the Azure portal (mentioned earlier). Now for testing!

Select “Develop” on the left hand menu on the Azure Portal (see screenshot below) and then on the “test” button on the right hand side. This will allow you put some test inputs on the queue and check the results.

develop

Enter some test data. In the above case I just entered “abcde” then hit RUN. You can see some of the debugging in the lower window. The input has been read, data processed and result thrown onto a queue (check your azure queue with your favourite azure viewing tool).

Now, given this is all just a batch file wrapper, you can see that the time to execute isn’t brilliant. You can see the entire process was about 750ms, and 600ms of that seems to be just the firing up of the azurefunction.exe executable from the batch file. Still, I think that being able to set these up, and “almost” fire and forget about them is really nice. You’re only charged for the time they’re running.

I’ll definitely be tinkering with these more, I can see them being hugely useful!

Lets GO OS crazy with AzureCopy-Go

Being able to copy from one cloud provider to another is useful, but if everything is purely serial (ie one blob at a time) the time taken to copy everything might be less than stellar. I’ve now released a new version of AzureCopy-Go (0.2.1) which allows concurrently copying of blobs. The default is 5, but using the –cc flag (concurrent copying) it can be expanded up to 1000 (arbitrary max limit). So far, so good!

Also, for this release I’ve build the binaries using the AMAZING Gox project. This allows for easy cross compiling for Go. So we now have Linux, FreeBSD, NetBSD, OpenBSD, Darwin (MacOS) and Windows binaries. For the most part we have 3 variations of each platform binary, one each for ARM, AMD64 and 386.

I knew how to get cross compiling with Go on Linux/MacOS but could never get it working on Windows (current main OS). Gox is definitely a time saver and is so damned easy to use.

Please give AzureCopy-Go 0.2.1 a try if you have any S3 <—> Azure migration needs. More features being worked on every few days.

Azurecopy (GO version) pre-release

As mentioned in previous posts, I’ve been writing a GO version of AzureCopy so people would have something that works cross platform (Linux, MacOS and Windows). Today I’ve released the first pre-release just to test the waters. It supports Windows only (simply due to not having compiled up the other platforms yet), and only supports local filesystem, S3 and Azure Blob Storage.

Baby steps.

The plan is to build for Linux,OSX, then start adding other cloud platforms. Meanwhile the original Azurecopy (Windows only, full dotnet framework) will still be developed (mainly from a Nuget/library point of view).  If you just need an executable to perform copying, then I suggest using this newer version.

Some examples of using this  newer version:

image1

In this case we’re just listing the contents of my testken123 (super secret) bucket. My AccessID and AccessSecret are passed in via command line options. The output format is in a basic tree structure (will add in a bog-standard list soon). In the above case, the top of the tree is “testken123” which is the bucket name. Under that we have 2 virtual directories (remember Azure/S3 etc do not really have directories but fake it by using / as a delimiter). In this case we see there is a blob called “ken1/test1” which treats the “ken1” part as a directory and “test1” as the blob name. Same applies for all the other results.  Simple enough.

Then we have:

image2

In this case we’re copying from my local filesystem (c:\temp\data\s3\) into the S3 bucket testken123. The console output is just to show what is going to be copied. Output will be modified to show progress.

Finally we have:

image3

That’s coping from Azure Blob Storage to S3. Same deal, basic output.

For every command it is possible to pass the “-debug” flag. This makes things VERY verbose but is extremely useful for figuring out issues.

This is just a first step, pre-release, uber new version. Please give it a go and let me know if there are any issues. The plan is to start cranking out changes pretty frequently.

0.1.0 version

AzureCopy GO

The Go version of AzureCopy is slowly making progress. So far I’ve just been focusing on local filesystem and Azure (since I can do those while offline on the train commute thanks to the Azure Storage Emulator). The next plan is for S3 integration, primarily because S3 –> Azure seems to be the big use case for the original AzureCopy.

I’m planning on frequent releases once the basic S3 code is added (hopefully within the next few days). Not all features from the original AzureCopy will be available, but will simply be focusing on 1) list content and 2) copy content. There will be a few new additions such as a “don’t overwrite” flag so copies can be continued after being stopped (has been requested by a few people).

Ofcourse, the original AzureCopy will still be developed (mainly from a Nuget packaging point of view) but if you just need a command line tool to copy (and maybe need it on multiple platforms) then this new version is probably the way to go.

Hopefully the S3 code will drop in a few days then I’ll have a first binary release for Linux, MacOS and Windows, and see how things proceed from there.

Adventures in GO!

I’ve dabbled (ok ok, writing and rewriting “hello world” many times) in Go for a few years but have never really given it a serious Go (boom boom!) But after buying a GO in Action and going through a number of great Pluralsight courses (particularly by Nigel Poulton and Mike Van Sickle ) I’ve decided to give it another crack.

Instead of going through various tutorials I’ve decided to try porting (well more likely rewriting from scratch) my AzureCopy project. The original AzureCopy is all C# running on the .NET Framework 4.*. Although I DO (well did until recently) want to get it migrated to DotNET Core I thought it would be a good chance to learn Go PROPERLY.

I’m still trying to get my head around OO in a “kinda-is, kinda-isn’t, sorta, maybe” OO language like Go. Going back to structs (ahh glory days of C/C++), interfaces and having the magic of pointers back is really giving me a nostalgia kick.

The rough outline for this AzureCopy rewrite is basically as follows:

  • Get my dev environment sorted out (currently VSCode)
  • Basic solution structure sorted, rough architecture
  • Be able to copy to/from the local filesystem to Azure Blob Storage
  • List blobs/containers in Azure
  • Add S3
  • Add DropBox
  • Add OneDrive

Really don’t think I’ll bother with Sharepoint this time around, was a bitch to maintain in the existing version.

I’m unsure what the Go support is like with those cloud providers etc. I know Azure one seems mostly there (well for the stuff I need) but I get the distinct impression it’s the poor cousin to .NET, Java, Python etc.  I’ve yet to investigate S3’s Go offerings. Hopefully if these libs aren’t in a great shape I might get a chance to finally get my name on a contributors list somewhere. Smile

I’m sure my Go will suck…  but am hoping it will get better. The new version of AzureCopy is ofcourse on Github.

Dropbox and direct links

During some refactoring of AzureCopy I’ve decided to finally add Azure CopyBlob support for Dropbox. This means that locally you can run a command to copy from Dropbox to Azure Blob Storage and none of the traffic actually goes through where AzureCopy is running, huge bandwidth/speed savings!

The catch is that it appears (I’ve NOT fully confirmed this yet) that Azure CopyBlob doesn’t like redirection URLs, which is what I was receiving from Dropbox. I was generating a “shared” URL for a particular Dropbox file which in turn generates an HTTP 302 redirection and then gives me the real URL. Azure CopyBlob doesn’t play friendly with this. The trick is to NOT generate a “shared” URL but to generate a “media” URL. Quoting from the Dropbox API documentation: “Similar to /shares. The difference is that this bypasses the Dropbox webserver, used to provide a preview of the file, so that you can effectively stream the contents of your media.

Once I made that change, hey presto, no more redirects and Azure CopyBlob is now a happy little ummm “thing”.

Upshot is now I can migrate a tonne of data from Dropbox to Azure without using up any of my own bandwidth.

woohoo Smile