Converting Wheatley (Slack bot) to Azure Functions!

I’ve first created the basis of Wheatley (my Slack bot) a few years ago during a hackathon. Now, I’ve used it at multiple companies helping me with various ops processes as well as adding in any little functionality that will help my co-workers. Normally I run it in a VM (both Windows and Linux, being that it’s Go based I can compile for most things), but have recently decided to add an option of running it as an Azure Function.

I was impressed how trivially easy this was. Kudos to both Microsoft for finally making it easy to make a Go based Azure Function, but also to Slack for allowing various alternate APIs to integrate against.

For details about making an Azure Function in Go, please see my previous post. What I’d like to highlight here is the Slack specifics I had to do to get AFs working properly. The initial hurdle was that Wheatley only used the RTM (real time communications) protocol, which is basically a fancy was of saying websockets. Now, my main aim for using Azure Functions is that I only wanted to have it running when I needed it and not to have hosted compute always running and always having connections to the various clients. Fortunately, Slack has an alternate option called Event API. Events API is basically just falling back to the good old REST protocol…. given how infrequent the messages really are in Slack (in the big scheme of things) REST works for me nicely.

Jumping across to my Slack library of choice, Slack-go also provides Event API functionality as well as the already used RTM functions. Cool… no switching libraries!

Basically the way Wheatley is designed, very little of it is Slack specific. Sure, receiving and sending messages are obviously Slack specific but those are just some very small touch points. Most of the code is integrating with other systems which has absolutely nothing to do with Slack. So, let’s look at the older RTM based code.

api := slack.New(slackKey)
rtm := api.NewRTM()
go rtm.ManageConnection()

for msg := range rtm.IncomingEvents {
  switch ev := msg.Data.(type) {
  case *slack.MessageEvent:
    originalMessage := ev.Text
    sender := ev.User
    
    // lets just echo back to the user.    
    rtm.SendMessage(rtm.NewOutgoingMessage(originalMessage, ev.Channel)
  default:
  }
}

So, we have some pre-defined slackKey (no, not going to tell you mine), we establish a new RTM connection and basically just sit in a loop getting the latest message and replying. Obviously Wheatley does a lot more, see github for the exact details.

So effectively we just need something similar without the websockets shenanigans.

There’s a bit more handshake ceremony going on, but really not much. Instead of 1 token (above called slackKey) there are 2. The one already mentioned and another called the verification token. This token is used to confirm that the messages you’re receiving are actually for this particular instance of the bot.

Fortunately our HTTP handle func is the same type we’re all used to in Go. The highlights of the function are follows:

var slackApi = slack.New(token)   // same token as RTM... 
func slackHttp( w http.ResponseWriter, r *http.Request) {

  // read the request details.
  buf := new(bytes.Buffer)
  buf.ReadFrom(r.Body)
  body := buf.String()
    
  // ug, hate the wordpress formatting, but basically we're using the 
  // Slack-go API to parse the 
  // event we've received. In this part we also confirm that the 
  // VerificationToken we received matches
  // the one we already have from the Slack portal (variable 
  // verificationToken)
  eventsAPIEvent, e := slackevents.ParseEvent(json.RawMessage(body), 
    slackevents.OptionVerifyToken( 			
    &slackevents.TokenComparator{VerificationToken: 
    verificationToken}))
  if e != nil {
        
    // verification token not matching... bugger off.
    w.WriteHeader(http.StatusUnauthorized)
    return
  }
    
  // Check that we're the bot for this acct.
  // Taken directly from the slack-go event api example :)
  if eventsAPIEvent.Type == slackevents.URLVerification {
    var r *slackevents.ChallengeResponse
    err := json.Unmarshal([]byte(body), &r)
    if err != nil {
      w.WriteHeader(http.StatusInternalServerError)
    }
    w.Header().Set("Content-Type", "text")
    w.Write([]byte(r.Challenge))
  }
    
  // Dealing with the message itself.
  if eventsAPIEvent.Type == slackevents.CallbackEvent {
    innerEvent := eventsAPIEvent.InnerEvent
    switch ev := innerEvent.Data.(type) {
    case *slackevents.MessageEvent:

      // return 200 immediately... according to https://api.slack.com
           /events-api#prepare
      // otherwise if we dont return in 3seconds the delivery is 
      // considered to have failed and we'll get another
      // message. So can return 200 immediately but then the code that 
      // processes the messages can
      // return their results later on
      w.WriteHeader(http.StatusOK)
            
      originalMessage := ev.Text
      sender := ev.User
            
      // again, we'll just echo it back.
      slackApi.PostMessage(ev.channelID, slack.MsgOptionText( 
        originalMessage, false))
    }
  }
}  
  
    
   

If you’re interested in the real Wheatley version (that’s wrapped in the Azure Function finery) then check on github.

The most awkward part is getting the bot permissions correct in the Slack Portal. So far for the basic messaging I’m needing the permissions of: users:read, app_mentions:read, channels:history, chat:write, im:history, mpim:history, mpim:read are useful. These are set in both the Event API part of the portal and the OAuth section.

After a few more days of testing this out on my private Slack group I think Slack + Wheatley + Azure Functions are ready to be unleashed on my co-workers 🙂

Remote execution of Powershell on Azure VM

From an operations point of view remotely executing commands on a machine is critical for anything beyond a few machines. In the Windows world the way I’ve usually done this is allowing remote powershell…. but I’ve recently realised (I’m slow on the uptake) that I can do this the Azure CLI. If I can do it with the Azure CLI (az) it means there is a REST API… If there is a REST API it means I can tinker.

Proceed with the tinkering!!

First thing’s first. The az command to achieve this is:

az vm run-command invoke --command-id RunPowerShellScript --name my-vm-name -g my-resourcegroup  --scripts 'echo \"hello there\" > c:\temp\ken'

Now the fun bit. To run some arbitrary bit of powershell ( which is scary enough ) the REST endpoint is :

https://management.azure.com/subscriptions/xxxx/resourceGroups/xxxxx/providers/Microsoft.Compute/virtualMachines/xxxxxx/runCommand?api-version=2018-04-01

with the usual substitution in of subscriptionID, resource group name and VM name.

You POST to the above URL with a body in the format of:

{"commandId":"RunPowerShellScript","script":['<powershell>']}

So the powershell could be the REAL commands…. format a drive, start the bitcoin miner etc etc…. OR… in my case I simply want to execute the powershell that has already been installed on the remote machine and has been verified as safe 🙂

I’m going to incorporate this remote execution into some tools I maintain for my own use (all in Go), so the entire program boils down to something pretty small. Firstly auth against a service principal in Azure, then with the generated token execute the POST. Auth with Azure is simplified with using my tiny AzureAuth project :

azureAuth := azureauth.NewAzureAuth(subscriptionID, tenantID, clientID, clientSecret)
azureAuth.RefreshToken()
template := "https://management.azure.com/subscriptions/%s/resourceGroups/%s/providers/Microsoft.Compute/virtualMachines/%s/runCommand?api-version=2018-04-01"

url := fmt.Sprintf(template, ps.subscriptionID, resourceGroup, vmName)

bodyTemplate := "{\"commandId\":\"RunPowerShellScript\",\"script\":['%s']}"
script := "c:/temp/myscript.ps1"
body := fmt.Sprintf(bodyTemplate, script)

req, _ := http.NewRequest("POST", url, strings.NewReader(body))
req.Header.Add("Authorization", "Bearer " + azureAuth.CurrentToken().AccessToken)
req.Header.Add("Content-type", "application/json")
client := http.Client{}
client.Do(req)

This is obvious with all the error checking removed etc (just to reduce the clutter here).

Now, one important thing to remember. If you want immediate (or even just vaguely quick-ish) execution of your scripts, executing Powershell via Azure REST APIs is NOT the way to achieve this. Even a simple Powershell script to write a hello world file might take 20 seconds or so.

The benefit of this (to me at least) is not enabling general remoting to the Azure VM, no fiddling with firewall rules etc etc. It’s using the Azure REST API (that I use for SOOO many other reasons), using the same type of authentication, same way to integrate into my other tools. It mightn’t fit everyones need but I think this will definitely be my remoting process going onwards (for Azure VMs)

See here for simple implementation. Don’t forget to create a service principal and assign it VM Contributor rights to the VM you’re trying to remote to!

Azure Functions and GO UPDATED!

FINALLY

Note: I forgot to mention that so far, due to a bug in the Go runtime, I can only get binaries created from Go 1.13 to work. Have filed a bug and will see how things go.

Azure has finally made it possible to write Azure Functions in any (well virtually) language you like. Yes, it’s just a redirection layer to your pre-compiled executable/script but hey, it works…. and it works with GO 🙂

Firstly, you’ll want to read about Azure Custom Handlers , also if you’re interested in Go, check out the samples . The samples include triggers for HTTP, Queues, Blobs etc. For now, I just want to focus on the HTTP triggers. They have seriously made this so easy, in particular running locally vs in an Azure Function is literally a line or two of changes.

Firstly, the image on the Azure Custom Handlers needs to be studied before we go anywhere.

Basically the Functions Host is just a redirection layer to our REAL function, which is basically a webserver. Yes, this is a hack… no question about it… BUT… it’s exactly what I’m after. We can use ANY language we want, as long as it handles HTTP request. Yes, there are overheads compared to not having this indirection layer, but really, I’m WAY more than satisfied with this approach. I’m just glad they didn’t insist we all use Docker containers for everything.

So, as long as we can run a tiny webserver we’re good. Fortunately, Go (and most languages out there these days) come with half decent HTTP servers built in.

For a simple Go example, I’m using:

package main

import (
	"fmt"
	"log"
	"net/http"
	"os"
)

func doProcessing( w http.ResponseWriter, r *http.Request) {
  fmt.Fprintf(w,"testing testing 1,2,3")
}

func main() {
	port, exists := os.LookupEnv("FUNCTIONS_HTTPWORKER_PORT")
	if !exists {
		port = "8080"
	}
	http.HandleFunc("/testfunction", doProcessing)
	log.Fatal(http.ListenAndServe(":"+port,nil))
}

It simply responds to a GET request to /testfunction with a string. Not exciting, but it will do for now. You see that the only change between local and Azure Function versions is the port. If the environment variable FUNCTIONS_HTTPWORKER_PORT exists, then it will use that as the port number, otherwise defaults to 8080 for local env.

Next there are 2 required files, host.json which basically says how the AzureFunction will run, ie what’s the executable that’s going to be the webserver. Mine is:

{
  "version": "2.0",
  "extensionBundle": {
    "id": "Microsoft.Azure.Functions.ExtensionBundle",
    "version": "[1.*, 2.0.0)"
  },
  "httpWorker": {
    "description": {
      "defaultExecutablePath": "azurefunctiongo.exe"
    }
  }
}

Where azurefunctiongo.exe is the executable generated from the above Go code.

Finally there is function.json, this is the binding about a particular function. So in my case I was interested in a HTTP trigger, so my function.json looked like:

{
  "bindings": [
    {
      "authLevel": "anonymous",
      "type": "httpTrigger",
      "direction": "in",
      "name": "req",
      "methods": ["get", "post"]
    },
    {
      "type": "http",
      "direction": "out",
      "name": "res"
    }
  ]
}

It can handle GET and POST (although my code is currently only GET). Incoming is HTTP and output is also HTTP. You might want situations where the input (ie trigger) is HTTP but the output is putting a message onto an Azure Queue for example. The Azure Customer Handler page linked above covers all of this.

Now, all of these just get uploaded to the usual wwwroot of an App Service Plan (will automate that soon) and away you go! (note, make sure you the exe and host.json are in the wwwroot directory, and the function.json is in a subdirectory of wwwroot called whatever your endpoint is called, in my case testfunction)

Now that Azure has this going, I can see SOOOOO many use cases for my projects. Thankyou thankyou thankyou Azure!!!

Alternate Azure Webjob Deploy….

Update:

Since original posting, webjobdeploy can now deploy Azure App Services as well (using -deploy appservice flag). Definitely becoming very handy 🙂 Although now might have to consider a rename 😉

———

I’ve recently found myself needing to deploy Azure webjobs fairly frequently. Now, once I have everything rigged up to Azure Devops everything will be easy, but for now as I’m tinkering on my local machine I’m after an easy way to do a quick deploy of one (or more) webjobs.

Now, of course I could just use the Powershell commands or the brilliantly useful AZ command provided by Microsoft…… but…… I wanted something that suits my rather awesome level of laziness. Enter webjobdeploy (or wjd, man I’m good at naming things).

The idea (whether good or bad) for webjobdeploy is to minimize how much you need to type to get your binaries deployed to Azure. Yeah, I know it shouldn’t be a concern… but it is.

Say you have the binaries for a webjob (here I’m talking about an executable that will run as a webjob) in c:\temp\mywebjob. The way you’d deploy it using webjobdeploy is:

webjobdeploy.exe -uploadpath c:\temp\mywebjob\ -appServiceName myappservice -webjobname mywebjob -webjobexename mywebjob.exe -username neveryoumind -password ditto

Hopefully the parameters are self explanatory, with possibly the exception of webjobexename . The API I use to upload the binaries needs to know what the name of the executable (whether exe, bat, cmd etc etc) is, hence that parameter.

So, you’ll end up with a webjob called “mywebjob” running under the context of the application service “myappservice” (which needs to be already created). So all is good and simple.

I modify my webjob and want to redeploy. Yes, I can “up arrow” and do that, or I’d like to preferably have an alternate method.

Webjobdeploy can store many of the parameters you provide, tied to the app service name you provide. For example, say I want webjobdeploy to store various parameters such as webjobname, webjobexename, username, password etc etc… I’d modify the above command to look like:

webjobdeploy.exe -uploadpath c:\temp\mywebjob\ -appServiceName myappservice -webjobname mywebjob -webjobexename mywebjob.exe -username neveryoumind -password ditto -store

Then for the second deployment I could use the command:

webjobdeploy.exe -uploadpath c:\temp\mywebjob\ -appServiceName myappservice

Here I’m telling webjobdeploy where to find the binaries (c:\temp\mywebjob) and which appservice to use. What happens in the background is that wjd checks to see if it has saved any config data for the app service specified (since deployments are really fairly specific to app services). If it finds credentials it will use those but any parameters passed in on the command line take precedence.

For example, say I’d already deployed the webjob as “mywebjob1” but now I wanted to deploy an alternate webjob so I could compare side by side, I could simply execute the command:

webjobdeploy.exe -uploadpath c:\temp\mywebjob\ -appServiceName myappservice -webjobname mywebjob2

Now I have the second instance of the webjob running almost instantly.

Yes, this really doesn’t give you anything Powershell or AZ doesn’t already provide…. but I do find it useful for rapid webjob turn arounds.

IMPORTANT NOTE: The configuration that is saved is currently NOT encrypted. This means your username/password are plain text (stored within your home directory). Yes this is bad… yes, this is hopefully only temporary. But if your local home directory is already compromised, then you might have bigger issues to hand. Not an excuse, just an observation 🙂

Dropbox and direct links

During some refactoring of AzureCopy I’ve decided to finally add Azure CopyBlob support for Dropbox. This means that locally you can run a command to copy from Dropbox to Azure Blob Storage and none of the traffic actually goes through where AzureCopy is running, huge bandwidth/speed savings!

The catch is that it appears (I’ve NOT fully confirmed this yet) that Azure CopyBlob doesn’t like redirection URLs, which is what I was receiving from Dropbox. I was generating a “shared” URL for a particular Dropbox file which in turn generates an HTTP 302 redirection and then gives me the real URL. Azure CopyBlob doesn’t play friendly with this. The trick is to NOT generate a “shared” URL but to generate a “media” URL. Quoting from the Dropbox API documentation: “Similar to /shares. The difference is that this bypasses the Dropbox webserver, used to provide a preview of the file, so that you can effectively stream the contents of your media.

Once I made that change, hey presto, no more redirects and Azure CopyBlob is now a happy little ummm “thing”.

Upshot is now I can migrate a tonne of data from Dropbox to Azure without using up any of my own bandwidth.

woohoo Smile

DocumentDB, Node.js, CoffeeScript and Hubot

For anyone that doesn’t already know, Hubot is Githubs ever present “bot” that can be customized to respond to all sorts of commands on a number of different messaging platforms. From what I understand (I don’t work at Github, so I’m just going by what I’ve read) it is used for build/deploy to production (and all other environments), determining employee locations (distributed teams) and a million other things. Fortunately Github has made Hubot open source and anyone can download and integrate it into Skype, Hipchat, Campfire, Slack etc etc. I’ve decided to have a crack at integrating it into my work place, specifically against the Slack messaging system.

I utterly love it.
During a 24 hour “hackday”, I integrated it into Slack (see details) and grabbed a number of pre-existing scripts to start me off. Some obvious ones (for a dev team) are TeamCity integration, statistics and statuses of various 3rd party services that we use and information retrieval from our own production system. This last one will be particularly useful for support, having an easy way to retrieve information about a customer without having to write up new UI’s for every change we do. *very* dev friendly Smile

One thing I’ve been tinkering with is having Hubot communicate directly with the Azure DocumentDB service. Although I’ve only put the proverbial toe in the water I see LOTS of potential here. Hubot is able to run anywhere (behind corporate firewall, out on an Azure Website or anywhere in between). Having it access DocumentDB (which can be accessed by anywhere with a net connection) means that we do not need to modify production services/firewalls etc for Hubot to work. Hubot can then perform these queries, get the statistics/details with ease. This (to me) is a big win, I can provide a useful information retrieval system without having to modify our existing production platform.

Fortunately the DocumentDB team have provided a nice Node.js npm package to install (see here for some examples). This made things trivially easy to do. The only recommendation I’d suggest is for tools/services/hubots that are read-only, just use the read only DocumentDB Key which is available on the Azure Portal. I honestly didn’t realise that read-only keys were available until I recently did some snooping about, and although I’m always confident in my code, having a read-only key just gives me a safety net against production data.

Oh yes, CoffeeScript. I’m not a Javascript person (I’m backend as much as possible, C# these days) and Hubots default language is CoffeeScript. So first I had to deal with JS and THEN deal with CoffeeScript. Yes, this last part is just my personal failing (kicking and screaming into the JS era).

An example of performing a query against DocumentDB in Node.js (in Coffeescript) follows. First you need to get a database reference, then a collection reference (from the DB) then perform the real query you want.

DocumentClient = require(“documentdb”).DocumentClient;
client = new DocumentClient( process.env.HUBOT_DOCUMENTDB_ENDPOINT, “masterKey”:process.env.HUBOT_DOCUMENTDB_READONLY_KEY} );
GetDatabase client, ‘(database) –>
  GetCollection client, database._self, ‘(collection) –>
    client.queryDocuments(collection._self, “select * from docs d where d.id = ‘testid’”).toArray   (err, res) –>
      if res && res.length > 0
        console.log(res[0].SomeData);

GetCollection = (client, databaseLink, callback) –> 
  collectionQuery = { query: ‘SELECT * FROM root r WHERE r.id=”mycollection”’};
    client.queryCollections( databaseLink, collectionQuery).toArray (err, results) –> 
      if !err
        if results.length > 0
            callback( results[0]);

GetDatabase = (client, databaseName, callback ) –>
  dbQuery = { query: ‘SELECT * FROM root r WHERE r.id=”mydatabase”’};
    client.queryDatabases(dbQuery).toArray (err, results) –> 
      if !err
        if results.length > 0  
            callback(results[0]);

Given CoffeeScript is white space sensitive and my blog editor doesn’t appear to allow me to format the code *exactly* how I need to, I’m hoping readers will be able to deduce where the white space is off.

End result is Hubot, Node.js and DocumentDB are really easy to integrate together. Thanks for a great service/library Microsoft!

Azure DocumentDB performance thoughts

Updated: Typos and clarifying collections.

I’ve been developing against Azure DocumentDB storage for over 6 months now and have to say, overall I’m impressed. It gives me more than Azure Table storage (great key/value lookup but no searching via other properties) but isn’t a 800 pound gorilla of Azure Database. For me it sits nicely between the two, giving me easy development/deployment but also lets me index which fields I like (admittedly I’m sticking with the default of “all”) and query against them.

Now, my development hasn’t just been idle curiosity with a bit of tinkering here and there, but is a commercial application that is out in the wild (although in beta) currently. It is critical that language support, tooling, performance and documentation quality is met. For the most part it has, I’m personally very happy with it and will push for us to continue using it where appropriate.

Initially DocumentDB was NOT available in the region where my Azure Web Roles/VM’s where running (during development we had Web Roles running out of Singapore but DocumentDB out of west-us). This was fine for development purposes but was a niggling concern that *when* will DocumentDB appear in Singapore? Well finally it did, and the performance change “felt” to improve.

Felt…  tricky word. I swear sometimes when I tinker with my machine it “feels” faster…  but it’s probably just mind over matter. (Personally I’d love to be involved in some medical trial where I end up with a placebo. I swear it would cure me of virtually anything… or at least I feel it would) Smile

Ahem, I digress. So it “felt” faster  once DocumentDB appeared in Singapore but I know others didn’t really notice any difference. Admittedly there are LOTS of moving parts in the application and DocumentDB is just one small cog in a big machine. Maybe I was bias, maybe I was the only one paying attention, maybe I was fooling myself? Time to crank out Visual Studio and see what lies/statistics and benchmarks will tell me.

One of our development accounts had enough data to make it mostly realistic (ie not just a tiny tiny sample of data which wouldn’t prove anything). But that was sitting in west-us…   so the benchmarks I took were slightly the reverse of what production was.

In production we have the VM/WebRole and DocumentDB in Singapore where as previously we have VM/Webrole in Singapore and DocumentDB in West-US. For the purposes of my benchmarking I’ve kept the DocumentDB in west-us (test data) and have 2 VM’s setup to do the testing. One in west-us and one in Singapore.

First, some notes about the setup. Originally we had 4 collections setup with a given DocumentDB account (for explanation of a collection, see here). The query was through the LINQ provider (using SQL syntax) with a couple of simple where conditions (company = x and userid = y type of thing). Very simple, very straight forward. The query was also only executed against one of the collections. The other collections had data but were not relevant for this query.

So, what did I find?

When the test was run on a VM in Singapore against DocumentDB in west-us, the runtime results were:

3916ms

3899ms

3904ms

3962ms

3928ms

3881ms

Giving an average of 3915ms

Where as running the same test in the west-us resulted in:

431ms

456ms

684ms

494ms

422ms

425ms

With an average of 485ms.

That’s an improvement of 88%. This really shouldn’t be a surprise, the Pacific ocean is a tad large. I bet all those packets got very soggy and slowed downWinking smile

Another change that I’ve been working on is merging our 4 collections into a single collection. It has been stressed by the DocumentDB team that collections are not tables. Regardless of this, when we setup our collections originally we did make them as if they were tables. ie a single type of entity would be stored in a single collection. Although I’ll eventually end up with just the single combined collection, during these tests all 5 collections all co-existed within the same DocumentDB account.

I’ve been modifying/copying the data from the 4 collections to a single “uber collection” which really is the way it should have been done in the first place. My only real source of confusion is when querying this combined collection how do we know what to serialize the response objects as?

ie if I perform a query and I get a mix of results (class A and class B), how do I deal with it? This really was an artificial problem. The reality is that my queries really didn’t change (that much). If I was originally querying collection 1 for results I’d always get back results serialized as a list of Class A objects. If I’m doing the same query against the combined collection I should still get the same results. The only change I did to the objects (and the query) was that in each Document stored in this combined collection I added a “DocType” property which was assigned some number (really enum). This way I could modify my query to be something like:   “….. original query…..  AND e.DocType=1”   etc.

This just gave me a little piece of mind that my queries would only return a single Document Type and that I wouldn’t have to “worry my pretty little head” over some serialization trickery later on.

So… what happened? Is a combined collection better or worse performance wise? A resounding BETTER is the answer. For the *exact* same data (just copying the documents from the 4 collections into the combined collection) and adding the DocType property I got the following results:

WebRole in Singapore with DocumentDB in west-us:

3598ms

3614ms

3624ms

3641ms

3563ms

3616ms

Giving an average of 3609ms. This is an 8% improvement.

For everything in west-us I then got:

144ms

155ms

136ms

185ms

159ms

136ms

With the average being 152ms. This is an improvement of 69%!!!!  HOW??? WHY???? (not that I’m complaining mind you). What appears to have happened is that regardless of compute vs storage location approximately 300ms has been shaved off the query time. ie The average for compute/storage in different locations went from 3915ms to 3609ms with a difference of 306ms. When we have compute and storage in the same location the averages were 485ms to 152ms, having a difference of 333ms.

I’ll be asking the DocumentDB production team for any advice/reasoning around this merely to satisfy my own curiosity but hey, not going to look a gift horse in the mouth.

When I get some time I’ll do some more tests to see if this DocType property I added somehow improve the performance. If I added that to the scenario where I had the 4 collections, would it speed things up? I can’t see how, since I’m just using it to filter document entity types and for the test when I have multiple collections I’m really only querying one of them (which has a single entity type in it). More investigations to follow…..