Building a Cloud Native Task Execution Framework on Azure

In this post we are going to be looking at how to build a consumption-based task execution framework that can scale elastically as demand rises and falls.

The architecture diagram for the complete framework is shown below:

Architecture Diagram
Architecture Diagram

In this post we will only be exploring the core components of the framework, namely:

  1. Task Runners – Docker containers running on Azure Container Instances
  2. Task Orchestrator – Azure Functions App

Task Runners

At a very high level, a task runner is a docker container that will download and execute a task.

What is a task?

A task can be anything ranging from powering off a virtual machine to deploying a new service to a Kubernetes cluster.

For our framework, a task is any *nix executable that is stored in an Azure storage container. This executable will read input from environment variables.

To keep things simple, let us take the example of turning off a virtual machine as our task. The bash script for this task shown below:

#! /bin/bash
az login --identity
az vm stop --resource-group "${ResourceGroupName}" --name "${VMName}"

The script logs in to Azure using the assigned identity and uses the “az vm stop” command to stop the virtual machine. For this script to run, we will need the Azure CLI.

What are Task Runners?

A task runner is a docker container that contains all the dependencies needed for a task to execute and it will download and launch the task executable.

For our example of turning off a virtual machine, the only dependency we have is the Azure CLI. Let us create a docker image that contains the Azure CLI and call it “lalitadithya/az-cli-image”.

The Dockerfile for this image is shown below:

FROM mcr.microsoft.com/azure-cli
WORKDIR /scripts
COPY script-runner.sh .
ENV PATH="/scripts:${PATH}"
RUN chmod a+x script-runner.sh
CMD ["./script-runner.sh", ""]

The docker file copies a bash script called “script-runner” from the host, sets the executable permissions for the script and then executes the script. The “script-runner” is shown below:

#! /bin/bash
wget -O script.sh $1
chmod a+x script.sh
./script.sh

The script runner downloads the task script/executable from the URL provided in the command line and executes the task script/executable.

What are Task Definitions?

A task definition is a JSON document that contains metadata associated with a task such as the name of the executable for the task, the docker image that should be used to execute the task and the list of input parameters for the task. The task definitions will be stored along with the task executables.

The task definition for our turning off a virtual machine task is shown below:

{
  "taskName": "StopVM",
  "dockerImage": "lalitadithya/az-cli-image:latest",
  "scriptName": "stop-vm-azure.sh",
  "scriptParameters": [
    {
      "parameterName": "VMName",
      "parameterType": "string",
      "required": true
    },
    {
      "parameterName": "ResourceGroupName",
      "parameterType": "string",
      "required": true
    }
  ]
}

Task Orchestrator

Let us now look at how we can make use of an Azure Functions app to automatically spawn Docker containers to execute tasks.

Here are the tasks that needs to be accomplished by our task orchestrator for every request

  1. Determine the container that needs to be spawned using the task definition
  2. Create a SAS for the task executable
  3. Spawn the docker container

The task orchestrator will contain one POST endpoint that will be used to perform the above-mentioned tasks. The request body will contain the name of the task and the input parameters for the task as shown below:

{
  "taskName": "StopVM",
  "parameters": [
    {
      "name": "VMName",
      "value": "a3etest"
    },
    {
      "name": "ResourceGroupName",
      "value": "rg-a3e-test_resources-dev"
    }
  ]
}

Determine the container that needs to be spawned using the task definition

To determine the container that needs to be spawned we will have to pull the task definition from the storage account.

The code for pulling the task definition from storage account is shown below:

var cloudStorageAccount = CloudStorageAccount.Parse(configuration["StorageConnetionString"])
var blobClient = cloudStorageAccount.CreateCloudBlobClient();
var storageContainer = blobClient.GetContainerReference(task-container);
var blob = storageContainer.GetBlobReference(stop-vm.json); 

using MemoryStream blobContents = new MemoryStream();
await blob.DownloadToStreamAsync(blobContents);
string jsonTaskDefinition = Encoding.UTF8.GetString(blobContents.ToArray());

TaskDefinition taskDefinition = JsonConvert.DeserializeObject<TaskDefinition>(jsonTaskDefinition);
string containerImage = taskDefinition.DockerImage
string taskExecutableName = taskDefinition.ScriptName

Create a SAS for the task executable

In the previous step we found the name of task executable, now let us create a SAS for the executable so that we can pass the URI to task runner so that it can download the executable.

The code for generating a SAS is shown below:

var cloudStorageAccount = CloudStorageAccount.Parse(configuration["StorageConnetionString"])
var blobClient = cloudStorageAccount.CreateCloudBlobClient();
var storageContainer = blobClient.GetContainerReference(task-container);
var blob = storageContainer.GetBlobReference(taskDefinition.ScriptName);

SharedAccessBlobPolicy sharedAccessBlobPolicy = new SharedAccessBlobPolicy
{
        SharedAccessExpiryTime = DateTime.UtcNow.AddHours(1),
        Permissions = SharedAccessBlobPermissions.Read
};
var blobToken = blob.GetSharedAccessSignature(sharedAccessBlobPolicy);
string uri = blob.Uri + blobToken;

Spawn the docker container

Now that we have a SAS URI for the task executable, let us now spawn a docker container on Azure Container Instances.

The code for this is shown below:

var environmentVariables = taskDefinition.ScriptParameters
                                                .ToDictionary(x => x.ParameterName,
                                                                x => task.Parameters.FirstOrDefault(y => x.ParameterName == y.Name)?.Value);

await azure.ContainerGroups
.Define(SdkContext.RandomResourceName(configuration["ContainerGroupNamePrefix"], 20))
                                .WithRegion(Region.USWest)
                                .WithExistingResourceGroup(configuration["ResourceGroupName"])
                                .WithLinux()
                                .WithPublicImageRegistryOnly()
                                .WithoutVolume()
.DefineContainerInstance(SdkContext.RandomResourceName(configuration["ContainerInstanceNamePrefix"], 20))
                                    .WithImage(containerImage)
                                    .WithoutPorts()
                                    .WithCpuCoreCount(1)
                                    .WithMemorySizeInGB(1)
                                    .WithEnvironmentVariables(environmentVariables)
                                    .WithStartingCommandLine("script-runner.sh", scriptLocation)
                                    .Attach()
                                .WithExistingUserAssignedManagedServiceIdentity(await azure.Identities.GetByIdAsync(configuration["UserAssignedManagedServiceIdentityId"]))
                                .WithRestartPolicy(ContainerGroupRestartPolicy.Never)
                                .CreateAsync();

Recap

In this post we looked at how we can build a consumption-based task execution framework on Azure using Azure Container Instances and Azure Functions.

You can view the source for all the components explained at my GitHub repository here</a>.

Share: Twitter Facebook
Lalit Adithya's Picture

About Lalit Adithya

Lalit is a coder, blogger, architect, and a photographer. He has been coding since 2010 and he has taken business critical websites and desktop apps from inception to production by working in/leading cross functional teams with an Agile focus. He currently focuses on developing & securing cloud native applications.

Bangalore, India https://lalitadithya.com