💸 Deploy low cost ECS tasks based on SQS queue size with AWS CDK

In this post we are using AWS CDK to build a system that scales up and down based on the amount of messages in the SQS queue. It allows users to do REST-API calls to an Amazon API Gateway endpoint from their applications or computers. This will add a new item to the SQS queue. In turn, this will trigger your task on ECS. After you’re task is finished it will delete the item from the SQS queue which will automatically scale down you’re ECS cluster and task.

Voila, a low cost autoscaling solution for your high intensity compute jobs.

🎓 At the end of this post

When you have finished this post, you will have build the following:

  • A CDK project that deploys your resources to AWS
  • During the CDK deployment, it will bundle your Python code and create a Docker image which will deploy to ECR
  • A SQS that spins up ECS tasks based on the amount of messages in the queue
  • The user will be able to call an Api Gateway that puts new messages in the queue
  • The Python code will put out an Hello World message

The above items will results in to the following architecture:

Example architecture of what we are building

📝 Requirements – Level 200

This post requires some form of fundamental knowledge on the AWS platform and dockerized development. We will not describe in detail how Docker or ECS works. Due to that we advise:

💼 Use cases

High computing tasks like automated video generation for the minimal cost

This architecture allows you to let users trigger high compute tasks without having to have resources running consistently. Generating video’s for example takes a lot of computing resources, and keeping that running all the time becomes very expensive very fast. With this setup, users can just trigger an endpoint and forget about it untill the job is done.

🧠 Tutorial

In this chapter we are walking through the different steps required for creating this architecture.

1 Creating a new CDK project

The very first thing we will need to do is generate a new CDK project for us to work in. We will use the CDK sample application for now, to get a head start.

In order to begin, open a terminal window where-ever you want to store your project and create a new folder in which you generate a new CDK project:

mkdir docker-ecs-sqs-auto-scaling
cd docker-ecs-sqs-auto-scaling
cdk init sample-app --language typescript

2 Adding the SQS queue

You will notice that there is actually already a SQS queue defined in your CDK application under ~/docker-ecs-sqs-auto-scaling/lib/cdk-demo-stack.ts which we can use.

You will find that it also added an SNS topic which you can remove for now. Instead, we want to add our alarm that monitors the size of the available messages in the queue:


const messageQueue = new sqs.Queue(this, 'DockerEcsSqsAutoScalingQueue', {
      visibilityTimeout: cdk.Duration.seconds(300)
    });
    

As always, verify you’re finding by deploying your resources with CDK by cdk deploy.

3 Creating the API Gateway that will add messages to the SQS queue

The API Gateway will be the entry point for our users to add messages to the SQS queue. You can add multiple destinations for your API Gateway but in our case we want it to direct requests to our SQS queue.

In order for us to create an API Gateway, we need to have the following:

  1. An IAM role that can be assumed by the API Gateway
  2. An inline policy that will be added to the role above that allows API Gateway to send messages to our SQS queue
  3. The REST api
  4. The method on the GET method that passes the message to the SQS queue (e.g. the endpoint)

When you have CDK’s help on your side, the above becomes the following:

// import * as iam from "@aws-cdk/aws-iam";
// import * as apigateway from "@aws-cdk/aws-apigateway";
// import the above in the top of your file

    const credentialsRole = new iam.Role(this, "Role", {
      assumedBy: new iam.ServicePrincipal("apigateway.amazonaws.com"),
    });

    credentialsRole.attachInlinePolicy(
      new iam.Policy(this, "SendMessagePolicy", {
        statements: [
          new iam.PolicyStatement({
            actions: ["sqs:SendMessage"],
            effect: iam.Effect.ALLOW,
            resources: [messageQueue.queueArn],
          }),
        ],
      })
    );

    const api = new apigateway.RestApi(this, "Endpoint", {
      deployOptions: {
        stageName: "run",
        tracingEnabled: true,
      },
    });

    const queue = api.root.addResource("queue");
    queue.addMethod(
      "GET",
      new apigateway.AwsIntegration({
        service: "sqs",
        path: `${cdk.Aws.ACCOUNT_ID}/${messageQueue.queueName}`,
        integrationHttpMethod: "POST",
        options: {
          credentialsRole,
          passthroughBehavior: apigateway.PassthroughBehavior.NEVER,
          requestParameters: {
            "integration.request.header.Content-Type": `'application/x-www-form-urlencoded'`,
          },
          requestTemplates: {
            "application/json": `Action=SendMessage&MessageBody=$util.urlEncode("$method.request.querystring.message")`,
          },
          integrationResponses: [
            {
              statusCode: "200",
              responseTemplates: {
                "application/json": `{"done": true}`,
              },
            },
          ],
        },
      }),
      { methodResponses: [{ statusCode: "200" }] }
    );

As always, verify you’re finding by deploying your resources with CDK by cdk deploy.

4 Creating the ECS cluster

Now that we have created the SQS queue, the alarm and the API Gateway that we will use to increase and decrease our ECS cluster size, we need to actually create our cluster that hosts our tasks.

Unfortunately, it isn’t as simple as just creating the cluster as we also need some extra foundational AWS resources. For example, in order for our cluster to read the Docker image from the ECR (repository) we will need a NAT gateway. This is one of the parts of this solution that will cost money consistently throughout the month.

 

const natGatewayProvider = ec2.NatProvider.instance({
      instanceType: new ec2.InstanceType("t3.nano"),
    });

const vpc = new ec2.Vpc(this, "FargateVPC", {
      natGatewayProvider,
      natGateways: 1,
});

const cluster = new ecs.Cluster(this, "Cluster", { vpc });

As always, verify you’re finding by deploying your resources with CDK by cdk deploy.

5 Adding the ECR repository and automatically deploy your Docker image

You’re doing great and this setup is almost finished already. It’s one of the great advantages of CDK that you don’t need that many lines of code to declare youre resources.

In order for us to deploy our code on the ECS cluster we will need to create a Task Definition which contains the information for our task that will run on ECS. This contains configuration for the regular things like cpu and memory limits, but also which containers to run from which registry. Lastly, we will add a Service to the cluster which in turn contains the Task Definition.

// Create a task role that will be used within the container
    const EcsTaskRole = new iam.Role(this, "EcsTaskRole", {
      assumedBy: new iam.ServicePrincipal("ecs-tasks.amazonaws.com"),
    });

    EcsTaskRole.attachInlinePolicy(
      new iam.Policy(this, "SQSAdminAccess", {
        statements: [
          new iam.PolicyStatement({
            actions: ["sqs:*"],
            effect: iam.Effect.ALLOW,
            resources: [messageQueue.queueArn],
          }),
        ],
      })
    );    

    // Create task definition
    const fargateTaskDefinition = new ecs.FargateTaskDefinition(
      this,
      "FargateTaskDef",
      {
        memoryLimitMiB: 4096,
        cpu: 2048,
        taskRole: EcsTaskRole
      }
    );

    // create a task definition with CloudWatch Logs
    const logging = new ecs.AwsLogDriver({
      streamPrefix: "myapp",
    });

    // Create container from local `Dockerfile`
    const appContainer = fargateTaskDefinition.addContainer("Container", {
      image: ecs.ContainerImage.fromAsset("./python-app", {}),
      logging,
    });

    // Create service
    const service = new ecs.FargateService(this, "Service", {
      cluster,
      taskDefinition: fargateTaskDefinition,
      desiredCount: 0,
    });

As always, verify you’re finding by deploying your resources with CDK by cdk deploy.

6 Creating a local Python app that will get deployed to ECS using Docker

Now that we have created the ECS Task Definition above we have defined that it should find a folder called ./python-app which contains our Docker definition. This docker image is what will get deployed as our ECS task – so let’s create it:

mkdir python-app
cd python-app
touch Dockerfile
touch app.py
touch requirements.txt

Open up the Dockerfile in you’re favorite IDE and lets add some contents to it. The only thing we need is a simple Python image that runs one file, so we can keep it short and to the following:

FROM python:3.6

USER root

WORKDIR /app

ADD . /app

RUN pip install --trusted-host pypi.python.org -r requirements.txt

CMD ["python", "app.py"]

Next, you can fill up the app.py file that is in this folder with anything you like.

If you want the easiest approach, you might just want to go for the easiest print("Hello World.") for now. Alternatively, you can read the messages from SQS and delete is from the queue accordingly.

Secondly, if you’d ever need extra dependencies you can add those to the requirements.txt file – for now (with the first option) we don’t have any. As an example of the alternative option, you can use the following app.py (and add boto3 to your requirements.txt):

import boto3

sqs = boto3.client('sqs')

queue_url = 'https://sqs.eu-central-1.amazonaws.com/.....' <-- Add your SQS url from the AWS

def delete_sqs_message(receipt_handle):
    print(f"Deleting message {receipt_handle}")
    # Delete received message from queue
    sqs.delete_message(
        QueueUrl=queue_url,
        ReceiptHandle=receipt_handle
    )

# Read SQS
messages = read_sqs()
print(f"Found messages {messages}")

for message in messages:
    # Take custom actions based on the message contents
    print(f"Activating {message}")
    print(f"Said Hello")

    # Delete Message 
    delete_sqs_message(message['ReceiptHandle'])
    print(f"Finished for {message}")

7 Create the Step scaling so your ECS tasks scale based on SQS available messages

In this final step we will add the CDK code that creates the step scaling of your ECS tasks based on the SQS queue size. We will both create a scale-up (increasing the amount of tasks) as well as scaling-in (decreasing the amount of tasks) rule for our Cloudwatch alarm.

In order to do so, we will have to create 2 scaling steps for our ECS service that scales on the  QueueMessagesVisibleScaling from the queue that is created above. It is actually really simple:

  // Configure task auto-scaling
    const scaling = service.autoScaleTaskCount({
      minCapacity: 0,
      maxCapacity: 1,
    });

    // Setup scaling metric and cooldown period
    scaling.scaleOnMetric("QueueMessagesVisibleScaling", {
      metric: messageQueue.metricApproximateNumberOfMessagesVisible(),
      adjustmentType: autoscaling.AdjustmentType.CHANGE_IN_CAPACITY,
      cooldown: cdk.Duration.seconds(300),
      scalingSteps: [
        { upper: 0, change: -1 },
        { lower: 1, change: +1 },
      ],
    });

And that is it, all done! For the very last time, verify your changes by running cdk deploy in your terminal to deploy your latest resources to the cloud.

8 Calling your API Gateway endpoint to trigger your ECS task

Now that all the infrastructure is in place it is time to call on the API Gateway to see the magic in action. You can find you’re API Gateway endpoint URL either in your terminal after doing an cdk deploy or in the AWS console under the API Gateway service.

Successful deployment of CDK

With that, you can add a message to your SQS queue in the following way by doing a request to the following endpoint: https://thmjsgwe1l.execute-api.eu-central-1.amazonaws.com/run/queue?message=test

After this, you should see the messages in your SQS queue increase on the AWS console to 1 (or more) available messages.

SQS Showing received message available

In a couple of minutes, this should create an ECS task task on your new ECR cluster.

Voila, a new active ECS task!

✅ Conclusion

I hope you’ve been able to follow along with this post and that you’ve now successfully deployed your resources. You can give that API Gateway URL to anyone for them to be able to trigger new ECS tasks on your cluster.

Next steps

If you think this is interesting and want to build out this project a little more, you can do any of the following:

You can always find more solutions and builds on our not build to last blog.

By Mart

Tutorials at nbtl.blog

3 comments

Leave a comment

Your email address will not be published.

Exit mobile version