In this post we are using AWS CDK to build a system that scales up and down based on the amount of messages in the SQS queue. It allows users to do REST-API calls to an Amazon API Gateway endpoint from their applications or computers. This will add a new item to the SQS queue. In turn, this will trigger your task on ECS. After you’re task is finished it will delete the item from the SQS queue which will automatically scale down you’re ECS cluster and task.
Voila, a low cost autoscaling solution for your high intensity compute jobs.
Table of Contents
🎓 At the end of this post
When you have finished this post, you will have build the following:
- A CDK project that deploys your resources to AWS
- During the CDK deployment, it will bundle your Python code and create a Docker image which will deploy to ECR
- A SQS that spins up ECS tasks based on the amount of messages in the queue
- The user will be able to call an Api Gateway that puts new messages in the queue
- The Python code will put out an Hello World message
The above items will results in to the following architecture:
📝 Requirements – Level 200
This post requires some form of fundamental knowledge on the AWS platform and dockerized development. We will not describe in detail how Docker or ECS works. Due to that we advise:
- Installed and configured Docker
- You have activated your AWS account
- You have used CDK before, or walked through the beginner workshop
💼 Use cases
High computing tasks like automated video generation for the minimal cost
This architecture allows you to let users trigger high compute tasks without having to have resources running consistently. Generating video’s for example takes a lot of computing resources, and keeping that running all the time becomes very expensive very fast. With this setup, users can just trigger an endpoint and forget about it untill the job is done.
🧠 Tutorial
In this chapter we are walking through the different steps required for creating this architecture.
1 Creating a new CDK project
The very first thing we will need to do is generate a new CDK project for us to work in. We will use the CDK sample application for now, to get a head start.
In order to begin, open a terminal window where-ever you want to store your project and create a new folder in which you generate a new CDK project:
mkdir docker-ecs-sqs-auto-scaling
cd docker-ecs-sqs-auto-scaling
cdk init sample-app --language typescript
2 Adding the SQS queue
You will notice that there is actually already a SQS queue defined in your CDK application under ~/docker-ecs-sqs-auto-scaling/lib/cdk-demo-stack.ts
which we can use.
You will find that it also added an SNS topic which you can remove for now. Instead, we want to add our alarm that monitors the size of the available messages in the queue:
const messageQueue = new sqs.Queue(this, 'DockerEcsSqsAutoScalingQueue', {
visibilityTimeout: cdk.Duration.seconds(300)
});
As always, verify you’re finding by deploying your resources with CDK by cdk deploy
.
3 Creating the API Gateway that will add messages to the SQS queue
The API Gateway will be the entry point for our users to add messages to the SQS queue. You can add multiple destinations for your API Gateway but in our case we want it to direct requests to our SQS queue.
In order for us to create an API Gateway, we need to have the following:
- An IAM role that can be assumed by the API Gateway
- An inline policy that will be added to the role above that allows API Gateway to send messages to our SQS queue
- The REST api
- The method on the GET method that passes the message to the SQS queue (e.g. the endpoint)
When you have CDK’s help on your side, the above becomes the following:
// import * as iam from "@aws-cdk/aws-iam";
// import * as apigateway from "@aws-cdk/aws-apigateway";
// import the above in the top of your file
const credentialsRole = new iam.Role(this, "Role", {
assumedBy: new iam.ServicePrincipal("apigateway.amazonaws.com"),
});
credentialsRole.attachInlinePolicy(
new iam.Policy(this, "SendMessagePolicy", {
statements: [
new iam.PolicyStatement({
actions: ["sqs:SendMessage"],
effect: iam.Effect.ALLOW,
resources: [messageQueue.queueArn],
}),
],
})
);
const api = new apigateway.RestApi(this, "Endpoint", {
deployOptions: {
stageName: "run",
tracingEnabled: true,
},
});
const queue = api.root.addResource("queue");
queue.addMethod(
"GET",
new apigateway.AwsIntegration({
service: "sqs",
path: `${cdk.Aws.ACCOUNT_ID}/${messageQueue.queueName}`,
integrationHttpMethod: "POST",
options: {
credentialsRole,
passthroughBehavior: apigateway.PassthroughBehavior.NEVER,
requestParameters: {
"integration.request.header.Content-Type": `'application/x-www-form-urlencoded'`,
},
requestTemplates: {
"application/json": `Action=SendMessage&MessageBody=$util.urlEncode("$method.request.querystring.message")`,
},
integrationResponses: [
{
statusCode: "200",
responseTemplates: {
"application/json": `{"done": true}`,
},
},
],
},
}),
{ methodResponses: [{ statusCode: "200" }] }
);
As always, verify you’re finding by deploying your resources with CDK by cdk deploy
.
4 Creating the ECS cluster
Now that we have created the SQS queue, the alarm and the API Gateway that we will use to increase and decrease our ECS cluster size, we need to actually create our cluster that hosts our tasks.
Unfortunately, it isn’t as simple as just creating the cluster as we also need some extra foundational AWS resources. For example, in order for our cluster to read the Docker image from the ECR (repository) we will need a NAT gateway. This is one of the parts of this solution that will cost money consistently throughout the month.
const natGatewayProvider = ec2.NatProvider.instance({
instanceType: new ec2.InstanceType("t3.nano"),
});
const vpc = new ec2.Vpc(this, "FargateVPC", {
natGatewayProvider,
natGateways: 1,
});
const cluster = new ecs.Cluster(this, "Cluster", { vpc });
As always, verify you’re finding by deploying your resources with CDK by cdk deploy
.
5 Adding the ECR repository and automatically deploy your Docker image
You’re doing great and this setup is almost finished already. It’s one of the great advantages of CDK that you don’t need that many lines of code to declare youre resources.
In order for us to deploy our code on the ECS cluster we will need to create a Task Definition which contains the information for our task that will run on ECS. This contains configuration for the regular things like cpu and memory limits, but also which containers to run from which registry. Lastly, we will add a Service to the cluster which in turn contains the Task Definition.
// Create a task role that will be used within the container
const EcsTaskRole = new iam.Role(this, "EcsTaskRole", {
assumedBy: new iam.ServicePrincipal("ecs-tasks.amazonaws.com"),
});
EcsTaskRole.attachInlinePolicy(
new iam.Policy(this, "SQSAdminAccess", {
statements: [
new iam.PolicyStatement({
actions: ["sqs:*"],
effect: iam.Effect.ALLOW,
resources: [messageQueue.queueArn],
}),
],
})
);
// Create task definition
const fargateTaskDefinition = new ecs.FargateTaskDefinition(
this,
"FargateTaskDef",
{
memoryLimitMiB: 4096,
cpu: 2048,
taskRole: EcsTaskRole
}
);
// create a task definition with CloudWatch Logs
const logging = new ecs.AwsLogDriver({
streamPrefix: "myapp",
});
// Create container from local `Dockerfile`
const appContainer = fargateTaskDefinition.addContainer("Container", {
image: ecs.ContainerImage.fromAsset("./python-app", {}),
logging,
});
// Create service
const service = new ecs.FargateService(this, "Service", {
cluster,
taskDefinition: fargateTaskDefinition,
desiredCount: 0,
});
As always, verify you’re finding by deploying your resources with CDK by cdk deploy
.
6 Creating a local Python app that will get deployed to ECS using Docker
Now that we have created the ECS Task Definition above we have defined that it should find a folder called ./python-app
which contains our Docker definition. This docker image is what will get deployed as our ECS task – so let’s create it:
mkdir python-app
cd python-app
touch Dockerfile
touch app.py
touch requirements.txt
Open up the Dockerfile in you’re favorite IDE and lets add some contents to it. The only thing we need is a simple Python image that runs one file, so we can keep it short and to the following:
FROM python:3.6
USER root
WORKDIR /app
ADD . /app
RUN pip install --trusted-host pypi.python.org -r requirements.txt
CMD ["python", "app.py"]
Next, you can fill up the app.py file that is in this folder with anything you like.
If you want the easiest approach, you might just want to go for the easiest print("Hello World.")
for now. Alternatively, you can read the messages from SQS and delete is from the queue accordingly.
Secondly, if you’d ever need extra dependencies you can add those to the requirements.txt file – for now (with the first option) we don’t have any. As an example of the alternative option, you can use the following app.py
(and add boto3 to your requirements.txt):
import boto3
sqs = boto3.client('sqs')
queue_url = 'https://sqs.eu-central-1.amazonaws.com/.....' <-- Add your SQS url from the AWS
def delete_sqs_message(receipt_handle):
print(f"Deleting message {receipt_handle}")
# Delete received message from queue
sqs.delete_message(
QueueUrl=queue_url,
ReceiptHandle=receipt_handle
)
# Read SQS
messages = read_sqs()
print(f"Found messages {messages}")
for message in messages:
# Take custom actions based on the message contents
print(f"Activating {message}")
print(f"Said Hello")
# Delete Message
delete_sqs_message(message['ReceiptHandle'])
print(f"Finished for {message}")
7 Create the Step scaling so your ECS tasks scale based on SQS available messages
In this final step we will add the CDK code that creates the step scaling of your ECS tasks based on the SQS queue size. We will both create a scale-up (increasing the amount of tasks) as well as scaling-in (decreasing the amount of tasks) rule for our Cloudwatch alarm.
In order to do so, we will have to create 2 scaling steps for our ECS service that scales on the QueueMessagesVisibleScaling from the queue that is created above. It is actually really simple:
// Configure task auto-scaling
const scaling = service.autoScaleTaskCount({
minCapacity: 0,
maxCapacity: 1,
});
// Setup scaling metric and cooldown period
scaling.scaleOnMetric("QueueMessagesVisibleScaling", {
metric: messageQueue.metricApproximateNumberOfMessagesVisible(),
adjustmentType: autoscaling.AdjustmentType.CHANGE_IN_CAPACITY,
cooldown: cdk.Duration.seconds(300),
scalingSteps: [
{ upper: 0, change: -1 },
{ lower: 1, change: +1 },
],
});
And that is it, all done! For the very last time, verify your changes by running cdk deploy
in your terminal to deploy your latest resources to the cloud.
8 Calling your API Gateway endpoint to trigger your ECS task
Now that all the infrastructure is in place it is time to call on the API Gateway to see the magic in action. You can find you’re API Gateway endpoint URL either in your terminal after doing an cdk deploy
or in the AWS console under the API Gateway service.
With that, you can add a message to your SQS queue in the following way by doing a request to the following endpoint: https://thmjsgwe1l.execute-api.eu-central-1.amazonaws.com/run/queue?message=test
After this, you should see the messages in your SQS queue increase on the AWS console to 1 (or more) available messages.
In a couple of minutes, this should create an ECS task task on your new ECR cluster.
✅ Conclusion
I hope you’ve been able to follow along with this post and that you’ve now successfully deployed your resources. You can give that API Gateway URL to anyone for them to be able to trigger new ECS tasks on your cluster.
Next steps
If you think this is interesting and want to build out this project a little more, you can do any of the following:
- Build out an React project using Amplify so that you can let users trigger that endpoint from a URL
- Create a low cost automatic video generation platform based on this solution (to be released)
You can always find more solutions and builds on our not build to last blog.
3 comments