Generating text from audio using Amazon Transcribe and AWS Lambda

S3 - Lambda - Transcribe architecture
S3 - Lambda - Transcribe architecture

In this example, we will build an AWS Lambda function in Python that listens to an S3 bucket for audio uploads and automatically transcribes them using Amazon Transcribe. The whole application will be built using AWS’ CDK.

Requirements

You can find the Amazon transcribe with the Lambda function example code inside the linked Github repository.

Architecture

Make sure that you have your AWS CLI correctly set up using the profile that you want. Once finished, we will have deployed the following architecture to your AWS account.

We will have deployed:

  • 2 AWS S3 buckets (cost/storage)
  • 1 AWS Lambda function (cost/invocation)
AWS Architecture Example

Getting started

This is where the actual fun begins: let’s start the project.

Creating the CDK project and setting up the stack

Firstly we will need to set up the CDK project and add all the resources required to the stack.

Initializing new CDK project

To begin developing our architecture you will need to set up a new CDK project.

Create a folder and generate the CDK sample app

Adding the S3 buckets

Next up, we want to add the two S3 buckets to the stack. You can use the default object like the following to achieve this:

Adding bucket for our audio and transcript files

Note: You can obviously limit the amount of S3 buckets to one and have both the audio and the transcription in the same bucket. For exemplary purposes, we have decided on two.

Creating the AWS Lambda function and adding the S3 trigger

The Lambda function will be used to call on the Amazon Transcribe SDK with our audio file and it will receive a transcript in its place. We’ll do the programming of the Lambda in the next chapter so lets first just create the CDK resource for the function and the S3 trigger:

We’re using the experimental PythonFunction construct to easily allow external dependencies.

In this case, we are using the PythonFunction to use the requests library to download the file. In case you haven’t used this before you can find more information in this post: ☁️ Using external libraries in your Python AWS Lambda in AWS CDK

Additional IAM permissions

We will need to add some extra permissions to our Lambda function’ IAM role. This is because reading and writing to S3 buckets is not allowed per default and neither is calling on Transcribe. Let’s create the required permissions and add them to the Lambda function like so:

Creating and adding the required IAM permissions for our Lambda function

Programming the AWS Lambda function in Python

We now have all the infrastructure required. This means that we can get to actually programming this thing.

Reading the file from S3

First of all, we will need to read the file location. We don’t actually need to download the file as Transcribe can read it directly from S3. This is great because that means that we shouldn’t get any sizing issues.

Reading the bucket and key from the event source

Calling to Amazon Transcribe

Next up is the Amazon Transcribe SDK invocation. This is basically telling Amazon Transcribe to get to work and start a job. Once finished you should get an overview of the complete job returned. The most interesting part for us is the Amazon S3 presigned URL that contains the transcription in JSON format.

Sending the transcription job to Amazon Transcribe

Download the job results

Because we have the experimental PythonFunction construct we can use the requests library to download the file.

Downloading the Job results from S3 and saving them locally

Download the job results

Because we have the experimental PythonFunction construct we can use the requests library to download the file.

Creating a file and storing it in S3

Parse JSON file and fetch first transcript

With the job results stored locally, we can now take a look at the transcript. In order to do so, we will need to parse the JSON file and get the transcript:

Parse the results and fetch the transcript from the JSON file

Upload the transcript to the transcripts bucket


The final step is to upload the transcript to your transcripts bucket.

Creating a file and storing it in S3


Conclusion

I hope you’ve been able to follow along and that you’ve just received your first Amazon Transcribe transcription. This example can be found on my personal Github which should be linked above. In case you have any questions feel free to drop a comment and i’ll get to them when I can.

Lessons learned

Avatar photo

By Mart

Tutorials at nbtl.blog

Leave a comment

Your email address will not be published.