Our Experience with Amazon Lambda

We were very excited to learn about Apple’s new tvOS platform, bundled together with the new Apple TV. As soon as we found some time, we got our hands dirty and started working on a simple but useful idea: A movie trailer app that shows random trailers from new releases on IMDb. The project was simple enough, it required a backend service for movie trailers. It was a single endpoint API that would provide random trailers for the tvOS app.

The obvious approach for such a small API was to prepare a simple system using Flask and deploy it on one of our existing servers. However, this meant that we would still have to deal with database settings, supervisor settings, monitoring, etc. All these tasks for a single-endpoint API seemed like overkill. This was a perfect occasion to try out Amazon’s new rapid development tool: Lambda.

What is Lambda?

AWS Lambda is a platform where you can create functions within their own environment. They don’t need a server to run on, they can work when they are called. You pay only for the computation time required to run your code.

Why is Lambda remarkably innovative?

In order to see Lambda’s potential, we should analyse the lifecycle of a web-based API. It consists of two main tasks:

1 - Building the API: Creating the architecture and coding the project

2 - Running the API: Making sure that the code runs on a computer for a given time period. This includes lots of small details like:

  • installing required packages,
  • ensuring there is sufficient disk space,
  • monitoring CPU and memory usage levels,
  • logging errors, etc...

In our opinion, real cloud technology should remove any task that is not related to the creation of the API. Nowadays, Amazon’s EC2 service is like renting a car: You do not care about the retail price of the car, taxes, paperwork, or maintenance, but you have to carefully check if the car has enough fuel, and deal with all the issues you may face while driving it. Sound familiar? We do not care about the server's value, the electricity bill, or hardware problems, but we are responsible for all other issues.

We should not be dealing with these problems in the future. We shouldn’t worry about making RAIDs in order to protect our data from a disk error. We shouldn’t worry about creating new instances when they are overloaded. We should be spending all our time creating better software architecture, not responding to load spike alerts at 3 in the morning and instance retirement warning emails. Lambda is a great milestone towards that dream. We don’t want to rent a car, we simply want to "Uber it" everywhere.

How does it work?

Amazon is using Docker-based smart technologies to run the functions as fast as possible. From the user’s perspective though, everything is magic. You just write a function using the dashboard or upload within a zip file (including the required libraries) then sit back & relax. Your function will work when it is invoked.

Basically you can invoke a Lambda function in two ways:

1 - Amazon API Gateway

You can create an API endpoint to trigger a function using Amazon API Gateway. When someone makes a request, it will invoke a Lambda function.

2 - Amazon SNS

You can send notifications through other AWS services (S3, Elastic Transducer, etc.) or from your own server to ignite a function. Lambda can act as a smart hub between different services and manage them.

For example, an iOS app can upload the videos directly to an S3 bucket. S3 can send a notification to the Lambda function with file details, this function can update the database, or start the video encoding process by sending another notification to Elastic Transcoder.

Hands-on example

Imagine that we already have a database of trailers in DynamoDB, with following fields:

  • imdb_id
  • poster_url
  • movie_name
  • genres
  • video_url

Let’s make an API endpoint that serves those movies in a RESTful way.

Assuming that the data table is ready at the DynamoDB, we will create a function that connects to DynamoDB and returns the results in a proper way. There are many blueprints available at the Amazon console which are quite useful for understanding different scenarios. However, for the sake of this blog post we will skip the blueprints and start from scratch.

We start by giving a name and a description to our Lambda function. Then we select our coding language. In our example, we will be using Python.

This is where we will code the Lambda function. You can also code with your favourite editor and upload it within a zip file. You will notice that, although we didn’t choose any blueprint, there is already some code available. Apparently, Amazon is trying to simplify things and displays a "hello world" function by default. Let's leave it be for now and focus on the rest of the screen.

Here is the important part. We should give access permission to our Lambda function. As we will connect to DynamoDB, we have to create a new role based on "Basic with DynamoDB". When you select that, a new wizard will open that allows you to create a new role with DynamoDB access.

You can also change the memory size and the timeout period. You can choose up to 1.536 MB of memory and set the timeout as high as 5 minutes. Since you will pay by computation time and memory, these two settings can seriously impact your bills. If you set your timeout period too high, in case of erroneous code (ex: an infinite loop), your function will run for minutes while charging you for each millisecond.

The last setting is the VPC setting where you can make Lambda access to your virtual private cloud. For example you can connect your Lambda function to your DB server running on an EC2 instance.

Our function is ready to be tested. By clicking on the test button, a pre-filled test data screen will pop up. Just click Next for it to run. If everything goes smoothly, we will see the results below:

The upper box displays the value that the function has returned. And the lower box displays the console logs that we can analyse. Everything that we printed during the process can be seen here.

Noticed something in the log output?

Duration: 0.33 ms

Billed Duration: 100 ms

Although our little function just took 0.33 ms to run, it is billed as 100 ms. Because the smallest billing denomination is 100 ms. Even if your function takes less than 100 ms, it will be charged as 100 ms.

Let’s focus on the function.

from __future__ import print_function  
import json

print('Loading function')

def lambda_handler(event, context):  
    #print("Received event: " + json.dumps(event, indent=2))
    print("value1 = " + event['key1'])
    print("value2 = " + event['key2'])
    print("value3 = " + event['key3'])
    return event['key1'] # Echo back the first key value
    #raise Exception('Something went wrong')

Lambda will search for the lambda_handler function within the Python file and will execute it with two variables (event & context).

event is the data payload that is sent to the function when it is fired. context is the environment variables of AWS. This contains infrastructure information like the cognito user id. For now, our API will not require any input, so we can clean the function a bit.

import json  
def lambda_handler(event, context):  
    return "Hello world!";

There we go! Much simpler than the previous example.

Let’s connect to the DB and get some data! The boto3 library is always included within the environment, so we can directly start using it by importing. We don’t need to set up another environment for it.

import boto3  
import json  
def lambda_handler(event, context):  
    dynamo = boto3.resource('dynamodb').Table("movies")
    db_response = dynamo.scan()
    return db_response[0]

Duration: 1119.91 ms Billed Duration: 1200 ms

The response coming from the DynamoDB is quite nice, so we will not touch it for now. It is time to let others know about our Lambda function! We have to make an API endpoint to fire this function. In other words, when someone will call that endpoint, our function will run and return the generated data. Let's save our function and move to the API Gateway interface.

Creating an API endpoint

API Gateway is a bit more complex than the Lambda interface.

Lets create a new API,

We should choose a name and a simple description, none of those will be visible to users, so they are only for your use.

First, we start by creating a resource: The resource will be the name of your endpoint. We are using "trailers" as the resource name.

Now we need to add a method to this resource. The method can be GET, POST or other common HTTP verbs. As we are making a list page, our method will obviously be GET.

Here comes the tricky part: do not forget to click on that little checkmark in order to create the method.

Now we have to connect this endpoint to a Lambda function. First, you need to choose a region, and then enter the name of the function. Surprisingly, the function name autocompletes here. Thanks Amazon!

Click "Save" and you are done. Now we just need to deploy the API.

Now we have a base API URL!

https://opvv77i806.execute-api.us-east-1.amazonaws.com/prod

Let's add the resource name to the end:

https://opvv77i806.execute-api.us-east-1.amazonaws.com/prod/trailers

And call this URL

Enjoy your new API that works without a server.

Final Words

Although the dashboard and documentation is still a bit immature, we really liked both the philosophy behind Lambda and its performance. We started using it both for serving simple APIs and managing different AWS services. It is a great tool for heavy projects, as well as hack day projects. We definitely recommend giving it a try and seeing it for yourself.