Deep Dive Into AWS Lambda – Part II
Welcome to the second part of this mini-series which focusses on how to design lambda functions and how to optimize your code. This article builds on concepts that were explained in a previous blog post. Make sure to have a look at our previous blog posts, „Introduction into serverless computing with AWS Lambda“ and „Deep Dive Into AWS Lambda – Part I“ for a solid overview of the topic.
The Handler Method
When creating a Lambda function, you specify a handler method. This method is generated for you by default and of the entry point for a request served by your function. As parameters, the function receives two objects:
The event object contains details of the event that invoked the lambda function. For example, the function might receive data of a http POST request. Usually, the type of the event object is a dictionary (in Python). Sometimes, the object type might differ and will either be a
list, str, int, float, or
None. The content and structure of the event object are determined by how the function is invoked.
def handler_name(event, context): ... return some_value
The context object provides information about the lambda function. You might need to know how much time is left until the function is terminated by its timeout or how much memory is left until the limit is reached. This kind of information and more can be retrieved from the context object. For a full list, consult the official docs .
Some Lambda functions contain a return statement. This strongly depends on the execution model, which was discussed in the first part of the deep dive. If the function is invoked synchronously, the invoking service expects an answer on its request to AWS Lambda. Thus, the function has to return a value to the service. Asynchronous invocations on the other hand do not wait for a response from Lambda. Therefore, a return statement is obsolete.
Best Practice: Design
A lot of what will be described in this section might sound familiar since it is closely related to the microservice architecture. This is no surprise since AWS Lambda itself is part of a microservice environment and at the same time an integral part of many microservice architectures.
The first important rule is to separate business and Lambda logic. The handler method belongs to the later. You should only use it to extract information from the event, and, if needed information from the context. Subsequent functions within Lambda will then process the extracted information. Once processing finishes, a response might be passed back into the handler method to be returned to the invoking service. This principle enables developers to run unit testing and integration testing against business logic.
The next rule is the general best practice in clean code: Write single-purpose functions. Do not write functions that do more than one thing. This helps other developers understand and maintain your code.
Lambda functions need to be stateless and no state should be saved in the context of the lambda function itself. Remember that the code only exists if there is work to be done. If something needs to be persisted, you should use S3 or DynamoDB. Both services scale horizontally and are easy to use. As a rule of thumb: Use DynamoDB when you need millisecond latency and your data changes rapidly. If throughput is not important and the data does not change much use S3.
Finally, you should only include the dependencies you need. This is easier said than done but has a significant impact on the deployment and the performance in production. Including whole SDKs into your code equals a bigger deployment package and thus longer deployment processes. Additionally, the runtime of the function requires more time to be available (see also cold and warm starts in the first part of the Deep Dive). To control dependencies in your code, consider another one of our blog articles: Dependency Management for AWS Lambda
Best Practice: Code
When consulting the AWS developer documentation, you will find sections called „Working with …“. This section gives specific tips when working with a supported programing language but contains the same six topics:
- Deployment package
The following part will address „Logging, Errors and Tracing“, „Environment Variables“ and „Recursive Code“. If you want to find out more about the other topics, make sure to read the other AWS Lambda related articles.
Logging, Errors, and Tracing
Logging is equally important for developers and members of operating teams. It helps to develop code and also to debug errors. When running a Lambda function, a developer can include output statements in the code which are automatically logged by the service.
For python, you can include
print()–statements. But the larger the project, the more modules you have, and thus you need a better-structured way to log information. Also, the project eventually transitions from development, debugging, and testing to a production state. For these states, you might want to incorporate different logging levels.
For structured logging in python, you can always use the
logging-module which allows you to
- Control message level to log only required ones
- Control where to show or save the logs
- Control how to format the logs with built-in message templates
- Know which module the messages are coming from
This module is part of Python.
An even better tool (for Python) might be „AWS Lambda Powertools “ which is hosted on GitHub . It enables you to log information just like the
logging-module but also includes the libraries for AWS X-Ray. AWS X-Ray is a tracing service that enables you to trace API calls between different services and helps you to find bottlenecks within your workflow.
Another best practice is to use environment variables instead of keeping them hardcoded in the code. Environment variables allow you to modify the function’s behavior without updating your code. By default, the variables are encrypted at rest, which means that you can store sensitive parameters in the environment variables rather than in the code. It is possible to use another than the default key and it is also possible to encrypt the variables on a client before entering them into AWS Lambda.
There are also limitations on environment variables (besides naming requirements):
- Keys that aren’t reserved by Lambda
- The total size of all environment variables doesn’t exceed 4 KB
A great use case for environment variables is testing. Assume you want to test a function that is connecting to a production database. For the test you want it to connect to the testing database. Updating the function’s code every time you test is error-prone. Imagine you forget to change the parameter before the function handles production traffic. It is safer to deploy two lambda functions with identical codes but different values for the connection string in the environment variable.
The limitations outlined above have shown that there are certain environment keys that are reserved by Lambda. These contain information about the runtime and the function itself. For more information, please visit the developer documentation.
The shortest and probably most important advice: Avoid using recursive code. This might cause delays in your workflow as the execution time of a function increases due to recursive code in the function. This is especially dangerous when invoking the function synchronously. Another example of recursive code is when your function invokes itself through an SDK. This increases the number of concurrent executions very quickly.
If you can not make your function work without recursive code, include stopping criteria to avoid accidental blocking or too many concurrent invocations.
Thank you for taking the time to learn about best practices in function design and how to organize your code. The upcoming article will focus on concurrency in AWS Lambda and how to monitor your functions.
 https://github.com/awslabs/aws-lambda-powertools-python – Connect to preview