AWS Tips 08/2020

Perfect global user experience requires use of content delivery network to minimize latency. CloudFront is a CDN built by the AWS cloud. CloudFront features a programmable edge compute provided by Lambda@Edge.

As I mentioned in previous AWS Tips 06/2020 - Lambda@Edge functions are interesting. Let me show you a basic use-case for edge compute functions.

S3 and Origin Access Identity

Content can be served from S3 to CloudFront CDN via traditional S3 HTTP endpoint or using Origin Access Identity. Origin Access Identity provides better access control and CloudFront can directly access S3 objects.

There is one edge case that must be handled when using Origin Access Identity. Requests going to /example (or a folder) won't automatically serve /example/index.html S3 object because such redirect or alias doesn't exist. Result of such request will be a 404 error returned from S3 to CloudFront and to the client.

This problem can be solved with simple Lambda@Edge function triggered by Origin Request. CloudFront allows you to attach Lambda@Edge functions to four stages of request and response processing:

  1. viewer-request - triggered when client HTTP request accesses the CloudFront distribution.
  2. origin-request - triggered on cache miss when CloudFront forwards the client request to distribution Origin.
  3. origin-response - triggered when Origin returns the HTTP response to CloudFront.
  4. viewer-response - triggered when CloudFront returns the HTTP response to the client.

Back to the /example/ folder 404 problem. It's ideal to use origin-request because we can detect access with / at the end of URI path. We can rewrite the request path from non-existent /example/ to /example/index.html before the HTTP request hits S3.

Basic Lambda@Edge rewrite

Consider the following Lambda@Edge Python 3 code that rewrites the URI as described. The handler function looks at CloudFront request object and the last character of the URI path and conditionally appends index.html to it.

1
2
3
4
5
def handler(event, context):
    request = event['Records'][0]['cf']['request']
    if request['uri'][-1] == '/':
        request['uri'] += 'index.html'
    return request

Regex Lambda@Edge rewrite

Another version of this function that may not be as fast is using a regular expression to rewrite the URI path in similar fashion. I've included this version because it is useful also for other types of URI rewrites.

1
2
3
4
5
6
import re

def handler(event, context):
    request = event['Records'][0]['cf']['request']
    request['uri'] = re.sub(r'\/$', '/index.html', request['uri'])
    return request

Easiest deployment of this function can be done through the SAM tools.

Here's a useful serverless stack snippet for Lambda@Edge functions. Lambda@Edge function resource is deployed with Publishing Alias. Publishing Alias handles required function Version for Lambda@Edge. Place the above rewrite function code in code_directory/app.py. SAM will find it, package it and deploy it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
YourLambdaFunction:
   Type: AWS::Serverless::Function
   Properties:
      Code:
         CodeUri: code_directory
      AutoPublishAlias: YourFunctionAlias
      Role: !GetAtt FunctionRole.Arn
      Runtime: python3.7
      Handler: app.handler
      Timeout: 5

And finally the CloudFront distribution resource default Cache behavior configuration for origin-request. Triggerring the function before request gets sent to the Origin. This behavior can be customized to non-default if you have multiple CloudFront Distribution Origins.

1
2
3
4
DefaultCacheBehavior:
   LambdaFunctionAssociations:
   - EventType: origin-request
     LambdaFunctionARN: !Ref YourFunctionAlias.Version

CloudWatch Logs Expiration

CloudWatch Logs have a 5GB free tier. Many customers want to stay within this limit to keep their AWS bill low. Speaking from experience, it is easy to max out the free tier with a few Lambda functions logging their messages. In default configuration AWS Lambda functions write to CloudWatch Log Groups with no expiration and these logs are never deleted. The trick to stay in the free tier is setting the log expiration directly from within your CloudFormation or SAM stack with the following snippet. The snippet just creates AWS::Logs::LogGroup resource and configures the log message expiration.

1
2
3
4
5
LambdaLogGroup:
  Type: AWS::Logs::LogGroup
  Properties:
    LogGroupName: !Sub '/aws/lambda/${YourLambdaFunction}'
    RetentionInDays: 30