Using one-time upload URLs in AWS with Memcached

Frederique Retsema

In this blog I will show how you can use the SAM (Serverless Application Model) to get a presigned upload URL to AWS S3 that can be used exactly once [1]. In AWS it is possible to use a presigned URL to upload files, but the URL is valid for a specified duration and can be used multiple times. This doesn’t mean, however, that one-time upload URLs cannot be implemented in AWS.

Memcached

In the previous blog I presented a solution that used DynamoDB. The advantage is that it can be set up very fast. The downside is that when you upload many files, DynamoDB can become quite expensive. It can become more expensive than paying for an in memory database [2].

DynamoDB stores its data on persistent storage. We only need to store the data for a few seconds, therefore we can keep the data in memory.

This brings me to ElastiCache Memcached:

The architecture is very similar to the previous blog, with DynamoDB [3]. In this blog I will mostly talk about the differences between DynamoDB and ElastiCache. The most important difference is that DynamoDB is serverless and ElastiCache isn’t. This means that we have to create a VPC and that the two Lambda functions that use ElastiCache also have to run in this VPC. The VPC doesn’t need a connection to the internet: the first Lambda function is called by the API Gateway and the second Lambda function is called from within AWS S3. We do need a VPC Endpoint to get access to S3 from the VPC:

As before, I use the Serverless Application Model (SAM) as an improvement on CloudFormation. To set up the VPC, I use a seperate CloudFormation template for the VPC, the subnets and the security group.

In SAM it is impossible to refer to a public S3 URL to get the template from. This is different from CloudFormation, where this is possible. The resource type is now AWS::Serverless::Application and the template is kept in the Serverless Repository to make it easier to share this solution with you.

S3 Endpoint

I used an S3 endpoint service name to connect the Lambda function to S3. The name is different for each region. To get the correct service name for your region, you can use the following command:

Windows: aws ec2 describe-vpc-endpoint-services --region eu-west-1 | findstr "S3"
Linux: aws ec2 describe-vpc-endpoint-services --region eu-west-1 | grep "S3"

You will see the service name in the results, in my case com.amazonaws.eu-west-1.s3:

Connecting to Memcached

When you are searching for documentation about Memcached, you will find that the interface is using a telnet protocol. When you connect via

telnet memcached-dns-name 11211

and then issue commands like

add newkey 0 60 5
quit

you can then quit the session, start a new session and retrieve the same key (and a nonexistent key) by using the commands:

get newkey
get nonexistent
quit

Fortunately, there is a code example for Lambda functions to get the same result from a Python library [5]. A full description of the memcached_client can be found here [6].

Play along

As usual, you can find all code in a GitHub repository [1]. You can use package.yaml to deploy the application. Mind, that this time it is only possible to deploy the template in region Ireland (eu-west-1). This is the case, because the network stack refers to the AWS serverless applications repository. The serverless repository is a regional service within AWS. After deploying the CloudFormation template, you can go to CloudFormation: you can find the one time signed URL in the outputs section of the enrolled stack. The following scripts are available in the client directory to look at the different situations:

  • upload.py: normal upload, will succeed
  • upload2x.py: will upload the same file twice
  • upload3x.py: will upload the same file three times
  • upload_with_delay.py: will wait a little bit too long (40 seconds) between getting the upload URL and uploading the file

All these scripts have the same parameters. The first parameter is the URL to get the signed upload URL. The second parameter is the API Key (see the previous blog how to get this [3]). The last parameter is the file you want to upload. For example:

python upload.py https://2lmht3jedj.execute-api.eu-west-1.amazonaws.com/Prod/getpresignedurl r2aVK769WH8IQse0cs5A17hbNKkqUEVK1tJ4hKFr myfile1.txt

Next time…

You might have wondered why we use DynamoDB and Memcached in the first place. Why not use S3 and then use versioning? When the version is one, then the file is processed. When the version is something else then it isn’t processed.

In the next blog, I will show a solution based on this idea. In this solution we don’t need to store the filename between the first two Lambda functions. There are, however, also some caveats in this solution.

I will also look back on this series: the advantages and disadvantages for each of the three solutions will be explained.

This series…

This is the first blog in a series of three. I solved the problem by using the following AWS services:

  • DynamoDB (previous blog, see [3])
  • Memcached (this blog)
  • S3 versioning (next blog)

Links

[1] Github repository: https://github.com/FrederiqueRetsema/AMIS-Blog-AWS , directory OneTimeUploadUrlMemcached

[2] ElastiCache pricing: https://aws.amazon.com/elasticache/pricing/, DynamoDB pricing: https://aws.amazon.com/dynamodb/pricing/

[3] Previous blog “Using one-time upload URLs in AWS using DynamoDB”: https://technology.amis.nl/aws/using-one-time-upload-urls-in-aws-using-dynamodb/

[4] Memcached cheat sheet: https://lzone.de/cheat-sheet/memcached

[5] Example code for ElastiCache memcached: https://docs.aws.amazon.com/lambda/latest/dg/services-elasticache-tutorial.html

[6] Description of memcache_client: http://mixpanel.github.io/memcache_client/

[7] This article describes how you can deploy your SAM template: https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-template-publishing-applications.html

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Next Post

Python: A Google Translate service using Playwright

Facebook 0 Twitter Linkedin There are a lot of use-cases in which you might want to automate web-browser actions. For example to automate tedious repetitive tasks or to perform automated tests of front-end applications. There are several tools available to do this such as Selenium, Cypress and Puppeteer. Several blog […]