Can Serverless work with AWS Lambda, DAX and VPCs?

Hi All,

So far I have successfully used Serverless to configure and deploy AWS Lambdas with DynamoDB, I’m now trying to ‘upgrade’ DynamoDB to use DAX. Thing is, DAX needs to run within a VPC and essentially my lambda cannot communicate with my DAX cluster.

In short I’m trying to implement something like this - Use Amazon DynamoDB Accelerator (DAX) from AWS Lambda to increase performance while reducing costs | AWS Database Blog - but using Serverless to manage and configure CloudFormation instead.

I searched as best as I could but could not find any directly useful information. Has anyone used Serverless to deploy AWS Lambdas with DAX before?

These links might serve useful for others.

Any help will be appreciated!

I haven’t done it myself yet but do plan to eventually. It looks like what you need to do is use the CloudFormation from those articles and convert into the correct yaml you can use in the resources section of your serverless.yml to create your DAX cluster etc.

Maybe someone else has some more complete advice but that’s where I would start.

Yup, the CloudFormation is only 1 part of the puzzle though. Apparently lambdas dont like to be confined to a VPC and additional config needs to be done to allow them to communicate with the rest of the Internet.

I’ve found separate guides as linked above, and I was wondering whether the Serverless framework had any implementation.

The Serverless Framework is a deployment tool. It just reduces the amount of CloudFormation template you need to write. Anything you can do with CloudFormation you can do with it.

Your biggest problem is that accessing a DAX cluster requires you to deploy your Lambda inside a VPC. That introduces all of the VPC limitations plus your Lambda will have approximately 10 seconds of additional latency during gold start while the EIN is attached.

Hi @laboro18,
I’ve created VPC to use with my RDS and ES instance. I’ve shared a gist, for any questions let’s communicate through the gist.

That introduces all of the VPC limitations plus your Lambda will have approximately 10 seconds of additional latency during gold start while the EIN is attached.

Does this mean that using Serverless/Lambda inside a VPC renders Lambdas fairly useless? We do all of our work inside VPCs and I really want to move our next project to Serverless. Do any of the other cloud providers have a better solution for this?

“useles” is a very strong word. There are definitely uses for Lambda inside a VPC. It really comes down to your use case. I believe that Bustle runs Lambda inside a VPC and it works for them. It wouldn’t work for me because my traffic levels are too sporadic and almost every user would suffer through a cold start.

I have had to deploy lambda functions within a VPC before and while it does add a little latency its not on the order of 10 seconds. Usually when people suffer the 10 second latency issue its because of the context.callbackWaitsForEmptyEventLoop set to true that causes the event loop to wait for the database connection to time out.

Being in a VPC can add some time on to the cold start but its not 10 seconds. At worst I saw an additional second.

Keep in mind that Lambda != APIG.
The use case that forced me to put “a” Lambda in a VPC was communicating with a financial institution that requires sftp request come from a whitelisted IP(or block of IP’s) since they were unwilling to whitelist every IP that AWS could possibly apply to an outbound connection from RandomLambda_01, I had to put it in a VPC.
But… Since this particular Lambda isn’t being called interactively by an end user, but being scheduled to make calls, or triggered by some other event, if it takes an extra 10 seconds to run, it doesn’t really effect anybody.

So yes “useless” is way to myopic as a blanket statement on a tool. It just may not be a good tool for your use case.

Thank you for this. That makes me feel much better about moving forward on Serverless with a VPC.

I’m thinking my use case for a VPC would be to communicate with a database (RDS) instance that is not publicly accessible. From my experience with AWS the only way to do that is to put everything in a VPC. I would like to move to a more modern architecture with ephemeral workloads on demand, so I don’t know if the same requirements exist yet.

Do you connect to private databases in AWS, and can that be done without a VPC?

I’ve managed to keep everything in this platform in DynamoDB CloudSearch and S3, so haven’t needed to access RDS or EC2 resources.

@garethmcc The 8-10 second additional delay on cold start inside a VPC is well documented and even AWS admits it. The delay is caused by the time it takes to add an EIN to the Lambda before it can start running your code and has nothing to do with context.callbackWaitsForEmptyEventLoop.

There is a delay. But it is not 10 seconds and I have not seen any AWS documentation that says as much. From personal experience having Lambdas inside a VPC, called by API Gateway then communicating with an RDS database I think the worst delay I saw was 1.5 seconds. This isn’t reading from a blog post but having it actually running. Without a VPC its < 1 second so there is a difference but it is not 10 seconds.

@garethmcc Can you share how you’re configuring your Lambda and how you’re measuring cold starts?

I no longer work at the company I did this at so don’t have access to the code I wrote there. However it was not very complicated. We had multiple services with Lambdas inside a VPC making SQL requests to multiple RDS databases using the Sequelize ORM. I didn’t need to check whether a specific instance of a Lambda was a cold start because we had no Lambda executions exceed 2 seconds. I do recall these 2 seconds executions occurred immediately after redeploying changes to a Lambda which forces a cold start as the new code replaces the old. But our general metrics showed no Lambda executing beyond 2 seconds as a maximum.

What we did find was 10 second stall happening because of the previously mentioned context.callbackWaitsForEmptyEventLoop being set to true by default causing the Lambda to wait for the database connection to drop before executing the callback. This is where I see most people end up having issues with a 10 second wait when dealing with RDS database.

The problem you’re describing does exist and can be fixed as you described but this isn’t the cold start problem.

This article does a good job of explaining the additional latency for Lambdas inside a VPC.

Interesting read but I have never seen an effect like that on my lambda’s running in a VPC. Nothing like 10 seconds latency and this is a backend service serving thousands of requests a day. All I can say is none of my experience thus far has shown these results. It doesn’t matter what articles you might point to, it doesn’t change the reality I saw on a daily basis for over a year.