I’ve been using Serverless for about a year now but I’m building my first app that runs inside a VPC so it can access an RDS database. When I run sls remove Cloudformation hangs at deleting the Lambda function for a long time (40 minutes) with a message saying:
CloudFormation is waiting for NetworkInterfaces associated with the Lambda Function to be cleaned up..
Is this normal? My non-VPC Serverless projects are removed in under a minute. I’ve tested this several times with the same result. I’m using sls version 1.19.0.
VPC-based Lambda functions used to be removed immediately, but the ENI that had been (automatically) allocated so that it could communicate inside the VPC would be orphaned unless you had waiting a certain (unspecified) amount of time after the function had had traffic. This resulted in lots of orphaned ENIs, and it was easy to quickly reach the default soft-limit for ENIs in an account.
The recent change in behaviour (which I didn’t see mentioned anywhere officially, but had heard of others experiencing) means that your stack clean-up will take as long as it takes to clean up the associated resources (in this case the ENI).
I would hope that this gets faster in the future, but I don’t think AWS will commit to any timeline or durations (it’s just not their style).
Just to be clear, this has nothing to do with Serverless, and everything to do with VPC-based Lambda functions.
That’s a bummer. I was pretty sure this was an AWS problem since the delay was happening during Cloudformation stack delete. Thanks for confirming. I guess we will just have to work around this.
I recently had an e-mail exchange with Chris Munns who is the Senior Developer Advocate for the AWS Lambda team and he confirmed there definitely is a 40 minute delay when cleaning up ENI’s on Lambda functions in VPC’s. He also told me “the 40 minute time issue is being worked on”. So for now we have to live with it but it appears they are working on getting rid of that delay. He indicated they have helped customers with workarounds so if you have support that might be an option.
Unfortunately our project requires using a VPC because the Lambda functions need access to an RDS database.
Yeah I read up on how Terraform handles it. I believe what they do is tear down the whole vpc subnet and recreate it which has the side effect of deleting the Lambda ENI’s.
From the PR associted to the issue in terraform, they are not deleting the subnet but only the ENI attached to a lambda function having a name that match a predefined rule
I think serverless can do the exact same thing : that will save a lot of time to all devs that are required to use these fuck**g VPC inside a lambda to access RDS
This is also very painful for our team. We’re using the gitlab ci to deploy our serverless functions and if we don’t coordinate our deploys to our integration environment we sometimes run into this problem. The CI job will either run for hours or time out. As a workaround I was hoping we could somehow disable removal of api methods and force them to be manually cleaned later, or have serverless fail to deploy without a force flag if the removal of an api method would occur.
As a follow up to this issue, it seems that the recent overhaul of VPC’s in relation to Lambda should fix the issue. They have started rolling out the changes and a handful of regions should see the improvement already but they apparently plan to have the full rollout completed some time in December across all regions.