Very long delay when doing sls remove of Lambda in a VPC

matt448 · August 18, 2017, 7:20pm

I’ve been using Serverless for about a year now but I’m building my first app that runs inside a VPC so it can access an RDS database. When I run sls remove Cloudformation hangs at deleting the Lambda function for a long time (40 minutes) with a message saying:

CloudFormation is waiting for NetworkInterfaces associated with the Lambda Function to be cleaned up..

Is this normal? My non-VPC Serverless projects are removed in under a minute. I’ve tested this several times with the same result. I’m using sls version 1.19.0.

rowanu · August 20, 2017, 3:51am

Yes, this is the “new normal”.

VPC-based Lambda functions used to be removed immediately, but the ENI that had been (automatically) allocated so that it could communicate inside the VPC would be orphaned unless you had waiting a certain (unspecified) amount of time after the function had had traffic. This resulted in lots of orphaned ENIs, and it was easy to quickly reach the default soft-limit for ENIs in an account.

The recent change in behaviour (which I didn’t see mentioned anywhere officially, but had heard of others experiencing) means that your stack clean-up will take as long as it takes to clean up the associated resources (in this case the ENI).

I would hope that this gets faster in the future, but I don’t think AWS will commit to any timeline or durations (it’s just not their style).

Just to be clear, this has nothing to do with Serverless, and everything to do with VPC-based Lambda functions.

matt448 · August 20, 2017, 10:25pm

That’s a bummer. I was pretty sure this was an AWS problem since the delay was happening during Cloudformation stack delete. Thanks for confirming. I guess we will just have to work around this.

nicusX · November 14, 2017, 5:22pm

I found a StackOverflow answer https://stackoverflow.com/questions/35990747/lambda-creating-eni-everytime-it-is-invoked-hitting-limit and a AWS Deleloper Forum answer https://forums.aws.amazon.com/message.jspa?messageID=734756 that seem to be relevant.
They both say it’s due to the Lambda execution policy lacking ec2:DeleteNetworkInterface permission.

But my lambdas have the following permissions and the issue still randomly happens:

- ec2:CreateNetworkInterface
- ec2:DescribeNetworkInterfaces
- ec2:DeleteNetworkInterface

I mitigated the problem moving out of the VPC all Lambdas not requiring VPC access.

matt448 · November 14, 2017, 6:30pm

I recently had an e-mail exchange with Chris Munns who is the Senior Developer Advocate for the AWS Lambda team and he confirmed there definitely is a 40 minute delay when cleaning up ENI’s on Lambda functions in VPC’s. He also told me “the 40 minute time issue is being worked on”. So for now we have to live with it but it appears they are working on getting rid of that delay. He indicated they have helped customers with workarounds so if you have support that might be an option.

Unfortunately our project requires using a VPC because the Lambda functions need access to an RDS database.

nicusX · November 14, 2017, 8:21pm

Ironically, Terraform has fixed this issue in their implementation https://github.com/hashicorp/terraform/issues/5767 While CloudFormation, in its dumbness, has no way of doing it .

I wonder if splitting the stack as suggested by https://forums.aws.amazon.com/message.jspa?messageID=734756#jive-message-734756 would help.
It doesn’t make any easier to delete the whole stack, but should allow redeploying the stack containing lambdas, adding and removing functions at any time

matt448 · November 14, 2017, 9:58pm

Yeah I read up on how Terraform handles it. I believe what they do is tear down the whole vpc subnet and recreate it which has the side effect of deleting the Lambda ENI’s.

cblin · March 2, 2018, 4:20pm

From the PR associted to the issue in terraform, they are not deleting the subnet but only the ENI attached to a lambda function having a name that match a predefined rule

Have a look at : https://github.com/hashicorp/terraform/pull/8033/files

I think serverless can do the exact same thing : that will save a lot of time to all devs that are required to use these fuck**g VPC inside a lambda to access RDS

simlu · May 29, 2018, 1:21am

@cblin

Also very interested in this. We opened a ticket here https://github.com/serverless/serverless/issues/5008

fantapop · November 29, 2018, 6:51pm

This is also very painful for our team. We’re using the gitlab ci to deploy our serverless functions and if we don’t coordinate our deploys to our integration environment we sometimes run into this problem. The CI job will either run for hours or time out. As a workaround I was hoping we could somehow disable removal of api methods and force them to be manually cleaned later, or have serverless fail to deploy without a force flag if the removal of an api method would occur.

kagarlickij · October 4, 2019, 10:15am

The same for me, I have

iamRoleStatements:
    - Effect: "Allow"
      Action:
        - "ec2:CreateNetworkInterface"
        - "ec2:DescribeNetworkInterfaces"
        - "ec2:DeleteNetworkInterface"
      Resource: "*"

But it’s been deleted ~40 min every time

garethmcc · November 21, 2019, 5:38am

As a follow up to this issue, it seems that the recent overhaul of VPC’s in relation to Lambda should fix the issue. They have started rolling out the changes and a handful of regions should see the improvement already but they apparently plan to have the full rollout completed some time in December across all regions.

tbsf · November 26, 2019, 10:08pm

This should be out in all regions now, but I’m still seeing this issue.

kiddrew · December 4, 2019, 2:00am

Seconded. I’m still seeing it. Even with really small stacks, if there’s a Lambda in a VPC, it takes FOREVER to delete.

pflugs30 · June 24, 2020, 2:47pm

Still seeing this issue on 2020-06-24.

pmuen · September 8, 2020, 6:54am

affirmative … 2020-08-25

piyushverma95 · December 14, 2020, 12:42pm

still seeing the issue on 14 Dec, 2020

BryschG · June 8, 2021, 6:43am

Still seeing the issue 08.06.2021.

jdobyns · November 9, 2021, 6:17pm

And now, this issue, is why we are dropping serverless altogether in 11/2021.

Way to go dumbs!

Ignore more things.

Nobody has 40 mins to wait for no good reason.

TomC · November 15, 2021, 9:29am

We never see this any more with updated Serverless.

And remove is rare enough it wasn’t too big a deal anyway

Topic		Replies	Views
CloudFormation is waiting for NetworkInterfaces associated with the Lambda Function to be cleaned up Serverless Architectures	0	444	January 7, 2022
Deploy not re-creating ENI's when in a VPC Serverless Framework	0	502	February 27, 2019
Serverless function redeployed with name property in yaml file stuck in deployment Serverless Framework	3	1729	August 2, 2017
Deploy deletes VPC Lambda settings Serverless Framework aws	1	831	September 19, 2019
Serverless remove hangs on delete of security group Serverless Framework	2	2522	March 13, 2017

Very long delay when doing sls remove of Lambda in a VPC

Related topics