As I understand it, at any given point in time there is a limit of 1000 lambda functions executing per region.
Based on my current application (not serverless), I expect to scale to an average of 100-400 web req/sec.
In my mind, each web request will have maybe 5-10 lambda calls through api gateway.
In this way, I will reach the concurrency limit pretty soon…And I will not scale any further.
I don’t understand well the concurrency limit
my design is flawed, i.e. I should reduce drastically the number of functions needed to serve a web request (but then maybe they will execute for longer time?)
I’m trying to migrate an application to serverless and I’m struggling to understand if it’s the right choice
I’m not sure what you mean by this: “ In my mind, each web request will have maybe 5-10 lambda calls through api gateway.”. What are these 5-10 calls?
In a typical serverless web app, each web request would be served by exactly 1 synchronous Lambda function invocation (triggered by API Gateway).
So if you have 400 reqs/sec hitting APIGW and let’s say your average Lambda execution time is 0.1 seconds, you would on average have 40 concurrent Lambda executions under this peak load.