Entire serverless API?

I’m investigating the idea of building a product with lambda and serverless, but I’m confused about concurrency. Do people actually create entire serverless APIs? How does this make sense from a concurrent performance perspective? (See below)

Say I’m building a traditional startup app that will respond to CRUD requests in front of a database. Maybe it does some auth and fancy queries too. If I were using last-gen tools, I would make an HTTP web server in my application language, and this could handle thousands of concurrent connections per instance, because my application is mostly IO-bound (Database access and web proxying).

In the serverless model, I might throw this web server into a lambda function, or split it up into multiple lambdas behind API Gateway. AWS will spin up as many lambdas as I need, and shut them down when I’m not using it, awesome.

But it’s my understanding that AWS only allows one request per lambda at a time. Isn’t this grossly inefficient / expensive? If my lambdas are running, handling one request after another, spinning their wheels while they wait for IO actions to complete, isn’t that equivalent going way back in time before we had green thread servers?

What am I missing here?

The requests don’t wait until the previous one is finished. Additional instances of the lambda functions are spun up on demand to cope with the extra load.

I imagine there would be a cross over point where it makes more financial sense to have a dedicated web server handling all the requests rather than using Lambda but they have pricing calculators that you can use to determine this.

what Bearclawstu said is correct. AWS lambda instantiate an instance and keeps it open to accept more request(pre warm). In case it cannot accept more accept, based on your concurrency limits, it will instantiate more instance to handle the load.