Best pattern for working with remote APIs with rate limits

There are a lot of applications out there, that a built on top of querying remote APIs with their own rate limits. With this scaling becomes a bit more problematic and choosing right architecture pattern can be rather hard.
I will try to describe problem as best as I can.
Imagine we have some API with:
daily limit of 1000 calls per client (per his API key, for example) per day
concurrent limit around 100 calls per second (we will never know the exact amount)

What architecture should one choose, if he want to make 3000 requests? With daily limit it is straightforward - we just store that state somewhere and that’s it. The question is how to deal with concurrent limit. Using queue? Resorting to monolithic app, launched once, so it can always monitored sent requests?

Basically I want to understand a best practice to build something like but on bigger scale.

The best (and recommended) way is to get API rate limits raised from the provider.

If you can’t, this lib can help work around some of those limits

Hi David!

Thanks for answer.
So you say that either have to raise limits (which is cool), or, basically have several API keys and use them with node-token-dealer again raising limits in a cheeky way, right?
But this does not tackle the core problem - however large a my limits I can always appear in a situation where several of my serverless function reach the temp limit. This more a question of control and management. I guess your idea is to have one (one for each ‘api key’) serveless function that does all communication, right?

The thing is, that I thought that serverless really shine when we use some kind of atomic action.
Like function = “fetch one request” vs function “fetch all requests”.
This is what I want to discuss :slight_smile:

As an example think of some queue architecture when servless function takes one of something and make an API call based on that. Or is this a bad idea?

You gotta bump up your API limits with whatever the API provider is avoid rate limiting

Lambda (by default) is limited to 1000 concurrent innovations, unless you ask AWS to raise your limits.

This means 1000 functions can all call the same API with the same key and you could see rate limit exceptions.

You need to handle these exceptions in your function code and add in retry/failover logic.

Or be “sneaky” and try rotating keys. Even then you will want rety/failover logic

Hi David!

OK, gotcha. I was hoping there will be some magic to remove step where code has to use retry/failover logic, but good to know that this is no such trick, too.