I have a ‘post registration process’ that does some offline processing.
There it pages 70000 json items from an external api in batches of 250 and puts them on a kinesis stream. (using kinesis as its the fastest way of putting larges amounts of data onto some type of queue, SNS and SQS have smaller limits when pushing messages onto them).
Anyway I have a consumer on the other end of the kinesis stream which saves the items to a database table.
If there is a failure, kinesis will just keep trying and trying for up to 7 days which could end up costing lots of money in terms of lambda invocations.
Sure, I could implement a record of how many times a particular message has been tried by using a database table with a kinesis message id and counter column - but if something goes wrong with the database then I am back to the same issue again.
So I was thinking of the following which Im wondering is a good idea:
- set kinesis batch size to 1
- in kinesis consumer do nothing other than simply put that 1 message onto a SNS topic
- SNS topic has a consumer which does the db operation to save
if there is any issue other than sns abviously, then at least kinesis will not keep retrying and the most amount of retries will be on the SNS consumer side which is default of 3 retries