from the documentation i understand that when using SNS as a event source to a lambda you will be invoked for every message that arrives in SNS. In Kinesis we can get up to 10000 messages per lambda invocation based on what has been configured.
Is the above statement correct?
If yes why does the model that SNS sends to a lambda contains a array of records? http://docs.aws.amazon.com/lambda/latest/dg/eventsources.html#eventsources-sns
With SNS the event source you get in lambda is based on the AWS SNS notification types (similar to S3 events) which predate the lambda integration. With SNS I think AWS have used the same record model across all services with notifications so this could be why we get an array which only has one record.
I can’t find any AWS docs that explain the SNS message structure in detail so it could be worth a message in the AWS forums to get some more information. Added this S3 notification event doc for reference: http://docs.aws.amazon.com/AmazonS3/latest/dev/notification-content-structure.html
the link i provided explained the message a little. I look just to confirm that SNS will send one message at a time despite the model supporting more due to the array. This is critical because it will change my design and i will use Kinesis instead where i can get up to 10000 messages per Lambda execution.
I’m sure that I have never got more than 1 message from SNS per lambda invoke, however this might be down to a low frequency of SNS dispatches (around 1 per min) in my experience.
From your stated volume of messages (10,000+) per invoke, I’m not 100% sure what you are looking to implement (message bus or message dispatch) but I think kinesis might be what you are after:
Kinesis works with your lambda function in a few different ways -> http://docs.aws.amazon.com/lambda/latest/dg/with-kinesis.html
It sounds like you are after the event structure (pull down multiple messages in one invoke), kinesis adds multiple messages to the stack and you grab as many messages from the stack as you need to process (setting the pointers etc), 10k in this use case.
Thing to watch out with here is how long it takes your lambda to process the 10,000 messages, if it takes a while to complete you might max out your invoke time before you finish (also you will pay more $ due to long run time). Also do you want to process less than 10k messages or are you going to wait until you have 10k messages waiting to be processed?
SNS to lambda will be invoked per message dispatch. so as you push your 10k messages into SNS it will dispatch these to a listening lambda (eventually). I don’t think this really fits your use case but offers a benefit if you change your design idea a bit. Instead of processing in 10k batches you can process single messages at a time, this will have a shorter invoke time and will hopefully ensure your lambda doesn’t max out its invoke time.
Bottom line in design is to weigh up if a batch processing design is really what you need or if stream processing works for you. Kinesis offers both options (may be with higher costs and setup complexity) but SNS will only really serve a message per invoke design (but is a lot easier to setup).
Sorry if this is still a bit vague, Hope this helps a little
Thanks vor the info.
I wanted to use SNS for two reasons.
- Low Cost
- Almost infinite scalability
Kinesis comes with a high upfront cost even when you don’t use it to the max, and scalability is not automatic. You have to create a second - third shard use a partioning key etc.
Processing 10000 messages is actually very fast (~1-2sec) so no issues with Lambda times.
I will go with Kinesis just for the batch message process property of it. If SNS could send more message per invocation then the choice would be SNS just for the price and scalability perspective.
Thanks for the info about the SNS.
I think sns follow push based mechanism while kinesis is pull based. So in kinesis, you have the freedom to read in batches but sns being push based send notification once it gets one.
There is a good intermediate here that has not been mentioned. SQS can be used as an event trigger for Lambda functions as well (https://serverless.com/framework/docs/providers/aws/events/sqs/) and does message batching as well. The problem with using SNS in your use case is if you have a very bursty rate of message sends and 10k message get pushed at SNS rapidly you will probably reach your concurrent Lambda limits very quickly. SQS, on the other hand, can batch messages together so you reduce the number of concurrent Lambda’s, it “autoscales” unlike Kinesis as you mentioned and has no fixed cost.