I have a python script I’ve been running at home. I queries ~200 web pages to check for updates in any of my various work groups. The system only provides hourly emails - I want notification on 10 minute intervals - so I wrote a polling script.
It uses aiohttp and asyncio pretty heavily to do the intial session login and set up credentials and then farms out and harvests all the 200 requests into a list of groups with activity - if any.
When I ported this to Lambda - I had to cut out all the asyncio and use straight synchronous requests. The Lambda function timed out after 6s… So I’m not doing this right.
Should I be thinking about this by:
having the initial trigger check for a valid login session - otherwise creates one
creates a loop for all active teams and triggers a different Lambda function to check each group
some kind of harvesting function that is triggered once all the groups are finished
1.) How long is the overall execution time when you run your script at home?
2.) What ist the error message when it fails on aws lambda?
3.) Do you use any non-standard python libraries, for example requests?
4.) Which runtime did you choose?
5.) How did you trigger your lambda function? schedule? api gateway?
1.) The asyncio stuff should run on aws lambda - because its python. But of course you have to use runtime: python3.6
2.) The timeout property for the function should be set appropriately - otherwise timeout can occur before the function finishes. Timeout can be set up to 5 minutes for a lambda function.
3.) If you use non-standard python libraries, you maybe forgot to deploy them?
I have no idea how to encode that loop in a lambda handler. I’m very new to Lambda
I have not set the timeout property - so I’m guessing the default is 6s - since that’s the error I get.
I’m using sls deploy at the moment and together with docker it’s uploading a 6Mb .zip file each time (why is it so big???) - which makes it really hard for me to test using my unstable rural internet connection.
I think there should be no problem running the asyncio stuff - if you select the python3.6 runtime - because its native python!?
Just make sure all your external (non-standard) libraries are asyncio-compatible.
Afaik the requests library is not compatible with asyncio.
However if you come close to the 300sec limit with the total execution time of your lambda, you have to rethink. A possible architecture may be:
A lambda function that initially feeds all the jobs in an Amazon SQS queue or Amazon SNS topic.
A worker lambda function then grabs one job from the queue/topic and runs it.
This can be scaled up if you allow parallel execution of the worker lambda function.