I have authored a few Serverless (AWS lambda) applications to date and one thing I have always struggled with is a way to manage / monitor a live environment of many functions. For example with container based applications we have tools such as Rancher / Kubernetes and the like, but within the serverless space production management involves a good bit of flicking between cloud platform consoles / CLIs / hosted logs.
My question is: Has anyone got any suggestions on a way wrangle all the information provided by a production serverless application (for example: invoke metrics, log streams, current versions, linked services, function configurations, function relationships) ?
Apologies as this is a very open ended question, but any insight into what others are doing is awesome.
I think what you’re looking for is CloudWatch Alarrms. You can set up metrics for API Gateway/Lambda and have any alerts posted to an SNS topic. You can then subscribe to the topic with email or set up another Lambda to post to a Slack channel. Hope that helps
Yeah Cloud Watch alarms are something I already use for alerting on events. But really what I think I’m after is more of an orchestration tool. Going back to the container example I guess it would be a Kubernetes type thing only for serverless applications.
I can’t find anything that really covers this so maybe it’s a good idea for a new project
I know, it’s not an exact answer to your question, but maybe helpful as general strategy for deployments. It ensures a working deployment before going live:
Thanks for the link, not really what I’m thinking about as blue / green is more of a deployment strategy not a post deployment monitoring tool.
I’ve not tried blue/green deployment with a serverless application. How did you set this up? This would be really interesting to hear how you got it working.
If I was to implement this on a API Gateway + Lambda stack I’m struggling to see how to handle the switch over between blue / green, connection draining, ramp up / down… etc.
You’d need at least a separate DNS record in front of your service to allow you to blue/green it, but that wouldn’t explicitly take care of connection draining. In practice the draining might “just work” because clients that have the old site might not re-resolve the DNS name (and therefore get the new service’s endpoint), but not sure how comfortable you are relying on that - DNS client implementations are notoriously varied in their implementations…
I find a more successful strategy for service deployments is the Canary model (which is also mentioned on Fowler’s blog) - again with an additional layer of DNS you direct a portion of your traffic to the new service (using Route53 weighted records), and monitoring the results.
@tomwilderspin sorry, I have to admit, I did not set it up myself up to now. I am interested in a solution to your problem as well… Blue/green deployment just popped up my mind, as a possible solution to solve a part of your problem. For sure, the solution would be somehow possible with a green and red production stage at API gateway and set up a custom domain. My guess would be to automate the removing of the custom domain of the old eg. green stage and add it to the new red stage with CloudFront. But that would be the route, I would try it. Never did it up to now. I also would be really interested, whether this is a feasible approach.
@rowanu thanks for your input! Indeed, the change of DNS entries could/would cause connection draining. I did not know the Canary model. Very interesting! Do you know if this is possible to set up in front of API gateway as well?