In Part 1, we took the serverless.AspNetCoreWebAPI
blueprint, and began to implement some best practice changes in order to make it a production ready workload. In Part 2, I’ll show you some more things I like to do to ensure your serverless application is ready for prime time.
Add Alarms
Once your workload is deployed into production, you’ll want to be able to track its health. Eventually you’ll want to create very workload specific metrics, and use these for alarms, but as a first pass, it doesn’t hurt to set up some basic alarms for API Gateway (i.e. 500 and 400 http codes), and Lambda errors. For the Lambda function, the cloudformation looks like this:
AspNetCoreFunctionAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmDescription: Alarm for API lambda errors
Namespace: "AWS/Lambda"
MetricName: Errors
Dimensions:
-
Name: FunctionName
Value: !Ref AspNetCoreFunction
Statistic: Sum
Period: 60
EvaluationPeriods: 1
Threshold: 1
ComparisonOperator: GreaterThanOrEqualToThreshold
For API Gateway, like this
ApiGateway5XXAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmDescription: Alarm for API gateway 5XX errors
Namespace: "AWS/ApiGateway"
MetricName: 5XXError
Dimensions:
-
Name: ApiName
Value: !Ref AWS::StackName
Statistic: Sum
Period: 60
EvaluationPeriods: 1
Threshold: 1
ComparisonOperator: GreaterThanOrEqualToThreshold
And similar for 4XX errors.
You should also consider setting up alarms on other resources you intend to use in your serverless application as you add these resources. For example, StepFunctions should alarm on the ExecutionsFailed metric, SQS queues should have a deadletter queue with an alarm set up for the queue length being greater than 1.
Your alarm strategy will need to evolve over time as you begin to understand your workload under production use. It’s important to find a balance between alerting you when something has gone wrong and needs remedying vs alerting so often you get notification fatigue and stop paying attention. This will ultimately give you some clues as to how to best write your Lambda functions. Firstly, you should let any serious issues such as exceptions, and other unexpected failure states, bubble up to the Lambda runtime. However, you should handle any likely failure states (even if they are caused by exceptions) and act with an appropriate fallback. This basic first pass will ensure that you have the infrastructure in place that you can expand on as your application evolves.
A good alarm strategy will become critical if you want to use any of the advanced CodeDeploy zero downtime deployment strategies such as Linear or Canary, as I demonstrated in at a Melbourne AWS User Group a while back
Add an SNS Topic Triggered by The Alarms
Nobody should be expected to be monitoring the AWS Console for alarms. Ideally team members should have limited if any access to the AWS Console for your production account. To ensure operations staff are notified of any relevant issues in production, an SNS topic should be employed. A single SNS topic will suffice in the early stages of development, and people can be subscribed to it using their email address. Again, this is about setting up the infrastructure so you can evolve it as your workload evolves.
Given that you will probably be deploying to multiple environments (e.g. dev, test, staging, prod etc…) it is at this point that I recommend ensuring you explicitly call out the environment in the SNS topic name so that if you are subscribed to alarms in multiple environments, you can immediately see which environment is problematic. This requires an Environment
parameter to be passed into your SAM template. The SNS Topic can now be defined as follows:
AlarmTopic:
Type: AWS::SNS::Topic
Properties:
DisplayName: !Sub 'Application Alarms Topic for ${Environment}'
TopicName: !Sub '${Environment}-Application-Alarm-Topic'
To hook up the Alarms to the SNS topic, simply add an `AlarmAction to the Alarm like so:
AlarmActions:
- !Ref RoppAlarmTopic
Create a CI/CD pipeline for deploying your serverless application
While using the dotnet lambda deploy-serverless …
cli command is an easy way to get started testing your ideas, if you are wanting to ensure your code is fit for end users, you need to ensure that you can reliably and repeatably build and deploy your application in a secure way. This is best achieved using a dedicated machine with consistent configuration and tooling, aka a “build server”. While I have written extensively about AWS CI/CD tooling, and even have an open source project demonstrating some of the techniques, there are many other options that will suffice. Just pick the one you and your team are most comfortable using. The key though, is that you need to separate the build from the deployment phases. In order to do this, you can use the following command to package your built dotnet binaries as well as augment your SAM template and store everything in S3 ready for later deployment:
dotnet lambda package-ci …
A full example of this can be seen using AWS CodeBuild in my previously mentioned open source project here. This will leave you with your deployable artefact. The next step is to use your chosen tool’s cloudformation deployment plug in, (or if your tool doesn’t have such a plugin, use the aws cli) in a deployment stage to deploy that artefact through all of the environments, changing only its configuration. This can be seen using AWS CodePipeline deploying to dev here, and to prod here. There is a slight difference in the way I deploy to prod in this example in that, for prod, I first create a change set, and then have a manual approval gate. This is to allow a release manager to verify the contents of the changeset before it goes to production. Most importantly though, the exact same artefacts that were deployed and tested in the dev environment are being deployed to production.
Version Stamping
Another thing I like to do is to implement some form of version stamping of my binaries so that I can very easily see what version of my application is running. A lot of people seem to like semantic versioning for this, which works fine, but I feel this kind of versioning is more for end user software packages where users have access to the binaries on their machine. For serverless style applications, the thing I care about most is being able to trace it back to my source code, which is why I tend to like to include the git sha1 somehow in the version string. One way of doing this is to add a version Suffix to your .Net assembly.
In the csproj file add the following to a property group
…
<VersionPrefix>1.0.0</VersionPrefix>
…
Then use an msbuild parameter VersionSuffix
, setting it to the first 7 characters of the current HEAD
sha1. In CodeBuild you can use this approach:
dotnet lambda package-ci
…
--msbuild-parameters "/p:VersionSuffix=${CODEBUILD_RESOLVED_SOURCE_VERSION:0:7}"
Then you should ensure you log this version out when your lambda first starts
var assembly = Assembly.GetExecutingAssembly();
var assemblyVersion = assembly.GetCustomAttributes<AssemblyInformationalVersionAttribute>().ToList()[0].InformationalVersion;
Console.WriteLine($"AssemblyInfo - name: {assembly.GetName().Name}, Version: {assemblyVersion}");
See an example commit here.
Warnings and Linting
As a general rule, the first thing I do when starting a new project is switch on the “Warnings as Errors”
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
If you start with this, you will avoid the challenges of ever having 300+ warnings, and then missing the important warning that shows you something that was obviously an unintended issue.
Same applies with your cloudformation templates. I like to use cfn-lint to ensure my SAM template is as hygienic as possible. You can add a cfn-lint plug-in to VSCode and get immediate feedback as you develop, but you should also fail your build if there are any cfn-lint errors, just in case others changing your SAM template are not using cfn-lint.
Tagging
Tags can really help you identify and select resources in your AWS account. Many of the resources you create with SAM/cloudformation will automagically have a default set of tags as shown below:
Unfortunately not all resources have a default set of tags. As such, you should put in place a tagging strategy to apply to all of your resources. This might include things like “Application Name”, “Stack Name”, “Cost Centre”, etc… This should evolve naturally over time as your application and development processes solidify.
Quick Fire Tips
There is still so much more to ensuring your serverless application is ready for prime time, but these start to get very specific to your application, so it is difficult to give generic guidance. Just for completeness, here is a list of other things you should consider if your application is API based.
Securing your API
Only a very small number of APIs should be exposed publicly to the internet without any form of authentication. Consider:
- Securing with IAM
- Using an Oauth provider
- Authenticating with Mutual TLS
Throttling
You should consider implementing some throttling, either on your whole API or based on usage plans for customers using an API Key. See here for details
Web Application Firewall
For public facing APIs you should consider DDOS protection like AWS WAF or some other equivalent offering.
Conclusion
I hope you found this list helpful. Please put in the comments anything else you think might be worth adding to the list.