AWS Serverless Isolated Stacks
Give every developer their own silo

alt text

Setting up Isolated Stacks

One thing I love about writing serverless software is the idea of scale-to-zero. Basically, when you are not using a serverless application, you are not paying for any compute resources whatsoever. While this is a boon for apps that have spikey and unpredictable loads, it also has a very real benefit during development. It means that developers can realistically spin up their own isolated environment in order to validate their work.

There is still value in being able to write and test code locally. However, this means isolating the code your working on, using locally running versions of dependencies (see Local .Net development with Amazon S3 as an example), and/or connecting to dependencies via the internet. At some point, to avoid the “works on my machine” syndrome, you need to integrate your code with the real environment it will actually run in.

The difference between running code locally and in a “real” environment can be huge. This may mean different OS types/patches, different networking constraints, differences between emulated APIs and real APIs, etc… Ultimately the only environment that is ever truly like production is… production. That said, every attempt you make to create an integration environment more “like” production, the more you are rewarded with early detection of issues.

The serverless approach allows you to build complete environments utilizing the exact same infrastructure as code scripts and the same compute sizes as your build pipeline uses to deploy to production. These isolated stacks should definitely not live in the same account as your production environment, but can happily be made to share a non-production account where developers have privileged access. Let’s take a look at how we can achieve this by modifying my own AWS CodePipeline Example to support isolated stacks. This will consist of 4 steps, and everything you need to do can be conveniently seen in this commit.

Before we get into these 4 steps a few hot tips.

Tip 1 - Ensure that Platform centric things like VPCs, subnets, VPC Endpoints, even KMS keys are separated into their own cloud formation stacks, and imported into the application stack. You probably don’t want 15 separate VPCs and associated subnets, endpoints etc… (and subsequent costs) in the one account.

Tip 2 - If you don’t explicitly need to name a resource … don’t. Most resources require replacement if the name changes, which is why I tend to avoid giving my resources a name if I don’t have to. CloudFormation will usually do a pretty good job of giving the resource a vaguely meaningful (and unique) name based on the resources logical ID and the stack name. That said, sometimes a resource name is required/desired. I will show you how we handle these cases.

Tip 3 - Ensure you are using !Ref or !GetAtt to reference arn’s of resources created in the stack rather than attempting to guess by convention. This will ensure you are less likely to mis-configure a resource by accidentally missing the name substitution in the following steps.

Step 1: Introduce an IsolatedStackPrefix parameter to your cloudformation template

This allows multiple isolated stacks to share the same account. It will even allow you to deploy one of the pipeline versions of the stack to that same environment. I usually choose an environment called “dev” for this. It is also then used to create a condition e.g. PipelineStack which will allow you to switch in/out various things depending on whether you are deploying from the pipeline or an isolated stack.

Step 2: Conditionally Exclude Pipeline Specific Resources

In my example I had defined some pre and post lifecycle hooks, along with supporting roles etc… Technically these resources could even live in their own stack and be imported in where required, but it may make sense to include these with the application itself. Simply use the Condition property of your resource with the PipelineStack condition described above.

  PostTrafficLifecycleFunction:
    Type: AWS::Serverless::Function
    Condition: PipelineStack

This can get a little tricky when referencing those resources later in your template, but you can use the Fn::If function along with the !ref AWS::NoValue psuedo parameter as shown here to solve these types of problems.

  FunctionXRayPolicy:
    Type: AWS::IAM::Policy
    Properties:
      Roles:
        - !Ref AspNetCoreFunctionRole
        - !If 
          - PipelineStack 
          - !Ref LifecycleEventHookRole
          - !Ref AWS::NoValue
          #...

Step 3: Ensure any named resources are prefixed with the isolated stack prefix

This is as simple as using the !Sub ${IsolatedStackPrefix} in front of the resource name.

    TableName: !Sub ${IsolatedStackPrefix}CovidAPI

Step 4: Ensure DynamoDB tables and code that references them use the prefix

In my example I use lambda functions to access a DynamoDB table, but a similar approach should work for other code accessing the table.

To do this, firstly I add an environment variable to the lambda function called TablePrefix, and set it to be the IsolatedStackPrefix.

      Environment:
        Variables:
          env: !Ref Environment
          TablePrefix: !Ref IsolatedStackPrefix

I then ensure that whenever I go to do an operation on the table, I check the environment variable and add the prefix first. If your code is structured well, this should be reasonably easy to achieve. In my case, I simply created DynamoDBHelper class that created a DynamoDBContext, and then wherever I needed a DynamoDBContext replaced it with this helper.

public static class DynamoDBHelper
{
    public static DynamoDBContext CreateDynamoDBContext(this IAmazonDynamoDB client)
    {
        var prefix = Environment.GetEnvironmentVariable("TablePrefix");
        var config = new DynamoDBContextConfig();
        if (!string.IsNullOrEmpty(prefix))
        {
            config.TableNamePrefix = prefix;
        }

        return new DynamoDBContext(client, config);
    }
}
using(var context = _dynamoDBClient.CreateDynamoDBContext())
{
    //use context to access table...
}

And… you’re done!

Testing Your IAM Roles Permissions

One nice advantage of this approach is that if you are doing the right thing from a permissions point of view (i.e. scoping your permissions down to explicit resources), you should be easily able to validate this with your isolated stacks. For instance, if I have a lambda role that allows me to read and write to my DynamoDB table, if I were to deploy 2 separate isolated stacks (call them “bob” and “sarah”), if I went and changed the TablePrefix environment variable in sarah’s stack to “bob”, I should expect to see a whole heap of “not authorized” errors showing up in my cloudwatch logs when I tried to run sarahs lambdas.

Final Suggestion

I like to make it really simple for developers to deploy their stacks, so I usually create one line script something like this https://github.com/scottjbaldwin/AWSCodePipelineExample/blob/main/tools/Deploy-IsolatedStack.ps1 that does the grunt work, and makes life that little bit easier.

Conclusion

Trust me, if you implement isolated stacks for your serverless application, you will be thanked by everyone in the team, and you will see the improvements in your development cycle. A small note of warning, it may become so popular and easy to do, that you may well start to see developers spinning up 3, 4 or even more of their own personal stacks as they experiment with different ideas.

*****
Written by Scott Baldwin on 09 July 2022