Restructuring AWS - Proper way to configure AWS Accounts, Organisations and Profiles when using Serverless

Vadorequest-SL · July 8, 2018, 11:54am

Hello everyone.

After playing around for about 6 months with GCP/AWS and Serverless Framework, we have decided to rethink our AWS Organisation structure entirely.

Why reorganise AWS?

Because our current way of doing things actually limits us, and we need to lift those limits.
Here is what’s the most annoying thing at the moment:

Only I can use the Serverless framework, because we created a serverless-admin IAM user with AdministratorAccess permissions.
This user is used to manage everything through Serverless, from staging to production applications.

If I give this user to any collaborator, I take the risk that any of them can destroy anything on the production environments.

If I don’t give it, then none of them can access staging environment.

Of course, at the beginning, it didn’t matter much because I was mostly playing around doing a bunch of R&D and testing all those amazing things like Lambdas and alike. But now, I need to be able to allow other people into the playground and get dirty. #scalability

That particular limitation is what I aim to fix, by restructuring our usage of AWS.

How to reorganise AWS?

That’s what I’m currently wondering. I’ve talked with several people about the Serverless Framework’s particular case, where you basically need to give a “admin” role to your collaborators so that they can actually use it.
Especially at the beginning because it just makes life easier to get started with. (and because there is nothing critical online at that moment)
Also, we all noticed how setting up IAM properly is a pain, so many permissions to deal with, so difficult to figure out what you actually need. Using an “admin” role is just so much simpler.

But this practice doesn’t scale, you can’t add new people and give them “admin” permissions. That’s where we are, trying to change our AWS structure and looking for a proper solution that’ll hold for years to come and will scale with the upcoming people and projects.

That’s where AWS Organisations and AWS Accounts come into action. The first thing is to understand how those two should be used.

I’ve had a long discussion with other SLS members (thanks to Franklin and Rob) on Slack, see Slack

AWS Organisation

An Organisation is the top-level block or your company on AWS, you should only have one. It’s where you manage AWS Accounts, consolidated-billing (overview for all Accounts) and top-level DNS.

AWS Account

An Account can represent different things, depending on how you decide to use it.
It can be a person, an environment, a product+environment, and many other things. It totally depends on your design.

An Account has its own billing unit and permission unit.

When you first sign up on AWS, you automatically create an Account. Quite often, that same account is used to become the Master Account of your AWS Organisation

The Master Account shouldn’t be an environment nor product, it shouldn’t contain any deployed service. It should just handle top-level configuration like top-level DNS (your main domain name). The email linked to the Master account shouldn’t be an individual, but rather an alias.

Then, depending on how you want to manage your company, you may use AWS Accounts in different ways, but in my case, it’s as follow:

AWS accounts:
	Master account (has access to AWS Organisation configuration)
		Consolidated Billing (for all other AWS Accounts)
		Top-level DNS and Route 53 domain management
		IAM Users (with cross-account access when needed)
	Account production
	Account staging
	Account development

Production, staging and development Accounts have their own AWS Services, unit billing, etc.
When a User needs access to multiple accounts (dev + staging for instance), it’s handled through “cross-account” configuration.

This setup provides enough flexibility:

You have both consolidated billing for all your accounts, and per-account billing. You know exactly how much cost you staging and production environments, separately. Even if you don’t use tags, it gives a good and reliable cost overview.
Security is enhanced, each IAM User can be setup in a way that allows access to AWS Console, and programmatic access. Using the “switch role” feature allows for a smooth transition between the different roles (AWS Console)

Configuring which environment to works with, locally

Alright, now that there is a proper separation between environments, let’s talk about how it changes the way developers manipulate SLS on their machine.

Since every environment (production, staging, development) has its own Account, a User who has access to development will not have access to staging (I’m not sure, but I don’t think cross-account apply to programmatic access, it may only work for Console access), one simple way to properly configure your multiple AWS IAM credentials, is through the use of Profiles.

There are several ways of doing it, I prefer the automated way which chooses the right profile based on the environment you’re deploying to.

Ressources:

This post is a work in progress, it’s a struggle to setup AWS properly, and anticipate how your company will grow and how you should organise it. The goal here is to build some kind of “AWS Setup Guide” from the experience and feedbacks of the community. Don’t hesitate to ask questions and share your own struggles! I also may misunderstand some parts of AWS and don’t hesitate to tell me if so!

buggy · July 9, 2018, 5:26am

The current best practice is one account per service per stage. This is designed to limit the blast radius when something goes wrong.

I’m the first to admit that I don’t normally follow that advice for smaller projects. If I’m only working with a couple of services then I’m more likely to use one account per stage with all services in that stage deployed to the same account. This limits the blast radius but there are still times (especially during deployment) when things can go very wrong.

AWS goes even further with SAM. They recommend one SAM project for each event sources.

I think what your missing here is automated deployments. Developers shouldn’t be executing deployment scripts. Instead you should look at automating this using your CI/CD pipeline.

Vadorequest-SL · July 9, 2018, 8:42am

Interesting! The “one account per service per stage” makes sense, but my main issue with this rule so far is defining what’s a service, because when building micro-services architecture many services are related to each other and deciding what’s a service and what’s inside a service can be hard. Also, we all eventually start with a small project, which then grows, and grows, until it’s not small anymore. Anticipating growth can be hazardous.

I don’t use SAM, but I heard about it a lot, does SLS generates SAM templates on its own? How/why use SAM when using SLS, aren’t they the same thing?

You’re totally right about automated deployments, I don’t currently have any. It’s not that I don’t want to, but don’t really know what to use (third party? bitbucket CI/CD? AWS CI/CD? …).
The SLS world is fairly new.

buggy · July 9, 2018, 2:49pm

Both Serverless and SAM transform into CloudFormation for deployment. They do a similar thing but very differently.

You could achieve the same result with Serverless that the AWS SA was recommending for SAM by only having one event source per Serverless project.

Defining a microservice can be difficult and I don’t believe there is a single correct answer. But that’s why they pay you the big bucks. I wouldn’t stress too much about getting it 100% right day one. Like any code you write your architecture will need to be refactored over time.

For example: There’s nothing wrong with building user notifications into a service but if you discover that multiple services are sending notifications to users then you might want to look at moving that into a notifications service. Equally, it may be obvious from day one that you’re going to have user notifications sent from multiple services so you just build it that way from the beginning.

It might help to ask questions like:

Is this something that could be used by multiple other services?
Does this implement a discrete business or technical function?
Can I build this in a reusable manner?

For automated deployments I would start with the CI/CD solution that you already understand. If you don’t have one then maybe give code pipeline a go?

sylnsr · July 14, 2018, 1:32pm

So, I’m not new to Lambda, but I am new to SLS. I’ll be using it for production very soon and I’ve had the same considerations and issues. For example, when using Lambda with SQS triggers, you’ll find there are additional SQS permissions needed, which are not necessarily documented in the most obvious places and the whole process becomes somewhat trial-and-error … (see How to grant access to SQS in Serverless.yml).

In any event since these are micro-services we are pushing out with SLS, and … on their own should be simple, non-proprietary functions … I also maintain the philosophy that each of these (non-proprietary) functions should be sharable … or portable in the sense that I can give it to any 3rd party to use or develop … on their own AWS account … using what ever liberal or conservative permission scheme they desire. So in other words, just as I expect a developer to have their own GitHub account, I also expect them to have their own AWS account, which shouldn’t be an issue since there is a free tier available. It also means that I don’t have to manage other accounts for other environments. My SLS project will always include unit tests for the http endpoints, and a deployment script which executes these tests the moment sls deploy completes successfully. Your situation may be a little different … but hopefully this gives you some ideas.

Regarding the idea of what a micro-service is … yes that can be tricky, but the first thing I consider is … is this a function I can share publicly or is it proprietary. If it’s proprietary (e.g. contains specific SQL which I don’t want to share with anyone) … then most likely the service/function is not designed properly in the first place and the proprietary part should be moved back to the calling application. Thus when these services are stripped of proprietary logic even domain specific functionality, and in turn become more generic … they naturally end up looking like proper, generic, micro-services.

bfieber · September 18, 2018, 6:46pm

Sorry for summoning an old post from the grave, but it’s a great topic with what appears to be great information.
Our environment has finally grown enough that we keep hitting various AWS limits. The latest being the Max Deployed Code size for Lambda. So I’ve decided to re-structure by moving dev/test/sandbox environments into their own accounts as children of an Org.

I’ve created an Org in our original account so it’s the Master.
I’ve also created an account as my personal dev environment as a child account.
My core stack which contains S3 buckets, Dynamodb tables, Lambda Functions for some core functionality that are exported for use by API stacks, etc is deployed fine.

Following the advice in the main post here, I still have my Route53 DNS in the Org Master account.
Now as I go to deploy my first API stack, and create the Custom Domain, I wonder:

How can I use the same domain (and same ACM Certificate) for my dev stack in my child account?
Or should I just create a new ACM Cert for my child account?

Creating a new ACM Cert in each child account isn’t too annoying, but that still leaves me wondering how I use the Master accounts DNS/Domain from the child accounts.

My long-term goal being to spin up a separate account for each developer, multiple accounts for QA, an Integration Test account, Sandbox account, and possibly others.

Vadorequest-SL · October 4, 2018, 10:39pm

Honestly I should write a blog post about what we did to our AWS setup. But basically we have 2 accounts for every product: Production and Staging.

I tried keeping my route53 config on the main account (org root), it works as long as you don’t have multiple sub domains like a.b.c.com, actually it can work either way but since we use staging.product-name.com for instance, the “staging” DNS config must be configured through the root org account, and that gets complicated over time, it’s just easier if the Staging AWS account deals with its own DNS setup to be honest. Less headaches and more agile (avoid waiting on an operation that can only been done by a high-clearance operative)

@bfieber bout your question for your ACM Certificate, I have the following policy (because ran into the same issue):
All my certificates ask for at least 2 domain names, the concerned domain, and a wildcard domain. For instance, my ACM for “product-name” will be:
product-name.com
*.product-name.com

This way, I can create as many sub domains as I want without generating/configuring additional ACM, helped me recently with a demo.product-name.com and a v2.product-name.com for instance. Nothing more to do on the ACM side with this policy.

But, if you split your domain names between accounts, like:
production aws account: product-name.com
staging aws account: staging.product-name.com

Then you’ll need to generate two ACM anyway because they can’t be shared across AWS accounts.

Also, I made a rule to allow all my developers to READ-ONLY Route53 settings, so that they can fetch DNS configuration of production systems, that’s usually needed when you’re doing CNAME or NS rules from one domain to another.

One additional advice before you decide to change your setup, modify your NS TTL value from 2 days to 5 mins, and wait 2 days after changing it before changing the whole thing if that’s your plan, that way you’ll save yourself lots of headaches because you won’t need to wait 2 days to see if your DNS changes are applied (basically reduces the cache time, perfect during development)

Also, with this kind of setup, you’ll hit another great AWS limit: A IAM user can’t belong to more than 10 groups at once, yup, sucks. A trick is to use “policies” which allows 10 more policies, but that’s it. Doesn’t help when you create one group per product/stage. My IAM user (as CTO) has reached that limit already and I think I’ll have to create a special group just for myself, that allows access to multiple AWS accounts and use that group instead of using one group per product/stage. (but that’s another issue that you’ll face later, annoying one)

Hope it helps, wanted to summarize it all but hell, that’s AWS we’re talking about…

niraj-bpsoftware · February 18, 2019, 9:46pm

Hi,
I am in exactly same situation and looking for answers for the questions you raised. Just wanted to know how did you go about this?
Thanks,
Niraj

bfieber · February 19, 2019, 3:45pm

I haven’t.
Made a few starts but keep getting pulled off for other dev work.
Hoping I can get the time to focus on it at some point.

niraj-bpsoftware · February 20, 2019, 12:39am

Hi,
Thank you for detailed reply. I have already setup my ACM certificate with wildcard so I have product.com and *.product.com for that ACM. This ACM is in my root account in AWS Organizations.

Now how should I use that certificate when creating an API Gateway and custom domain in another production (child account)? I can’t get that option through console to select the ACM certificate of root account.
Thanks,
Niraj

niraj-bpsoftware · February 20, 2019, 9:53pm

Hi Everyone.
I was able to do this by creating a sub domain in production account like prod.company.com. and then created NS record in master/root account pointing to this sub domain in production account.
In sub-domain production account, I created an Alias record to point to production custom domain.
But I had to create another ACM certificate in production account as mentioned earlier by Vadorequest-SL.

Vadorequest-SL · February 21, 2019, 6:49pm

Indeed, you need to create the ACM certificate in the AWS Account you’ll use it. You cannot create it on another account, it won’t be usable. Must be within the same account. (you can create a ACM certificate with the same domain names in separated accounts though, there is no conflict about that)

I am not quite sure if you fixed all your issues, but I had eventually documented the whole thing internally. Here it is, hope it helps.

Configure an application, with each stage running in a different AWS Account

The AWS configuration to run an app with each stage living in a different AWS Account is a bit tedious, to say the least.

ACM Certificate limitation

For instance, it is not possible for an AWS Account to access another AWS Account’s ACM Certificates.
This means that each certificate must be created in the proper AWS Account.
Therefore, there is a certificate for each stage.

Also, when configuring additional custom domain (like my-budget-advisor.com, which acts as the end-user production website for hep.the-funding-place.org),
the certificate must contain all domain names, like my-budget-advisor.com and hep.the-funding-place.org otherwise there will be a SSL handshake failure.

This implies to know the final end-user domain name when creating the production certificate,
or to have to create another certificate later, and replace the certificate used by the Custom Domain once issued.

Steps overview

The domain should have been bought from the Root AWS Account.
Since we don’t want to deploy anything on the-funding-place.org itself but only on sub-domains then we don’t have to change anything on the apex domain.

Instead, we’ll create a Hosted Zone for each domain you need to create, in the appropriate AWS Account (staging, production, etc.)

Then, we’ll create a “NS” Record Set for each of those domains, so that the requests going to a domain are handled by this domain.
This allows each stage to handle its own DNS properly (unlike it was done with unly.org and staging.unly.org)
and therefore allows developers to deal with DNS of the stage they have access to, instead of configuring all DNS through the AWS Root Account.
(which is neither developer-friendly, nor practical, nor a good design but rather a noobie mistake [that one’s on me!])

Steps to follow - Example with demo.the-funding-place.org

Now that we know a bit more about the limitations we have to deal with, let’s look at the step to setup the whole thing:

Connect to the AWS Console with your usual AWS Account, which has cross-accounts privileges (ability to go to other AWS Accounts)
Switch to the production account of your application, for instance [production] TFP in our case (cross account)
1. Go to the Route 53 service → Hosted Zones
2. Create a new Domain Name demo.the-funding-place.org
3. Change its NS TTL value to 60 instead of 17200, so that if you ever happen to need to change NS in the future it would take 1mn instead of 2 days to propagate
4. Copy it’s NS value (4 rows of domain names starting by ns-)
Switch to the AWS Root Account (you need to leave the cross-account mode by clicking on “Back to YOUR_EMAIL”) [Only an admin can do that because need write access to the AWS Account “Root”]
1. Go to the Route 53 service → Hosted Zones
2. Select your hosted zone, for instance the-funding-place.org in our case
3. Create a new Record Set, which will link the sub-domain to the DNS configuration defined in the [production] TFP AWS Account
  - Type: “NS”
  - Name: corresponding to the sub-domain you’re configuring, demo in our case (full name becomes demo.the-funding-place.org)
  - Value: The NS value you previously copied
Nice! Now, the DNS settings defined in the [production] TFP AWS Account are applied to demo.the-funding-place.org and we can now create the ACM certificate
Switch to the production account of your application, for instance [production] TFP in our case (cross account)
1. Go to Certificate Manager service
2. Select the N. Virginia region (us-east-1) (all certificates must always be created in this region to benefit from edge endpoints)
3. Request a Certificate
  - Set the most global name as first name (important for the serverless-domain-manager plugin to resolve your certificate correctly), for instance demo.the-funding-place.org
  - Set all other names you may need, usually *.demo.the-funding-place.org is enough and covers potential future use-cases
  - Set all end-user production domain names (such as my-budget-advisor.com) if you know them (you can still create a new certificate later, update the domain name and remove the now-unused certificate, but that sounds like a pain, isn’t?
4. Validate and wait for the button Create record in Route 53 (if you added external domain names such as my-budget-advisor.com, then you’ll may need to manually add those records in the appropriate Hosted Zone, if there is no button to do it automatically)
Nice! Your production setup is ready, let’s try to see if we can generate the custom domain by running sls create_domain, you must configure the NODE_ENV and stage properly or it won’t work:
GROUP_NAME=demo NODE_ENV=production sls create_domain -s demo
You should get a message like

‘demo.the-funding-place.org’ was created/updated. New domains may take up to 40 minutes to be initialized.

and have to wait a good 30mn for the process to be done.
You can check its status in API Gateway → Custom Domains (make sure to select your region, like “Ireland”!)
Now that the custom domain is being created, let’s setup the staging environment
Switch to the staging account of your application, for instance [staging] TFP in our case (cross account)
1. Go to the Route 53 service → Hosted Zones
2. Create a new Domain Name staging.demo.the-funding-place.org
3. Change its NS TTL value to 60 instead of 17200, so that if you ever happen to need to change NS in the future it would take 1mn instead of 2 days to propagate
4. Copy it’s NS value (4 rows of domain names starting by ns-)
Switch to the production account of your application, for instance [production] TFP in our case (cross account)
1. Go to the Route 53 service → Hosted Zones
2. Select your hosted zone, for instance demo.the-funding-place.org in our case
3. Create a new Record Set, which will link the sub-domain to the DNS configuration defined in the [staging] TFP AWS Account
  - Type: “NS”
  - Name: corresponding to the sub-domain you’re configuring, staging in our case (full name becomes stagingdemo.the-funding-place.org)
  - Value: The NS value you previously copied
Nice! Now, the DNS settings defined in the [staging] TFP AWS Account are applied to staging.demo.the-funding-place.org and we can now create the ACM certificate
Switch to the staging account of your application, for instance [staging] TFP in our case (cross account)
1. Go to Certificate Manager service
2. Select the N. Virginia region (us-east-1) (all certificates must always be created in this region to benefit from edge endpoints)
3. Request a Certificate
  - Set the most global name as first name (important for the serverless-domain-manager plugin to resolve your certificate correctly), for instance staging.demo.the-funding-place.org
  - Set all other names you may need, usually *.staging.demo.the-funding-place.org is enough and covers potential future use-cases
4. Validate and wait for the button Create record in Route 53
Nice! Your staging setup is ready, let’s try to see if we can generate the custom domain by running sls create_domain, you must configure the NODE_ENV and stage properly or it won’t work:
GROUP_NAME=staging.demo NODE_ENV=staging sls create_domain -s demoStaging
You should get a message like

‘staging.demo.the-funding-place.org’ was created/updated. New domains may take up to 40 minutes to be initialized.

and have to wait a good 30mn for the process to be done.
You can check its status in API Gateway → Custom Domains (make sure to select your region, like “Ireland”!)

Finger in the nose, right? Yeah, that’s what documentation is for.
I bet you wish you had to figure all this out by yourselves

I did.

niraj-bpsoftware · February 21, 2019, 11:18pm

That’s really helpful, thank you very much. It helped me validate what I have done
I also like the trick to treat PRODUCTION domain as root account for STAGING account which makes sense as you want both of those account to be identical (as much as possible).

Topic		Replies	Views
Why use different AWS accounts for stages Serverless Framework	7	4313	October 12, 2017
Custom domain name which is not controlled by current aws account Serverless Framework aws	2	1360	June 7, 2018
Deploy to multiple accounts at the same time Serverless Framework aws	1	1742	August 6, 2018
Multiple custom domains for different stages in eu-central-1 region Serverless Framework aws , lambda	0	782	March 27, 2018
Sample Serverless.yml for multiple AWS accounts needed! Serverless Framework	13	21609	August 6, 2019