The DevOps methodology has grown his importance with the advent of the agile movements and new technology stacks, which demanded more focus on configuration management, provision of agile infrastructures and continuous deployment of applications. This extension to the agile methodologies brought closer software and system engineers. However, some companies, mainly startups, decided to give developers the responsibility of a systems engineer to avoid having dedicated persons for the role, in an attempt to reduce costs. Obviously, this bold move came with consequences.
Despite all the benefits of multi versed individuals - each discipline requires years of training (guys like Leonardo Da Vinci are rare to find), the added responsibility took time from what the developers were trained to do: writing code. To remove the burden from those developers, some products were made available, like Elastic Beanstalk, Heroku and Google App Engine. However, those products work fine for simple applications but don’t provide the necessary flexibility that wider complexity demand.
AWS, being one of the most mature cloud-based provider until this date, has built some useful tools that solve some of the most time-consuming tasks of being a Automation engineer, shifting the focus of those overloaded developers to writing code again.
This post is the first of a series of three that explains how to build an infrastructure with multiple environments, with a Continuous Delivery (CD) pipeline and no downtime deployments on AWS. The main objective is to do build it in an easy and modular way to reduce costs and decrease time of bootstrapping application and delivery infrastructures.
To design the application infrastructure, we’re going to use CloudFormation, which allows systems administrators to manage AWS resources declaratively. It helps to provision and manage those resources predictably by using text-based templates, written in JSON or YAML. One of the advantages of declaring the infrastructure this way is the ability to version-control the changes and optionally automating the provision of the resources, managing them as we typically do with code. Automating the infrastructure changes is not in the scope of this article, but if you’re interested in doing so, please read this blog post from AWS.
The easiest way to design a CloudFormation template is to use the CloudFormation Designer. This tool is best used to define the skeleton of the AWS resources in the infrastructure but doesn’t eliminate the need to know the configuration details and to edit the template file manually. However, I found the documentation easy to read and the learning curve is relatively small if you’re familiar with AWS.
To maximize re-use, let’s design the application template as modular as possible by using the nested stacks capability of CloudFormation and by defining inputs parameters, which will be used for stack inter-communication during creation time.
The following image depicts the application infrastructure, designed with CloudFormation Designer.
The first thing we’re going to define are the template input parameters in a way to avoid major specific configurations. By writing the template as generic as possible, it will allow us to replicate stacks and, for example, create multiple environments with a single template. Our template will be used to create the QA and Production environments inside a single VPC. A dedicated VPC for each environment would be the right choice, and the template could be changed to fit that ideal scenario. However, for simplicity and to keep the budget as low as possible, we’re going to place all the resources inside the same VPC. One VPC per environment would allow us to separate each environment logically, mainly due to the production environment’s strict security requirements.
The template input parameters are defined in the Parameters block, which should be placed in the top of the template for better readability. Here’s an example of the ApplicationName parameter:
The following list defines all the needed parameters to turn this template generic enough to make it reusable:
To achieve a fault tolerant configuration, we will configure two private networks in the same region but in different Availability Zones (AZ), which provide us another layer of logical separation since each subnetwork doesn’t extend cross AZs. This configuration can’t prevent the application to stop working after a catastrophe in the deployment region but two AZ will guarantee minimum fault-tolerance: if one datacenter is down, the instances in the other AZ are kept in service by the Load Balancer (LB). This prevents for extreme probable situations like the cleaning person to accidentally unplug the rack power to use the vacuum cleaner to swipe the datacenter room.
There are three properties we need to configure in each private subnetwork: VPC ID, CIDR block and AZ. Since VPC ID and subnet CIDR Block are both template input parameters, we will reference those inside each AWS::EC2::Subnet resource type for CloudFormation to resolve during runtime. Note that each subnet CIDR block must be valid within the VPC CIDR block. The AZ name depends on the region and, in some way, we need to construct it based on the input region name. One way of doing this is to use CloudFormation’s Fn::Join built-in intrinsic function. For example, for the region
us-west-2, the below AZ would be constructed as
The following code block shows the private subnet 1 configuration:
For our private networks we need to configure the routing table. We’re just going to define a single route for the application subnetworks to route requests to the Internet using AWS::EC2::Route resource type. For now, we’re going to use the VPC’s internet gateway ID to configure the route as a placeholder for the NAT instance, which we’re going to configure during Part II. The EC2 instances inside these two private networks won’t be exposed directly to the internet. Configuring a NAT to translate outbound requests is the solution we’re going to use for securing our networks. Otherwise, we would have to configure an Internet Gateway and assign Elastic IPs (EIP) to each instance, exposing them to the outside world.
The following snippet refers to the private subnet’s route to the Internet. The internet gateway ID destination is just a placeholder for now:
Now, it’s time to configure the LB properties. The application LB will, by default, forward outside requests to the application nodes using a round-robin algorithm. This design is useful to achieve two requirements we defined for our application stack: fault-tolerance and horizontal scalability. Using the setup we designed, the LB will balance requests between the healthy instances across the two AZs. If the application needs to be scaled up, the Auto Scaling Group (ASG) will instantiate new nodes and set them inside those AZs, distributing the load for more instances, reducing the load pressure of the original set.
The LB is responsible to keep the healthy nodes in service. If some node is unhealthy, the LB will remove them from rotation automatically. This behaviour must be configured in the LB properties and all instance nodes must expose a health check endpoint for the LB to be able to access it.
The following code shows the LB’s HealthCheck property pointing to the input parameter HealthCheckTarget. An example of an HealthCheckTarget input would be
HTTP:8080/health, which means the instances must expose an HTTP endpoint in the 8080 port using the
/health path. A healthy node would respond
HTTP 200 OK. The LB connection listeners must also be configured with the front-end (LB) port and back-end port (instance). For more information please check the user guide.
With the LB properly configured, we now need to set up networking, which will expose it to the Internet for handling outside requests. For that, public networks for the LB nodes need to be configured: one for each AZ, the same configured for the application private networks. A VPC ID, a CIDR block and an AZ will be passed as references, as the following block shows:
A single route will be configured in the LB networking router for accessing the Internet. In this case, the Internet Gateway (IGW) will be the destination for the route and responsible to translate the assigned public IP address to the LB.
Here’s the route configuration:
An Auto Scaling Group (ASG) is a group of EC2 instances which share the same characteristics for aiming automatic scalability and management. The stack we’re building will just have a simple ASG that keeps a fixed number of healthy instances up and running. If you want to configure an ASG with complex policies, like CPU percentage threshold or memory usage, please check the quick reference. For demonstration purposes, we’re configuring the ASG with a single instance using AWS::AutoScaling::AutoScalingGroup resource type, as follows:
Note the AvailabilityZones and VPCZoneIdentifier configurations to define where the application nodes will be instantiated. The subnetworks configured in the VPC zone identifier must reside in the configured AZs and the ApplicationLoadBalancer references the previously configured LB.
The LaunchConfigurationName points to a Launch Configuration (LC), which the ASG uses to launch EC2 instances. A simple LC would have defined: an instance type, an Amazon Machine Image (AMI) ID, a key pair for logging into the instances and a Security Group (SG), which we’ll cover in the next section. You can review the LC properties in the following resource block:
The AMI ID “ami-f173cc91” is just a pre-cooked Linux AMI from Amazon, but you can use others as your own cooked AMIs.
To control the traffic for the LB and the application instances, we’re going to configure two Security Groups (SG), attached to the VPC ID input parameter. We could also use Network Access Control Lists ACL in conjunction with SGs, or even use it as an alternative but SG gives us protection at an instance level and it’s perfect for creating an applicational context for security.
The bellow table is a resume about the differences between SGs and ACLs but you can read more details in the VPC security user guide.
|Security Group||Access Control List|
|Return traffic is always allowed (stateful)||Return traffic must be explicitly allowed (stateless)|
|Applied to instance level||Applied to subnetwork level|
|Rules to allow traffic only||Rules to allow and deny traffic|
The following resource block is a LB security group configuration to allow ingress Internet traffic only to the 80 port:
The following SG rules limit the ingress traffic to the application instances to the VPC’s network and allows outbound traffic to the Internet:
To test the stack, one can use AWS CLI in the way shown below and see it being created in the AWS Console:
If something goes wrong, the
disable-rollback flag is useful to help debug by freezing the state of the stack upon error to check for error messages and logs. Please read the documentation reference for more details.
This concludes Part I. We created a template for our application infrastructure using CloudFormation Designer. This tool doesn’t exempt the designer from knowing how to create a template manually, since there’s no such thing as free meals, but it reduces the effort of creating one from scratch. Besides all the benefits of using it, you will also have a documented infrastructure. How cool is that?
You can checkout the complete template in BytePitch’s GitHub account.
Until now, we touched three objectives from our goals:
In the next post of this series, we’re going to:
See you next week.