July 9, 2023

Terraform – Managing Auto Scaling Groups & Load Balancers

Share this

By Andrei Maksimov

March 16, 2021

aws_internet_gateway, aws_nat_gateway, aws_route_table, aws_security_group, aws_subnet, aws_vpc, cloudwatch, elb, launch_configuration, terraform, vpc

Enjoy what I do? Consider buying me a coffee ☕️

  • Home
  • AWS
  • Terraform – Managing Auto Scaling Groups & Load Balancers

This article continues the Terraform article series and covers how to use Terraform to create AutoScaling Groups in the AWS cloud – a collection of EC2 instances that share similar characteristics and are treated as a logical grouping, such as scaling and management.

Update: 2020 Oct

Terraform code updated to support newer syntax.

As soon as you learn how to manage basic network infrastructure in AWS using Terraform (see “Terraform recipe – Managing AWS VPC – Creating Public Subnet” and “Terraform recipe – Managing AWS VPC – Creating Private Subnets”), you want to start creating auto-scalable infrastructures.

Auto Scaling Groups

Usually, Auto Scaling Groups , a crucial part of Amazon ec2 auto scaling, are used to control the number of instances executing the same task like rendering dynamic web pages for your website, decoding videos and images, or calculating machine learning models. With autoscaling’s predictive scaling policy resource, you can ensure your EC2 instances are continually operating within the correct parameters.

Auto Scaling Groups, offering a powerful tool to manage group scaling, also allows you to dynamically control your server pool size – increase it when your web servers are processing more traffic or tasks than usual, or decrease it when it becomes quieter.

In any case, this feature allows you to save your budget and make your infrastructure more fault-tolerant significantly.

Let’s build a simple infrastructure, which consists of several web servers for serving website traffic. In the following article, we’ll add RDS DB to our infrastructure.

Our infrastructure will be the following:

You may find complete .tf file source code in my GitHub repository.

Setting up VPC.

Let’s assemble it in a new infrastructure.tf file. First of all, let’s declare VPC, two Public Subnets, Internet Gateway and Route Table (we may take this example as a base):

resource "aws_vpc" "my_vpc" {
  cidr_block       = "10.0.0.0/16"
  enable_dns_hostnames = true
  tags = {
    Name = "My VPC"
  }
}
resource "aws_subnet" "public_us_east_1a" {
  vpc_id     = aws_vpc.my_vpc.id
  cidr_block = "10.0.0.0/24"
  availability_zone = "us-east-1a"
  tags = {
    Name = "Public Subnet us-east-1a"
  }
}
resource "aws_subnet" "public_us_east_1b" {
  vpc_id     = aws_vpc.my_vpc.id
  cidr_block = "10.0.1.0/24"
  availability_zone = "us-east-1b"
  tags = {
    Name = "Public Subnet us-east-1b"
  }
}
resource "aws_internet_gateway" "my_vpc_igw" {
  vpc_id = aws_vpc.my_vpc.id
  tags = {
    Name = "My VPC - Internet Gateway"
  }
}
resource "aws_route_table" "my_vpc_public" {
    vpc_id = aws_vpc.my_vpc.id
    route {
        cidr_block = "0.0.0.0/0"
        gateway_id = aws_internet_gateway.my_vpc_igw.id
    }
    tags = {
        Name = "Public Subnets Route Table for My VPC"
    }
}
resource "aws_route_table_association" "my_vpc_us_east_1a_public" {
    subnet_id = aws_subnet.public_us_east_1a.id
    route_table_id = aws_route_table.my_vpc_public.id
}
resource "aws_route_table_association" "my_vpc_us_east_1b_public" {
    subnet_id = aws_subnet.public_us_east_1b.id
    route_table_id = aws_route_table.my_vpc_public.id

Everything here should be very similar to you. If not, we recommend checking the Terraform AWS VPC – Complete Tutorial article.

Next, we need to describe the Security Group for our web servers, which will allow HTTP connections to our instances:

resource "aws_security_group" "allow_http" {
  name        = "allow_http"
  description = "Allow HTTP inbound connections"
  vpc_id = aws_vpc.my_vpc.id
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  egress {
    from_port       = 0
    to_port         = 0
    protocol        = "-1"
    cidr_blocks     = ["0.0.0.0/0"]
  }
  tags = {
    Name = "Allow HTTP Security Group"
  }

Launch configuration

As soon as we have SecurityGroup, we may describe a Launch Configuration. Think of it like a template containing all instance settings to apply to each new launched by Auto Scaling Group instance. We’re using aws_launch_configuration resource in Terraform to describe it

Most of the parameters should be familiar to you, as we already used them in aws_instance resource.

resource "aws_launch_configuration" "web" {
  name_prefix = "web-"
  image_id = "ami-0947d2ba12ee1ff75" # Amazon Linux 2 AMI (HVM), SSD Volume Type
  instance_type = "t2.micro"
  key_name = "Lenovo T410"
  security_groups = [ aws_security_group.allow_http.id ]
  associate_public_ip_address = true
  user_data = < /usr/share/nginx/html/index.html
chkconfig nginx on
service nginx start
  USER_DATA
  lifecycle {
    create_before_destroy = true
  }
}

The new ones are a user_data and a lifecycle:

  • user_data – is a special interface created by AWS for EC2 instances automation. Usually this option is filled with scripted instructions to the instance, which need to be executed at the instance boot time. For most of the OS this is done by cloud-init.
  • lifecycle – special instruction, which is declaring how new launch configuration rules applied during update. We’re using create_before_destroy here to create new instances from a new launch configuration before destroying the old ones. This option commonly used during rolling deployments.

The user-data option is filled with a simple bash-script, which installs the Nginx web server and puts the instance’s local IP address to the index.html file, so we can see it after the instance is up and running.

Note: pay attention that the user_data is limited to 16 KB. If you need to do a complex EC2 instance configuration during its startup process, consider using Ansible, Chef, or Puppet.

Load balancer

Before we create an Auto Scaling Group, we need to declare a Load Balancer. There are three Load Balances available for you in AWS right now:

For simplicity, let’s create an Elastic Load Balancer in front of our EC2 instances (I’ll show how to use other types of them in future articles). To do that, we need to declare aws_elb resource.

resource "aws_security_group" "elb_http" {
  name        = "elb_http"
  description = "Allow HTTP traffic to instances through Elastic Load Balancer"
  vpc_id = aws_vpc.my_vpc.id
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  egress {
    from_port       = 0
    to_port         = 0
    protocol        = "-1"
    cidr_blocks     = ["0.0.0.0/0"]
  }
  tags = {
    Name = "Allow HTTP through ELB Security Group"
  }
}
resource "aws_elb" "web_elb" {
  name = "web-elb"
  security_groups = [
    aws_security_group.elb_http.id
  ]
  subnets = [
    aws_subnet.public_us_east_1a.id,
    aws_subnet.public_us_east_1b.id
  ]
  cross_zone_load_balancing   = true
  health_check {
    healthy_threshold = 2
    unhealthy_threshold = 2
    timeout = 3
    interval = 30
    target = "HTTP:80/"
  }
  listener {
    lb_port = 80
    lb_protocol = "http"
    instance_port = "80"
    instance_protocol = "http"
  }
}

Here we’re setting up Load Balancer name, its own Security Group, so we could make traffic rules more restrictive later if we want to.

We’re specifying 2 subnets, where our Load Balancer will look for (listener configuration) launched instances and turned on cross_zone_load_balancing feature so that we could have our instances in different Availability Zones.

And finally, we’ve specified health_check configuration, which determines when Load Balancer should transition instances from healthy to unhealthy state and back depending on its ability to reach HTTP port 80 on the target instance.

If ELB can not reach the instance on the specified port, it will stop sending traffic.

Auto Scaling group

Now we’re ready to create an Auto Scaling Group by describing it using aws_autoscaling_group resource:

resource "aws_autoscaling_group" "web" {
  name = "${aws_launch_configuration.web.name}-asg"
  min_size             = 1
  desired_capacity     = 2
  max_size             = 4
  
  health_check_type    = "ELB"
  load_balancers = [
    aws_elb.web_elb.id
  ]
  launch_configuration = aws_launch_configuration.web.name
  enabled_metrics = [
    "GroupMinSize",
    "GroupMaxSize",
    "GroupDesiredCapacity",
    "GroupInServiceInstances",
    "GroupTotalInstances"
  ]
  metrics_granularity = "1Minute"
  vpc_zone_identifier  = [
    aws_subnet.public_us_east_1a.id,
    aws_subnet.public_us_east_1b.id
  ]
  # Required to redeploy without an outage.
  lifecycle {
    create_before_destroy = true
  }
  tag {
    key                 = "Name"
    value               = "web"
    propagate_at_launch = true
  }
}

Here we have the following configuration:

  • There will be minimum one instance to serve the traffic.
  • Auto Scaling Group will be launched with 2 instances and put each of them in separate Availability Zones in different Subnets.
  • Auto Scaling Group will get information about instance availability from the ELB.
  • We’re set up collection for some Cloud Watch metrics to monitor our Auto Scaling Group state.
  • Each instance launched from this Auto Scaling Group will have Name tag set to web.

Now we are almost ready, let’s get the Load Balancer DNS name as an output from the Terraform infrastructure description:

output "elb_dns_name" {
  value = aws_elb.web_elb.dns_name
}

And try to deploy the following structure using the commands below:

terraform init
terraform plan
terraform apply
Terraform recipe Managing AutoScaling Groups and LoadBalancers Launch

Starting from this point, you can open provided ELB URL in your browser and refresh the page several times to see different local IP addresses of your just launched instances.

Auto scaling policies

But this configuration is static. We discussed no rules at the top of the article, which will add or remove instances based on specific and appropriate scaling metric pair specification like Load Metric and Scaling Metric or something like a Customized Scaling Metric specification. We will only be using a select few policies in this case. However, there is much more to explore for other scenarios, like the Predefined Scaling Metric Specification, which contains predefined metric specification information for a target tracking policy, which is the scaling policy’s type, and the Customized Capacity Metric specification, which allows you to add metrics of your own choice into the required parameters with valid values, which defines Custom Capacity Metric.

To make our infrastructure dynamic, we need to create several Auto Scaling Policies and CloudWatch Alarms.

First, let’s determine how AWS need to scale our group UP by declaring aws_autoscaling_policy and aws_cloudwatch_metric_alarm resources:

resource "aws_autoscaling_policy" "web_policy_up" {
  name = "web_policy_up"
  scaling_adjustment = 1
  adjustment_type = "ChangeInCapacity"
  cooldown = 300
  autoscaling_group_name = aws_autoscaling_group.web.name
}
resource "aws_cloudwatch_metric_alarm" "web_cpu_alarm_up" {
  alarm_name = "web_cpu_alarm_up"
  comparison_operator = "GreaterThanOrEqualToThreshold"
  evaluation_periods = "2"
  metric_name = "CPUUtilization"
  namespace = "AWS/EC2"
  period = "120"
  statistic = "Average"
  threshold = "60"
  dimensions = {
    AutoScalingGroupName = aws_autoscaling_group.web.name
  }
  alarm_description = "This metric monitor EC2 instance CPU utilization"
  alarm_actions = [ aws_autoscaling_policy.web_policy_up.arn ]
}

aws_autoscaling_policy defines how AWS should change Auto Scaling Group’s newly launched instance count will add a positive increment, contribute and defines cloudwatch metrics in its current desired capacity in case of aws_cloudwatch_metric_alarm.

cooldown option is needed to give our infrastructure some time (an absolute number of 300 seconds) after a single scaling adjustment before increasing Auto Scaling Group again.

aws_cloudwatch_metric_alarm is a straightforward alarm that will be fired, if the total CPU utilization of all instances in our Auto Scaling Group is used to its maximum capacity but should not exceed forecast capacity when the forecast capacity approaches and is a greater positive value or equal threshold (60% CPU utilization which is the target value) during 120 seconds.

After the scaling activity completes, we pretty much add the same resources we need to declare to scale our Auto Scaling Group down:

resource "aws_autoscaling_policy" "web_policy_down" {
  name = "web_policy_down"
  scaling_adjustment = -1
  adjustment_type = "ChangeInCapacity"
  cooldown = 300
  autoscaling_group_name = aws_autoscaling_group.web.name
}
resource "aws_cloudwatch_metric_alarm" "web_cpu_alarm_down" {
  alarm_name = "web_cpu_alarm_down"
  comparison_operator = "LessThanOrEqualToThreshold"
  evaluation_periods = "2"
  metric_name = "CPUUtilization"
  namespace = "AWS/EC2"
  period = "120"
  statistic = "Average"
  threshold = "10"
  dimensions = {
    AutoScalingGroupName = aws_autoscaling_group.web.name
  }
  alarm_description = "This metric monitor EC2 instance CPU utilization"
  alarm_actions = [ aws_autoscaling_policy.web_policy_down.arn ]
}

Here we’re decreasing Auto Scaling Group size by one instance every 300 seconds if its total CPU utilization is less or equals 10%(valid range is 0% to 100%) all the way down to a negative value.

Apply these rules by running the following commands:

terraform plan
terraform apply

In a couple of minutes, you’ll see a fired alarm in CloudWatch:

Terraform recipe Managing AutoScaling Groups and Load Balancers - CloudWatch Alarm

This will cause one of two instances termination process:

Terraform recipe Managing AutoScaling Groups and Load Balancers CloudWatch Alarm Result

Summary

In this article, you’ve learned how to set up a dynamic AWS Autoscaling Group and Load Balancer with Autoscaling Scaling Policy Resource and a Dynamic Scaling Policy to distribute traffic to your instances in several Availability Zones.

I hope this article was helpful. If so, please, help us to spread it to the world!

Stay tuned!

Andrei Maksimov

I’m a passionate Cloud Infrastructure Architect with more than 20 years of experience in IT. In addition to the tech, I'm covering Personal Finance topics at https://amaksimov.com.

Any of my posts represent my personal experience and opinion about the topic.

  • cialis generico

    Appreciate it. An abundance of content!

  • {"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

    Related Posts

    Comprehensive Guide to Install Boto3 Python
    Python Boto3 Session: A Comprehensive Guide

    Andrei Maksimov

    11/17/2023

    Ultimate Guide to Amazon Bedrock

    Ultimate Guide to Amazon Bedrock
    AWS Proxies: Enhancing Data Collection and Security

    Subscribe now to get the latest updates!

    >