AWS Cloudformation ECS Stack json config with Launch Config, Autoscaling Group and Load Balancer

A while back, at my company we switched to using Docker and ECS for our application. I wanted a structured way to generate AWS resources and I found that AWS Cloudformation is a great way to do this. It took a lot of trial and error to figure out everything, so I thought posting a rough tutorial might help others trying to do the same thing. Here we go…

There are a few things which need to be generated manually before you can bring up the stack:
1. Create a VPC with public subnets (or use an existing one) and a VPC key pair (or use an existing one). In this example, the VPC key pair name is VPC-Example-Key-Pair
2. Create an ECS Cluster. In this example, the ECS Cluster name is example-ecs-cluster
3. Create an s3 bucket and upload the initialization script specified below. In this example, the file is uploaded to this S3 path: s3://example-bucket/ecs/userdata/example-ecs-init-script.sh
4. Create an IAM Role and give it the “AmazonEC2ContainerServiceforEC2Role” policy, S3 access to the bucket where the EC2 initialization script lives, permission to register with an Elastic Load Balancer, and a few other things. I created a custom IAM policy for our ECS instances, see the policy below. Plus you may want to add policies for whatever other permissions are necessary for your application. In this example, the IAM Role name is example-ecs-role
5. Create an EC2 Security Group for the Load Balancer and the EC2 Instances
6. Optional – Create an SNS Notification Topic for EC2 instance autoscaling life cycle events. In this example, the SNS Topic name is example-ecs-autoscale-topic
7. Optional – Upload a SSL certificate for the load balancer. In this example the SSL Cert ARN is arn:aws:iam::1234567890:server-certificate/2016_wildcard.example.com

I decided to keep these things as manually generated. I wanted them to exist outside the life cycle of the cloudformation stack because they can be shared by more than one stack and I felt more comfortable maintaining them manually.

EC2 instance initialization script which tells ECS Agent which cluster the EC2 instance is associated with and provides authorization for private docker repo:

#!/bin/sh
 
# update host to latest packages
yum -y update
 
if [ -z "$1" ] || [ -z "$2" ];
then
    echo "${0} usage: [ECS Cluster Name] [Extended Volume Size in GB]"
    exit 0
fi
 
ecsCluster=$1
extendLvmBy=$2
 
cat > /etc/ecs/ecs.config << END
ECS_CLUSTER=$ecsCluster
ECS_ENGINE_AUTH_TYPE=dockercfg
ECS_ENGINE_AUTH_DATA={"quay.io": {"auth": "BIG LONG NASTY AUTH STRING","email": ""}}
ECS_LOGLEVEL=warn
END
 
# uncomment this if you want the ECS Agent to clean up after itself once per minute, not recommended
# see http://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-optimized_AMI.html for more info
#echo "ECS_ENGINE_TASK_CLEANUP_WAIT_DURATION=1m" >> /etc/ecs/ecs.config
 
# do some daily docker cleanup which the ECS Agent does not seem to do
cat > /etc/cron.daily/docker-remove-dangling-images << END
#!/bin/sh
echo "Removing dangling images from docker host"
docker rmi \$(docker images -q -f "dangling=true")
END
 
# extend docker lvm with attached EBS volume
vgextend docker /dev/xvdcy
lvextend -L+${extendLvmBy}G /dev/docker/docker-pool
 
##
# Do other initialization stuff for your EC2 instances below
# such as running a logging container or installing custom packages
##

We use quay.io, but you could set up any private repo. Check out http://docs.aws.amazon.com/AmazonECS/latest/developerguide/private-auth.html for more info.

There is also configuration near the end of the script for extending Docker’s LVM with an additional volume, in this example by 80GB. We found that in our development cluster where lots of images were being deployed, the default 22GB drive for the ECS Optimized image was not enough. The default 22GB is fine for most use cases, but I left this part in this example because it was difficult to figure out. You can probably delete the extra volume from the json config and the init script.

IAM policy for ECS instances:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt12345678900000",
            "Effect": "Allow",
            "Action": [
                "ecs:DeregisterContainerInstance",
                "ecs:DeregisterTaskDefinition",
                "ecs:DescribeClusters",
                "ecs:DescribeContainerInstances",
                "ecs:DescribeServices",
                "ecs:DescribeTaskDefinition",
                "ecs:DescribeTasks",
                "ecs:DiscoverPollEndpoint",
                "ecs:ListClusters",
                "ecs:ListContainerInstances",
                "ecs:ListServices",
                "ecs:ListTaskDefinitionFamilies",
                "ecs:ListTaskDefinitions",
                "ecs:ListTasks",
                "ecs:Poll",
                "ecs:RegisterContainerInstance",
                "ecs:RegisterTaskDefinition",
                "ecs:RunTask",
                "ecs:StartTask",
                "ecs:StopTask",
                "ecs:StartTelemetrySession",
                "ecs:SubmitContainerStateChange",
                "ecs:SubmitTaskStateChange",
                "ecs:UpdateContainerAgent",
                "ec2:Describe*",
                "ec2:AuthorizeSecurityGroupIngress",
                "elasticloadbalancing:Describe*",
                "elasticloadbalancing:DeregisterInstancesFromLoadBalancer",
                "elasticloadbalancing:RegisterInstancesWithLoadBalancer",
                "cloudwatch:ListMetrics",
                "cloudwatch:GetMetricStatistics",
                "cloudwatch:Describe*",
                "autoscaling:Describe*",
                "iam:PassRole",
                "iam:ListInstanceProfiles"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Sid": "Stmt12345678900001",
            "Effect": "Allow",
            "Action": [
                "s3:Get*",
                "s3:List*"
            ],
            "Resource": [
                "arn:aws:s3:::example-bucket/*",
                "arn:aws:s3:::example-bucket"
            ]
        }
    ]
}

Cloudformation json config file:

{
    "AWSTemplateFormatVersion": "2010-09-09",
    "Description": "Example ECS Cluster - Creates a Load Balancer, AutoScaling Group and LaunchConfiguration against an EXISTING VPC and EXISTING ECS Cluster",
    "Parameters": {
        "EcsClusterName": {
            "Type": "String",
            "Description": "ECS Cluster Name",
            "Default": "example-ecs-cluster"
        },
        "Vpc": {
            "Type": "AWS::EC2::VPC::Id",
            "Description": "VPC for ECS Clusters",
            "Default": "vpc-abc123def"
        },
        "SubnetIds": {
            "Type": "List<AWS::EC2::Subnet::Id>",
            "Description": "Comma separated list of VPC Subnet Ids where ECS instances should run",
            "Default": "subnet-abc123,subnet-efg456,subnet-lmn789"
        },
        "AvailabilityZones": {
            "Type": "List<AWS::EC2::AvailabilityZone::Name>",
            "Description": "AutoScaling Group Availability Zones. MUST MATCH THE SUBNETS AZ's",
            "Default": "us-east-1c,us-east-1d,us-east-1e"
        },
        "EcsAmiId": {
            "Type": "AWS::EC2::Image::Id",
            "Description": "Amazon ECS Optimized AMI for us-east-1 region - see http://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-optimized_AMI.html",
            "Default": "ami-6df8fe7a"
        },
        "EcsInstanceType": {
            "Type": "String",
            "Description": "ECS EC2 instance type",
            "Default": "t2.nano",
            "AllowedValues": [
                "t2.nano",
                "t2.micro",
                "t2.small",
                "t2.medium",
                "t2.large",
                "m4.large",
                "m4.xlarge",
                "m4.2xlarge",
                "m4.4xlarge",
                "m4.10xlarge",
                "m3.medium",
                "m3.large",
                "m3.xlarge",
                "m3.2xlarge",
                "c4.large",
                "c4.xlarge",
                "c4.2xlarge",
                "c4.4xlarge",
                "c4.8xlarge",
                "c3.large",
                "c3.xlarge",
                "c3.2xlarge",
                "c3.4xlarge",
                "c3.8xlarge",
                "r3.large",
                "r3.xlarge",
                "r3.2xlarge",
                "r3.4xlarge",
                "r3.8xlarge",
                "i2.xlarge",
                "i2.2xlarge",
                "i2.4xlarge",
                "i2.8xlarge"
            ],
            "ConstraintDescription": "must be a valid EC2 instance type."
        },
        "KeyName": {
            "Type": "AWS::EC2::KeyPair::KeyName",
            "Description": "Name of an existing EC2 KeyPair to enable SSH access to the ECS instances",
            "Default": "VPC-Example-Key-Pair"
        },
        "IamRoleInstanceProfile": {
            "Type": "String",
            "Default": "example-ecs-role",
            "Description": "Name or the Amazon Resource Name (ARN) of the instance profile associated with the IAM role for the instance"
        },
        "AsgMinSize": {
            "Type": "Number",
            "Description": "Minimum Size Capacity of ECS Auto Scaling Group",
            "Default": "1"
        },
        "AsgMaxSize": {
            "Type": "Number",
            "Description": "Maximum Size Capacity of ECS Auto Scaling Group",
            "Default": "3"
        },
        "AsgDesiredCapacity": {
            "Type": "Number",
            "Description": "Initial Desired Size of ECS Auto Scaling Group",
            "Default": "2"
        },
        "AsgNotificationArn": {
            "Type": "String",
            "Description": "ECS Autoscale Notification SNS Topic ARN",
            "Default": "arn:aws:sns:us-east-1:1234567890:example-ecs-autoscale-topic"
        },
        "EcsClusterHostedZoneId": {
            "Type": "String",
            "Description": "Route53 Hosted Zone ID For ECS Cluster",
            "Default": "ABCDEFGHIJKLM"
        },
        "EcsClusterHostedZoneName": {
            "Type": "String",
            "Description": "Route53 Hosted Zone Domain Name For ECS Cluster",
            "Default": "myvpc.internal"
        },
        "EcsClusterHostedZoneInstanceName": {
            "Type": "String",
            "Description": "Route53 Hosted Zone Domain Name For ECS Cluster",
            "Default": "ecs-example"
        },
        "EcsPort": {
            "Type": "String",
            "Description": "Security Group port to open on ECS instances - defaults to port 80",
            "Default": "80"
        },
        "EcsHealthCheckEndpoint": {
            "Type": "String",
            "Description": "HealthCheck endpoint for application running on ECS cluster",
            "Default": "/healthcheck/endpoint/url"
        },
        "EcsSecurityGroup": {
            "Type": "AWS::EC2::SecurityGroup::Id",
            "Description": "ECS Instance Security Group",
            "Default": "sg-abc123def"
        },
        "LbSecurityGroup": {
            "Type": "AWS::EC2::SecurityGroup::Id",
            "Description": "Load Balancer Security Group",
            "Default": "sg-lmn456xyz"
        },
        "SslCertArn": {
            "Type": "String",
            "Description": "SSL Certificate ARN",
            "Default": "arn:aws:iam::1234567890:server-certificate/2016_wildcard.example.com"
        },
        "EC2InstanceInitScriptS3Path": {
            "Type": "String",
            "Description": "ECS Instance Init Script S3 path",
            "Default": "s3://example-bucket/ecs/userdata/example-ecs-init-script.sh"
        },
        "EcsEbsLvmVolumeSize": {
            "Type": "Number",
            "Description": "Size in GB of attached EBS volume for extending Docker's LVM disk space",
            "Default": "80"
        }
    },
    "Resources": {
        "EcsInstanceLc": {
            "Type": "AWS::AutoScaling::LaunchConfiguration",
            "Properties": {
                "ImageId": {
                    "Ref": "EcsAmiId"
                },
                "InstanceType": {
                    "Ref": "EcsInstanceType"
                },
                "AssociatePublicIpAddress": true,
                "IamInstanceProfile": {
                    "Ref": "IamRoleInstanceProfile"
                },
                "KeyName": {
                    "Ref": "KeyName"
                },
                "SecurityGroups": [
                    {
                        "Ref": "EcsSecurityGroup"
                    }
                ],
                "BlockDeviceMappings": [
                    {
                        "DeviceName": "xvdcy",
                        "Ebs": {
                            "DeleteOnTermination": "true",
                            "VolumeSize": {
                                "Ref": "EcsEbsLvmVolumeSize"
                            },
                            "VolumeType": "gp2"
                        }
                    }
                ],
                "UserData": {
                    "Fn::Base64": {
                        "Fn::Join": [
                            "",
                            [
                                "#!/bin/bash\n",
                                "yum install -y aws-cli\n",
                                "aws s3 cp ",
                                {"Ref": "EC2InstanceInitScriptS3Path"},
                                " /tmp/ecs-init.sh\n",
                                "chmod +x /tmp/ecs-init.sh\n",
                                "/tmp/ecs-init.sh ",
                                {"Ref": "EcsClusterName"},
                                " ",
                                {"Ref": "EcsEbsLvmVolumeSize"},
                                "\n"
                            ]
                        ]
                    }
                }
            }
        },
        "EcsInstanceAsg": {
            "Type": "AWS::AutoScaling::AutoScalingGroup",
            "Properties": {
                "AvailabilityZones": {
                    "Ref": "AvailabilityZones"
                },
                "VPCZoneIdentifier": {
                    "Ref": "SubnetIds"
                },
                "LaunchConfigurationName": {
                    "Ref": "EcsInstanceLc"
                },
                "MinSize": {
                    "Ref": "AsgMinSize"
                },
                "MaxSize": {
                    "Ref": "AsgMaxSize"
                },
                "DesiredCapacity": {
                    "Ref": "AsgDesiredCapacity"
                },
                "NotificationConfigurations": [
                    {
                        "NotificationTypes": [
                            "autoscaling:EC2_INSTANCE_LAUNCH",
                            "autoscaling:EC2_INSTANCE_LAUNCH_ERROR",
                            "autoscaling:EC2_INSTANCE_TERMINATE",
                            "autoscaling:EC2_INSTANCE_TERMINATE_ERROR"
                        ],
                        "TopicARN": {
                            "Ref": "AsgNotificationArn"
                        }
                    }
                ],
                "Tags": [
                    {
                        "Key": "Name",
                        "Value": {
                            "Fn::Join": [
                                "",
                                [
                                    {
                                        "Ref": "EcsClusterName"
                                    },
                                    "-auto"
                                ]
                            ]
                        },
                        "PropagateAtLaunch": "true"
                    },
                    {
                        "Key": "DomainMeta",
                        "Value": {
                            "Fn::Join": [
                                ":",
                                [
                                    {
                                        "Ref": "EcsClusterHostedZoneId"
                                    },
                                    {
                                        "Ref": "EcsClusterHostedZoneName"
                                    },
                                    {
                                        "Ref": "EcsClusterHostedZoneInstanceName"
                                    }
                                ]
                            ]
                        },
                        "PropagateAtLaunch": "true"
                    }
                ],
                "LoadBalancerNames": [
                    {
                        "Ref": "EcsLb"
                    }
                ]
            }
        },
        "EcsLb": {
            "Type": "AWS::ElasticLoadBalancing::LoadBalancer",
            "Properties": {
                "Subnets": {
                    "Ref": "SubnetIds"
                },
                "SecurityGroups": [
                    {
                        "Ref": "LbSecurityGroup"
                    }
                ],
                "Instances": [],
                "Listeners": [
                    {
                        "LoadBalancerPort": "80",
                        "InstancePort": {
                            "Ref": "EcsPort"
                        },
                        "Protocol": "HTTP"
                    },
                    {
                        "LoadBalancerPort": "443",
                        "InstancePort": {
                            "Ref": "EcsPort"
                        },
                        "Protocol": "HTTPS",
                        "SSLCertificateId": {
                            "Ref": "SslCertArn"
                        }
                    }
                ],
                "HealthCheck": {
                    "Target": {
                        "Fn::Join": [
                            "",
                            [
                                "HTTP:",
                                {
                                    "Ref": "EcsPort"
                                },
                                {
                                    "Ref": "EcsHealthCheckEndpoint"
                                }
                            ]
                        ]
                    },
                    "HealthyThreshold": "2",
                    "UnhealthyThreshold": "2",
                    "Interval": "20",
                    "Timeout": "5"
                },
                "Tags": [
                    {
                        "Key": "Name",
                        "Value": {
                            "Fn::Join": [
                                "",
                                [
                                    {
                                        "Ref": "EcsClusterName"
                                    },
                                    "-lb"
                                ]
                            ]
                        }
                    }
                ]
            }
        }
    },
    "Outputs": {
        "EcsAutoScalingGroupName": {
            "Description": "AutoScaling Group Name which will manage creation of new ECS Instances",
            "Value": {
                "Ref": "EcsInstanceAsg"
            }
        },
        "EcsLaunchConfiguration": {
            "Description": "Launch Configuration the AutoScalingGroup will use when creating new ECS Instances",
            "Value": {
                "Ref": "EcsInstanceLc"
            }
        }
    }
}

Fill in all the parameters and then create the stack with the following AWS CLI command, assuming you have AWS CLI installed and configured with a user which has cloudformation permission =)

aws cloudformation create-stack --stack-name example-ecs-stack \
  --template-body file:///Users/yourname/path/to/cloudformation/ecs_example_stack-cluster.cloudformation.json \
  --tags Key=stack,Value=example-ecs Key=vpc,Value=example-vpc

You’ll note that there is Domain Meta tags for the EC2 Instance. In our system, these tags are used by a Lambda function which automatically sets Route53 DNS entries for ECS cluster instances by subscribing to the SNS Topic for autoscaling life cycle events. Very handy, I’ll put up that Lambda function in another post.

Leave a Reply

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>