AWS Auto Scaling provides a simple, powerful user interface that lets AWS clients build scaling plans for resources including Amazon EC2 instances and Spot Fleets, Amazon ECS tasks, Amazon DynamoDB tables and indexes, and Amazon Aurora Replicas. The AWS Auto Scaling console provides a single user interface to use the automatic scaling features of multiple […]
AWS Auto Scaling
AWS Auto Scaling provides a simple, powerful user interface that lets AWS clients build scaling plans for resources including Amazon EC2 instances and Spot Fleets, Amazon ECS tasks, Amazon DynamoDB tables and indexes, and Amazon Aurora Replicas. The AWS Auto Scaling console provides a single user interface to use the automatic scaling features of multiple AWS services. Using scaling plan of AWS Auto scaling, customers can configure and manage scaling of their resources. The scaling plan uses dynamic and predictive Scaling to automatically scale the application’s resources. This allows customers to add the required computing power to handle the load on the application and then remove it when it’s no longer required. There are two different ways Auto scaling; dynamic scaling and predictive scaling.
- Dynamic scaling creates target tracking scaling policies for the scalable resources in your application. This lets your scaling plan add and remove capacity for each resource as required to maintain resource utilization at the specified target value.
- Predictive Scaling looks at historic traffic patterns and forecasts them into the future to schedule changes in the number of EC2 instances at the appropriate times going forward.
- Predictive Scaling uses machine learning models to forecast daily and weekly patterns.
Auto Scaling Features
AWS Auto Scaling continually calculates the appropriate scaling adjustments and immediately adds and removes capacity as needed to keep the metrics on target. AWS target tracking scaling policies are self-optimizing, and learn the customer actual load patterns to minimize fluctuations in resource capacity.
- AWS Auto Scaling allows customers to build scaling plans that automate how groups of different resources respond to changes in demand.
- AWS Auto Scaling automatically creates all of the scaling policies and sets targets for customers based on their preference.
- AWS Auto Scaling monitors customers applications and automatically adds or removes capacity from their resource groups in real-time as demands change.
Predictive Scaling predicts future traffic, including regularly-occurring spikes, and provisions the right number of EC2 instances in advance of predicted changes. Predictive Scaling’s machine learning algorithms detect changes in daily and weekly patterns, automatically adjusting their forecasts. Auto Scaling enhanced with Predictive Scaling delivers faster, simpler, and more accurate capacity provisioning resulting in lower cost and more responsive applications.
- Load forecasting: AWS Auto Scaling analyzes up to 14 days of history for a specified load metric and forecasts the future demand for the next two days.
- Scheduled scaling actions: AWS Auto Scaling schedules the scaling actions that proactively add and remove resource capacity to reflect the load forecast. At the scheduled time, AWS Auto Scaling updates the resource’s minimum capacity with the value specified by the scheduled scaling action.
- Maximum capacity behavior: Each resource has a minimum and a maximum capacity limit between which the value specified by the scheduled scaling action is expected to lie.
AWS Auto Scaling automatically creates target tracking scaling policies for all of the resources in the scaling plan, using the customer selected scaling strategy to set the target values for each metric.
- AWS Auto Scaling also creates and manages the Amazon CloudWatch alarms that trigger scaling adjustments for each of the resources.
- AWS Auto Scaling continually monitors customers applications to make sure that they are operating at the desired performance levels. When demand spikes, AWS Auto Scaling automatically increases the capacity of constrained resources.
Using AWS Auto Scaling, AWS customers can select one of three predefined optimization strategies designed to optimize performance, optimize costs, or balance the two:
- The application has a Cyclical traffic such as high use of resources during regular business hours and low use of resources overnight.
- The application is experiencing On and off workload patterns, such as batch processing, testing, or periodic analysis.
- The application has Variable traffic patterns, such as marketing campaigns with periods of spiky growth.
AWS Auto Scaling scans customers environments and automatically discovers the scalable cloud resources underlying their application. Using AWS Auto Scaling, customers can set target utilization levels for multiple resources in a single, intuitive interface.
- Customers can quickly see the average utilization of all of their scalable resources without having to navigate to other consoles.
- For applications such as Amazon EC2 and Amazon DynamoDB, AWS Auto Scaling manages resource provisioning for all of the EC2 Auto Scaling groups and database tables in the customer application.
EC2 AUTO SCALING
Amazon EC2 Auto Scaling groups enable customers to Launch or terminate EC2 instances in an Auto Scaling group.
- Amazon EC2 Auto Scaling scales out the client group (add more instances) to deal with high demand at peak times, and scale in the group (run fewer instances) to reduce costs during periods of low utilization.
- A scaling policy instructs Amazon EC2 Auto Scaling to track a specific CloudWatch metric, and it defines what action to take when the associated CloudWatch alarm is in ALARM.
- The metrics that are used to trigger an alarm are an aggregation of metrics coming from all of the instances in the Auto Scaling group.
Amazon EC2 Auto Scaling supports the following types of scaling policies:
- Target tracking scaling—Increase or decrease the current capacity of the group based on a target value for a specific metric.
- Step scaling—Increase or decrease the current capacity of the group based on a set of scaling adjustments, known as step adjustments, that vary based on the size of the alarm breach.
- Simple scaling—Increase or decrease the current capacity of the group based on a single scaling adjustment.
ECS AUTO SCALING
Automatic scaling has the ability to increase or decrease the desired count of tasks in the customer Amazon ECS service automatically. Amazon ECS leverages the Application Auto Scaling service to provide this functionality. Amazon ECS publishes CloudWatch metrics with customers service’s average CPU and memory usage, so that they can use this and other CloudWatch metrics to scale out the service to deal with high demand at peak times, and to scale in the service to reduce costs during periods of low utilization. Amazon ECS Service Auto Scaling supports
- Target Tracking
- Scaling Policies.
- Scheduled Scaling
The Application Auto Scaling service needs permission to describe the Amazon ECS services, CloudWatch alarms, and to modify customers service’s desired count on their behalf. Service Auto Scaling is a combination of the Amazon ECS, CloudWatch, and Application Auto Scaling APIs.
- Services are created and updated with Amazon ECS,
- Alarms are created with CloudWatch, and
- Scaling policies are created with Application Auto Scaling.
AURORA AUTO SCALING
Aurora Auto Scaling dynamically adjusts the number of Aurora Replicas provisioned for an Aurora DB cluster using single-master replication. Aurora Auto Scaling is available for both Aurora MySQL and Aurora PostgreSQL. Aurora Auto Scaling enables the customer Aurora DB cluster to handle sudden increases in connectivity or workload.
- When the connectivity or workload decreases, Aurora Auto Scaling removes unnecessary Aurora Replicas.
- The scaling policy defines the minimum and maximum number of Aurora Replicas that Aurora Auto Scaling can manage.
- Using this policy customers can define and apply a scaling policy to an Aurora DB cluster.
Aurora Auto Scaling uses a scaling policy to adjust the number of Aurora Replicas in an Aurora DB cluster. Aurora Auto Scaling has the following components:
- A service-linked role
- Target metric:– A target metric is a predefined or custom metric and a target value for the metric is specified in a target-tracking scaling policy configuration.
- Minimum and maximum capacity:- Customers are able to specify the maximum number of Aurora Replicas (0 – 15) to be managed by Application Auto Scaling.
- A cooldown period:- A cooldown period blocks subsequent scale-in or scale-out requests until the period expires. These blocks slow the deletions of Aurora Replicas in the Aurora DB cluster.