Amazon Auto Scaling
Auto Scaling provides a simple, powerful user interface that lets AWS clients build scaling plans for resources including Amazon EC2 instances and Spot Fleets, Amazon ECS tasks, Amazon DynamoDB tables and indexes, and Amazon Aurora Replicas. The AWS Auto Scaling console provides a single user interface to use the automatic scaling features of multiple AWS services. Using scaling plan of AWS Auto scaling, customers can configure and manage scaling of their resources. The scaling plan uses dynamic and predictive Scaling to automatically scale the application’s resources. This allows customers to add the required computing power to handle the load on the application and then remove it when it’s no longer required. There are two different ways Auto scaling; dynamic scaling and predictive scaling.
- Dynamic scaling creates target tracking scaling policies for the scalable resources in your application. This lets your scaling plan add and remove capacity for each resource as required to maintain resource utilization at the specified target value.
- Predictive Scaling looks at historic traffic patterns and forecasts them into the future to schedule changes in the number of EC2 instances at the appropriate times going forward.
- Predictive Scaling uses machine learning models to forecast daily and weekly patterns.
Auto Scaling Features
AWS Auto Scaling continually calculates the appropriate scaling adjustments and immediately adds and removes capacity as needed to keep the metrics on target. AWS target tracking scaling policies are self-optimizing, and learn the customer actual load patterns to minimize fluctuations in resource capacity.
- AWS Auto Scaling allows customers to build scaling plans that automate how groups of different resources respond to changes in demand.
- AWS Auto Scaling automatically creates all of the scaling policies and sets targets for customers based on their preference.
- AWS Auto Scaling monitors customers applications and automatically adds or removes capacity from their resource groups in real-time as demands change.
Predictive Scaling predicts future traffic, including regularly-occurring spikes, and provisions the right number of EC2 instances in advance of predicted changes. Predictive Scaling’s machine learning algorithms detect changes in daily and weekly patterns, automatically adjusting their forecasts. Auto Scaling enhanced with Predictive Scaling delivers faster, simpler, and more accurate capacity provisioning resulting in lower cost and more responsive applications.
- Load forecasting: AWS Auto Scaling analyzes up to 14 days of history for a specified load metric and forecasts the future demand for the next two days.
- Scheduled scaling actions: AWS Auto Scaling schedules the scaling actions that proactively add and remove resource capacity to reflect the load forecast. At the scheduled time, AWS Auto Scaling updates the resource’s minimum capacity with the value specified by the scheduled scaling action.
- Maximum capacity behavior: Each resource has a minimum and a maximum capacity limit between which the value specified by the scheduled scaling action is expected to lie.
AWS Auto Scaling automatically creates target tracking scaling policies for all of the resources in the scaling plan, using the customer selected scaling strategy to set the target values for each metric.
- AWS Auto Scaling also creates and manages the Amazon CloudWatch alarms that trigger scaling adjustments for each of the resources.
- AWS Auto Scaling continually monitors customers applications to make sure that they are operating at the desired performance levels. When demand spikes, AWS Auto Scaling automatically increases the capacity of constrained resources.
Using AWS Auto Scaling, AWS customers can select one of three predefined optimization strategies designed to optimize performance, optimize costs, or balance the two:
- The application has a Cyclical traffic such as high use of resources during regular business hours and low use of resources overnight.
- The application is experiencing On and off workload patterns, such as batch processing, testing, or periodic analysis.
- The application has Variable traffic patterns, such as marketing campaigns with periods of spiky growth.
AWS Auto Scaling scans customers environments and automatically discovers the scalable cloud resources underlying their application. Using AWS Auto Scaling, customers can set target utilization levels for multiple resources in a single, intuitive interface.
- Customers can quickly see the average utilization of all of their scalable resources without having to navigate to other consoles.
- For applications such as Amazon EC2 and Amazon DynamoDB, AWS Auto Scaling manages resource provisioning for all of the EC2 Auto Scaling groups and database tables in the customer application.
Predictive Scaling is an AWS Auto Scaling that looks at historic traffic patterns and forecasts them into the future to schedule changes in the number of EC2 instances at the appropriate times going forward. Predictive Scaling uses machine learning models to forecast daily and weekly patterns. Predictive Scaling works with in conjunction with target tracking to make the customer EC2 capacity changes more responsive to customers incoming application traffic.
- Auto Scaling enhanced with Predictive Scaling delivers faster, simpler, and more accurate capacity provisioning resulting in lower cost and more responsive applications.
- By predicting traffic changes, Predictive Scaling provisions EC2 instances in advance of changing traffic, making Auto Scaling faster and more accurate.
- While Predictive Scaling sets up the minimum capacity for customers application based on forecasted traffic, target tracking changes the actual capacity based on the actual traffic at the moment.
- Target tracking works to track the desired capacity utilization levels over varying traffic conditions and addresses unpredicted traffic spikes and other fluctuations.
Predictive Scaling and target tracking are configured together by a user to generate a scaling plan. A scaling plan is a collection of scaling instructions for multiple AWS resources. Customers can configure a scaling plan by first selecting all the EC2 resources underlying their application in AWS Auto Scaling.
- The resource utilization metric and the incoming traffic metric are the key parameters for the scaling plan.
- The incoming traffic metric is used by Predictive Scaling to generate traffic forecasts. Based on these forecasts, Predictive Scaling then schedules future scaling actions to configure minimum capacity.
A launch configuration is an instance configuration template that an Auto Scaling group uses to launch EC2 instances. When AWS customers create a launch configuration, they also specify information for the instances. Include the ID of the Amazon Machine Image (AMI), the instance type, a key pair, one or more security groups, and a block device mapping. when creating an Auto Scaling group, the launch configuration, the launch template, or an EC2 instance need to be specified .
- During the creation of an Auto Scaling group using an EC2 instance, Amazon EC2 Auto Scaling automatically creates a launch configuration for them and associates it with the Auto Scaling group.
A launch template is similar to a launch configuration, in addition to that it specifies instance configuration information. Which includes ID of the Amazon Machine Image (AMI), instance type, key pair, security groups, and the other parameters.
- Defining a launch template instead of a launch configuration allows you to have multiple versions of a template.
- With launch templates, customers are able to provision capacity across multiple instance types using both On-Demand Instances and Spot Instances to achieve the desired scale, performance, and cost.
An Auto Scaling group contains a collection of Amazon EC2 instances that are treated as a logical grouping for the purposes of automatic scaling and management. An Auto Scaling group also enables customers to use Amazon EC2 Auto Scaling features such as health check replacements and scaling policies. Both maintaining the number of instances in an Auto Scaling group and automatic scaling are the core functionality of the Amazon EC2 Auto Scaling service.
- The Auto Scaling group continues to maintain a fixed number of instances even if an instance becomes unhealthy. If an instance becomes unhealthy, the group terminates the unhealthy instance and launches another instance to replace it.
An Auto Scaling group can launch On-Demand Instances, Spot Instances, or both. Spot Instances provide customers with access to unused Amazon EC2 capacity at steep discounts relative to On-Demand prices. For more information, see Amazon EC2 Spot Instances. There are key differences between Spot Instances and On-Demand Instances:
- The price for Spot Instances varies based
- On demand Amazon EC2 can terminate an individual Spot Instance as the availability of, or price for, Spot Instances changes.
Auto Scaling Resources
AWS customers have multiple options for scaling resources. To configure automatic scaling for multiple resources across multiple services, use AWS Auto Scaling to create a scaling plan for the resources underlying their application. AWS Auto Scaling is also used to create predictive scaling for EC2 resources.
- Amazon EC2 Auto Scaling helps AWS clients ensure that they have the correct number of Amazon EC2 instances available to handle the load for their application.
- In additon, Application Auto Scaling can scale Amazon ECS services, Amazon EC2 Spot fleets, Amazon EMR clusters, Amazon AppStream 2.0 fleets, provisioned read and write capacity for Amazon DynamoDB tables and global secondary indexes, Amazon Aurora Replicas, and Amazon SageMaker endpoint variants.
EC2 Spot Fleet requests
Amazon EC2 Spot Fleet requests: Launch or terminate instances from a Spot Fleet request, or automatically replace instances that get interrupted for price or capacity reasons. Automatic scaling is the ability to increase or decrease the target capacity of the customer Spot Fleet automatically based on demand. A Spot Fleet can either launch instances (scale out) or terminate instances (scale in), within the range that was specified, in response to one or more scaling policies. Spot Fleet supports the following types of automatic scaling:
- Target tracking scaling:– Increase or decrease the current capacity of the fleet based on a target value for a specific metric. This is similar to the way that your thermostat maintains the temperature of your home—you select temperature and the thermostat does the rest.
- Step scaling:– Increase or decrease the current capacity of the fleet based on a set of scaling adjustments, known as step adjustments, that vary based on the size of the alarm breach.
- Scheduled scaling:– Increase or decrease the current capacity of the fleet based on the date and time
The scaling policies that were created for Spot Fleet support a cool down period. Which is the number of seconds after a scaling activity completes where previous trigger-related scaling activities can influence future scaling events.
- Use scale based on instance metrics with a 1-minute frequency to ensure a faster response to utilization changes. Scaling on metrics with a 5-minute frequency can result in slower response time and scaling on stale metric data.
DynamoDB Auto Scaling
Amazon DynamoDB auto scaling uses the AWS Application Auto Scaling service to dynamically adjust provisioned throughput capacity on customers behalf, in response to actual traffic patterns. This enables a table or a global secondary index to increase its provisioned read and write capacity to handle sudden increases in traffic, without throttling. When the workload decreases, Application Auto Scaling decreases the throughput.
Enabling a DynamoDB table or a global secondary index increases or decreases its provisioned read and write capacity to handle increases in traffic without throttling. With Application Auto Scaling, customers can create a scaling policy for a table or a global secondary index.
- The scaling policy contains a target utilization, the percentage of consumed provisioned throughput at a point in time. Application Auto Scaling uses a target tracking algorithm to adjust the provisioned throughput of the table (or index) upward or downward in response to actual workloads, so that the actual capacity utilization remains at or near the customer target utilization.
- DynamoDB auto scaling also supports global secondary indexes. Every global secondary index has its own provisioned throughput capacity, separate from that of its base table.
- DynamoDB auto scaling modifies provisioned throughput settings only when the actual workload stays elevated (or depressed) for a sustained period of several minutes.
When AWS clients create a scaling policy, Application Auto Scaling creates a pair of Amazon CloudWatch alarms on their behalf. Each pair represents clients upper and lower boundaries for provisioned throughput settings. To enable DynamoDB auto scaling for the ProductCatalog table, clients need to create a scaling policy. This policy specifies includes:
- The table or global secondary index that the clients want to manage.
- Which capacity type to manage (read capacity or write capacity).
- The upper and lower boundaries for the provisioned throughput settings.
- The customer target utilization
EC2 Auto Scaling
Amazon EC2 Auto Scaling groups enable customers to Launch or terminate EC2 instances in an Auto Scaling group.
- Amazon EC2 Auto Scaling scales out the client group (add more instances) to deal with high demand at peak times, and scale in the group (run fewer instances) to reduce costs during periods of low utilization.
- A scaling policy instructs Amazon EC2 Auto Scaling to track a specific CloudWatch metric, and it defines what action to take when the associated CloudWatch alarm is in ALARM.
- The metrics that are used to trigger an alarm are an aggregation of metrics coming from all of the instances in the Auto Scaling group.
Amazon EC2 Auto Scaling supports the following types of scaling policies:
- Target tracking scaling—Increase or decrease the current capacity of the group based on a target value for a specific metric.
- Step scaling—Increase or decrease the current capacity of the group based on a set of scaling adjustments, known as step adjustments, that vary based on the size of the alarm breach.
- Simple scaling—Increase or decrease the current capacity of the group based on a single scaling adjustment.
ECS Auto Scaling
Automatic scaling has the ability to increase or decrease the desired count of tasks in the customer Amazon ECS service automatically. Amazon ECS leverages the Application Auto Scaling service to provide this functionality. Amazon ECS publishes CloudWatch metrics with customers service’s average CPU and memory usage, so that they can use this and other CloudWatch metrics to scale out the service to deal with high demand at peak times, and to scale in the service to reduce costs during periods of low utilization. Amazon ECS Service Auto Scaling supports
- Target Tracking
- Scaling Policies.
- Scheduled Scaling
The Application Auto Scaling service needs permission to describe the Amazon ECS services, CloudWatch alarms, and to modify customers service’s desired count on their behalf. Service Auto Scaling is a combination of the Amazon ECS, CloudWatch, and Application Auto Scaling APIs.
- Services are created and updated with Amazon ECS,
- Alarms are created with CloudWatch, and
- Scaling policies are created with Application Auto Scaling.
Aurora Auto Scaling
Aurora Auto Scaling dynamically adjusts the number of Aurora Replicas provisioned for an Aurora DB cluster using single-master replication. Aurora Auto Scaling is available for both Aurora MySQL and Aurora PostgreSQL. Aurora Auto Scaling enables the customer Aurora DB cluster to handle sudden increases in connectivity or workload.
- When the connectivity or workload decreases, Aurora Auto Scaling removes unnecessary Aurora Replicas.
- The scaling policy defines the minimum and maximum number of Aurora Replicas that Aurora Auto Scaling can manage.
- Using this policy customers can define and apply a scaling policy to an Aurora DB cluster.
Aurora Auto Scaling uses a scaling policy to adjust the number of Aurora Replicas in an Aurora DB cluster. Aurora Auto Scaling has the following components:
- A service-linked role
- Target metric:– A target metric is a predefined or custom metric and a target value for the metric is specified in a target-tracking scaling policy configuration.
- Minimum and maximum capacity:- Customers are able to specify the maximum number of Aurora Replicas (0 – 15) to be managed by Application Auto Scaling.
- A cooldown period:- A cooldown period blocks subsequent scale-in or scale-out requests until the period expires. These blocks slow the deletions of Aurora Replicas in the Aurora DB cluster.
AWS CloudFormation is an AWS service that gives developers and businesses an easy way to create a collection of related AWS and third party resources and provision them in an orderly and predictable fashion. AWS CloudFormation enables customers to use programming languages or a simple text file to model and provision in an automated and secure manner.
- Using AWS CloudFormation sample templates or by creating their own templates, AWS clients can describe the AWS resources, and any associated dependencies or runtime parameters, required to run any application.
- AWS CloudFormation automates and simplifies the task by creating groups of related resources, and interconnecting all these resources to power customers applications.
- AWS CloudFormation provisions customers application resources in a safe, repeatable manner, that enables them to build and rebuild their infrastructure and applications, without having to perform manual actions or write custom scripts.
- AWS CloudFormation allows customers to model their entire infrastructure and application resources with either a text file or programming languages. The AWS CloudFormation Registry and CLI enables to manage third party resources with CloudFormation.
AWS CloudFormation template is a template that describes all of AWS customers resources and their properties. When creating a template in AWS CloudFormation stack, AWS CloudFormation provisions the Auto Scaling group, load balancer, and database. Once the stack has been successfully created and AWS resources are up and running, customers can delete the stack, that deletes all the resources in the stack.
- By using AWS CloudFormation, AWS customers can easily manage a collection of resources as a single unit.
- When provisioning the infrastructure in AWS CloudFormation, the AWS CloudFormation template describes exactly what resources are provisioned and their settings. Because these templates are text files, customers can simply track differences in their templates to track changes to their infrastructure, similar to the way developers control revisions to source code.
With the AWS Cloud Development Kit (AWS CDK) customers can define their application using TypeScript, Python, Java, and .NET. AWS CDK is an open source software development framework that helps customers model their cloud application resources using familiar programming languages, and then provision their infrastructure using AWS CloudFormation directly from their IDE. CDK provides you with high-level components that preconfigure cloud resources with proven defaults.
- It provides you with high-level components that preconfigure cloud resources with proven defaults,
- AWS CDK provisions customers resources in a safe, repeatable manner through AWS CloudFormation.
- It enables customers to compose and share their own custom components that incorporate their organization’s requirements, helping them to start new projects faster.
AWS CloudFormation Designer (Designer) is a graphic tool for creating, viewing, and modifying AWS CloudFormation templates. With Designer, AWS customers can diagram their template resources using a drag-and-drop interface, and then edit their details using the integrated JSON and YAML editor.
- For those, who are a new or an experienced AWS CloudFormation user, AWS CloudFormation Designer can help them quickly see the interrelationship between a template’s resources and easily modify templates.
- Designer enables customers to see graphic representations of the resources in their template
- It simplifies template authoring, and template editing.
Continuous delivery is a release practice in which code changes are automatically built, tested, and prepared for release to production. With AWS CloudFormation and CodePipeline, customers can use continuous delivery to automatically build and test changes to your AWS CloudFormation templates before promoting them to production stacks.
- Continuous delivery lets developers automate testing beyond just unit tests so they can verify application updates across multiple dimensions before deploying to customers.
- These tests may include UI testing, load testing, integration testing, API reliability testing, etc., which enables developers more thoroughly validate updates and pre-emptively discover issues.
- Customers can discover and address bugs earlier before they grow into larger problems later with more frequent and comprehensive testing.
When you use AWS CloudFormation, you work with templates (describe customers AWS resources and their properties) and stacks(provisions the resources that are described in customers template).
- The template, a JSON or YAML-format, text-based file that describes all the AWS resources customers need to deploy to run their application and the stack, the set of AWS resources that are created and managed as a single unit when AWS CloudFormation instantiates a template.
- Customers can use JSON or YAML to describe what AWS resources they want to create and configure. Using use AWS CloudFormation Designer, customers can design visually and get started with AWS CloudFormation templates.
- An AWS CloudFormation template is a JSON or YAML formatted text file. Customers can save these files with any extension, such as
.txt. AWS CloudFormation uses these templates as blueprints for building your AWS resources.
A stack is AWS CloudFormation service that manage related AWS resources as a single unit. Customers can create, update, and delete a collection of resources by creating, updating, and deleting stacks. All the resources in a stack are defined by the stack’s AWS CloudFormation template.
- Once customers created a template such as Auto Scaling group, Elastic Load Balancing load balancer, and an Amazon Relational Database Service (Amazon RDS) database instance, they can create a stack by submitting the template that they created, and AWS CloudFormation provisions all those resources.
- AWS customers can work with stacks by using the AWS CloudFormation console, API, or AWS CLI.
- Using a Change Sets (summary of the customer proposed changes), AWS CloudFormation will create a new template and delete the old one.
- While creating an AWS CloudFormation stack, the AWS Management Console will automatically synthesize and present a pop-up dialog form for AWS customers to edit parameter values
Amazon ECS is integrated with AWS Cloud Map, that helps customers discover and connect their containerized services with each other. Cloud Map enables customers to define custom names for application resources, and it maintains the updated location of these dynamically changing resources.
- Service mesh makes it easy to build and run complex microservices applications by standardizing how every microservice in the application communicates.
- Amazon Elastic Container Service supports Docker networking and integrates with Amazon VPC to provide isolation for containers.
- Amazon ECS is integrated with Elastic Load Balancing, allowing customers to distribute traffic across your containers using Application Load Balancers or Network Load Balancers.
- Amazon ECS allows clients to specify an IAM role for each ECS task. This allows the Amazon ECS container instances to have a minimal role
Manage relationships: Templates concisely capture resource relationships, such as EC2 instances that must be associated with an Elastic Load Balancing load balancer, or the fact that an EBS volume must be in the same EC2 Availability Zone as the instance to which it is attached.
Use over and over: Using template parameters enable a single template to be used for many infrastructure deployments with different configuration values, such as how many instances to deploy for the application.
Get helpful feedback: Templates also provide output properties for communicating deployment results or configuration information back to the user. For example, when instantiated, a template may provide the URL of the Elastic Load Balancing endpoint the customer should use to connect to the newly instantiated application.
Avoid collisions: All AWS resources in a template are identified using logical names, allowing multiple stacks to be created from a template without fear of naming collisions between AWS resources.
Write and go: Use any method to launch a stack without having to register the template with AWS CloudFormation beforehand.
Visualize your stack: CloudFormation Designer allows customers to visualize their templates in a diagram. Customers can view the AWS resources and their relationships, and arrange their layout so that the diagram makes sense to you. They can edit the templates using the drag-and-drop interface and the integrated JSON editor.
Look up resources: AWS CloudFormation retains a copy of the stack template so you can use the AWS Management Console, the command line tools or the APIs to look up the precise resource configurations that were applied during stack creation.
Automate: You have the option to automate template generation using a programming language or a tool of your choice. You also have the option to automate stack creation from the templates using the CloudFormation API, AWS SDKs, or AWS CLI.
Using a stack set AWS customers can create stacks in AWS accounts across regions by using a single AWS CloudFormation template. All the resources included in each stack are defined by the stack set’s AWS CloudFormation template.
- Once the stack set is defined, customers can create, update, or delete stacks in the target accounts and Regions they specify.
- While creating, updating, or deleting a stacks, customers may also specify operation preferences, such as the order of regions, the failure tolerance, and the number of accounts in which operations are performed on stacks concurrently.
A stack instance is a reference to a stack in a target account within a Region. A stack instance can exist without a stack. These are the status codes for stack instances within stack sets
- CURRENT The stack is currently up to date with the stack set.
- OUTDATED The stack is not currently up to date with the stack set for one of the following reasons. A CreateStackSet or UpdateStackSet operation on the associated stack failed. The stack was part of a CreateStackSet or UpdateStackSet operation that failed, or was stopped before the stack was created or updated.
- INOPERABLE A DeleteStackInstances operation has failed and left the stack in an unstable state. Stacks in this state are excluded from further UpdateStackSet operations. You might need to perform a DeleteStackInstances operation, with RetainStacks set to true, to delete the stack instance, and then delete the stack manually.
Stack set operation options
Maximum concurrent accounts:- This setting is available in create, update, and delete workflows, and it also allows customers to specify the maximum number or percentage of target accounts in which an operation is performed at one time.
- A lower number or percentage means that an operation is performed in fewer target accounts at one time.
- For large deployments, under certain circumstances the actual number of accounts acted upon concurrently may be lower due to service throttling.
Failure tolerance:- This setting is available in create, update, and delete workflows, and it also enables customers to specify the maximum number or percentage of stack operation failures that can occur, per Region, beyond which AWS CloudFormation stops an operation automatically.
- A lower number or percentage means that the operation is performed on fewer stacks, but you are able to start troubleshooting failed operations faster.
Retain stacks:- This setting is available in delete stack workflows, lets you keep stacks and their resources running even after they have been removed from a stack set. When customers retain stacks, AWS CloudFormation leaves stacks in individual accounts and Regions intact. Stacks are disassociated from the stack set, but the stack and its resources are saved.
Administrator and target accounts
An administrator account is the AWS account in which you create stack sets. A stack set is managed by signing in to the AWS administrator account in which it was created.
- A target account is the account into which you create, update, or delete one or more stacks in your stack set.
- In order to use a stack set to create stacks in a target account, customers need set up a trust relationship between the administrator and target accounts.
self-managed permissions enable customers to create an IAM roles required by StackSets to deploy across accounts and Regions.
- These roles are necessary to establish a trusted relationship between the account customers administering the stack set from and the account they deploying stack instances to.
- Using this permissions model, StackSets can deploy to any AWS account in which the customer have permissions to create an IAM role.
service-managed permissions allow customers to deploy stack instances to accounts managed by AWS Organizations.
- If customers are using this permissions model, you don’t necessarily have to create IAM roles; StackSets creates the IAM roles on their behalf.
- With this model, clients are able for automatic deployments to accounts that are added to their businesses or organizations in the future.
Stack set operations
Create stack set:- Creating a new stack set includes specifying an AWS CloudFormation template that customers want to use to create stacks, specifying the target accounts in which they want to create stacks, and identifying the AWS Regions in which they want to deploy stacks in their target accounts.
- A stack set ensures consistent deployment of the same stack resources, with the same settings, to all specified target accounts within the Regions you choose.
Update stack set:- AWS customers can update a stack set in one of the following ways. Customers can change existing settings in the template or add new resources, such as updating parameter settings for a specific service, or adding new Amazon EC2 instances.
- Customers are able to Replace the template with a different template.
- They can add stacks in existing or additional target accounts, across existing or additional Regions.
- Template updates always affect all stacks; Customers can’t selectively update the template for some stacks in the stack set, but not others.
Delete stacks:- Deleting a stacks means removing a stack and all its associated resources from the target accounts customers specify, within the Regions they select.
Delete stack set:- AWS customers have the abilty to delete their stack set only when there are no stack instances in it.
A stack is a collection of resources that result from instantiating a template, and it can created by supplying a template and any required parameters to AWS CloudFormation. AWS CloudFormation determines what AWS resources need to be created and in what order, based on the template and dependencies specified on it. To update a stack, customers need to provide a template with the desired configuration of all of the resources in their stack.
- They can modify properties of the existing resources in their stack to react to changes in the environment or new application requirements.
- The changes will be made without affecting customers running application. However, if a change cannot be made dynamically (such as updating the AMI on an EC2 instance), AWS CloudFormation will create a new resource and rewire it into the stack, deleting the old resource once the service has determined that the full update will be successful.
AWS CloudFormation will create or update a stack it in its entirety. If a stack cannot be created or updated in its entirety, AWS CloudFormation will roll it back. For debugging purposes, the rollback operation can be disabled and the stack create or update can be manually retried at a later time.
- Using AWS CloudFormation Designer, customers can create or modify a stack’s template and then submit it to AWS CloudFormation to create or update the stack.
- AWS CloudFormation Designer is available within the AWS Management Console.
- AWS CloudFormation can be easily accessed through the AWS Management Console, which is a point-and-click, web-based interface to deploy and manage stacks. Customers are allowed to create, delete, and update an application from inside the AWS Management Console in a few simple steps.
Bootstrapping Applications and Handling
AWS CloudFormation provides a number of helper scripts that can be deployed to your EC2 instances. These scripts provide a simple way to read resource metadata from customers stack and use it to configure their application, deploy packages and files to the instance that are listed in the template, and react to stack updates such as changes to the configuration or updates to the application. Here are some of the scripts that are available:
- cfn-get-metadata: Retrieve metadata attached to your resources in the template.
- cfn-init: Download and install packages and files described in your template.
- cfn-signal: Signal to the stack creation workflow that your application is up and running and ready to take traffic.
- cfn-hup: A daemon to listen for stack updates that were initiated through the AWS console, command line tools or API directly and execute your application-specific hooks to react to those changes.
Customers can use CloudFormation scripts on their own or in conjunction with CloudInit, a feature available on the Amazon Linux AMI and some other Linux AMIs. For more details of bootstrapping applications and updating configuration, see the AWS CloudFormation developer resources.
AWS CloudFormation API
AWS CloudFormation provides a simple set of APIs that are easy to use and highly flexible. Some of the most commonly used APIs and their functionality are listed below:
CreateStack: Starts the creation of a new stack. The input parameters to the call include the stack name and a file name (or Amazon S3 URL) for the source template.
ListStacks: Lists all stacks in customers account. Customers are able to use ListStacks to view the set of stacks and their current status, such as whether the stack is being created or updated.
ListStackResources: Lists all the AWS resource names and identifiers that were created as part of creating a stack. In addition to providing customers information, this call can be used by an AWS CloudFormation-aware application to understand its environment.
DescribeStackEvents: Lists all AWS CloudFormation generated operations and events for a stack so that customers can see how creation or deletion is progressing.
UpdateStack: Starts the update process for an existing stack. The input parameters to the call include the stack name and a file name (or Amazon S3 URL) for the updated template.
AWS CloudFormation is integrated with the Amazon Simple Notification Service (Amazon SNS), that enables customers to receive notifications as the creation, update and deletion of the stack progresses.
AWS CloudTrail is an AWS service that enables customers governance, compliance, and operational and risk auditing of their AWS account. it is also records activity made on customers account and delivers log files to their Amazon S3 bucket if they have one. Using CloudTrail, customers can log, continuously monitor, and retain account activity related to actions across their AWS infrastructure.
- CloudTrail provides event history of customers AWS account activity, including actions taken through the AWS Management Console, AWS SDKs, command line tools, and other AWS services.
- CloudTrail detect unusual activity in customers AWS accounts, that help to simplify operational analysis and troubleshooting.
- Actions taken by a user, role, or an AWS service are recorded as events in CloudTrail.
- CloudTrail can be integrated to applications using the API automate trail creation for customers business or organization.
CloudTrail provides visibility into user activity by recording actions taken on customers account. CloudTrail records important information about each action, including who made the request, the services used, the actions performed, parameters for the actions, and the response elements returned by the AWS service.
- This information helps AWS customers to track changes made to their AWS resources and to troubleshoot operational issues. CloudTrail makes it easier to ensure compliance with internal policies and regulatory standards.
- AWS CloudTrail shows the results of the CloudTrail Event History for the current region customers are viewing for the last 90 days. These events are limited to with create, modify, and delete API calls and account activity.
- For a complete record of account activity, including all management events, data events, and read-only activity, customers need to configure a CloudTrail trail.
Using CloudTrail log file integrity validation AWS customers can determine whether a log file was modified, deleted, or unchanged after CloudTrail delivered it. This feature is built using industry standard algorithms: SHA-256 for hashing and SHA-256 with RSA for digital signing. This makes it computationally infeasible to modify, delete or forge CloudTrail log files without detection.
- Customers can use the AWS CLI to validate the files in the location where CloudTrail delivered them.
- A validated log file enables customers to assert positively that the log file itself has not changed, or that particular user credentials performed specific API activity.
Server-side encryption is the encryption of data at its destination by the application or service that receives it. AWS Key Management Service (AWS KMS) is a service that combines secure, highly available hardware and software to provide a key management system scaled for the cloud. Amazon S3 uses AWS KMS customer master keys (CMKs) to encrypt customers Amazon S3 objects. AWS KMS encrypts only the object data and here are what customers can do with it:
- Create and manage the CMK encryption keys yourself.
- Use a single CMK to encrypt and decrypt log files for multiple accounts across all regions.
- Customers ave control over who can use their key for encrypting and decrypting CloudTrail log files.
- Have enhanced security. With this feature, in order to read log files, the following permissions are required:
- A user must have S3 read permissions for the bucket that contains the log files.
- A user must also have a policy or role applied that allows decrypt permissions by the CMK policy.
- Because S3 automatically decrypts the log files for requests from users authorized to use the CMK, SSE-KMS encryption for CloudTrail log files is backward-compatible with applications that read CloudTrail log data.
A trail can be applied to all Regions or a single Region. As a best practice, create a trail that applies to all Regions in the AWS partition in which you are working. This is the default setting when you create a trail in the CloudTrail console. A trail that applies to all AWS Regions has the following advantages:
- The configuration settings for the trail apply consistently across all AWS Regions.
- Receive CloudTrail events from all AWS Regions in a single Amazon S3 bucket and, optionally, in a CloudWatch Logs log group.
- Manage trail configuration for all AWS Regions from one location.
- Receive events from a new AWS Region. When a new AWS Region is launched, CloudTrail automatically creates a copy of all of the Region trails.
- Any activity in any AWS Region is logged in a trail that applies to all AWS Regions.
Data events provide insights into the resource (“data plane”) operations performed on or within the resource itself. Data events are often high volume activities and include operations such as Amazon S3 object level APIs and AWS Lambda function invoke APIs.
- By logging on API actions Amazon S3 objects, customers can receive detailed information such as the AWS account, IAM user role, and IP address of the caller, time of the API call, and other details.
- They can record activity of their Lambda functions, and receive details on Lambda function executions, such as the IAM user or service that made the Invoke API call, when the call was made, and which function was executed.
AWS Lambda:- Amazon S3 bucket notification publish object-created events to AWS Lambda. When CloudTrail writes logs to your S3 bucket, Amazon S3 can invoke customers Lambda function to process the access records logged by CloudTrail.
Amazon CloudWatch Logs:- AWS CloudTrail integration with Amazon CloudWatch Logs enables customers to send management and data events recorded by CloudTrail to CloudWatch Logs.
- CloudWatch Logs allows customers to create metric filters to monitor events, search events, and stream events to other AWS services, such as AWS Lambda and Amazon Elasticsearch Service.
Amazon CloudWatch Events:- AWS CloudTrail integration with Amazon CloudWatch Events, that enables customers to automatically respond to changes to their AWS resources.
- With CloudWatch Events, you are able to define actions to execute when specific events are logged by AWS CloudTrail.
- Customers can create a CloudWatch Events rule that sends this activity to an AWS Lambda function. Lambda can then execute a workflow to create a ticket in their IT Helpdesk system.
CloudTrail integration with CloudWatch Logs delivers management and data events captured by CloudTrail to a CloudWatch Logs log stream in the CloudWatch Logs log group you specify.
Using AWS Athena Service can achieve:
- Using Athena with CloudTrail logs is a powerful way to enhance customers analysis of AWS service activity. Customers can use queries to identify trends and further isolate activity by attribute, such as source IP address or user.
- You can automatically create tables for querying logs directly from the CloudTrail console, and use those tables to run queries in Athena.
Amazon CloudWatch Logs:- Customers can configure CloudTrail with CloudWatch Logs to monitor their trail logs and be notified when specific activity occurs.
- AWS clients can define CloudWatch Logs metric filters that will trigger CloudWatch alarms and send notifications to you when those alarms are triggered.
Customers can create a trail in the master account for an organization that collects all event data for all AWS accounts in an organization in AWS Organizationsm known as an organization trail.
- Creating an organization trail helps customers define a uniform event logging strategy for their organization.
- An organization trail is applied automatically to each AWS account in customers organization.
- Users in member accounts can see these trails but cannot modify them.
An event in CloudTrail is the record of an activity in an AWS account. This activity can be an action taken by a user, role, or service that is monitorable by CloudTrail. CloudTrail events provide a history of both API and non-API account activity made through the AWS Management Console, AWS SDKs, command line tools, and other AWS services. There are two types of events that can be logged in CloudTrail: management events and data events. By default, trails log management events, but not data events.
Management events provide information about management operations that are performed on resources in the customer AWS account. These are also known as control plane operations.
- Configuring security (for example, IAM AttachRolePolicy API operations).
- Registering devices (for example, Amazon EC2 CreateDefaultVpc API operations).
- Configuring rules for routing data (for example, Amazon EC2 CreateSubnet API operations).
- Setting up logging (for example, AWS CloudTrail CreateTrail API operations).
Management events can also include non-API events that occur in customers account.
An organization trail is a configuration that enables delivery of CloudTrail events in the master account and all member accounts in an organization to the same Amazon S3 bucket, CloudWatch Logs, and CloudWatch Events. Creating an organization trail helps customers define a uniform event logging strategy for their business or organization.
- Users with CloudTrail permissions in member accounts will be able to see this trail (including the trail ARN) when they log into the AWS CloudTrail console from their AWS accounts, or when they run AWS CLI commands such as
describe-trails(although member accounts must use the ARN for the organization trail, and not the name, when using the AWS CLI).
- When customers create an organization trail in the console, or enable CloudTrail as a trusted service in the Organizations, it creates a service-linked role to perform logging tasks in their organization’s member accounts, which is referred as AWSServiceRoleForCloudTrail
About Global Service Events
For most services, events are recorded in the region where the action occurred. For global services such as AWS Identity and Access Management (IAM), AWS STS, and Amazon CloudFront, events are delivered to any trail that includes global services, and are logged as occurring in US East (N. Virginia) Region. To avoid receiving duplicate global service events, remember the following:
- Global service events are delivered by default to trails that are created using the CloudTrail console. Events are delivered to the bucket for the trail.
- If AWS customers have multiple single region trails, consider configuring their trails so that global service events are delivered in only one of the trails.
- If customers change the configuration of a trail from logging all regions to logging a single region, global service event logging is turned off automatically for that trail.
- If customers change the configuration of a trail from logging a single region to logging all regions, global service event logging is turned on automatically for that trail.
CloudTrail Insights events capture unusual activity in customers AWS account. when it enabled CloudTrail detects unusual activity, Insights events are logged to a different folder or prefix in the destination S3 bucket for your trail. Insights events are logged when CloudTrail detects unusual
write management API activity in your account.
- Aws clients can see the type of insight and the incident time period when they view Insights events on the CloudTrail console.
- Insights events provide relevant information, such as the associated API, incident time, and statistics, that help understand and act on unusual activity.
- Insights events are logged only when CloudTrail detects changes in customers account’s API usage that differ significantly from the account’s typical usage patterns.
- CloudTrail Insights analyzes write management events that occur in a single Region, not globally. A CloudTrail Insights event is generated in the same Region as its supporting management events are generated.
Customers can use and manage the CloudTrail service with the AWS CloudTrail console. The console provides a user interface for performing many CloudTrail tasks such as:
- Viewing recent events and event history for your AWS account.
- Downloading a filtered or complete file of the last 90 days of events.
- Creating and editing CloudTrail trails.
- Configuring CloudTrail trails, including:
- Selecting an Amazon S3 bucket.
- Setting a prefix.
- Configuring delivery to CloudWatch Logs.
- Using AWS KMS keys for encryption.
- Enabling Amazon SNS notifications for log file delivery.
- Adding and managing tags for your trails.
Data events provide information about the resource operations performed on or in a resource. These are also known as data plane operations. Data events are often high-volume activities. Example data events include:
- Amazon S3 object-level API activity (for example, GetObject, DeleteObject, and PutObject API operations).
- AWS Lambda function execution activity (the Invoke API).
Data events are disabled by default when AWS customers create a trail. To record CloudTrail data events, customers need to explicitly add to a trail the supported resources or resource types for which they want to collect activity.
Amazon CloudWatch is a monitoring service for AWS cloud resources and applications customers run on AWS. It’s built for DevOps engineers, developers, site reliability engineers (SREs), and IT managers.CloudWatch collects monitoring and operational data in the form of logs, metrics, and events, providing customers with a unified view of AWS resources, applications, and services that run on AWS and on-premises servers. Using Amazon CloudWatch customers can collect and track metrics, collect and monitor log files, and set alarms.
- Amazon CloudWatch can monitor AWS resources such as Amazon EC2 instances, Amazon DynamoDB tables, and Amazon RDS DB instances, as well as custom metrics generated by customers applications and services, and any log files their applications generate.
- With CloudWatch, AWS customers gain system-wide visibility into resource utilization, application performance, and operational health.
- Using the metrics, customers can calculate statistics and then present the data graphically in the CloudWatch console.
- To provide additional scalability and reliability, each data center facility is located in a specific geographical area (Region). Each Region is designed to be completely isolated from the other Regions, to achieve the greatest possible failure isolation and stability.
Amazon CloudWatch dashboards enable customers to create re-usable graphs and helps them visualize their cloud resources and applications in a unified view. Customers can correlate the log pattern of a specific metric and set alarms to be proactively alerted about performance and operational issues.
- This gives customers system-wide visibility into operational health and the ability to quickly troubleshoot issues, reducing Mean Time to Resolution (MTTR).
- Amazon CloudWatch alarms allow customers to set a threshold on metrics and trigger an action.
- Real-time alarm on metrics and events enables customers to minimize downtime and potential business impact.
Amazon CloudWatch correlates metrics and logs helps customers to quickly go from diagnosing the problem to understanding the root cause. Amazon CloudWatch Application Insights for .NET and SQL Server enables customers to monitor .NET and SQL Server applications, so that they can get visibility into the health of such applications.
- It helps identify and set up key metrics and logs across customers application resources and technology stack.
Using Amazon CloudWatch ServiceLens, customers can visualize and analyze the health, performance, and availability of their applications in a single place. CloudWatch ServiceLens ties together CloudWatch metrics and logs as well as traces from AWS X-Ray to give customers a complete view of the applications and their dependencies.
- This enables customers to quickly pinpoint performance bottlenecks, isolate root causes of application issues, and determine users impacted.
- Customers can gain visibility into their applications through three different ways: Through Infrastructure monitoring, Transaction monitoring, and End user monitoring.
Amazon CloudWatch Synthetics allows AWS customers to monitor application endpoints more easily. It runs tests on the endpoints every minute, 24×7, and alerts them as soon as their application endpoints don’t behave as expected.
- These tests can be customized to check for availability, latency, transactions, broken or dead links, step by step task completions, page load errors, load latencies for UI assets, complex wizard flows, or checkout flows in your applications.
- It also can be used to isolate alarming application endpoints and map them back to underlying infrastructure issues to reduce mean time to resolution.
- CloudWatch Synthetics supports monitoring of customers REST APIs, URLs, and website content, checking for unauthorized changes from phishing, code injection and cross-site scripting.
Auto Scaling enables AWS customers to automate capacity and resource planning. They can set a threshold to alarm on a key metric and trigger an automated Auto Scaling action.
- CloudWatch Events provides a near real-time stream of system events that describe changes to customer AWS resources.
- It allows customers to respond quickly to operational changes and take corrective action.
The Amazon CloudWatch Logs service allows customers to collect and store logs from their resources, applications, and services in near real-time.
- There are three main categories of logs Vended logs, Logs that are published by AWS services, and Custom logs.
Amazon CloudWatch enables customers to collect default metrics from more than 70 AWS services, such as Amazon EC2, Amazon DynamoDB, Amazon S3, Amazon ECS, AWS Lambda, and Amazon API Gateway.
Using CloudWatch, customers can collect custom metrics from their own applications to monitor operational performance, troubleshoot issues, and spot trends. Container Insights simplifies the collection and aggregation of curated metrics and container ecosystem logs.
- It collects compute performance metrics such as CPU, memory, network, and disk information from each container as performance events.
AWS Batch allows customers to set up multiple queues with different priority levels. Batch jobs are stored in the queues until compute resources are available to execute the job. The AWS Batch scheduler evaluates when, where, and how to run jobs that have been submitted to a queue based on the resource requirements of each job.
- The scheduler evaluates the priority of each queue and runs jobs in priority order on optimal compute resources such as memory vs CPU optimized, as long as those jobs have no outstanding dependenciesGP.
Container Insights provides automatic dashboards in the CloudWatch console. These dashboards summarize the compute performance, errors, and alarms by cluster, pod/task, and service.
- For Amazon EKS and k8s, dashboards are also available for nodes/EC2 instances and namespaces.
Amazon CloudWatch Anomaly Detection applies machine-learning algorithms to continuously analyze data of a metric and identify anomalous behavior. It enables customers to create alarms that auto-adjust thresholds based on natural metric patterns, such as time of day, day of week seasonality, or changing trends.
- AWS customers can visualize metrics with anomaly detection bands on dashboards. Which enables them to monitor, isolate, and troubleshoot unexpected changes in their metrics.
Amazon CloudWatch allows its customers to monitor trends and seasonality with 15 months of metric data (storage and retention). Which helps them to perform historical analysis to fine-tune resource utilization.
- Amazon CloudWatch Metric Math enables customers to perform calculations across multiple metrics for real-time analysis so that they can derive insights from the existing CloudWatch metrics.
- Amazon CloudWatch Logs Insights enables customers to drive actionable intelligence from their logs to address operational issues.
Container Insights simplifies the analysis of observable data from metrics, logs, and traces by simplifying deep linking from automatic dashboards to granular performance events, application logs (stdout/stderr), custom logs, predefined Amazon EC2 instance logs, Amazon EKS/k8s data plane logs and Amazon EKS control plane logs using CloudWatch Logs Insights’ advance query language.
Amazon Simple Notification Service (Amazon SNS) coordinates and manages the delivery or sending of messages to subscribing endpoints or clients.
- Using Amazon SNS with CloudWatch customers can send messages when an alarm threshold has been reached.
Amazon EC2 Auto Scaling enables customers to automatically launch or terminate Amazon EC2 instances based on user-defined policies, health status checks, and schedules.
- AWS customers can use a CloudWatch alarm with Amazon EC2 Auto Scaling to scale your EC2 instances based on demand.
AWS CloudTrail enables customers to monitor the calls made to the Amazon CloudWatch API for their account, including calls made by the AWS Management Console, AWS CLI, and other services.
- When CloudTrail logging is turned on, CloudWatch writes log files to the Amazon S3 bucket that customers specified when they configured CloudTrail.
AWS Identity and Access Management (IAM) is a web service that helps AWS clients securely control access to AWS resources for your users.
- Using IAM, customers can control AWS resources (authentication) and what resources they can use in which ways (authorization).
Amazon CloudWatch is basically a metrics repository. An AWS service—such as Amazon EC2—puts metrics into the repository, and AWS customers retrieve statistics based on those metrics.
AWS Batch can be integrated with commercial and open-source workflow engines and languages such as:
Statistics are metric data aggregations over specified periods of time. CloudWatch provides statistics based on the metric data points provided by AWS clients custom data or provided by other AWS services to CloudWatch.
- Aggregations are made using the namespace, metric name, dimensions, and the data point unit of measure, within the time period you specify
Each statistic has a unit of measure, such as Bytes, Seconds, Count, and Percent. Customers can specify a unit when they create a custom metric. Units help provide conceptual meaning to your data. Though CloudWatch attaches no significance to a unit internally, other applications can derive semantic information based on the unit.
- Metric data points that specify a unit of measure are aggregated separately.
- If customers get statistics without specifying a unit, CloudWatch aggregates all data points of the same unit together.
- If there are two otherwise identical metrics with different units, two separate data streams are returned, one for each unit.
Metrics are the fundamental concept in CloudWatch. A metric represents a time-ordered set of data points that are published to CloudWatch. Think of a metric as a variable to monitor, and the data points as representing the values of that variable over time
- When AWS services send metrics to CloudWatch, customers add the data points in any order, and at any rate you choose.
- Metrics exist only in the Region in which they are created. Although Metrics cannot be deleted, they automatically expire after 15 months if no new data is published to them.
- Data points older than 15 months expire on a rolling basis; as new data points come in, data older than 15 months is dropped.
- Metrics are uniquely defined by a name, a namespace, and zero or more dimensions. Each data point in a metric has a time stamped, and a unit of measure.
CloudWatch treats each unique combination of dimensions as a separate metric, even if the metrics have the same metric name. Customers can only retrieve statistics using combinations of dimensions that they specifically published.
- When retrieving statistics, customers can specify the same values for the namespace, metric name, and dimension parameters that were used when the metrics were created.
- They can also specify the start and end times for CloudWatch to use for aggregation.
A period is the length of time associated with a specific Amazon CloudWatch statistic. Each statistic represents an aggregation of the metrics data collected for a specified period of time. Periods are defined in numbers of seconds, and valid values for period are 1, 5, 10, 30, or any multiple of 60.
- Only custom metrics that customers define with a storage resolution of 1 second support sub-minute periods.
- Even though the option to set a period below 60 is always available in the console, customers should rather select a period that aligns to how the metric is stored.
- Periods are important for CloudWatch alarms. When creating an alarm to monitor a specific metric, customers are asking CloudWatch to compare that metric to the threshold value that you specified.
- Customers can not only specify the period over which the comparison is made, but they can also specify how many evaluation periods are used to arrive at a conclusion.
A namespace is a container for CloudWatch metrics. Metrics in different namespaces are isolated from each other, so that metrics from different applications are not mistakenly aggregated into the same statistics.
- There is no default namespace. Customers need to specify a namespace for each data point customers publish to CloudWatch.
- Customers can specify a namespace name during the creation of a metric. These names must contain valid XML characters, and be fewer than 256 characters in length.
Each metric data point must be associated with a time stamp. The time stamp can be up to two weeks in the past and up to two hours into the future. CloudWatch creates a time stamp for customers based on the time the data point was received if they didn’t provide one.
- Time stamps are dateTime objects, with the complete date plus hours, minutes, and seconds
- CloudWatch alarms check metrics based on the current time in UTC (Universal Time). Custom metrics sent to CloudWatch with time stamps other than the current UTC time can cause alarms to display the Insufficient Data state or result in delayed alarms.
A dimension is a name/value pair that is part of the identity of a metric, and AWS customers can assign up to 10 dimensions to a metric. Every metric has specific characteristics that describe it. Dimensions can be described as categories for those characteristics.
- Dimensions help customers design a structure for their statistics plan. Because dimensions are part of the unique identifier for a metric, whenever customers add a unique name/value pair to one of the metrics, by default they are creating a new variation of that metric.
- AWS services that send data to CloudWatch attach dimensions to each metric. You can use dimensions to filter the results that CloudWatch returns.
- For metrics produced by certain AWS services, such as Amazon EC2, CloudWatch can aggregate data across dimensions.
CloudWatch retains metric data as follows:
- Data points with a period of less than 60 seconds are available for 3 hours. These data points are high-resolution custom metrics.
- Data points with a period of 60 seconds (1 minute) are available for 15 days
- Data points with a period of 300 seconds (5 minute) are available for 63 days
- Data points with a period of 3600 seconds (1 hour) are available for 455 days (15 months)
A percentile indicates the relative standing of a value in a dataset. Percentiles help customers to get a better understanding of the distribution of their metric data. Percentiles are often used to isolate anomalies.
- Using percentiles, customers can monitor the 95th percentile of CPU utilization to check for instances with an unusually heavy load.
- Some CloudWatch metrics support percentiles as a statistic. For these metrics, customers can monitor your system and applications using percentiles as you would when using the other CloudWatch statistics.
AWS customers are able to use an alarm to automatically initiate actions on their behalf. An alarm watch is a single metric over a specified time period, and performs one or more specified actions, based on the value of the metric relative to a threshold over time.
- The action is a notification sent to an Amazon SNS topic or an Auto Scaling policy.
- Alarms invoke actions for sustained state changes only. Because they are in a particular state, CloudWatch alarms do not invoke actions. The state must have to change and must be maintained for a specified number of periods.