Amazon Glacier is a low-cost storage service designed to store data that is infrequently accessed and service that is mainly used for data archiving and long-term backup. Amazon Glacier retrieval jobs typically complete in 3 to 5 hours. Just like S3, Glacier is extremely secure and durable and provides the same security and durability as S3. Amazon Glacier enables customers to offload the administrative burdens of operating and scaling storage to AWS so that they don’t have to worry about capacity planning, hardware provisioning, data replication, hardware failure detection and repair, or time-consuming hardware migrations.
- Amazon Glacier is designed for use with other Amazon web services. You can seamlessly move data between Amazon Glacier and Amazon S3 using S3 data life-cycle policies.
- Customers can use Amazon Glacier in such a way that, they can archiving offsite enterprise information, media assets, and research and scientific data, and also performing digital preservation and magnetic tape replacement.
Amazon Glacier is designed to provide average annual durability of 99.999999999 percent (11 nines) for an archive. The service redundantly stores data in multiple facilities and on multiple devices within each facility. To increase durability, Amazon Glacier synchronously stores customers data across multiple facilities before returning SUCCESS on uploading an archive.
Amazon Glacier scales to meet growing and often unpredictable storage requirements. A single archive is limited to 40 TB in size, but there is no limit to the total amount of data that customers can store in the service. Whether the customer want to store petabytes or gigabytes, Amazon Glacier automatically scales their storage up or down as needed seamlessly.
Amazon Glacier uses server-side encryption to encrypt all data at rest. Amazon Glacier handles key management and key protection for its clients by using one of the strongest block ciphers available, 256-bit Advanced Encryption Standard (AES256). Clients who want to manage their own keys can encrypt data prior to uploading it.
There are two ways that AWS clients can use the Amazon Glaciers.
- Amazon Glacier provides a native, standards-based REST web services interface. This interface can be accessed using the Java SDK or the .NET SDK. Customers can use the AWS Management Console or Amazon Glacier API actions to create vaults to organize the archives in Amazon Glacier.
- Amazon Glacier can be used as a storage class in Amazon S3 by using object lifecycle management that provides automatic, policy-driven archiving from Amazon S3 to Amazon Glacier. Clients can simply set one or more life-cycle rules for an Amazon S3 bucket, defining what objects should be transitioned to Amazon Glacier and when.
S3 PUT API for direct uploads to S3 Glacier, and S3 Lifecycle management for automatic migration of objects. Glacier provides three retrieval options that range from a few minutes to hours
Data stored in the GLACIER storage class has a minimum storage duration period of 90 days. AWS clients can reliably store any amount of data at costs that are competitive with or cheaper than on-premises solutions. To keep costs low yet suitable for varying needs, Amazon Glacier provides three retrieval options that range from a few minutes to several hours.
- Expedited retrieval typically returns data in 1-5 minutes, and it is great for Active Archive use cases.
- The expedited retrieval cost is $0.03 per gigabyte.
- Standard retrieval typically completed between 3-5 hours, and work well for less time-sensitive needs like backup data, media editing, or long-term analytics.
- The standard retrieval cost is $0.01 per gigabyte
- Bulk retrieval are the lowest-cost retrieval option, returning large amounts of data within 5-12 hours.
- The bulk retrieval cost is $0.0025 per gigabyte.
Amazon Glacier has a vault like a safe deposit box or locker. Customers can group multiple archives and put them in a vault. They can say a vault is like a container for an archive. A vault gives them the ability to organize their data residing in Amazon Glacier.
- Amazon Glacier Vault Lock allows customers to easily deploy and enforce compliance controls on individual Glacier vaults via a lockable policy.
- Customers can specify controls such as a Write Once Read Many (WORM) in a Vault Lock policy and lock the policy from future edits.
- Once locked, the policy becomes immutable, and Glacier will enforce the prescribed controls to help achieve customers compliance objectives.
- Glacier maintains a cold index of archives refreshed every 24 hours, which is known as an inventory or vault inventory.
- Whenever customers want to retrieve an archive and vault inventory, they need to submit a Glacier job, which is going to run behind the scenes to deliver them the files requested.
Deep-Archive is used for archiving data that rarely needs to be accessed. Data stored in the Deep-Archive storage class has a minimum storage duration period of 180 days and a default retrieval time of 12 hours.
- The Amazon S3 Glacier Deep-Archive storage class provides two retrieval options ranging from 12-48 hours.
- DEEP_ARCHIVE is the lowest cost storage option in AWS.
Both Amazon S3 Glacier and S3 Glacier Deep Archive storage classes offer sophisticated integration with AWS CloudTrail to log, monitor and retain storage API call activities for auditing, and supports three different forms of encryption.
All of the storage classes except for Onezone_IA are designed to be resilient to simultaneous complete data loss in a single Availability Zone and partial loss in another Availability Zone.
Amazon S3 versioning helps protects customers data against accidental or malicious deletion by keeping multiple versions of each object in the bucket, identified by a unique version ID.
- Versioning allows customers to preserve, retrieve, and restore every version of every object stored in their Amazon S3 bucket.
- If a the clients makes an accidental change or even maliciously deletes an object in their S3 bucket, they can restore the object to its original state simply by referencing the version ID in addition to the bucket and object key.
- Versioning is turned on at the bucket level. Once enabled, versioning cannot be removed from a bucket; it can only be suspended.
Cross-origin resource sharing (CORS
Cross-origin resource sharing (CORS) defines a way for client web applications that are loaded in one domain to interact with resources in a different domain.
- Customers can build rich client-side web applications with Amazon S3 and selectively allow cross-origin access to their Amazon S3 resources.
- Cross-region replication customers to asynchronously replicate all new objects in the source bucket in one AWS region to a target bucket in another region. Any metadata and ACLs associated with the object are part of the replication.
- Cross-region replication is commonly used to reduce the latency required to access objects in Amazon S3 by placing objects closer to a set of users or to meet requirements to store backup data at a certain distance from the original source data.
Amazon S3 enables customers to store, retrieve, and delete objects. They can retrieve an entire object or a portion of an object. If the customers have enabled versioning on their bucket, they can retrieve a specific version of the object.
- Uploading objects:– Clients can upload objects of up to 5 GB in size in a single operation.
- Copying objects:– The copy operation creates a copy of an object that is already stored in Amazon S3. Customers can create and copy an object up to 5 GB in size in a single atomic operation.
Deleting Objects from a Version-Enabled Bucket
Customers with version-enabled buckets have multiple versions of the same object in the bucket. Options to delete version-enabled buckets:
- Specify a non-versioned delete request:– Customers can specify only the object’s key, and not the version ID. In this case, Amazon S3 creates a delete marker and returns its version ID in the response. This makes the object disappear from the bucket.
- Specify a versioned delete request:– Customers need to specify both the key and also a version ID. In this case the following two outcomes are possible:
- If the version ID maps to a specific object version, then Amazon S3 deletes the specific version of the object.
- If the version ID maps to the delete marker of that object, Amazon S3 deletes the delete marker. This makes the object reappear in the bucket.
Deleting Objects from an MFA-Enabled Bucket
MFA Delete adds another layer of data protection on top of bucket versioning. MFA Delete requires additional authentication in order to permanently delete an object version or change the versioning state of a bucket.
- MFA Delete requires an authentication code (a temporary, one-time password) generated by a hardware or virtual Multi-Factor Authentication (MFA) device.
- The MFA Delete can only be enabled by the root account.
S3 Batch Operations
S3 Batch Operations help customers to manage billions of objects stored in Amazon S3, with a single API request or a few clicks in the S3 Management Console.
- AWS customers can make changes to object properties and metadata, and perform other storage management tasks such as copying objects between buckets, replacing tag sets, modifying access controls, and restoring archived objects from Amazon S3 Glacier.
- S3 Batch Operations manages retries, tracks progress, sends notifications, generates completion reports, and delivers events to AWS CloudTrail for all changes made and tasks executed.
Data encryption can happen either on clients’ side (client-side encryption) or on AWS (server-side encryption or SSE). When customers encrypt data on their side, the data transferred to S3 is already encrypted. S3 never sees the raw data. Server-side encryption is different because customers send the raw data to S3 where it is encrypted.
Clients can encrypt in flight and at rest. To encrypt their Amazon S3 data in transit, they can use the Amazon S3 Secure Sockets Layer (SSL) API endpoints. This ensures that all data sent to and from Amazon S3 is encrypted while in transit using the HTTPS protocol.
To encrypt the clients Amazon S3 data at rest, ghey can use several variations of Server-Side Encryption (SSE). All SSE performed by Amazon S3 and AWS Key Management Service (Amazon KMS) uses the 256-bit Advanced Encryption Standard (AES). Clients can also encrypt their Amazon S3 data at rest using Client-Side Encryption, encrypting their data on the client before sending it to Amazon S3.
Server-side encryption:- In this case clients send unencrypted raw data to AWS, and the AWS infrastructure encrypt the raw data then store it on the disk. When the clients retrieve data, AWS reads the encrypted data from the disk, decrypts the data, and sends raw data back to them. The en/decryption is transparent to the AWS user.
- SSE-AES:– AWS handles encryption and decryption for clients on the server-side using the aes256 algorithm. AWS also controls the secret key that is used for encryption/decryption.
- SSE-KMS (AWS managed CMK):— SSE-KMS is very similar to SSE-AES. The only difference is that the secret key (aka AWS managed Customer Master Key (CMK)) is provided by the KMS service and not by S3.
- SSE-KMS (customer managed CMK):— AWS clients can manage the secret key (aka Customer managed Customer Master Key) using the KMS service.
- Clients can create a Customer Master Key (CMK) and reference that key for encryption/decryption.
- At any time, they can delete the CMK to make all data useless.
- They have full control over the CMK by customizing the key policy.
- SSE-C:– With SSE-C, AWS clients are in charge of the secret key while AWS still cares about encryption/decryption. Every time clients call the S3 API, they also have to attach the secret key.
Client-side encryption:– Client-side encryption means that AWS clients encrypt the data before they send it to AWS. It also means that they decrypt the data that they retrieve from AWS. Client-side encryption needs to be deeply embedded into their application. Clients have two options for using data encryption keys:
- Use an AWS KMS-managed customer master key.
- Use a client-side master key.
AWS SDK + KMS:- Clients can use the AWS SDK to upload/download files from S3. The KMS service can generate data keys that clients can use for encryption/decryption. The data key itself is encrypted using the KMS Customer Master Key.
- If the clients want to use the encrypted data key, they have to send the encrypted data to the KMS service and ask for decryption. The decrypted data key is only returned if the CMK is still available and the clients have permission to use it.
When using client-side encryption, clients retain end-to-end control of the encryption process, including management of the encryption keys. For maximum simplicity and ease of use
Access Control (ACL):- Access control lists (ACLs) are one of the resource-based access policy options that allow customers to manage and access their buckets and objects. Customers can also ACLs to grant basic read/write permissions to other AWS accounts. In order the customers to give controlled access to others, Amazon S3 provides:
- Coarse-grained access controls:– (Amazon S3 Access Control Lists [ACLs]):– Amazon S3 ACLs enables customers to grant certain coarse-grained permissions such READ, WRITE, or FULL-CONTROL at the object or bucket level.
- Fine-grained access controls (Amazon S3 bucket policies, AWS Identity and Access Management [IAM] policies, and query-string authentication):–
- Fine-grained access control allows Amazon QuickSight account administrators to control authors’ default access to connected AWS resources. Fine-grained access control enables administrators to use IAM policies to scope down access permissions, limiting specific authors’ access to specific items within the AWS resources.
- Amazon QuickSight is a business analytics service, which customers can use to build visualizations, perform ad hoc analysis, and get business insights from their data.
- Amazon QuickSight can automatically discover AWS data sources and also works with customers data sources.
- Amazon QuickSight also enables organizations to scale to hundreds of thousands of users, and delivers responsive performance by using a robust in-memory engine called SPICE.
- Amazon S3 bucket policies are the recommended access control mechanism for Amazon S3 and provide much finer-grained control. Amazon S3 bucket policies are very similar to IAM policies,