Tutorial: AWS Certified Solutions Architect - Professional (SAP-C02)

Storage tiering

Tutorial / Cram Notes

Amazon S3 is one of the primary services where storage tiering comes into play. It offers a range of storage classes, each tailored for specific data access patterns and cost objectives.

S3 Standard

S3 Standard is the default storage class, designed for frequently accessed data. It provides high durability, availability, and performance with no retrieval costs.

S3 Intelligent-Tiering

Intelligent-Tiering automatically moves data between two access tiers – a frequent access tier and a lower-cost infrequent access tier – based on usage patterns, optimizing costs without performance impact.

S3 Standard-IA and S3 One Zone-IA

For data that is less frequently accessed but requires rapid access when needed, S3 Standard-IA is suitable. It has lower storage costs but charges for retrieval. S3 One Zone-IA stores data in a single Availability Zone and is ideal for non-critical, infrequently accessed data, offering even lower storage costs.

S3 Glacier and S3 Glacier Deep Archive

These two classes are designed for archival purposes. S3 Glacier is for data that can tolerate retrieval times of several minutes to several hours, whereas S3 Glacier Deep Archive is the lowest-cost storage class suitable for long-term archiving with retrieval times of 12 hours or more.

Comparing S3 Storage Classes

Storage Class	Use Case	Availability Zones	Durability	Retrieval Time	Cost (Storage + Access)
S3 Standard	Frequently accessed data	≥ 3	99.999999999%	Milliseconds	High
S3 Intelligent-Tiering	Data with unknown access patterns	≥ 3	99.999999999%	Milliseconds	Varies based on access
S3 Standard-IA	Infrequently accessed data	≥ 3	99.999999999%	Milliseconds	Lower + retrieval fee
S3 One Zone-IA	Infrequently accessed, non-critical	1	99.999999999%	Milliseconds	Lower + retrieval fee
S3 Glacier	Archive accessible in minutes/hours	≥ 3	99.999999999%	Minutes to hours	Lower
S3 Glacier Deep Archive	Long-term archive	≥ 3	99.999999999%	12 hours	Lowest

Amazon EBS Volume Types

Amazon Elastic Block Store (EBS) provides block-level storage volumes for EC2 instances with different volume types for various workloads.

EBS General Purpose (gp2 and gp3)

General Purpose volumes offer a balance of cost and performance for a wide array of workloads. The gp3 volumes provide the ability to scale IOPS (input/output operations per second) and throughput independently of storage capacity.

Provisioned IOPS (io1 and io2)

Provisioned IOPS volumes are designed for I/O-intensive workloads like databases. They deliver high performance with consistent latency and are suitable for workloads that require more than 16,000 IOPS.

Throughput Optimized HDD (st1) and Cold HDD (sc1)

These HDD-based volume types are best for large, sequential workloads. While st1 is for frequently accessed data, sc1 provides a lower cost solution for less frequently accessed data.

Comparing EBS Volume Types

Volume Type	Use Case	Baseline Performance	Maximum IOPS & Throughput	Durability	Cost
General Purpose	Balanced workloads	3 IOPS per GiB	Up to 16,000 IOPS	99.8–99.9%	Moderate
Provisioned IOPS	I/O-intensive workloads (databases)	Provisioned rate	Up to 64,000 IOPS	≥99.9%	Higher
Throughput Optimized HDD	Big data, data warehouses	40 MB/s per TB	Up to 500 MB/s	≥99.8%	Lower
Cold HDD	Infrequently accessed workloads	12 MB/s per TB	Up to 250 MB/s	≥99.8%	Lowest among EBS

Amazon EFS Storage Classes

Amazon Elastic File System (EFS) offers file storage with two storage classes: the Standard storage class and the Infrequent Access (IA) storage class. Lifecycle policies can move files that have not been accessed for a defined period from Standard to IA, reducing costs.

Implementing Storage Tiering

When designing solutions for various scenarios, an AWS Solutions Architect should consider automating storage tiering to optimize costs effectively. For example, setting up lifecycle policies in S3 can automatically transition objects to appropriate storage classes as their access patterns change.

{ "Rules": [ { "ID": "MoveToIAAfter30Days", "Filter": { "Prefix": "" }, "Status": "Enabled", "Transitions": [ { "Days": 30, "StorageClass": "STANDARD_IA" } ], "NoncurrentVersionTransitions": [ { "NoncurrentDays": 30, "StorageClass": "STANDARD_IA" } ], "Expiration": { "Days": 365, "ExpiredObjectDeleteMarker": true } } ] }

The above JSON is part of a lifecycle policy definition that transitions objects in an S3 bucket to Standard-IA after 30 days and expires them after 365 days.

In conclusion, storage tiering within AWS is an essential strategy for balancing cost, access, and performance requirements. AWS Certified Solutions Architect – Professional candidates should be proficient in understanding and leveraging these storage classes and tiering options to design efficient and optimized architectures.

Practice Test with Explanation

True or False: In AWS, storage tiering refers to the manual process of moving data between storage classes to optimize for cost.

A) True
B) False

Answer: B) False

Explanation: Storage tiering on AWS can be both manual and automated. Services like Amazon S3 Intelligent-Tiering automate the process of moving data between different storage classes to optimize for cost and performance.

Which AWS service supports automatic tiering?

A) Amazon EBS
B) Amazon S3 Intelligent-Tiering
C) Amazon RDS
D) Amazon EC2

Answer: B) Amazon S3 Intelligent-Tiering

Explanation: Amazon S3 Intelligent-Tiering is designed to automatically move data between access tiers when access patterns change.

True or False: Amazon S3 Glacier is suitable for frequently accessed data.

A) True
B) False

Answer: B) False

Explanation: Amazon S3 Glacier is designed for long-term data archiving where data is infrequently accessed. For frequently accessed data, other storage tiers like S3 Standard are more appropriate.

Which of the following is NOT a factor in determining the right storage tier for your data?

A) Access frequency
B) Retrieval time
C) Compliance requirements
D) The color of the AWS management console

Answer: D) The color of the AWS management console

Explanation: The color of the AWS management console is not a factor in determining the right storage tier. Cost, access frequency, retrieval time, and compliance requirements are the typical factors considered in tiering decisions.

True or False: AWS recommends using Amazon EFS for data that requires high throughput and low latency.

A) True
B) False

Answer: A) True

Explanation: Amazon EFS is designed to provide high throughput and low latency, making it suitable for workloads that require fast and easy access to data.

Which Amazon S3 storage class is primarily used for disaster recovery purposes?

A) S3 Standard
B) S3 Intelligent-Tiering
C) S3 One Zone-IA
D) S3 Glacier

Answer: C) S3 One Zone-IA

Explanation: S3 One Zone-IA (Infrequent Access) is designed for data that is not frequently accessed but requires rapid access when needed, making it suitable for disaster recovery in certain scenarios.

True or False: Amazon EBS provides the ability to automatically move volumes between different types (e.g., General Purpose SSD, Provisioned IOPS SSD, etc.) based on performance requirements.

A) True
B) False

Answer: B) False

Explanation: Amazon EBS does not automatically move volumes between different types but allows users to manually change the volume type to optimize performance and cost.

To implement storage tiering, what must you consider about your data?

A) Size of each file
B) Sensitivity of data
C) Access patterns
D) All of the above

Answer: D) All of the above

Explanation: When implementing storage tiering, all aspects such as the size of each file, sensitivity of the data, and access patterns should be considered to choose the appropriate storage tier.

What is the key benefit of using automated storage tiering in AWS?

A) Improved data durability
B) Enhanced security compliance
C) Cost savings
D) Static data placement

Answer: C) Cost savings

Explanation: Automated storage tiering, such as S3 Intelligent-Tiering, can lead to cost savings by automatically moving data to the most cost-effective access tier based on usage patterns.

True or False: Amazon S3 lifecycle policies can be used to automate the transition of objects between storage classes.

A) True
B) False

Answer: A) True

Explanation: Amazon S3 lifecycle policies can be used to create rules that automate the transition of objects between different S3 storage classes, such as from S3 Standard to S3 Glacier, to save costs.

Interview Questions

What is storage tiering and why is it important for cost optimization on AWS?

Storage tiering is the process of assigning different types of storage to data based on its usage patterns and accessibility requirements. It is important for cost optimization on AWS because it allows users to reduce storage costs by moving infrequently accessed data to cheaper storage classes, such as Amazon S3 Glacier or S3 Glacier Deep Archive, while keeping frequently accessed data on faster, more expensive storage like Amazon S3 Standard.

Can you describe the various storage tiers available in Amazon S3 and when you would use each?

Amazon S3 offers several storage classes: S3 Standard for frequently accessed data, S3 Intelligent-Tiering for unknown or changing access patterns, S3 Standard-IA (Infrequent Access) and S3 One Zone-IA for infrequently accessed data but requires rapid access when needed, S3 Glacier for archival storage with retrieval times ranging from minutes to hours, and S3 Glacier Deep Archive for long-term archival with the lowest cost, but retrieval times in hours. Usage depends on the data access patterns and cost considerations.

How does AWS S3 Intelligent-Tiering work, and what are the costs associated with it?

AWS S3 Intelligent-Tiering automatically moves data to the most cost-effective access tier based on changing access patterns without performance impact or operational overhead. There are two cost components: a small monthly monitoring and automation fee per object and the cost of storage within the frequent and infrequent access tiers. There are no retrieval fees when accessing data within this storage class.

In which scenario would it be more cost-effective to use Amazon S3 Glacier instead of Amazon S3 Standard-IA?

Amazon S3 Glacier would be more cost-effective for data that is rarely accessed and intended for long-term archiving, such as compliance or regulatory data. It’s much cheaper in terms of storage costs than S3 Standard-IA but has higher retrieval times and fees, making it unsuitable for data that might need to be accessed quickly or frequently.

How does Amazon S3’s lifecycle management feature facilitate storage tiering?

Amazon S3’s lifecycle management allows users to automatically transition objects to different storage classes at defined periods of the object’s lifetime. This automation helps in implementing a cost-effective storage strategy by seamlessly tiering data without manual intervention based on the organization’s data usage policies.

What is the difference between Amazon S3 One Zone-IA and S3 Standard-IA?

Amazon S3 One Zone-IA stores data in a single Availability Zone and is suitable for non-critical or replaceable data at a lower cost compared to S3 Standard-IA, which stores data redundantly across multiple geographically separated Availability Zones for better durability and availability.

How do you monitor access patterns to implement a storage tiering strategy effectively?

Access patterns can be monitored using AWS tools such as Amazon S3 access logs, AWS CloudTrail, and Amazon CloudWatch metrics. These tools provide insights into how frequently data is accessed, which is essential for making informed decisions regarding the most appropriate storage tier for different datasets.

Can you explain the process of retrieving data from Amazon S3 Glacier and S3 Glacier Deep Archive?

Data retrieval from S3 Glacier and S3 Glacier Deep Archive involves initiating a retrieval request which includes choosing an expedited, standard, or bulk retrieval option, which dictate the retrieval time and cost. The requested data will be made available in the S3 bucket within the service level agreement time for the chosen retrieval option. Expedited retrievals are the fastest but most expensive, while bulk retrievals are the slowest but most cost-effective.

What role does Amazon EFS lifecycle management play in storage tiering for file systems?

Amazon EFS lifecycle management automatically manages files by moving them from the Standard storage class to the Infrequent Access (IA) storage class based on the age of the file and the lifecycle policy set by the user. This feature helps in optimizing costs for file storage by ensuring that less frequently accessed files incur lower storage costs.

Discuss how AWS Backup can integrate with storage tiering to optimize the overall cost of backups.

AWS Backup allows users to define backup lifecycle policies that can automatically transition backups to more cost-effective storage tiers like S3 Glacier or S3 Glacier Deep Archive as they age, thus optimizing the storage cost without compromising data availability for restoration if needed.

How does Amazon S3 storage tiering align with data compliance and regulatory requirements?

Amazon S3 storage tiering enables organizations to meet compliance and regulatory requirements by offering durable and secure storage options that keep data accessible based on policy requirements. Features like S3 Glacier Vault Lock help in enforcing compliance controls by preventing deletions and enforcing WORM (Write Once, Read Many) policies.

0 0 votes

Article Rating

25 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Cristina Pires

1 year ago

The concept of storage tiering on AWS definitely helped me crack the SAP-C02 exam!

Etienne White

1 year ago

I found the explanation about Intelligent-Tiering in S3 particularly useful. It’s a game-changer for cost optimization.

Pablo Ortiz

1 year ago

Can anyone explain how storage tiering applies to Elastic File System (EFS)?

Cherly Vargas

1 year ago

I think more examples on real-world scenarios could have been helpful.

Tabea Picard

1 year ago

The storage tiering features in EFS and S3 were really well-covered in the blog. Helped me a lot!

Lilja Hannula

1 year ago

Does storage tiering also apply to Amazon FSx for Lustre?

Alan Mcdonalid

1 year ago

The blog post was quite informative. Thanks!

Väinö Niva

1 year ago

Is there any performance penalty when using S3 Intelligent-Tiering?

Storage tiering

Tutorial / Cram Notes

S3 Standard

S3 Intelligent-Tiering

S3 Standard-IA and S3 One Zone-IA

S3 Glacier and S3 Glacier Deep Archive

Comparing S3 Storage Classes

Amazon EBS Volume Types

EBS General Purpose (gp2 and gp3)

Provisioned IOPS (io1 and io2)

Throughput Optimized HDD (st1) and Cold HDD (sc1)

Comparing EBS Volume Types

Amazon EFS Storage Classes

Implementing Storage Tiering

Practice Test with Explanation

True or False: In AWS, storage tiering refers to the manual process of moving data between storage classes to optimize for cost.

Which AWS service supports automatic tiering?

True or False: Amazon S3 Glacier is suitable for frequently accessed data.

Which of the following is NOT a factor in determining the right storage tier for your data?

True or False: AWS recommends using Amazon EFS for data that requires high throughput and low latency.

Which Amazon S3 storage class is primarily used for disaster recovery purposes?

True or False: Amazon EBS provides the ability to automatically move volumes between different types (e.g., General Purpose SSD, Provisioned IOPS SSD, etc.) based on performance requirements.

To implement storage tiering, what must you consider about your data?

What is the key benefit of using automated storage tiering in AWS?

True or False: Amazon S3 lifecycle policies can be used to automate the transition of objects between storage classes.

Interview Questions

What is storage tiering and why is it important for cost optimization on AWS?

Can you describe the various storage tiers available in Amazon S3 and when you would use each?

How does AWS S3 Intelligent-Tiering work, and what are the costs associated with it?

In which scenario would it be more cost-effective to use Amazon S3 Glacier instead of Amazon S3 Standard-IA?

How does Amazon S3’s lifecycle management feature facilitate storage tiering?

What is the difference between Amazon S3 One Zone-IA and S3 Standard-IA?

How do you monitor access patterns to implement a storage tiering strategy effectively?

Can you explain the process of retrieving data from Amazon S3 Glacier and S3 Glacier Deep Archive?

What role does Amazon EFS lifecycle management play in storage tiering for file systems?

Discuss how AWS Backup can integrate with storage tiering to optimize the overall cost of backups.

How does Amazon S3 storage tiering align with data compliance and regulatory requirements?

Related Post

Employing remediation techniques

High-performing systems architectures (for example, auto scaling, instance fleets, placement groups)

Global service offerings (for example, AWS Global Accelerator, Amazon CloudFront, edge computing services)