Concepts
Hybrid storage options provide the ability to integrate on-premises storage systems with cloud storage services, offering the flexibility, scalability, and economic benefits of the cloud, without sacrificing the locality and immediacy of local storage. When studying for the AWS Certified Solutions Architect – Associate exam, it is important to understand AWS’s offerings in this space, which include AWS DataSync, AWS Transfer Family, and AWS Storage Gateway.
AWS DataSync
AWS DataSync is a data transfer service that simplifies and automates moving data between on-premises storage systems and AWS storage services such as Amazon S3, Amazon EFS, and Amazon FSx for Windows File Server. DataSync can transfer data at speeds up to 10 times faster than open-source tools by using a purpose-built network protocol and parallel transfer optimization.
Example Use-Case:
A company wants to migrate a large on-premises file share to Amazon S3 to take advantage of cloud storage economics and durability. They use DataSync to automate the transfer securely and efficiently, minimizing downtime.
To set up DataSync:
- Deploy a DataSync agent on-premises or in AWS if transferring from an AWS storage service.
- Create a task in the AWS Management Console that specifies source and destination locations, including options like scheduling and bandwidth limits.
- Execute the task to begin the transfer.
AWS Transfer Family
The AWS Transfer Family provides fully managed support for file transfers directly into and out of Amazon S3 or Amazon EFS using SFTP, FTPS, and FTP. By leveraging the Transfer Family, businesses can integrate cloud storage into workflows that depend on these protocols, all while maintaining compliance and security standards.
Example Use-Case:
Financial institutions often exchange files with their partners using SFTP for compliance reasons. Using AWS Transfer Family, they can set up an SFTP-enabled server in AWS, reducing management overhead and improving scalability.
To set up an AWS Transfer Family server:
- Initiate the setup of an SFTP server in the AWS Management Console.
- Configure user authentication using service-managed or custom identity providers.
- Point your SFTP client to the service endpoint to upload and download files.
AWS Storage Gateway
AWS Storage Gateway is a hybrid cloud storage service that gives on-premises applications access to virtually unlimited cloud storage. It supports various storage interfaces and types, including file gateways for file-based storage, volume gateways for block-based storage, and tape gateways for virtual tape infrastructure.
Example Use-Cases:
- Using File Gateway to store and retrieve files directly from Amazon S3 with a traditional file protocol, such as NFS or SMB.
- Volume Gateway provides block storage to on-premises applications via iSCSI, backed by Amazon S3, with local caching for low-latency access.
- Tape Gateway offers a virtual tape infrastructure that enables companies to replace physical tape libraries with a cloud-based solution.
To implement a Storage Gateway:
- Choose the appropriate gateway type (file, volume, or tape) and deploy the Storage Gateway appliance on-premise.
- Connect the appliance to AWS, and configure it to interface with the desired AWS storage service.
- Access the storage gateway from on-premises applications as if it were a local storage resource.
Comparison
| Feature/Service | AWS DataSync | AWS Transfer Family | AWS Storage Gateway | 
|---|---|---|---|
| Transfer Protocols | Proprietary | SFTP, FTPS, FTP | iSCSI (Volume), NFS/SMB (File), VTL (Tape) | 
| Use-Cases | Data migration, large dataset synchronization | Secure file transfers using traditional file transfer protocols | On-premises access to cloud storage, low-latency data access | 
| Integration | Amazon S3, Amazon EFS, Amazon FSx | Amazon S3, Amazon EFS | Amazon S3, Amazon S3 Glacier, Amazon EBS, AWS Backup | 
| Pricing | Per gigabyte transferred | Per provisioned endpoint and per gigabyte transferred | Per gateway deployed, request pricing, data transfer fees | 
Choosing between these hybrid storage solutions depends on specific use cases. AWS DataSync is ideal for migrations and regular, large-scale data synchronization. AWS Transfer Family excels in scenarios where secure, managed file transfers are required. AWS Storage Gateway is most suitable when on-premises applications need a seamless integration with cloud storage for low-latency access or a replacement for physical backup systems.
While preparing for the AWS Certified Solutions Architect – Associate exam, it is essential to understand the differences, typical applications, and benefits of each service, and you should be familiar with setting up and configuring each of these storage options.
Answer the Questions in Comment Section
True/False: AWS DataSync can only transfer data between two AWS storage services such as Amazon S3 and EFS.
- Answer: False
Explanation: AWS DataSync can transfer data between on-premises storage and AWS services, as well as between different AWS storage services. It is not limited to AWS-to-AWS transfers.
True/False: AWS Storage Gateway can be used to connect an on-premises environment to cloud-based storage for backup purposes.
- Answer: True
Explanation: AWS Storage Gateway connects an on-premises environment to AWS cloud storage for backup and archiving, among other uses.
Which AWS service can be used to accelerate moving large amounts of data offline into and out of AWS using storage devices for transport?
- A) AWS Transfer Family
- B) AWS Snowball
- C) AWS Direct Connect
- D) AWS DataSync
- Answer: B) AWS Snowball
Explanation: AWS Snowball is a part of the AWS Snow Family used to transport large amounts of data into and out of AWS using physical storage devices, thereby, bypassing the internet.
Multiple Select: What are the components of the AWS Transfer Family?
- A) AWS Transfer for SFTP
- B) AWS Transfer for FTPS
- C) AWS Transfer for ASP
- D) AWS Transfer for FTP
- Answer: A) AWS Transfer for SFTP, B) AWS Transfer for FTPS, D) AWS Transfer for FTP
Explanation: The AWS Transfer Family supports Secure File Transfer Protocol (SFTP), File Transfer Protocol over SSL (FTPS), and File Transfer Protocol (FTP). There is no service called AWS Transfer for ASP.
True/False: AWS Storage Gateway supports a file gateway mode which allows for a seamless integration with Amazon S
- Answer: True
Explanation: AWS Storage Gateway’s file gateway mode provides a seamless way to connect to the cloud in order to store application data files and backup images as durable objects in Amazon S
AWS DataSync can be used to:
- A) Sync data at high speed over the internet
- B) Sync data over a private connection only
- C) Transfer data between AWS services only
- D) Automate and accelerate data transfer
- Answer: A) Sync data at high speed over the internet, D) Automate and accelerate data transfer
Explanation: AWS DataSync can be used to automate and accelerate data transfer over the internet or AWS Direct Connect links. It is not limited to AWS services only or private connections only.
True/False: The AWS Transfer Family does not support Active Directory for user authentication.
- Answer: False
Explanation: The AWS Transfer Family allows the integration with existing authentication systems such as Active Directory to authenticate users.
Single Select: Which AWS Storage Gateway configuration stores primary data in Amazon S3 and retains frequently accessed data locally?
- A) Stored Volumes
- B) Cached Volumes
- C) Gateway-Virtual Tape Library (VTL)
- D) Tape Gateway
- Answer: B) Cached Volumes
Explanation: Cached Volumes retain frequently accessed data locally while storing the primary data in Amazon S3, offering a low-latency access to frequently accessed data.
True/False: AWS Snowball is a suitable solution for continuous, online data transfer between on-premises storage and Amazon S
- Answer: False
Explanation: AWS Snowball is designed for occasional, large-scale data transfer. For continuous, online data transfer, AWS DataSync or AWS Storage Gateway are more suitable solutions.
Which of the following is a primary use case for AWS Storage Gateway’s File Gateway?
- A) Real-time data analytics
- B) Backup and recovery
- C) Virtual tape library
- D) High-performance computing
- Answer: B) Backup and recovery
Explanation: File Gateway is commonly used for backup and recovery purposes, allowing you to store and retrieve Amazon S3 objects through standard file storage protocols.
True/False: AWS Snow Family devices can run AWS Lambda functions locally for data processing before the data is shipped to AWS.
- Answer: True
Explanation: AWS Snow Family devices like Snowball Edge offer the capability to run AWS Lambda functions locally to process data on-premises before it is transferred to AWS.
Single Select: What is the purpose of AWS Transfer Family’s AWS Transfer for FTPS?
- A) To run Docker containers
- B) To accelerate content delivery using a worldwide network of edge locations
- C) To securely transfer files into and out of Amazon S3 using the FTPS protocol
- D) To connect data centers to the AWS Cloud with dedicated physical or virtual network connections
- Answer: C) To securely transfer files into and out of Amazon S3 using the FTPS protocol
Explanation: AWS Transfer for FTPS is part of the AWS Transfer Family, and it enables the transfer of files into and out of Amazon S3 over the Secure File Transfer Protocol (FTPS).
This blog post really clarified the differences between AWS DataSync and Storage Gateway for me. Thanks!
What’s the best use-case for AWS DataSync vs Transfer Family?
I prefer using Storage Gateway for hybrid cloud storage. It’s very efficient for backup and archiving tasks.
Can someone explain the security implications of using Transfer Family for FTP transfers?
AWS Certified Solutions Architect – Associate exam is tough. Any tips on how to balance hybrid storage concepts in the study schedule?
Awesome blog post. I feel more prepared for my AWS Certified Solutions Architect – Associate exam now!
I didn’t find this blog post helpful. It skipped over some key features of Storage Gateway.
I’m confused about how Storage Gateway handles data caching. Can someone explain?