Throughput and latency characteristics for AWS services that ingest data
Data ingestion patterns (for example, frequency and data history)
Batch data ingestion (for example, scheduled ingestion, event-driven ingestion)
Replayability of data ingestion pipelines
Stateful and stateless data transactions
Creation of ETL pipelines based on business requirements
Volume, velocity, and variety of data (for example, structured data, unstructured data)
Cloud computing and distributed computing
How to use Apache Spark to process data
Intermediate data staging locations
How to integrate various AWS services to create ETL pipelines
Event-driven architecture
How to configure AWS services for data pipelines based on schedules or dependencies
Continuous integration and continuous delivery (CI/CD) (implementation, testing, and deployment of data pipelines)