Avoid overfitting or underfitting. Detect and handle bias and variance.
Split data between training and validation (for example, cross validation).
Interpret confusion matrices.
Monitor performance of the model.
Determine storage mediums (for example, databases, Amazon S3, Amazon Elastic File System [Amazon EFS], Amazon Elastic Block Store [Amazon EBS]).
Identify and extract features from datasets, including from data sources such as text, speech, image, public datasets.
XGBoost, logistic regression, k-means, linear regression, decision trees, random forests, RNN, CNN, ensemble, transfer learning
Encryption and anonymization
Deploy to multiple AWS Regions and multiple Availability Zones.
Perform offline and online model evaluation (A/B testing).
Deploy Auto Scaling groups.
Understand linear models (learning rate).
Identify and handle missing data, corrupt data, and stop words.
Perform cluster analysis (for example, hierarchical, diagnosis, elbow plot, cluster size).