An ML engineer is setting up a CI/CD pipeline for an ML workflow in Amazon SageMaker AI. The pipeline must automatically retrain, test, and deploy a model whenever new data is uploaded to an Amazon S3 bucket. New data files are approximately 10 GB in size. The ML engineer also needs to track model versions for auditing.
Which solution will meet these requirements?
AWS documentation identifies SageMaker Pipelines as the native CI/CD service for ML workflows. Pipelines allow engineers to define automated steps for data processing, training, evaluation, and deployment, making them ideal for retraining models when new data arrives in Amazon S3.
For version tracking and auditing, SageMaker Model Registry is explicitly designed to manage model versions, metadata, approval status, and deployment history. This satisfies regulatory and audit requirements without custom tooling.
AWS Lambda is not suitable for handling large datasets (10 GB), and CodeBuild is not ML-aware and lacks built-in model governance. Manual notebook workflows do not meet CI/CD or automation requirements.
AWS best practices strongly recommend SageMaker Pipelines combined with the Model Registry for scalable, auditable, and production-grade ML CI/CD pipelines.
Therefore, Option B is the correct and AWS-verified solution.
Currently there are no comments in this discussion, be the first to comment!