New Year Sale 2026! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Amazon MLS-C01 Exam - Topic 6 Question 57 Discussion

Actual exam question for Amazon's MLS-C01 exam
Question #: 57
Topic #: 6
[All MLS-C01 Questions]

A company is building a new version of a recommendation engine. Machine learning (ML) specialists need to keep adding new data from users to improve personalized recommendations. The ML specialists gather data from the users' interactions on the platform and from sources such as external websites and social media.

The pipeline cleans, transforms, enriches, and compresses terabytes of data daily, and this data is stored in Amazon S3. A set of Python scripts was coded to do the job and is stored in a large Amazon EC2 instance. The whole process takes more than 20 hours to finish, with each script taking at least an hour. The company wants to move the scripts out of Amazon EC2 into a more managed solution that will eliminate the need to maintain servers.

Which approach will address all of these requirements with the LEAST development effort?

Show Suggested Answer Hide Answer
Suggested Answer: B

Contribute your Thoughts:

0/2000 characters
Hoa
4 months ago
I agree, Glue with PySpark is probably the least effort solution.
upvoted 0 times
...
Ettie
4 months ago
Wait, are they really using EC2 for this? That seems outdated!
upvoted 0 times
...
Allene
4 months ago
Redshift seems like overkill for this task.
upvoted 0 times
...
Alba
4 months ago
I think Lambda functions could work too, but not sure about the complexity.
upvoted 0 times
...
Alita
4 months ago
Sounds like AWS Glue is the way to go for this!
upvoted 0 times
...
Leota
5 months ago
I practiced a similar question where we used AWS Step Functions, but I’m not sure if breaking the scripts into individual Lambda functions is the right approach here.
upvoted 0 times
...
Nan
5 months ago
I think using AWS Glue might be the best option since it’s designed for ETL processes, but I’m a bit uncertain about converting the scripts to PySpark.
upvoted 0 times
...
Glendora
5 months ago
I remember we discussed how AWS Lambda can help with serverless architecture, but I'm not sure if it can handle the entire pipeline efficiently.
upvoted 0 times
...
Malissa
5 months ago
Loading data into Amazon Redshift sounds familiar, but I feel like it might require more effort to set up than the other options.
upvoted 0 times
...
Kristine
5 months ago
Okay, I think I've got a strategy here. I need to focus on the static and instance variables in the code and how they can be accessed.
upvoted 0 times
...
Domonique
5 months ago
This is a tricky one. I'll need to think carefully about the differences between privacy and compliance risks.
upvoted 0 times
...

Save Cancel