Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Amazon MLS-C01 Exam - Topic 1 Question 129 Discussion

Actual exam question for Amazon's MLS-C01 exam
Question #: 129
Topic #: 1
[All MLS-C01 Questions]

[Data Engineering]

A data engineer needs to provide a team of data scientists with the appropriate dataset to run machine learning training jobs. The data will be stored in Amazon S3. The data engineer is obtaining the data from an Amazon Redshift database and is using join queries to extract a single tabular dataset. A portion of the schema is as follows:

...traction Timestamp (Timeslamp)

...JName(Varchar)

...JNo (Varchar)

Th data engineer must provide the data so that any row with a CardNo value of NULL is removed. Also, the TransactionTimestamp column must be separated into a TransactionDate column and a isactionTime column Finally, the CardName column must be renamed to NameOnCard.

The data will be extracted on a monthly basis and will be loaded into an S3 bucket. The solution must minimize the effort that is needed to set up infrastructure for the ingestion and transformation. The solution must be automated and must minimize the load on the Amazon Redshift cluster

Which solution meets these requirements?

Show Suggested Answer Hide Answer
Suggested Answer: C

Contribute your Thoughts:

0/2000 characters
Colton
13 days ago
I think D could work too, but it might be more complex.
upvoted 0 times
...
Tasia
18 days ago
Option C sounds like the best choice for automation!
upvoted 0 times
...
Pamela
1 month ago
I vaguely recall that using Lambda functions can be tricky with scheduling. I wonder if option D would really be the most efficient for this scenario.
upvoted 0 times
...
Robt
1 month ago
I practiced a similar question where we had to automate data extraction. I feel like AWS Glue is designed for these kinds of transformations, so it might be the right choice here.
upvoted 0 times
...
Brent
1 month ago
I'm not entirely sure, but I think using Amazon EMR could be overkill for this task. It seems like a lot of setup for just monthly data extraction.
upvoted 0 times
...
Telma
2 months ago
I remember we discussed the importance of minimizing load on the Redshift cluster. I think option C with AWS Glue might be the best fit for that.
upvoted 0 times
...

Save Cancel