Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Amazon Exam BDS-C00 Topic 1 Question 73 Discussion

Actual exam question for Amazon's BDS-C00 exam
Question #: 73
Topic #: 1
[All BDS-C00 Questions]

An organization uses Amazon Elastic MapReduce (EMR) to process a series of extract-transform-load (ETL) steps that run in sequence. The output of each step must be fully processed in subsequent steps but will not be retained.

Which of the following techniques will meet this requirement most efficiently?

Show Suggested Answer Hide Answer
Suggested Answer: C

Contribute your Thoughts:

Brynn
19 days ago
Using EMRFS to store the outputs in S3 is definitely the way to go. I mean, who wants to be the one to tell the boss they used the 's3n' URI? That's like pulling a Betamax in the age of Netflix.
upvoted 0 times
...
Stephaine
25 days ago
Loading the data into HDFS and then writing the final output to S3 seems like overkill for this use case, where we don't need to retain the intermediate data.
upvoted 0 times
Mertie
2 days ago
B) Use the s3n URI to store the data to be processed as objects in Amazon S3.
upvoted 0 times
...
Bettye
8 days ago
A) Use the EMR File System (EMRFS) to store the outputs from each step as objects in Amazon Simple Storage Service (S3).
upvoted 0 times
...
...
Kristeen
1 months ago
Defining the ETL steps as separate AWS Data Pipeline activities could work, but it might add unnecessary complexity compared to the EMRFS approach.
upvoted 0 times
Lauran
3 days ago
D) Load the data to be processed into HDFS and then write the final output to Amazon S3.
upvoted 0 times
...
Jeniffer
4 days ago
B) Use the s3n URI to store the data to be processed as objects in Amazon S3.
upvoted 0 times
...
Billye
9 days ago
A) Use the EMR File System (EMRFS) to store the outputs from each step as objects in Amazon Simple Storage Service (S3).
upvoted 0 times
...
...
Vi
2 months ago
That's a valid point, but I still think storing outputs in S3 using EMRFS is more efficient.
upvoted 0 times
...
Domingo
2 months ago
I'm not sure s3n URI is the right choice here, as it doesn't seem to address the requirement of not retaining the data locally.
upvoted 0 times
Cyndy
24 days ago
C) Define the ETL steps as separate AWS Data Pipeline activities.
upvoted 0 times
...
Richelle
1 months ago
A) Use the EMR File System (EMRFS) to store the outputs from each step as objects in Amazon Simple Storage Service (S3).
upvoted 0 times
...
...
Thomasena
2 months ago
I disagree, I believe option D is better because it involves loading data into HDFS first.
upvoted 0 times
...
Rasheeda
2 months ago
Using EMRFS to store the output in S3 seems the most efficient option since it allows us to process the data without having to retain it locally.
upvoted 0 times
...
Vi
2 months ago
I think option A is the most efficient.
upvoted 0 times
...

Save Cancel