Independence Day Deal! Unlock 25% OFF Today – Limited-Time Offer - Ends In 00:00:00 Coupon code: SAVE25

- Free Preparation Discussions

Amazon Exam BDS-C00 Topic 1 Question 73 Discussion

Actual exam question for Amazon's BDS-C00 exam

Question #: 73
Topic #: 1

[All BDS-C00 Questions]

An organization uses Amazon Elastic MapReduce (EMR) to process a series of extract-transform-load (ETL) steps that run in sequence. The output of each step must be fully processed in subsequent steps but will not be retained.

Which of the following techniques will meet this requirement most efficiently?

AUse the EMR File System (EMRFS) to store the outputs from each step as objects in Amazon
Simple Storage Service (S3).

BUse the s3n URI to story the data to be processes as objects in Amazon S3.

CDefine the ETL steps as separate AWS Data Pipeline activities.

DLoad the data to be processed into HDFS and then write the final output to Amazon S3.

Show Suggested Answer

Suggested Answer: C

by Jesusa at Apr 03, 2023, 01:45 AM

Limited Time Offer

25%

Off

Get Premium BDS-C00 Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Brynn

1 months ago

Using EMRFS to store the outputs in S3 is definitely the way to go. I mean, who wants to be the one to tell the boss they used the 's3n' URI? That's like pulling a Betamax in the age of Netflix.

upvoted 0 times

...

Stephaine

1 months ago

Loading the data into HDFS and then writing the final output to S3 seems like overkill for this use case, where we don't need to retain the intermediate data.

upvoted 0 times

Iraida

10 days ago

Loading the data into HDFS and then writing the final output to S3 seems like overkill for this use case, where we don't need to retain the intermediate data.

upvoted 0 times

...

Maxima

10 days ago

A) Use the EMR File System (EMRFS) to store the outputs from each step as objects in Amazon Simple Storage Service (S3).

upvoted 0 times

...

Mertie

17 days ago

B) Use the s3n URI to store the data to be processed as objects in Amazon S3.

upvoted 0 times

...

Bettye

23 days ago

A) Use the EMR File System (EMRFS) to store the outputs from each step as objects in Amazon Simple Storage Service (S3).

upvoted 0 times

...

...

Kristeen

2 months ago

Defining the ETL steps as separate AWS Data Pipeline activities could work, but it might add unnecessary complexity compared to the EMRFS approach.

upvoted 0 times

Lauran

18 days ago

D) Load the data to be processed into HDFS and then write the final output to Amazon S3.

upvoted 0 times

...

Jeniffer

19 days ago

B) Use the s3n URI to store the data to be processed as objects in Amazon S3.

upvoted 0 times

...

Billye

24 days ago

A) Use the EMR File System (EMRFS) to store the outputs from each step as objects in Amazon Simple Storage Service (S3).

upvoted 0 times

...

...

Vi

2 months ago

That's a valid point, but I still think storing outputs in S3 using EMRFS is more efficient.

upvoted 0 times

...

Domingo

2 months ago

I'm not sure s3n URI is the right choice here, as it doesn't seem to address the requirement of not retaining the data locally.

upvoted 0 times

Cyndy

1 months ago

C) Define the ETL steps as separate AWS Data Pipeline activities.

upvoted 0 times

...

Richelle

1 months ago

A) Use the EMR File System (EMRFS) to store the outputs from each step as objects in Amazon Simple Storage Service (S3).

upvoted 0 times

...

...

Thomasena

2 months ago

I disagree, I believe option D is better because it involves loading data into HDFS first.

upvoted 0 times

...

Rasheeda

2 months ago

Using EMRFS to store the output in S3 seems the most efficient option since it allows us to process the data without having to retain it locally.

upvoted 0 times

...

Vi

2 months ago

I think option A is the most efficient.

upvoted 0 times

...