Google Exam Professional Data Engineer Topic 3 Question 83 Discussion

Actual exam question for Google's Professional Data Engineer exam

Question #: 83
Topic #: 3

[All Professional Data Engineer Questions]

You need to modernize your existing on-premises data strategy. Your organization currently uses.

* Apache Hadoop clusters for processing multiple large data sets, including on-premises Hadoop Distributed File System (HDFS) for data replication.

* Apache Airflow to orchestrate hundreds of ETL pipelines with thousands of job steps.

You need to set up a new architecture in Google Cloud that can handle your Hadoop workloads and requires minimal changes to your existing orchestration processes. What should you do?

AUse Dataproc to migrate Hadoop clusters to Google Cloud, and Cloud Storage to handle any HDFS use cases Convert your ETL pipelines to Dataflow.

BUse Bigtable for your large workloads, with connections to Cloud Storage to handle any HDFS use cases Orchestrate your pipelines with Cloud Composer.

CUse Dataproc to migrate your Hadoop clusters to Google Cloud, and Cloud Storage to handle any HDFS use cases. Use Cloud Data Fusion to visually design and deploy your ETL pipelines.

DUse Dataproc to migrate Hadoop clusters to Google Cloud, and Cloud Storage to handle any HDFS use cases. Orchestrate your pipelines with Cloud Composer..

Show Suggested Answer

Suggested Answer: C

by Benton at May 12, 2024, 01:23 PM

Limited Time Offer

25%

Off

Get Premium Professional Data Engineer Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Lenora

1 months ago

I gotta say, these cloud services are really starting to sound like they were named by a team of engineers high on caffeine. 'Dataproc'? 'Cloud Composer'? I feel like I need a PhD in Google Cloud just to understand the question!

upvoted 0 times

...

Gerald

1 months ago

Wow, this is a tough one! I'm tempted to go with Option A just to keep things simple, but Option C seems like it offers the most comprehensive modernization. 'When in doubt, go with the most features!' - that's my motto!

upvoted 0 times

...

2 months ago

I'm leaning towards Option D. Using Dataproc and Cloud Composer seems like a straightforward approach that aligns well with the existing orchestration processes.

upvoted 0 times

Bambi

13 days ago

Yeah, it's important to minimize disruptions when updating your data strategy.

upvoted 0 times

...

Linn

24 days ago

Dataproc and Cloud Composer seem like a reliable combination for this migration.

upvoted 0 times

...

Evangelina

1 months ago

I agree, sticking with what you know can make the transition smoother.

upvoted 0 times

...

Amber

2 months ago

Option D sounds like a good choice. It keeps things simple and aligned with what you already have in place.

upvoted 0 times

...

Edward

2 months ago

Option C looks like the most comprehensive solution. Migrating Hadoop to Dataproc and using Cloud Storage for HDFS, while leveraging Cloud Data Fusion for visual ETL design, seems like a great way to modernize the architecture with minimal changes.

upvoted 0 times

Lenna

1 months ago

I agree, using Dataproc for migration, Cloud Storage for HDFS, and Cloud Data Fusion for ETL pipelines sounds like a solid plan.

upvoted 0 times

...

2 months ago

I think we should use Dataproc to migrate our Hadoop clusters to Google Cloud.

upvoted 0 times

...