Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Google Exam Associate Data Practitioner Topic 3 Question 9 Discussion

Actual exam question for Google's Associate Data Practitioner exam
Question #: 9
Topic #: 3
[All Associate Data Practitioner Questions]

Your organization has a petabyte of application logs stored as Parquet files in Cloud Storage. You need to quickly perform a one-time SQL-based analysis of the files and join them to data that already resides in BigQuery. What should you do?

Show Suggested Answer Hide Answer
Suggested Answer: C

Creating external tables over the Parquet files in Cloud Storage allows you to perform SQL-based analysis and joins with data already in BigQuery without needing to load the files into BigQuery. This approach is efficient for a one-time analysis as it avoids the time and cost associated with loading large volumes of data into BigQuery. External tables provide seamless integration with Cloud Storage, enabling quick and cost-effective analysis of data stored in Parquet format.


Contribute your Thoughts:

PySpark is overkill for a one-time analysis. Option C looks like the most straightforward approach here.
upvoted 0 times
...
Ardella
2 days ago
I'm not a fan of external tables - they can be a bit of a pain to manage. I'd go with option D and just load the Parquet files directly into BigQuery.
upvoted 0 times
...
Irma
8 days ago
I think option D could work too, loading the files into BigQuery.
upvoted 0 times
...
Filiberto
9 days ago
I prefer option C, creating external tables over the files in Cloud Storage.
upvoted 0 times
...
Fairy
13 days ago
I agree, using Dataproc cluster with PySpark seems efficient.
upvoted 0 times
...
Daniel
17 days ago
Cloud Data Fusion seems like the easiest way to get this done. No need to write any code!
upvoted 0 times
...
Sommer
23 days ago
I think option A sounds like a good idea.
upvoted 0 times
...

Save Cancel