Google Professional Machine Learning Engineer Exam - Topic 8 Question 57 Discussion

Actual exam question for Google's Professional Machine Learning Engineer exam

Question #: 57
Topic #: 8

[All Professional Machine Learning Engineer Questions]

You work on a data science team at a bank and are creating an ML model to predict loan default risk. You have collected and cleaned hundreds of millions of records worth of training data in a BigQuery table, and you now want to develop and compare multiple models on this data using TensorFlow and Vertex AI. You want to minimize any bottlenecks during the data ingestion state while considering scalability. What should you do?

AUse the BigQuery client library to load data into a dataframe, and use tf.data.Dataset.from_tensor_slices() to read it.

BExport data to CSV files in Cloud Storage, and use tf.data.TextLineDataset() to read them.

CConvert the data into TFRecords, and use tf.data.TFRecordDataset() to read them.

DUse TensorFlow I/O's BigQuery Reader to directly read the data.

Show Suggested Answer

Suggested Answer: B

by Jennie at Dec 24, 2022, 10:44 AM

Limited Time Offer

25%

Off

Get Premium Professional Machine Learning Engineer Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Pansy

5 months ago

Not sure about that, exporting to CSV seems safer to me.

upvoted 0 times

...

Ronna

5 months ago

I think option D is the best choice for scalability.

upvoted 0 times

...

Martina

6 months ago

Wait, can TensorFlow I/O really handle that much data directly?

upvoted 0 times

...

Azalee

6 months ago

Definitely going with option C! Makes the most sense.

upvoted 0 times

...

Hermila

6 months ago

I heard using TFRecords is super efficient for large datasets.

upvoted 0 times

...

Val

6 months ago

Using TensorFlow I/O's BigQuery Reader sounds familiar, and I think it could help avoid bottlenecks, but I need to double-check its scalability.

upvoted 0 times

...

Shay

6 months ago

I feel like converting data to TFRecords could be beneficial for performance, but I can't recall the exact advantages over other methods.

upvoted 0 times

...

Reita

6 months ago

I think exporting to CSV files is a common approach, but it might introduce some latency during data loading.

upvoted 0 times

...

Frederica

6 months ago

I remember we discussed using the BigQuery client library, but I'm not sure if that's the most efficient way for large datasets.

upvoted 0 times

...

Margot

6 months ago

I'm feeling pretty confident about this one. The scenarios describe different ways the host can be configured to connect to the two arrays, so I just need to analyze each option carefully.

upvoted 0 times

...

Arminda

7 months ago

I vaguely remember something about digests being generated from the content. So, if the content is identical but has more whitespace, would that really affect the digest?

upvoted 0 times

...