Google Exam Professional Data Engineer Topic 5 Question 75 Discussion

Actual exam question for Google's Professional Data Engineer exam

Question #: 75
Topic #: 5

[All Professional Data Engineer Questions]

Your company is loading comma-separated values (CSV) files into Google BigQuery. The data is fully imported successfully; however, the imported data is not matching byte-to-byte to the source file. What is the most likely cause of this problem?

AThe CSV data loaded in BigQuery is not flagged as CSV.

BThe CSV data has invalid rows that were skipped on import.

CThe CSV data loaded in BigQuery is not using BigQuery's default encoding.

DThe CSV data has not gone through an ETL phase before loading into BigQuery.

Show Suggested Answer

Suggested Answer: D

by Karl at Dec 21, 2023, 08:28 PM

Limited Time Offer

25%

Off

Get Premium Professional Data Engineer Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Omer

2 months ago

I'm going with option C as well. BigQuery is pretty picky about the encoding, and if it's not the default, you can end up with mismatched data. Gotta love those character encoding problems!

upvoted 0 times

...

Cheryl

2 months ago

Ha! The question says the data is 'fully imported successfully', so option D about an ETL phase is clearly not the issue. These exam questions can be tricky sometimes.

upvoted 0 times

Deonna

26 days ago

C) The CSV data loaded in BigQuery is not using BigQuery's default encoding.

upvoted 0 times

...

Ryan

29 days ago

B) The CSV data has invalid rows that were skipped on import.

upvoted 0 times

...

Rikki

2 months ago

A) The CSV data loaded in BigQuery is not flagged as CSV.

upvoted 0 times

...

Dawne

3 months ago

Option B seems plausible - the CSV data could have invalid rows that were skipped on import. That would lead to the data not matching byte-to-byte. I'll keep that in mind.

upvoted 0 times

Theodora

2 months ago

Agreed. Skipping invalid rows during import could definitely lead to discrepancies in the data.

upvoted 0 times

...

Shelia

2 months ago

Yes, that makes sense. It's important to ensure the CSV data is clean before loading it into BigQuery.

upvoted 0 times

...

Hester

2 months ago

I think option B is the most likely cause. Invalid rows could definitely cause the data not to match byte-to-byte.

upvoted 0 times

...

Vincent

3 months ago

I agree with Ayesha, option B makes the most sense because invalid rows could cause the mismatch.

upvoted 0 times

...

Cordell

3 months ago

I disagree, I believe it could be option C.

upvoted 0 times

...

Ayesha

3 months ago

I think the most likely cause is option B.

upvoted 0 times

...

Sue

3 months ago

I think the most likely cause is option C - the CSV data loaded in BigQuery is not using BigQuery's default encoding. I've seen this issue before when the source file uses a different encoding than what BigQuery expects.

upvoted 0 times