Google Exam Professional Data Engineer Topic 1 Question 67 Discussion

Actual exam question for Google's Professional Data Engineer exam

Question #: 67
Topic #: 1

[All Professional Data Engineer Questions]

You are preparing an organization-wide dataset. You need to preprocess customer data stored in a restricted bucket in Cloud Storage. The data will be used to create consumer analyses. You need to follow data privacy requirements, including protecting certain sensitive data elements, while also retaining all of the data for potential future use cases. What should you do?

AUse Dataflow and the Cloud Data Loss Prevention API to mask sensitive data. Write the processed data in BigQuery.

BUse the Cloud Data Loss Prevention API and Dataflow to detect and remove sensitive fields from the data in Cloud Storage. Write the filtered data in BigQuery.

CUse Dataflow and Cloud KMS to encrypt sensitive fields and write the encrypted data in BigQuery. Share the encryption key by following the principle of least privilege.

DUse customer-managed encryption keys (CMEK) to directly encrypt the data in Cloud Storage. Use federated queries from BigQuery. Share the encryption key by following the principle of least privilege.

Show Suggested Answer

Suggested Answer: C

by Denise at Mar 22, 2023, 05:20 PM

Limited Time Offer

25%

Off

Get Premium Professional Data Engineer Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Avery

20 days ago

Honestly, I'm just hoping the correct answer isn't 'All of the above'. That would be a real head-scratcher!

upvoted 0 times

...

Dannie

1 months ago

Gotta love these data privacy questions! Option A sounds like a nice balance between masking sensitive info and keeping the full dataset. Not bad, not bad at all.

upvoted 0 times

Roy

23 hours ago

Using Dataflow and the Cloud Data Loss Prevention API seems like a solid approach.

upvoted 0 times

...

Patria

3 days ago

I agree, it's important to balance data privacy with retaining useful information.

upvoted 0 times

...

Craig

4 days ago

Option A sounds like a good choice. It masks sensitive data while keeping the full dataset.

upvoted 0 times

...

Monroe

2 months ago

Hmm, Option B might be the easiest solution, but I'm worried about potentially losing valuable data in the process. Better to err on the side of caution.

upvoted 0 times

Ettie

24 days ago

User 2: I agree, it's better to be cautious and not risk losing any valuable information.

upvoted 0 times

...

Nichelle

1 months ago

User 1: I think Option B is the safest choice to protect sensitive data.

upvoted 0 times

...

Diane

2 months ago

I'm leaning towards Option D. Handling the encryption at the storage level and using federated queries in BigQuery could be more efficient and secure.

upvoted 0 times

Tawna

5 days ago

Definitely. This approach can help protect sensitive data while retaining it for future use cases.

upvoted 0 times

...

Quiana

14 days ago

It's important to follow the principle of least privilege when sharing encryption keys.

upvoted 0 times

...

Jina

16 days ago

I agree. Using federated queries in BigQuery can also help maintain data privacy.

upvoted 0 times

...

Leila

26 days ago

Option D sounds like a good choice. Encrypting the data at the storage level is a secure approach.

upvoted 0 times

...

Melvin

2 months ago

Option C seems like the way to go. Encrypting the sensitive data while retaining the full dataset for future use is a smart approach.

upvoted 0 times

...

Lisha

2 months ago

That's a good point, but I still think option A is more secure in terms of protecting sensitive data elements.

upvoted 0 times

...

Flo

2 months ago

I disagree, I believe option B is better as it uses the Cloud Data Loss Prevention API and Dataflow to detect and remove sensitive fields.

upvoted 0 times

...

Lisha

2 months ago

I think option A is the best choice because it uses Dataflow and the Cloud Data Loss Prevention API to mask sensitive data.

upvoted 0 times

...