Google Professional Data Engineer Exam - Topic 3 Question 100 Discussion

Actual exam question for Google's Professional Data Engineer exam

Question #: 100
Topic #: 3

[All Professional Data Engineer Questions]

You are designing storage for two relational tables that are part of a 10-TB database on Google Cloud. You want to support transactions that scale horizontally. You also want to optimize data for range queries on nonkey columns. What should you do?

AUse Cloud SQL for storage. Add secondary indexes to support query patterns.

BUse Cloud SQL for storage. Use Cloud Dataflow to transform data to support query patterns.

CUse Cloud Spanner for storage. Add secondary indexes to support query patterns.

DUse Cloud Spanner for storage. Use Cloud Dataflow to transform data to support query patterns.

Show Suggested Answer

Suggested Answer: C

To re-encrypt all of your CMEK-protected Cloud Storage data after a key has been exposed, and to ensure future writes are protected with a new key, creating a new Cloud KMS key and a new Cloud Storage bucket is the best approach. Here's why option C is the best choice:

Re-encryption of Data:

By creating a new Cloud Storage bucket and copying all objects from the old bucket to the new bucket while specifying the new Cloud KMS key, you ensure that all data is re-encrypted with the new key.

This process effectively re-encrypts the data, removing any dependency on the compromised key.

Ensuring CMEK Protection:

Creating a new bucket and setting the new CMEK as the default ensures that all future objects written to the bucket are automatically protected with the new key.

This reduces the risk of objects being written without CMEK protection.

Deletion of Compromised Key:

Once the data has been copied and re-encrypted, the old key can be safely deleted from Cloud KMS, eliminating the risk associated with the compromised key.

Steps to Implement:

Create a New Cloud KMS Key:

Create a new encryption key in Cloud KMS to replace the compromised key.

Create a New Cloud Storage Bucket:

Create a new Cloud Storage bucket and set the default CMEK to the new key.

Copy and Re-encrypt Data:

Use the gsutil tool to copy data from the old bucket to the new bucket while specifying the new CMEK key:

gsutil -o 'GSUtil:gs_json_api_version=2' cp -r gs://old-bucket/* gs://new-bucket/

Delete the Old Key:

After ensuring all data is copied and re-encrypted, delete the compromised key from Cloud KMS.

Cloud KMS Documentation

Cloud Storage Encryption

Re-encrypting Data in Cloud Storage

by Arleen at Dec 08, 2024, 07:57 PM

Limited Time Offer

25%

Off

Get Premium Professional Data Engineer Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Sunshine

6 days ago

I think Cloud SQL can handle this too, though.

upvoted 0 times

...

Janet

12 days ago

Cloud Spanner is great for horizontal scaling!

upvoted 0 times

...

Danica

17 days ago

I lean towards Cloud Spanner for this scenario, especially since it supports transactions and has good indexing options.

upvoted 0 times

...

Laila

23 days ago

I feel like using Cloud Dataflow could be useful, but I can't recall if it directly helps with optimizing range queries.

upvoted 0 times

...

Fletcher

28 days ago

I think Cloud SQL might not handle the scale we need for a 10-TB database, but I practiced a question about using secondary indexes with it.

upvoted 0 times

...

Virgilio

1 month ago

I remember that Cloud Spanner is designed for horizontal scaling, but I'm not sure if secondary indexes are the best approach for range queries.

upvoted 0 times

...

Marylyn

1 month ago

I'm not sure about using Cloud Dataflow to transform the data. That seems like an extra step that might not be necessary. I'll need to weigh the pros and cons of the different approaches.

upvoted 0 times

...

Aimee

1 month ago

I'm pretty confident I can solve this. The key is to use a database service that supports both transactions and secondary indexes, which points me towards Cloud Spanner.

upvoted 0 times

...

Oneida

1 month ago

Okay, I think I've got a strategy. I'll focus on the need for horizontal scaling and optimizing for range queries. That should help me narrow down the options.

upvoted 0 times

...

Loreta

1 month ago

Hmm, I'm a bit confused about the differences between Cloud SQL and Cloud Spanner. I'll need to review the key features of each to determine the best fit.

upvoted 0 times

...

Maira

1 month ago

This looks like a tricky question. I'll need to carefully consider the requirements around scaling, transactions, and optimizing for range queries.

upvoted 0 times

...

Sage

6 months ago

Wait, we're supposed to optimize for range queries? I thought we were designing storage for a dance club database. Guess I better re-read the question.

upvoted 0 times

...

Nickolas

6 months ago

Yo, what's up with these options? It's like they're trying to trick us or something. I'm just going to go with the one that sounds the most 'cloud-y'.

upvoted 0 times

Raymon

5 months ago

True, we need to optimize data for those nonkey columns.

upvoted 0 times

...

Arthur

5 months ago

But adding secondary indexes is important for range queries.

upvoted 0 times

...

Lajuana

5 months ago

Yeah, Cloud Spanner seems like a good choice for horizontal scaling.

upvoted 0 times

...

Arminda

5 months ago

I think option C sounds the most 'cloud-y'.

upvoted 0 times

...

Daisy

6 months ago

Cloud Dataflow, huh? Sounds like a lot of extra work to me. I'd rather just let Cloud Spanner do its thing and keep things simple.

upvoted 0 times

Kimberely

5 months ago

True, but it might be worth the extra work for better performance.

upvoted 0 times

...

Magdalene

5 months ago

But Cloud Dataflow could help optimize data for range queries on nonkey columns.

upvoted 0 times

...

Shawnee

5 months ago

I agree, Cloud Spanner seems like the easier option.

upvoted 0 times

...

Devorah

6 months ago

I'm all about that Cloud SQL life. With some secondary indexes, it can totally handle those query patterns. Plus, it's a familiar relational database, so I'm comfortable with it.

upvoted 0 times

Billy

5 months ago

User 3: I agree, Cloud SQL with secondary indexes is a solid choice for optimizing data.

upvoted 0 times

...

Brock

5 months ago

User 2: Yeah, Cloud SQL is familiar and can handle it well.

upvoted 0 times

...

Goldie

6 months ago

User 1: Cloud SQL with secondary indexes is the way to go for those query patterns.

upvoted 0 times

...

Keshia

7 months ago

Cloud Spanner for the win! It's the perfect choice for a 10-TB database that needs horizontal scalability and support for range queries. Secondary indexes are the way to go.

upvoted 0 times