New Year Sale 2026! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Snowflake DSA-C02 Exam - Topic 1 Question 4 Discussion

Actual exam question for Snowflake's DSA-C02 exam
Question #: 4
Topic #: 1
[All DSA-C02 Questions]

You previously trained a model using a training dataset. You want to detect any data drift in the new data collected since the model was trained.

What should you do?

Show Suggested Answer Hide Answer
Suggested Answer: A

To track changing data trends, create a data drift monitor that uses the training data as a baseline and the new data as a target.

Model drift and decay are concepts that describe the process during which the performance of a model deployed to production degrades on new, unseen data or the underlying assumptions about the data change.

These are important metrics to track once models are deployed to production. Models must be regularly re-trained on new data. This is referred to as refitting the model. This can be done either on a periodic basis, or, in an ideal scenario, retraining can be triggered when the performance of the model degrades below a certain pre-defined threshold.


Contribute your Thoughts:

0/2000 characters
Vilma
3 months ago
D seems off, how can you ignore new data completely?
upvoted 0 times
...
Jeff
3 months ago
A is definitely the way to go, baseline comparison is crucial.
upvoted 0 times
...
Juliann
4 months ago
Wait, can you really just add new data and expect it to work?
upvoted 0 times
...
Freeman
4 months ago
I think B is better, retraining is key!
upvoted 0 times
...
Theola
4 months ago
Option A sounds right for monitoring data drift.
upvoted 0 times
...
Dierdre
4 months ago
I vaguely remember that correcting outliers is important, but I don't think we should ignore the new data completely. Option D feels off to me.
upvoted 0 times
...
Janet
4 months ago
I feel like adding the new data to the existing dataset could lead to issues. Option C seems risky, but I can't recall the specifics.
upvoted 0 times
...
Paola
4 months ago
I'm not entirely sure, but I remember something about retraining the model. Would option B be the right choice for that?
upvoted 0 times
...
Ressie
5 months ago
I think we need to monitor for data drift, so maybe option A makes sense since it uses the new data as a target.
upvoted 0 times
...
Reiko
5 months ago
I'm a bit confused on the difference between options A and B. Wouldn't retraining the model with just the new data also help detect data drift? I'm not sure if creating a separate dataset is necessary.
upvoted 0 times
...
Benedict
5 months ago
Okay, I think I've got this. The key here is to detect any changes in the data distribution between the original training data and the new data collected. Option A seems like the most comprehensive approach to do that, with the timestamp column to track changes over time.
upvoted 0 times
...
Katina
5 months ago
Hmm, I'm a bit unsure about this one. I'm not sure if creating a new dataset is the best approach, or if we should just be adding the new data to the existing dataset. Maybe option C is the way to go?
upvoted 0 times
...
Vicky
5 months ago
This seems like a straightforward data drift detection question. I think option A is the way to go - creating a new dataset with the new data and a timestamp column, and then using that as the target to compare against the original training dataset.
upvoted 0 times
...
Bette
5 months ago
MD5 and MD4 are both hash functions, not symmetric key algorithms. I'm leaning towards 3DES as the best answer here.
upvoted 0 times
...
Audra
5 months ago
Hmm, the information about the bond 'Bond F' seems relevant, but I'm not sure how to connect that to the question about Revolution Ltd. I'll need to carefully read through the details again.
upvoted 0 times
...

Save Cancel