Databricks Certified Professional Data Scientist Exam - Topic 4 Question 73 Discussion

Actual exam question for Databricks's Databricks Certified Professional Data Scientist exam

Question #: 73
Topic #: 4

[All Databricks Certified Professional Data Scientist Questions]

What is the best way to evaluate the quality of the model found by an unsupervised algorithm like k-means clustering, given metrics for the cost of the clustering (how well it fits the data) and its stability (how similar the clusters are across multiple runs over the same data)?

AThe lowest cost clustering subject to a stability constraint

BThe lowest cost clustering

CThe most stable clustering subject to a minimal cost constraint

DThe most stable clustering
There is a tradeoff between cost and stability in unsupervised learning. The more tightly you fit the data, the less stable the model will be, and vice versa. The idea is to find a good balance with more weight given to the cost. Typically a good approach is to set a stability threshold and select the model that achieves the lowest cost above the stability threshold.

Show Suggested Answer

Suggested Answer: A

by Tamera at Sep 17, 2024, 04:29 PM

Limited Time Offer

25%

Off

Get Premium Databricks Certified Professional Data Scientist Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Nobuko

4 months ago

D sounds right, but I wonder if it's too subjective.

upvoted 0 times

...

Amie

5 months ago

Wait, is it really that simple?

upvoted 0 times

...

Glenn

5 months ago

I think C makes more sense for real-world applications.

upvoted 0 times

...

Tesha

5 months ago

Definitely A! Stability matters!

upvoted 0 times

...

Carmela

5 months ago

A balance between cost and stability is key!

upvoted 0 times

...

Wilda

6 months ago

I recall that tradeoff between cost and stability being important. Option D sounds familiar, but I’m not entirely sure if it’s the right answer.

upvoted 0 times

...

Celestina

6 months ago

I feel like the most stable clustering might be the best choice, but I’m not confident if it should be subject to a cost constraint.

upvoted 0 times

...

Terry

6 months ago

I think we practiced a similar question where we had to balance cost and stability. I lean towards option A, but I’m a bit uncertain about the stability constraint.

upvoted 0 times

...

Leota

6 months ago

I remember discussing how stability is crucial in unsupervised learning, but I'm not sure if we should prioritize cost or stability in this case.

upvoted 0 times

...

Brendan

6 months ago

Okay, I think I've got it. The key is to find the lowest cost clustering that still meets a minimum stability constraint. That seems like the best way to get a high-quality model.

upvoted 0 times

...

Mitsue

6 months ago

Hmm, I'm a bit confused. I'm not sure how to balance those two metrics. Maybe I should review the lecture notes on this.

upvoted 0 times

...

Felicidad

6 months ago

This is a tricky one. I'll need to think carefully about the tradeoff between cost and stability.

upvoted 0 times

...

Olen

6 months ago

I'm feeling pretty confident about this one. The most stable clustering subject to a minimal cost constraint seems like the right approach to me.

upvoted 0 times

...

Goldie

6 months ago

I'm pretty sure trend analysis is part of the Problem Management process, so I'll go with option B.

upvoted 0 times

...

Stefanie

2 years ago

Option C all the way. I'd rather have a super stable model, even if it's not the absolute lowest cost. Stability is key in unsupervised learning!

upvoted 0 times

...

Kris

2 years ago

I believe the optimal approach is to set a stability threshold and select the model that achieves the lowest cost above that threshold. This way we balance cost and stability effectively.

upvoted 0 times

...

Sharen

2 years ago

Definitely option A. The stability of the clusters is just as important as the cost, so we need to consider both factors.

upvoted 0 times

Eladia

1 year ago

Exactly, setting a stability threshold can help us prioritize the cost while ensuring the clusters are stable.

upvoted 0 times

...

Jaleesa

1 year ago

I agree, it's important to find that balance. We don't want a model that fits the data perfectly but is not stable.

upvoted 0 times

...

Evangelina

1 year ago

Option A is the best choice. We need to balance cost and stability in the clustering model.

upvoted 0 times

...

Gearldine

2 years ago

I agree, but we also need to consider stability. Maybe the most stable clustering subject to a minimal cost constraint?

upvoted 0 times

...

Rhea

2 years ago

I think the best way to evaluate the quality of the model is to find the lowest cost clustering subject to a stability constraint. That seems like the most balanced approach to me.

upvoted 0 times

...

Leila

2 years ago

This is a tough one, but I'm going with the lowest cost clustering subject to a stability constraint. Ain't no point in having super stable clusters if they don't fit the data well, am I right?

upvoted 0 times

...

Hana

2 years ago

I think the best way is to choose the lowest cost clustering.

upvoted 0 times

...

Casie

2 years ago

Hold up, what about that tradeoff though? I reckon the best answer is the one that balances cost and stability, like the question says. Gotta find that sweet spot, you know?

upvoted 0 times

Arletta

1 year ago

A) The lowest cost clustering subject to a stability constraint

upvoted 0 times

...

Velda

1 year ago

Hold up, what about that tradeoff though? I reckon the best answer is the one that balances cost and stability, like the question says. Gotta find that sweet spot, you know?

upvoted 0 times

...

Shawnta

1 year ago

There is a tradeoff between cost and stability in unsupervised learning. The more tightly you fit the data, the less stable the model will be, and vice versa. The idea is to find a good balance with more weight given to the cost. Typically a good approach is to set a stability threshold and select the model that achieves the lowest cost above the stability threshold.

upvoted 0 times

...

Delbert

2 years ago

A) The lowest cost clustering subject to a stability constraint

upvoted 0 times

...

Bettina

2 years ago

Nah man, you gotta consider stability too. The most stable clustering is the way to go, even if the cost is a bit higher. You don't want your clusters changing all the time, that's just confusing.

upvoted 0 times

...

Sheldon

2 years ago

I think the best way is to go with the lowest cost clustering, because that's the whole point of k-means, right? Stability is overrated. Just give me the clusters that fit the data the best!

upvoted 0 times

Adela

1 year ago

I think the best way is to go with the lowest cost clustering, because that's the whole point of k-means, right? Stability is overrated. Just give me the clusters that fit the data the best!

upvoted 0 times

...

Daron

2 years ago

upvoted 0 times

...

Deane

2 years ago

A) The lowest cost clustering

upvoted 0 times

...