New Year Sale 2026! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Certified Professional Data Scientist Exam - Topic 4 Question 50 Discussion

Actual exam question for Databricks's Databricks Certified Professional Data Scientist exam
Question #: 50
Topic #: 4
[All Databricks Certified Professional Data Scientist Questions]

You are asked to create a model to predict the total number of monthly subscribers for a specific magazine. You are provided with 1 year's worth of subscription and payment data, user demographic data, and 10 years worth of content of the magazine (articles and pictures). Which algorithm is the most appropriate for building a predictive model for subscribers?

Show Suggested Answer Hide Answer
Suggested Answer: B

Contribute your Thoughts:

0/2000 characters
Judy
3 months ago
Logistic regression is for binary outcomes, not this!
upvoted 0 times
...
Julene
3 months ago
TF-IDF? That seems off for subscriber prediction.
upvoted 0 times
...
Chantay
3 months ago
Wait, why not decision trees? They handle complex data well.
upvoted 0 times
...
Louann
4 months ago
Totally agree, it’s straightforward for predicting numbers!
upvoted 0 times
...
Darci
4 months ago
I think linear regression makes the most sense here.
upvoted 0 times
...
Lashandra
4 months ago
TF-IDF sounds like it's more for text analysis, not really for predicting subscriber counts. I guess decision trees could work too, but I need to think more about it.
upvoted 0 times
...
Bernardo
4 months ago
Logistic regression seems off for this question since we're not dealing with a binary outcome. I feel like linear regression might fit better.
upvoted 0 times
...
Nina
4 months ago
I'm not so sure about that. I remember practicing with decision trees for similar questions, but I can't recall if they were better for this type of data.
upvoted 0 times
...
Carma
5 months ago
I think linear regression could be a good choice since we're predicting a continuous variable, right?
upvoted 0 times
...
Wade
5 months ago
This is a tricky one. I'd want to explore the data a bit more before deciding on the algorithm. Linear regression seems like a good starting point, but the textual data might require some feature engineering or a more complex model like a neural network. I'd need to do some research on the best approaches for this type of problem.
upvoted 0 times
...
Wilda
5 months ago
TF-IDF? That's for text analysis, not predicting subscriber numbers. I think I'd rule that one out. Logistic regression could be an option if we're trying to predict whether someone will subscribe or not, but the question is asking for the total number, so I'm leaning more towards linear regression.
upvoted 0 times
...
Robt
5 months ago
I'm a bit unsure about this one. The data includes both numerical (subscription/payment) and textual (articles/pictures) features, so I'm not sure if linear regression is the best fit. Maybe a decision tree model could handle the mixed data types better.
upvoted 0 times
...
Sheron
5 months ago
This seems like a classic regression problem, so I'd probably go with linear regression. The question mentions predicting a continuous variable (total number of subscribers), which is what linear regression is designed for.
upvoted 0 times
...
Ardella
5 months ago
Hmm, I'm a bit confused. Do I need to set up a connection monitor for each VM, or can I consolidate them somehow?
upvoted 0 times
...
Lanie
5 months ago
This seems pretty straightforward. I'd go with option C - create a Query Activity in Automation Studio to pull the data from the Sent Data View, and then use an Email Activity to send the report. That way I can automate the whole process and schedule it to run weekly.
upvoted 0 times
...
Earleen
9 months ago
Logistic regression? More like illogical regression, am I right? *crickets* Tough crowd...
upvoted 0 times
Cory
8 months ago
C) Decision trees
upvoted 0 times
...
Laquita
9 months ago
B) Logistic regression
upvoted 0 times
...
Leandro
9 months ago
A) Linear regression
upvoted 0 times
...
...
Crissy
10 months ago
I'm just going to go with the most complicated-sounding option, Decision trees. That way, at least I'll sound smart, even if I'm totally wrong. *nervous laughter*
upvoted 0 times
...
Horace
10 months ago
Hold up, are we sure we can't use a combination of algorithms? Like a hybrid model or something? Seems like we have enough data to get creative here.
upvoted 0 times
Tegan
8 months ago
Definitely, combining different algorithms could help improve the accuracy of our predictive model.
upvoted 0 times
...
Malcolm
8 months ago
That's a good point, a hybrid model could be a great idea with the amount of data we have.
upvoted 0 times
...
Jeanice
8 months ago
We could potentially use a combination of algorithms to create a hybrid model.
upvoted 0 times
...
Annice
8 months ago
D) TF-IDF
upvoted 0 times
...
Enola
8 months ago
C) Decision trees
upvoted 0 times
...
Angelyn
9 months ago
B) Logistic regression
upvoted 0 times
...
Jimmie
9 months ago
A) Linear regression
upvoted 0 times
...
...
Lettie
10 months ago
TF-IDF? Really? That's for text analysis, not subscription prediction. I'm going to have to go with logistic regression for this one.
upvoted 0 times
Kati
9 months ago
Linear regression might not capture the complexity of subscriber prediction as well as logistic regression or decision trees.
upvoted 0 times
...
Amie
9 months ago
I think decision trees could also work well for this type of prediction.
upvoted 0 times
...
Dierdre
10 months ago
I agree, TF-IDF is not suitable for subscription prediction. Logistic regression seems like the best choice.
upvoted 0 times
...
...
Sommer
10 months ago
I would go with Logistic regression, as it can predict binary outcomes like whether a user will subscribe or not.
upvoted 0 times
...
Celeste
10 months ago
I agree with Angelyn, Decision trees can handle both numerical and categorical data.
upvoted 0 times
...
Therese
10 months ago
Linear regression seems like the obvious choice to me. We're trying to predict a continuous variable, so a linear model should work well here.
upvoted 0 times
Demetra
9 months ago
Decision trees could also work well, especially with the demographic and content data we have.
upvoted 0 times
...
Albina
9 months ago
I agree, linear regression is a good choice for predicting continuous variables.
upvoted 0 times
...
Tula
9 months ago
Decision trees could also be a good option, especially with the amount of data we have available.
upvoted 0 times
...
Emelda
10 months ago
I agree, linear regression is a good choice for predicting the total number of monthly subscribers.
upvoted 0 times
...
...
Frank
11 months ago
I think decision trees would be the best choice here. With the diverse data we have, a decision tree model can capture the nuanced relationships between the variables and predict the subscriber count.
upvoted 0 times
...
Angelyn
11 months ago
I think Decision trees would be the most appropriate algorithm.
upvoted 0 times
...

Save Cancel