A data scientist is clustering a data set but does not want to specify the number of clusters present. Which of the following algorithms should the data scientist use?
DBSCAN discovers clusters based on density without requiring you to predefine the number of clusters, automatically finding arbitrarily shaped groups and identifying noise points.
A data scientist is developing a model to predict the outcome of a vote for a national mascot. The choice is between tigers and lions. The full data set represents feedback from individuals representing 17 professions and 12 different locations. The following rank aggregation represents 80% of the data set:
Which of the following is the most likely concern about the model's ability to predict the outcome of the vote?
The aggregated feedback covers only 80% of respondents, mostly from a few professions and locations, so the model hasn't ''seen'' the remaining 20% (and those underrepresented groups). Its performance on those unseen subsets (out-of-sample data) is therefore the primary concern for how well it will predict the actual vote.
Which of the following types of layers is used to downsample feature detection when using a convolutional neural network?
Pooling layers (such as max pooling or average pooling) reduce the spatial dimensions of the feature maps by summarizing local neighborhoods, effectively downsampling the detected features and controlling overfitting.
A data scientist is analyzing a data set with categorical features and would like to make those features more useful when building a model. Which of the following data transformation techniques should the data scientist use? (Choose two.)
One-hot encoding creates binary indicator columns for each category, allowing models to treat nominal categories without implying any order.
Label encoding maps categories to integer labels, which can be useful for tree-based models or when you need a single numeric column (though you must ensure the algorithm can handle treated ordinality appropriately).
Given matrix
Which of the following is AT?
A)
B)
C)
D)
Transposing swaps rows and columns, so the (i, j) entry becomes the (j, i) entry.
Reyes
2 days agoMatt
18 days agoKelvin
20 days ago