Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Exam Databricks-Machine-Learning-Associate Topic 3 Question 26 Discussion

Actual exam question for Databricks's Databricks-Machine-Learning-Associate exam
Question #: 26
Topic #: 3
[All Databricks-Machine-Learning-Associate Questions]

A data scientist wants to use Spark ML to one-hot encode the categorical features in their PySpark DataFrame features_df. A list of the names of the string columns is assigned to the input_columns variable.

They have developed this code block to accomplish this task:

The code block is returning an error.

Which of the following adjustments does the data scientist need to make to accomplish this task?

Show Suggested Answer Hide Answer
Suggested Answer: C

For large datasets, Spark ML uses iterative optimization methods to distribute the training of a linear regression model. Specifically, Spark MLlib employs techniques like Stochastic Gradient Descent (SGD) and Limited-memory Broyden--Fletcher--Goldfarb--Shanno (L-BFGS) optimization to iteratively update the model parameters. These methods are well-suited for distributed computing environments because they can handle large-scale data efficiently by processing mini-batches of data and updating the model incrementally.


Databricks documentation on linear regression: Linear Regression in Spark ML

Contribute your Thoughts:

Thurman
7 days ago
Maybe the error is because they forgot to add the 'sparkly' parameter to the OneHotEncoder. You know, to make it extra fabulous.
upvoted 0 times
...
Shannan
11 days ago
I heard the data scientist tried to one-hot encode their socks. Turns out they were just a bunch of ones and zeros!
upvoted 0 times
...
Alfred
21 days ago
VectorAssembler? Sounds like a superhero name. Maybe that's the solution, but I'm not sure.
upvoted 0 times
...
Rosendo
28 days ago
Ah, I see the issue. The method parameter is missing from the OneHotEncoder. We need to specify that.
upvoted 0 times
Joaquin
9 days ago
User2: Yes, that's correct. That should fix the error.
upvoted 0 times
...
Eden
17 days ago
User1: I think we need to specify the method parameter to the OneHotEncoder.
upvoted 0 times
...
...
Jenelle
1 months ago
Wait, I think we need to use StringIndexer first to convert the string columns to numerical values. Then we can use OneHotEncoder.
upvoted 0 times
Reita
17 days ago
A: I think you're right, we should use StringIndexer first.
upvoted 0 times
...
...
Ligia
1 months ago
I believe they should also use StringIndexer before one-hot encoding the features to properly encode the categorical values.
upvoted 0 times
...
Quentin
1 months ago
Hmm, the error is probably due to the fit operation. Let's try removing that line and see if it works.
upvoted 0 times
Lawrence
14 days ago
User2: Yeah, let's try that and see if it fixes the error.
upvoted 0 times
...
Adaline
23 days ago
User1: I think we should remove the line with the fit operation.
upvoted 0 times
...
...
Essie
1 months ago
I agree with Daisy. Without specifying the method parameter, the code won't work properly.
upvoted 0 times
...
Daisy
2 months ago
I think the data scientist needs to specify the method parameter to the OneHotEncoder.
upvoted 0 times
...

Save Cancel