Databricks Machine Learning Associate Exam - Topic 4 Question 38 Discussion

Actual exam question for Databricks's Databricks Machine Learning Associate exam

Question #: 38
Topic #: 4

[All Databricks Machine Learning Associate Questions]

A data scientist has defined a Pandas UDF function predict to parallelize the inference process for a single-node model:

They have written the following incomplete code block to use predict to score each record of Spark DataFrame spark_df:

Which of the following lines of code can be used to complete the code block to successfully complete the task?

Apredict(*spark_df.columns)

BmapInPandas(predict)

Cpredict(Iterator(spark_df))

DmapInPandas(predict(spark_df.columns))

Epredict(spark_df.columns)

Show Suggested Answer

Suggested Answer: B

To apply the Pandas UDF predict to each record of a Spark DataFrame, you use the mapInPandas method. This method allows the Pandas UDF to operate on partitions of the DataFrame as pandas DataFrames, applying the specified function (predict in this case) to each partition. The correct code completion to execute this is simply mapInPandas(predict), which specifies the UDF to use without additional arguments or incorrect function calls. Reference:

PySpark DataFrame documentation (Using mapInPandas with UDFs).

by Trina at Jan 05, 2026, 05:16 AM

Limited Time Offer

25%

Off

Get Premium Databricks Machine Learning Associate Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Johnetta

2 months ago

D looks tempting, but I think it complicates things. B is clearer.

upvoted 0 times

...

Nenita

2 months ago

I lean towards B as well. It seems to fit the context better.

upvoted 0 times

...

Luther

2 months ago

I feel like A could work too, but not sure.

upvoted 0 times

...

Harrison

3 months ago

This question is tricky! I think B is the best choice.

upvoted 0 times

...

Alease

3 months ago

I agree with B, it just makes more sense!

upvoted 0 times

...

Cordie

3 months ago

Definitely not D, that looks wrong.

upvoted 0 times

...

Jordan

3 months ago

Wait, can you really use predict like that? Sounds off.

upvoted 0 times

...

Kristeen

3 months ago

Nah, I’m leaning towards A.

upvoted 0 times

...

Carri

3 months ago

I think option B is the right choice!

upvoted 0 times

...

Audra

4 months ago

This is a classic data science problem. I bet the answer is B) mapInPandas(predict).

upvoted 0 times

...

Kaitlyn

4 months ago

I love how they're trying to trick us with these Pandas UDF questions. B) is definitely the way to go.

upvoted 0 times

...

Tamra

4 months ago

Haha, this question is a real brain-teaser! I'm going to go with B) just to be safe.

upvoted 0 times

...

Danica

5 months ago

Hmm, I'm not sure about this one. I'm leaning towards E) predict(spark_df.columns), but I could be wrong.

upvoted 0 times

...

Peter

5 months ago

The question is pretty straightforward. I think B is the way to go.

upvoted 0 times

...

Marisha

5 months ago

B) mapInPandas(predict) is the correct answer.

upvoted 0 times

...

Hildegarde

5 months ago

I vaguely recall something about using `Iterator` with UDFs, but I can't remember if `predict(Iterator(spark_df))` is the right syntax.

upvoted 0 times

...

Kara

5 months ago

I feel like `mapInPandas(predict)` is the most straightforward option, but I need to double-check if that's how we should apply the function.

upvoted 0 times

...

Iluminada

5 months ago

I think `predict(*spark_df.columns)` seems like it could work, but I’m not entirely confident about how the arguments are being passed.

upvoted 0 times

...

Gaston

6 months ago

I remember we practiced using `mapInPandas` with UDFs, but I'm not sure if it's the right choice here.

upvoted 0 times

...

Zena

6 months ago

This is a good test of my understanding of Pandas UDFs and Spark DataFrame operations. I'll need to think carefully about the syntax and how the predict function is expected to be used.

upvoted 0 times

...

Kyoko

6 months ago

I'm feeling pretty confident about this one. The question is asking us to complete the code block, so I'm guessing one of these options is the correct way to call the predict function.

upvoted 0 times

...

Carin

6 months ago

Okay, I think I've got a strategy. The key is to figure out how to properly pass the Spark DataFrame columns to the predict function. Let me try a few of these options and see which one works.

upvoted 0 times

...

Adelaide

6 months ago

Hmm, I'm a bit confused about the Pandas UDF function and how it's supposed to be used here. I'll need to review my notes on Spark DataFrame transformations.

upvoted 0 times

...