Databricks Certified Associate Developer for Apache Spark 3.5 Exam - Topic 7 Question 8 Discussion

Question

Databricks Certified Associate Developer for Apache Spark 3.5 Exam - Topic 7 Question 8 Discussion

An MLOps engineer is building a Pandas UDF that applies a language model that translates English strings into Spanish. The initial code is loading the model on every call to the UDF, which is hurting the performance of the data pipeline.The initial code is:def in_spanish_inner(df: pd.Series) -> pd.Series:model = get_translation_model(target_lang='es')return df.apply(model)in_spanish = sf.pandas_udf(in_spanish_inner, StringType())How can the MLOps engineer change this code to reduce how many times the language model is loaded?

A) Convert the Pandas UDF to a PySpark UDF

B) Convert the Pandas UDF from a Series Series UDF to a Series Scalar UDF

C) Run the in_spanish_inner() function in a mapInPandas() function call

Accepted Answer

D) Convert the Pandas UDF from a Series Series UDF to an Iterator[Series] Iterator[Series] UDF

Databricks Certified Associate Developer for Apache Spark 3.5 Exam - Topic 7 Question 8 Discussion

Databricks Certified Associate Developer for Apache Spark 3.5 Exam - Topic 7 Question 8 Discussion

Contribute your Thoughts:

Henriette

Francesco

Davida

Frederica

Rosalyn

Nakita

Annice

Jillian

Mariann

Bonita

Dudley

Abraham

Tarra

Rikki

Alpha

Gilma

Cyndy

Edward

Gertude

Derick

Rolf

Magda

Effie

Maryln

Refugia

Jani

Billye

Ernie

Catherin