A data scientist is wanting to explore the Spark DataFrame spark_df. The data scientist wants visual histograms displaying the distribution of numeric features to be included in the exploration.
Which of the following lines of code can the data scientist run to accomplish the task?
To display visual histograms and summaries of the numeric features in a Spark DataFrame, the Databricks utility function dbutils.data.summarize can be used. This function provides a comprehensive summary, including visual histograms.
Correct code:
dbutils.data.summarize(spark_df)
Other options like spark_df.describe() and spark_df.summary() provide textual statistical summaries but do not include visual histograms.
Databricks Utilities Documentation
Antonio
11 months agoTresa
11 months agoMarshall
11 months agoBilly
11 months agoIsadora
12 months agoCarline
11 months agoAnnice
11 months agoHayley
11 months agoCorazon
12 months agoIrma
12 months agoNakita
12 months agoMelissa
12 months agoSharee
11 months agoIsaiah
11 months agoJosephine
11 months agoDortha
12 months agoKara
12 months ago