Databricks Exam Databricks Certified Associate Developer for Apache Spark 3.0 Topic 1 Question 36 Discussion

Actual exam question for Databricks's Databricks Certified Associate Developer for Apache Spark 3.0 exam

Question #: 36
Topic #: 1

[All Databricks Certified Associate Developer for Apache Spark 3.0 Questions]

Which of the following DataFrame methods is classified as a transformation?

ADataFrame.count()

BDataFrame.show()

CDataFrame.select()

DDataFrame.foreach()

EDataFrame.first()

Show Suggested Answer

Suggested Answer: A

transactionsDf.select('storeId').dropDuplicates().count()

Correct! After dropping all duplicates from column storeId, the remaining rows get counted, representing the number of unique values in the column.

transactionsDf.select(count('storeId')).dropDuplicates()

No. transactionsDf.select(count('storeId')) just returns a single-row DataFrame showing the number of non-null rows. dropDuplicates() does not have any effect in this context.

transactionsDf.dropDuplicates().agg(count('storeId'))

Incorrect. While transactionsDf.dropDuplicates() removes duplicate rows from transactionsDf, it does not do so taking only column storeId into consideration, but eliminates full row duplicates

instead.

transactionsDf.distinct().select('storeId').count()

Wrong. transactionsDf.distinct() identifies unique rows across all columns, but not only unique rows with respect to column storeId. This may leave duplicate values in the column, making the count

not represent the number of unique values in that column.

transactionsDf.select(distinct('storeId')).count()

False. There is no distinct method in pyspark.sql.functions.

by Zona at Apr 02, 2023, 04:39 PM

Limited Time Offer

25%

Off

Get Premium Databricks Certified Associate Developer for Apache Spark 3.0 Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Rosalind

2 months ago

Wait, is this a trick question? What if the answer is actually B) DataFrame.show(), because showing the DataFrame is, like, the ultimate transformation, man?

upvoted 0 times

Josphine

20 days ago

E) DataFrame.first() is not a transformation method.

upvoted 0 times

...

Jerilyn

1 months ago

C) DataFrame.select() is also a transformation method.

upvoted 0 times

...

Jin

1 months ago

A) DataFrame.count() is a transformation method.

upvoted 0 times

...

Darell

2 months ago

I'm going to have to go with E) DataFrame.first(). It's a basic DataFrame operation, and who doesn't love getting the first row? That's a transformation, right?

upvoted 0 times

...

Herman

2 months ago

This is a tricky one! I'm going to go with C) DataFrame.select() as the transformation method. It just feels right, you know?

upvoted 0 times

Elvera

23 days ago

I agree with both of you, C) DataFrame.select() is definitely a transformation method.

upvoted 0 times

...

Felicitas

1 months ago

I'm not sure, but I think A) DataFrame.count() is a transformation method.

upvoted 0 times

...

Ivette

1 months ago

I think C) DataFrame.select() is the transformation method too.

upvoted 0 times

...

Adelina

2 months ago

I'd go with A) DataFrame.count(). It's a common operation to get the number of rows in a DataFrame, and that seems like a transformation to me.

upvoted 0 times

Reena

21 hours ago

No, I don't think so. Showing the DataFrame is more of an action than a transformation.

upvoted 0 times

...

Avery

5 days ago

What about B) DataFrame.show()? Is that a transformation?

upvoted 0 times

...

Augustine

7 days ago

I agree, it makes sense to count rows as a transformation.

upvoted 0 times

...

Lashawnda

14 days ago

I think A) DataFrame.count() is a transformation.

upvoted 0 times

...

Regenia

2 months ago

Hmm, I'm not sure about this one. Maybe D) DataFrame.foreach() is the transformation method, since it applies a function to each row of the DataFrame.

upvoted 0 times

Tamekia

2 months ago

No, I believe it's actually C) DataFrame.select().

upvoted 0 times

...

Robt

2 months ago

I think D) DataFrame.foreach() is the transformation method.

upvoted 0 times

...

Halina

2 months ago

I think C) DataFrame.select() is the correct transformation method. It allows you to select specific columns from a DataFrame, which is a common data manipulation task.

upvoted 0 times

Tom

28 days ago

I'm not sure about DataFrame.foreach(), but I know DataFrame.first() is an action method, not a transformation.

upvoted 0 times

...

Felice

29 days ago

I believe DataFrame.show() is not a transformation method, it is used to display the contents of the DataFrame.

upvoted 0 times

...

Talia

2 months ago

I think DataFrame.count() is also a transformation method, as it returns the number of rows in the DataFrame.

upvoted 0 times

...

Tiera

2 months ago

I agree, DataFrame.select() is definitely a transformation method.

upvoted 0 times

...

Jonelle

3 months ago

I'm not sure about the others, but DataFrame.foreach() is definitely not a transformation because it is an action that applies a function to each element.

upvoted 0 times

...

Davida

3 months ago

I agree with Vashti. DataFrame.select() transforms the DataFrame by selecting specific columns.

upvoted 0 times

...

Vashti

3 months ago

I think DataFrame.select() is a transformation because it selects specific columns.

upvoted 0 times

...