Independence Day Deal! Unlock 25% OFF Today – Limited-Time Offer - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Exam Databricks Certified Associate Developer for Apache Spark 3.0 Topic 1 Question 36 Discussion

Actual exam question for Databricks's Databricks Certified Associate Developer for Apache Spark 3.0 exam
Question #: 36
Topic #: 1
[All Databricks Certified Associate Developer for Apache Spark 3.0 Questions]

Which of the following DataFrame methods is classified as a transformation?

Show Suggested Answer Hide Answer
Suggested Answer: A

transactionsDf.select('storeId').dropDuplicates().count()

Correct! After dropping all duplicates from column storeId, the remaining rows get counted, representing the number of unique values in the column.

transactionsDf.select(count('storeId')).dropDuplicates()

No. transactionsDf.select(count('storeId')) just returns a single-row DataFrame showing the number of non-null rows. dropDuplicates() does not have any effect in this context.

transactionsDf.dropDuplicates().agg(count('storeId'))

Incorrect. While transactionsDf.dropDuplicates() removes duplicate rows from transactionsDf, it does not do so taking only column storeId into consideration, but eliminates full row duplicates

instead.

transactionsDf.distinct().select('storeId').count()

Wrong. transactionsDf.distinct() identifies unique rows across all columns, but not only unique rows with respect to column storeId. This may leave duplicate values in the column, making the count

not represent the number of unique values in that column.

transactionsDf.select(distinct('storeId')).count()

False. There is no distinct method in pyspark.sql.functions.


Contribute your Thoughts:

Rosalind
2 months ago
Wait, is this a trick question? What if the answer is actually B) DataFrame.show(), because showing the DataFrame is, like, the ultimate transformation, man?
upvoted 0 times
Josphine
20 days ago
E) DataFrame.first() is not a transformation method.
upvoted 0 times
...
Jerilyn
1 months ago
C) DataFrame.select() is also a transformation method.
upvoted 0 times
...
Jin
1 months ago
A) DataFrame.count() is a transformation method.
upvoted 0 times
...
...
Darell
2 months ago
I'm going to have to go with E) DataFrame.first(). It's a basic DataFrame operation, and who doesn't love getting the first row? That's a transformation, right?
upvoted 0 times
...
Herman
2 months ago
This is a tricky one! I'm going to go with C) DataFrame.select() as the transformation method. It just feels right, you know?
upvoted 0 times
Elvera
23 days ago
I agree with both of you, C) DataFrame.select() is definitely a transformation method.
upvoted 0 times
...
Felicitas
1 months ago
I'm not sure, but I think A) DataFrame.count() is a transformation method.
upvoted 0 times
...
Ivette
1 months ago
I think C) DataFrame.select() is the transformation method too.
upvoted 0 times
...
...
Adelina
2 months ago
I'd go with A) DataFrame.count(). It's a common operation to get the number of rows in a DataFrame, and that seems like a transformation to me.
upvoted 0 times
Reena
21 hours ago
No, I don't think so. Showing the DataFrame is more of an action than a transformation.
upvoted 0 times
...
Avery
5 days ago
What about B) DataFrame.show()? Is that a transformation?
upvoted 0 times
...
Augustine
7 days ago
I agree, it makes sense to count rows as a transformation.
upvoted 0 times
...
Lashawnda
14 days ago
I think A) DataFrame.count() is a transformation.
upvoted 0 times
...
...
Regenia
2 months ago
Hmm, I'm not sure about this one. Maybe D) DataFrame.foreach() is the transformation method, since it applies a function to each row of the DataFrame.
upvoted 0 times
Tamekia
2 months ago
No, I believe it's actually C) DataFrame.select().
upvoted 0 times
...
Robt
2 months ago
I think D) DataFrame.foreach() is the transformation method.
upvoted 0 times
...
...
Halina
2 months ago
I think C) DataFrame.select() is the correct transformation method. It allows you to select specific columns from a DataFrame, which is a common data manipulation task.
upvoted 0 times
Tom
28 days ago
I'm not sure about DataFrame.foreach(), but I know DataFrame.first() is an action method, not a transformation.
upvoted 0 times
...
Felice
29 days ago
I believe DataFrame.show() is not a transformation method, it is used to display the contents of the DataFrame.
upvoted 0 times
...
Talia
2 months ago
I think DataFrame.count() is also a transformation method, as it returns the number of rows in the DataFrame.
upvoted 0 times
...
Tiera
2 months ago
I agree, DataFrame.select() is definitely a transformation method.
upvoted 0 times
...
...
Jonelle
3 months ago
I'm not sure about the others, but DataFrame.foreach() is definitely not a transformation because it is an action that applies a function to each element.
upvoted 0 times
...
Davida
3 months ago
I agree with Vashti. DataFrame.select() transforms the DataFrame by selecting specific columns.
upvoted 0 times
...
Vashti
3 months ago
I think DataFrame.select() is a transformation because it selects specific columns.
upvoted 0 times
...

Save Cancel