Databricks Exam Databricks Certified Associate Developer for Apache Spark 3.0 Topic 1 Question 25 Discussion

Actual exam question for Databricks's Databricks Certified Associate Developer for Apache Spark 3.0 exam

Question #: 25
Topic #: 1

[All Databricks Certified Associate Developer for Apache Spark 3.0 Questions]

Which of the following code blocks creates a new 6-column DataFrame by appending the rows of the 6-column DataFrame yesterdayTransactionsDf to the rows of the 6-column DataFrame

todayTransactionsDf, ignoring that both DataFrames have different column names?

Aunion(todayTransactionsDf, yesterdayTransactionsDf)

BtodayTransactionsDf.unionByName(yesterdayTransactionsDf, allowMissingColumns=True)

CtodayTransactionsDf.unionByName(yesterdayTransactionsDf)

DtodayTransactionsDf.concat(yesterdayTransactionsDf)

EtodayTransactionsDf.union(yesterdayTransactionsDf)

Show Suggested Answer

Suggested Answer: E

todayTransactionsDf.union(yesterdayTransactionsDf)

Correct. The union command appends rows of yesterdayTransactionsDf to the rows of todayTransactionsDf, ignoring that both DataFrames have different column names. The resulting DataFrame

will have the column names of DataFrame todayTransactionsDf.

todayTransactionsDf.unionByName(yesterdayTransactionsDf)

No. unionByName specifically tries to match columns in the two DataFrames by name and only appends values in columns with identical names across the two DataFrames. In the form presented

above, the command is a great fit for joining DataFrames that have exactly the same columns, but in a different order. In this case though, the command will fail because the two DataFrames have

different columns.

todayTransactionsDf.unionByName(yesterdayTransactionsDf, allowMissingColumns=True)

No. The unionByName command is described in the previous explanation. However, with the allowMissingColumns argument set to True, it is no longer an issue that the two DataFrames have

different column names. Any columns that do not have a match in the other DataFrame will be filled with null where there is no value. In the case at hand, the resulting DataFrame will have 7 or more

columns though, so it this command is not the right answer.

union(todayTransactionsDf, yesterdayTransactionsDf)

No, there is no union method in pyspark.sql.functions.

todayTransactionsDf.concat(yesterdayTransactionsDf)

Wrong, the DataFrame class does not have a concat method.

More info: pyspark.sql.DataFrame.union --- PySpark 3.1.2 documentation, pyspark.sql.DataFrame.unionByName --- PySpark 3.1.2 documentation

Static notebook | Dynamic notebook: See test 3, Question: 18 (Databricks import instructions)

by Salome at Aug 20, 2022, 04:28 AM

Limited Time Offer

25%

Off

Get Premium Databricks Certified Associate Developer for Apache Spark 3.0 Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Currently there are no comments in this discussion, be the first to comment!