24 of 55. Which code should be used to display the schema of the Parquet file stored in the location events.parquet?
A.
spark.sql("SELECT * FROM events.parquet").show()
B.
spark.read.format("parquet").load("events.parquet").show()
C.
spark.read.parquet("events.parquet").printSchema()
D.
spark.sql("SELECT schema FROM events.parquet").show()
To view the schema of a Parquet file, you must use the DataFrameReader to load the Parquet data and call the .printSchema() method.
Correct syntax:
spark.read.parquet('events.parquet').printSchema()
This command loads the file metadata (without triggering a full read) and prints the column names, data types, and nullability information in a tree format.
Why the other options are incorrect:
A/D: SQL queries can't directly introspect file schemas.
B: .show() displays data rows, not schema.
PySpark DataFrameReader API --- read.parquet() and DataFrame.printSchema().
Databricks Exam Guide (June 2025): Section ''Using Spark SQL'' --- describes reading files and examining schemas in Spark SQL and DataFrame APIs.
===========
Currently there are no comments in this discussion, be the first to comment!