A data analyst has created a Delta table sales that is used by the entire data analysis team. They want help from the data engineering team to implement a series of tests to ensure the data is clean. However, the data engineering team uses Python for its tests rather than SQL.
Which of the following commands could the data engineering team use to access sales in PySpark?
The data engineering team can use thespark.tablemethod to access the Delta tablesalesin PySpark. This method returns a DataFrame representation of the Delta table, which can be used for further processing or testing.Thespark.tablemethod works for any table that is registered in the Hive metastore or the Spark catalog, regardless of the file format1.Alternatively, the data engineering team can also use theDeltaTable.forPathmethod to load the Delta table from its path2.Reference:1:SparkSession | PySpark 3.2.0 documentation2:Welcome to Delta Lake's Python documentation page --- delta-spark 2.4.0 documentation
Currently there are no comments in this discussion, be the first to comment!