A DataFrame df has columns name, age, and salary. The developer needs to sort the DataFrame by age in ascending order and salary in descending order.
Which code snippet meets the requirement of the developer?
To sort a PySpark DataFrame by multiple columns with mixed sort directions, the correct usage is:
python
CopyEdit
df.orderBy('age', 'salary', ascending=[True, False])
age will be sorted in ascending order
salary will be sorted in descending order
The orderBy() and sort() methods in PySpark accept a list of booleans to specify the sort direction for each column.
Documentation Reference: PySpark API - DataFrame.orderBy
Currently there are no comments in this discussion, be the first to comment!