Which of the following describes the relationship between native Spark DataFrames and pandas API on Spark DataFrames?
To filter rows in a Spark DataFrame based on a condition, the filter method is used. In this case, the condition is that the value in the 'discount' column should be less than or equal to 0. The correct syntax uses the filter method along with the col function from pyspark.sql.functions.
Correct code:
from pyspark.sql.functions import col filtered_df = spark_df.filter(col('discount') <= 0)
Option A and D use Pandas syntax, which is not applicable in PySpark. Option B is closer but misses the use of the col function.
Justine
24 days agoMarshall
6 days agoJoseph
1 months agoCarylon
1 months agoStephanie
7 days agoVeronika
1 months agoCorrina
3 days agoAdolph
16 days agoYolande
25 days agoTayna
2 months agoKenda
6 days agoLauna
1 months agoTalia
1 months agoSkye
2 months agoIlene
29 days agoIlene
1 months agoKyoko
2 months agoCharolette
2 months agoKyoko
2 months ago