Which of the following describes the relationship between native Spark DataFrames and pandas API on Spark DataFrames?
To filter rows in a Spark DataFrame based on a condition, the filter method is used. In this case, the condition is that the value in the 'discount' column should be less than or equal to 0. The correct syntax uses the filter method along with the col function from pyspark.sql.functions.
Correct code:
from pyspark.sql.functions import col filtered_df = spark_df.filter(col('discount') <= 0)
Option A and D use Pandas syntax, which is not applicable in PySpark. Option B is closer but misses the use of the col function.
Haydee
3 months agoShawnna
3 months agoLaurene
4 months agoJoanna
4 months agoMing
4 months agoVerona
4 months agoHarrison
4 months agoLindsay
5 months agoLeatha
5 months agoKathrine
5 months agoAlonzo
5 months agoGilma
5 months agoBethanie
5 months agoNenita
5 months agoJustine
10 months agoNobuko
8 months agoLonna
8 months agoMarshall
9 months agoJoseph
10 months agoCarylon
10 months agoDeandrea
8 months agoKarima
8 months agoStephanie
9 months agoVeronika
10 months agoCorrina
9 months agoAdolph
9 months agoYolande
10 months agoTayna
11 months agoDaren
9 months agoKenda
9 months agoLauna
10 months agoTalia
10 months agoSkye
11 months agoIlene
10 months agoIlene
10 months agoKyoko
11 months agoCharolette
11 months agoKyoko
11 months ago