A data analyst runs the following command:
SELECT age, country
FROM my_table
WHERE age >= 75 AND country = 'canada';
Which of the following tables represents the output of the above command?
A)
B)
C)
D)
E)
Option A uses theSELECT DISTINCTstatement to remove duplicate rows from thetable_bronzeand create a new tabletable_silverwith the deduplicated data.This is the correct way to deduplicate data using Spark SQL12. Option B simply inserts all the rows fromtable_bronzeintotable_silver, without removing any duplicates. Option C is not a valid syntax for Spark SQL, as there is noMERGE DEDUPLICATEstatement. Option D appends all the rows fromtable_bronzeintotable_silver, without removing any duplicates. Option E overwrites the existing data intable_silverwith the data fromtable_bronze, without removing any duplicates.Reference:Delete Duplicate using SPARK SQL,Spark SQL - How to Remove Duplicate Rows
Limited Time Offer
25%
Off
Teri
5 months agoThomasena
5 months agoJules
5 months agoTaryn
6 months agoMarya
6 months agoAntonio
6 months agoNobuko
6 months agoYoulanda
6 months agoJacqueline
7 months agoAntonio
7 months agoLuis
7 months agoMakeda
7 months agoRaina
7 months agoIsidra
7 months agoXochitl
7 months agoMichell
7 months agoLorean
7 months agoMelda
11 months agoAdolph
10 months agoThea
10 months agoLelia
10 months agoBettina
12 months agoJesusa
10 months agoShantell
10 months agoRolland
10 months agoCoral
12 months agoJesusita
1 year agoGlynda
11 months agoKris
11 months agoWilliam
12 months agoYoulanda
12 months agoLilli
1 year agoShasta
12 months agoPa
12 months agoTeri
1 year agoMarla
1 year agoFelicidad
1 year ago