A data analyst runs the following command:
SELECT age, country
FROM my_table
WHERE age >= 75 AND country = 'canada';
Which of the following tables represents the output of the above command?
A)
B)
C)
D)
E)
Option A uses theSELECT DISTINCTstatement to remove duplicate rows from thetable_bronzeand create a new tabletable_silverwith the deduplicated data.This is the correct way to deduplicate data using Spark SQL12. Option B simply inserts all the rows fromtable_bronzeintotable_silver, without removing any duplicates. Option C is not a valid syntax for Spark SQL, as there is noMERGE DEDUPLICATEstatement. Option D appends all the rows fromtable_bronzeintotable_silver, without removing any duplicates. Option E overwrites the existing data intable_silverwith the data fromtable_bronze, without removing any duplicates.Reference:Delete Duplicate using SPARK SQL,Spark SQL - How to Remove Duplicate Rows
Limited Time Offer
25%
Off
Teri
4 months agoThomasena
5 months agoJules
5 months agoTaryn
5 months agoMarya
5 months agoAntonio
6 months agoNobuko
6 months agoYoulanda
6 months agoJacqueline
6 months agoAntonio
6 months agoLuis
6 months agoMakeda
6 months agoRaina
6 months agoIsidra
6 months agoXochitl
6 months agoMichell
6 months agoLorean
6 months agoMelda
11 months agoAdolph
10 months agoThea
10 months agoLelia
10 months agoBettina
11 months agoJesusa
10 months agoShantell
10 months agoRolland
10 months agoCoral
11 months agoJesusita
12 months agoGlynda
11 months agoKris
11 months agoWilliam
11 months agoYoulanda
11 months agoLilli
12 months agoShasta
11 months agoPa
11 months agoTeri
12 months agoMarla
1 year agoFelicidad
1 year ago