A data analyst runs the following command:
SELECT age, country
FROM my_table
WHERE age >= 75 AND country = 'canada';
Which of the following tables represents the output of the above command?
A)
B)
C)
D)
E)
Option A uses theSELECT DISTINCTstatement to remove duplicate rows from thetable_bronzeand create a new tabletable_silverwith the deduplicated data.This is the correct way to deduplicate data using Spark SQL12. Option B simply inserts all the rows fromtable_bronzeintotable_silver, without removing any duplicates. Option C is not a valid syntax for Spark SQL, as there is noMERGE DEDUPLICATEstatement. Option D appends all the rows fromtable_bronzeintotable_silver, without removing any duplicates. Option E overwrites the existing data intable_silverwith the data fromtable_bronze, without removing any duplicates.Reference:Delete Duplicate using SPARK SQL,Spark SQL - How to Remove Duplicate Rows
Limited Time Offer
25%
Off
Melda
1 months agoAdolph
4 days agoThea
8 days agoLelia
9 days agoBettina
2 months agoJesusa
3 days agoShantell
4 days agoRolland
5 days agoCoral
2 months agoJesusita
2 months agoGlynda
30 days agoKris
1 months agoWilliam
1 months agoYoulanda
2 months agoLilli
2 months agoShasta
1 months agoPa
1 months agoTeri
2 months agoMarla
3 months agoFelicidad
3 months ago