A large grocery distributor receives daily depletion reports from the field in the form of gzip archives of CSV files uploading to Amazon S3. The files range from 500MB to 5GB. These files are processes daily by an EMR job.
Recently it has been observed that the file sizes vary, and the EMR jobs take too long. The distributor needs to tune and optimize the data processing workflow with this limited information to improved the performance of the EMR job.
Which recommendation should an administrator provide?
Elfriede
4 months agoJerry
4 months agoDesiree
4 months agoRosendo
4 months agoCordie
4 months agoStephanie
5 months agoWai
5 months agoYolando
5 months agoAnna
5 months agoEva
5 months agoMona
5 months agoDerick
5 months agoCaren
5 months agoStevie
5 months agoSabrina
9 months agoLatricia
9 months agoAlisha
8 months agoKarma
9 months agoJanna
9 months agoEmelda
9 months agoLanie
10 months agoRuthann
8 months agoMargot
9 months agoRoselle
9 months agoGoldie
10 months agoArlean
9 months agoBlossom
9 months agoSylvia
9 months agoAlaine
11 months agoCordelia
9 months agoLaurene
10 months agoAlbina
11 months agoAnjelica
11 months agoEdwin
11 months ago