The data engineer is using Spark's MEMORY_ONLY storage level.
Which indicators should the data engineer look for in the spark UI's Storage tab to signal that a cached table is not performing optimally?
In the Spark UI's Storage tab, an indicator that a cached table is not performing optimally would be the presence of the _disk annotation in the RDD Block Name. This annotation indicates that some partitions of the cached data have been spilled to disk because there wasn't enough memory to hold them. This is suboptimal because accessing data from disk is much slower than from memory. The goal of caching is to keep data in memory for fast access, and a spill to disk means that this goal is not fully achieved.
Fatima
3 months agoAntione
3 months agoHenriette
3 months agoIluminada
4 months agoTandra
4 months agoRaylene
4 months agoJoaquin
4 months agoFlo
4 months agoKendra
5 months agoLaquita
5 months agoTiffiny
5 months agoMerri
5 months agoKati
5 months agoFrank
5 months agoNorah
5 months agoMabel
5 months agoAnissa
2 years agoPhillip
2 years agoJettie
2 years agoRaul
2 years agoLizbeth
2 years agoJerry
2 years agoFrederica
2 years agoLea
2 years agoEnola
2 years agoLemuel
2 years agoGeorgeanna
2 years agoValentin
1 year agoLachelle
1 year agoJenelle
1 year agoLatanya
2 years agoHelaine
2 years agoNell
2 years agoThaddeus
2 years agoYolando
2 years agoDarci
2 years agoEnola
2 years agoChristoper
2 years agoMarylin
2 years agoScarlet
2 years agoGwenn
2 years agoMariko
2 years agoClare
2 years agoVivienne
2 years agoLayla
2 years ago