A Machine Learning Specialist observes several performance problems with the training portion of a machine learning solution on Amazon SageMaker The solution uses a large training dataset 2 TB in size and is using the SageMaker k-means algorithm The observed issues include the unacceptable length of time it takes before the training job launches and poor I/O throughput while training the model
What should the Specialist do to address the performance issues with the current solution?
SageMaker Data Wrangler is a feature of SageMaker Studio that provides an end-to-end solution for importing, preparing, transforming, featurizing, and analyzing data. Data Wrangler includes built-in analyses that help generate visualizations and data insights in a few clicks. One of the built-in analyses is the Quick Model visualization, which can be used to quickly evaluate the data and produce importance scores for each feature. A feature importance score indicates how useful a feature is at predicting a target label. The feature importance score is between [0, 1] and a higher number indicates that the feature is more important to the whole dataset. The Quick Model visualization uses a random forest model to calculate the feature importance for each feature using the Gini importance method. This method measures the total reduction in node impurity (a measure of how well a node separates the classes) that is attributed to splitting on a particular feature. The ML developer can use the Quick Model visualization to obtain the importance scores for each feature of the dataset and use them to feature engineer the dataset. This solution requires the least development effort compared to the other options.
References:
* Analyze and Visualize
* Detect multicollinearity, target leakage, and feature correlation with Amazon SageMaker Data Wrangler
Hortencia
5 months agoEliseo
5 months agoAnthony
5 months agoDaniela
5 months agoDemetra
6 months agoTula
6 months agoIola
6 months agoTyisha
6 months agoWilliam
6 months agoElfriede
6 months agoHoney
6 months agoValene
6 months agoGlenn
6 months agoAbel
7 months agoLeeann
7 months agoRasheeda
7 months agoSena
7 months agoGail
11 months agoJamal
10 months agoTegan
10 months agoLacresha
10 months agoHerminia
10 months agoFelix
11 months agoDomingo
10 months agoLeontine
10 months agoReta
10 months agoPolly
11 months agoEllsworth
10 months agoAfton
10 months agoAnisha
11 months agoGennie
12 months agoChara
11 months agoDick
11 months agoJamal
11 months agoMarkus
12 months agoLeanora
1 year agoAaron
1 year agoBette
1 year ago