[Modeling]
A company sells thousands of products on a public website and wants to automatically identify products with potential durability problems. The company has 1.000 reviews with date, star rating, review text, review summary, and customer email fields, but many reviews are incomplete and have empty fields. Each review has already been labeled with the correct durability result.
A machine learning specialist must train a model to identify reviews expressing concerns over product durability. The first model needs to be trained and ready to review in 2 days.
What is the MOST direct approach to solve this problem within 2 days?
[Modeling]
A Data Scientist is building a model to predict customer churn using a dataset of 100 continuous numerical
features. The Marketing team has not provided any insight about which features are relevant for churn
prediction. The Marketing team wants to interpret the model and see the direct impact of relevant features on
the model outcome. While training a logistic regression model, the Data Scientist observes that there is a wide
gap between the training and validation set accuracy.
Which methods can the Data Scientist use to improve the model performance and satisfy the Marketing team's
needs? (Choose two.)
[Modeling]
A company wants to enhance audits for its machine learning (ML) systems. The auditing system must be able to perform metadata analysis on the features that the ML models use. The audit solution must generate a report that analyzes the metadat
a. The solution also must be able to set the data sensitivity and authorship of features.
Which solution will meet these requirements with the LEAST development effort?
Each morning, a data scientist at a rental car company creates insights about the previous day's rental car reservation demands. The company needs to automate this process by streaming the data to Amazon S3 in near real time. The solution must detect high-demand rental cars at each of the company's locations. The solution also must create a visualization dashboard that automatically refreshes with the most recent data.
Which solution will meet these requirements with the LEAST development time?
The solution that will meet the requirements with the least development time is to use Amazon Kinesis Data Firehose to stream the reservation data directly to Amazon S3, detect high-demand outliers by using Amazon QuickSight ML Insights, and visualize the data in QuickSight. This solution does not require any custom development or ML domain expertise, as it leverages the built-in features of QuickSight ML Insights to automatically run anomaly detection and generate insights on the streaming data. QuickSight ML Insights can also create a visualization dashboard that automatically refreshes with the most recent data, and allows the data scientist to explore the outliers and their key drivers.References:
2: Detecting outliers with ML-powered anomaly detection - Amazon QuickSight
3: Real-time Outlier Detection Over Streaming Data - IEEE Xplore
4: Towards a deep learning-based outlier detection ... - Journal of Big Data
A company wants to create a data repository in the AWS Cloud for machine learning (ML) projects. The company wants to use AWS to perform complete ML lifecycles and wants to use Amazon S3 for the data storage. All of the company's data currently resides on premises and is 40 in size.
The company wants a solution that can transfer and automatically update data between the on-premises object storage and Amazon S3. The solution must support encryption, scheduling, monitoring, and data integrity validation.
Which solution meets these requirements?
The best solution to meet the requirements of the company is to use AWS DataSync to make an initial copy of the entire dataset, and schedule subsequent incremental transfers of changing data until the final cutover from on premises to AWS. This is because:
Therefore, by using AWS DataSync, the company can create a data repository in the AWS Cloud for machine learning projects, and use Amazon S3 for the data storage, while meeting the requirements of encryption, scheduling, monitoring, and data integrity validation.
References:
Data Transfer Service - AWS DataSync
Syncing Data with AWS DataSync
Ressie
19 hours agoEladia
14 days agoMalcolm
1 months agoNikita
2 months agoLisbeth
3 months agoWenona
3 months agoEvangelina
4 months agoCora
4 months agoEulah
4 months agoKaitlyn
5 months agoLigia
5 months agoLeonida
5 months agoLilli
5 months agoMeghann
6 months agoGaston
6 months agoTorie
6 months agoLenna
6 months agoDannie
7 months agoJavier
7 months agoPortia
7 months agoFranklyn
7 months agoElke
8 months agoDarrel
8 months agoTimmy
8 months agoAlberta
8 months agoHelga
8 months agoKimi
9 months agoPamella
9 months agoMitsue
9 months agoGlenna
9 months agoAdell
9 months agoGladys
10 months agoFarrah
10 months agoDalene
10 months agoKayleigh
11 months agoRoyal
12 months agoElza
1 years agoHerman
1 years agoGlory
1 years agoTherese
1 years ago