Amazon Exam MLS-C01 Topic 4 Question 80 Discussion

Actual exam question for Amazon's MLS-C01 exam

Question #: 80
Topic #: 4

A Data Scientist is working on an application that performs sentiment analysis. The validation accuracy is poor and the Data Scientist thinks that the cause may be a rich vocabulary and a low average frequency of words in the dataset

Which tool should be used to improve the validation accuracy?

AAmazon Comprehend syntax analysts and entity detection

BAmazon SageMaker BlazingText allow mode

CNatural Language Toolkit (NLTK) stemming and stop word removal

DScikit-learn term frequency-inverse document frequency (TF-IDF) vectorizers

Show Suggested Answer

Suggested Answer: A

by Veronika at Oct 02, 2023, 04:25 PM

Limited Time Offer

25%

Off

Get Premium MLS-C01 Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Chaya

22 days ago

Stemming and stop word removal? Sounds like my high school English teacher's dream tool. Maybe we can throw in some thesaurus action too, just for fun.

upvoted 0 times

...

Vallie

29 days ago

B) Amazon SageMaker BlazingText? Sounds like a made-up answer. I'd stick to the more well-known NLP tools and techniques.

upvoted 0 times

Novella

16 days ago

C: Natural Language Toolkit (NLTK) stemming and stop word removal could also be a good option to try.

upvoted 0 times

...

Shelia

17 days ago

B: Amazon SageMaker BlazingText? Sounds like a made-up answer. I'd stick to the more well-known NLP tools and techniques.

upvoted 0 times

...

Josphine

20 days ago

A: I think using Scikit-learn term frequency-inverse document frequency (TF-IDF) vectorizers could help improve the validation accuracy.

upvoted 0 times

...

Sharika

1 months ago

A) Amazon Comprehend is probably overkill for this task. It's more suited for enterprise-level NLP tasks, not a simple sentiment analysis problem.

upvoted 0 times

Virgie

16 days ago

A) Amazon Comprehend is probably overkill for this task.

upvoted 0 times

...

Nicolette

17 days ago

D) Scikit-learn term frequency-inverse document frequency (TF-IDF) vectorizers

upvoted 0 times

...

Tyra

19 days ago

C) Natural Language Toolkit (NLTK) stemming and stop word removal

upvoted 0 times

...

Domitila

1 months ago

D) Scikit-learn term frequency-inverse document frequency (TF-IDF) vectorizers could also be a good choice. TF-IDF can help identify the most important words in the dataset and reduce the impact of common words.

upvoted 0 times

Son

2 days ago

C: D) Scikit-learn term frequency-inverse document frequency (TF-IDF) vectorizers could also be a good choice.

upvoted 0 times

...

Coral

1 months ago

B: I agree, removing stop words can help focus on the more important words in the dataset.

upvoted 0 times

...

Joanne

1 months ago

A: I think we should use C) Natural Language Toolkit (NLTK) stemming and stop word removal to improve the validation accuracy.

upvoted 0 times

...

Jamal

2 months ago

I prefer using NLTK for stemming and stop word removal to improve accuracy.

upvoted 0 times

...

Chauncey

2 months ago

C) Natural Language Toolkit (NLTK) stemming and stop word removal seems like the best option to handle the issue of rich vocabulary and low average word frequency. Removing common words and reducing words to their base form can help improve the model's performance.

upvoted 0 times