Amazon AIF-C01 Exam - Topic 4 Question 40 Discussion

Actual exam question for Amazon's AIF-C01 exam

Question #: 40
Topic #: 4

An education company wants to build a private tutor application. The application will give users the ability to enter text or provide a picture of a question. The application will respond with a written answer and an explanation of the written answer.

Which model type meets these requirements?

AComputer vision model

BMultimodal LLM

CDiffusion model

DText-to-speech model

Show Suggested Answer

Suggested Answer: B

Comprehensive and Detailed Explanation From Exact AWS AI documents:

A multimodal large language model (LLM) can:

Accept both text and image inputs

Understand visual and textual context

Generate coherent written explanations

AWS generative AI guidance positions multimodal LLMs as the best choice for applications requiring cross-modal understanding and text generation.

Why the other options are incorrect:

Computer vision (A) does not generate text explanations.

Diffusion models (C) generate images.

Text-to-speech (D) converts text to audio.

AWS AI document references:

Multimodal Foundation Models on AWS

Building AI Tutors with Generative Models

by Merrilee at Jun 30, 2026, 06:48 AM

Limited Time Offer

25%

Off

Get Premium AIF-C01 Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Currently there are no comments in this discussion, be the first to comment!