Deal of The Day! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Microsoft AI-103 Exam Questions

Exam Name: Microsoft Developing AI Apps and Agents on Azure Exam
Exam Code: AI-103
Related Certification(s): Microsoft Azure AI Apps and Agents Developer Associate Certification
Certification Provider: Microsoft
Actual Exam Duration: 120 Minutes
Number of AI-103 practice questions in our database: 67 (updated: Jun. 01, 2026)
Disscuss Microsoft AI-103 Topics, Questions or Ask Anything Related
0/2000 characters

Currently there are no comments in this discussion, be the first to comment!

Free Microsoft AI-103 Exam Actual Questions

Note: Premium Questions for AI-103 were last updated On Jun. 01, 2026 (see below)

Question #1

You have a Microsoft Foundry project that contains a high-traffic agent.

After a recent update, operational costs increase significantly.

Monitoring confirms that the volume of user traffic to the agent remains unchanged.

You suspect that changes to the request or response characteristics are causing the increase.

You need to identify whether the additional costs are driven by the model input size, the model output size, or expanded tool usage.

Which observability capability should you use?

Reveal Solution Hide Solution
Correct Answer: B

The correct capability is token usage. In Microsoft Foundry observability, token consumption is the primary signal for diagnosing model-cost changes when request volume is unchanged. Token usage lets you distinguish whether costs increased because prompts became larger, retrieved or tool-provided context expanded, responses became longer, or agent execution added more model calls. Microsoft Foundry monitoring dashboards track operational metrics such as token consumption, latency, error rates, and quality scores, and the agent monitoring dashboard is specifically intended to help analyze token usage, latency, success rates, and evaluation outcomes for production traffic.

This directly matches the scenario because the issue is not more traffic, but changed request or response characteristics. Input tokens reveal whether the prompt, chat history, grounding data, or tool outputs being sent to the model increased. Output tokens reveal whether the model is generating longer completions. Expanded tool usage can also increase cost indirectly by adding more tool results, intermediate calls, and context into subsequent model requests; Foundry tracing and observability capture tool usage and token consumption for agent runs.

Evaluation metrics assess response quality and safety, not cost drivers. Latency identifies performance delays, and run success rate measures reliability. Reference topics: Microsoft Foundry observability, agent monitoring dashboard, token consumption, cost analysis, tool usage, and production monitoring.


Question #2

You are deploying a support agent that enables users to upload photos.

You need to automatically classify uploaded images for harmful content. The solution must block content based on severity levels.

What should you do?

Reveal Solution Hide Solution
Correct Answer: A

The correct answer is A. Implement image moderation. Azure AI Content Safety provides image analysis that classifies uploaded images for harmful content, including harm categories such as hate, sexual content, violence, and self-harm. Microsoft's Content Safety overview states that the Analyze Image API scans images for harmful content with multi-severity levels, which directly matches the requirement to automatically classify uploaded photos and block content based on configured severity thresholds.

Prompt Shields are intended to detect prompt injection and jailbreak-style attacks against generative models, not to classify image harm categories. Keyword scanning OCR output would only detect visible text extracted from the image and would miss visual harm in the image itself. Blocklists can help match known words or custom patterns, but they are not a complete image safety classifier and do not provide the built-in severity-based image harm classification required here. Image moderation is therefore the correct control for user-uploaded photos. Reference topics: Azure AI Content Safety, image moderation, harm categories, severity levels, Foundry guardrails, and responsible AI controls.


Question #3

You have a chat app in a Microsoft Foundry project and an Azure AI Search vectorized index.

You need to connect to the index to meet the following requirements:

* Complex questions must retrieve information from multiple chunks.

* Multi-turn conversations must influence retrieval planning.

* Retrievals must run in parallel to reduce latency.

Which retrieval approach should you use?

Reveal Solution Hide Solution
Correct Answer: C

The correct answer is agentic Retrieval Augmented Generation (RAG) because the requirements describe the agentic retrieval pipeline in Azure AI Search. Agentic retrieval is designed for chat and copilot scenarios where a user's request can be complex, conversational, and dependent on prior turns. Azure AI Search agentic retrieval uses an LLM-assisted planning stage to break a complex request into focused subqueries, allowing the system to retrieve grounding information from multiple chunks rather than relying on a single query path. Microsoft's Azure AI Search guidance describes agentic retrieval as a multi-query pipeline for complex questions in chat and agent workflows, with subqueries that can include chat history for additional context.

This also satisfies the latency requirement because agentic retrieval runs the generated subqueries in parallel and then merges and reranks the best results for use by the generative model. Classic RAG is simpler and typically sends a single query to search, making it less suitable for multi-hop or conversational retrieval planning. Chain of thought is a reasoning technique, not an Azure AI Search retrieval approach, and iterative retrieval does not specifically provide the built-in query planning, conversation-aware retrieval, and parallel execution described here. Reference topics: Azure AI Search agentic retrieval, RAG with Azure AI Search, knowledge bases, query planning, and generative AI grounding.


Question #4

You are creating an agent workflow in a Microsoft Foundry project to support natural voice interactions.

The agent must receive continuous audio input, convert the input into text for reasoning, and then return spoken responses to a user. The workflow must meet the following requirements:

. Support turn-taking dynamics, where the agent begins to generate the speech output before the user finishes speaking. . Operate with low latency to maintain a conversational experience.

You need to enable both speech to text and text to speech in a real-time agent interaction.

What should you do?

Reveal Solution Hide Solution
Correct Answer: D

The correct answer is D. Use real-time speech to text for incoming audio and text to speech for agent responses. The workflow requires continuous audio input, low-latency transcription for reasoning, and spoken output back to the user. Azure Speech in Foundry Tools real-time speech to text is designed for immediate transcription from streaming audio, which satisfies the incoming-audio side of the interaction. Text to speech provides the outbound spoken response path after the agent generates its answer.

This pattern aligns with Microsoft's real-time voice-agent architecture. The Voice Live API overview explains that low-latency speech-to-speech systems integrate speech recognition, generative reasoning, and text-to-speech functionality to create natural voice experiences. It also identifies contact centers as a key scenario and highlights low perceived latency for end users. Embeddings do not decode audio into conversational speech. Batch transcription introduces file-oriented delay and is not suitable for turn-taking. Speech translation is only appropriate when translating between languages and does not provide the required reasoning-plus-spoken-response loop. Reference topics: Azure Speech in Foundry Tools, real-time speech to text, text to speech, voice agents, low-latency interaction, and conversational turn-taking.


Question #5

You have a Microsoft Foundry project that uses Azure Al Search to ground an agent in internal documentation.

After a recent content update, users report that the agent's answers have become less accurate.

You need to identify whether the retrieved content is negatively influencing the model's generated responses.

Which observability signal should you review?

Reveal Solution Hide Solution
Correct Answer: B

The correct observability signal is B. groundedness evaluation metrics. In a RAG solution, the key diagnostic question is whether the generated answer is supported by the retrieved context. Microsoft Foundry's built-in evaluator reference defines Groundedness as the metric that measures how grounded the response is in the retrieved context, with scoring that indicates whether the model's claims are supported by the provided source material.

This matches the issue after a content update. If retrieved chunks are stale, misleading, incomplete, or poorly aligned with the user query, groundedness results can show that generated responses are not reliably supported by the retrieved documentation. The RAG evaluator guidance explains that groundedness focuses on whether the response avoids content outside the grounding context, while other process metrics such as retrieval evaluate how relevant the retrieved chunks are. Latency traces are useful for performance troubleshooting, not response accuracy. Indexer status can reveal ingestion failures, but it does not show whether retrieved content is influencing generated answers negatively. Prediction drift is a model monitoring concept and is not the primary signal for RAG grounding quality. Reference topics: Microsoft Foundry observability, RAG evaluators, groundedness, retrieved context, and response quality evaluation.



Unlock Premium AI-103 Exam Questions with Advanced Practice Test Features:
  • Select Question Types you want
  • Set your Desired Pass Percentage
  • Allocate Time (Hours : Minutes)
  • Create Multiple Practice tests with Limited Questions
  • Customer Support
Get Full Access Now

Save Cancel