NVIDIA NCP-AAI Exam - Topic 4 Question 4 Discussion

Question

NVIDIA NCP-AAI Exam - Topic 4 Question 4 Discussion

In your RAG deployment, you've identified a performance bottleneck in the retrieval phase -- specifically, the time it takes to access the vector database.Which of the following optimization strategies is most aligned with micro-service best practices, considering your RAG architecture?

A) Implement a ''cache-and-check'' mechanism where the retrieval microservice immediately returns the first matching chunk, regardless of relevance.

B) Increase the size of the LLM model itself, because it will automatically accelerate the overall response time.

D) Optimize the LLM prompt to be shorter and more concise, significantly reducing the computational load.

Accepted Answer

C) Introduce a dedicated service responsible solely for querying the vector database and returning relevant chunks.

NVIDIA NCP-AAI Exam - Topic 4 Question 4 Discussion

NVIDIA NCP-AAI Exam - Topic 4 Question 4 Discussion

Contribute your Thoughts:

Aretha

Alfreda

Vilma

Muriel