New Year Sale 2026! Hurry Up, Grab the Special Discount - Save 25% - Ends In 00:00:00 Coupon code: SAVE25
Welcome to Pass4Success

- Free Preparation Discussions

Databricks Certified Generative AI Engineer Associate Exam - Topic 5 Question 4 Discussion

Actual exam question for Databricks's Databricks Certified Generative AI Engineer Associate exam
Question #: 4
Topic #: 5
[All Databricks Certified Generative AI Engineer Associate Questions]

A Generative AI Engineer developed an LLM application using the provisioned throughput Foundation Model API. Now that the application is ready to be deployed, they realize their volume of requests are not sufficiently high enough to create their own provisioned throughput endpoint. They want to choose a strategy that ensures the best cost-effectiveness for their application.

What strategy should the Generative AI Engineer use?

Show Suggested Answer Hide Answer
Suggested Answer: B

Problem Context: The engineer needs a cost-effective deployment strategy for an LLM application with relatively low request volume.

Explanation of Options:

Option A: Switching to external models may not provide the required control or integration necessary for specific application needs.

Option B: Using a pay-per-token model is cost-effective, especially for applications with variable or low request volumes, as it aligns costs directly with usage.

Option C: Changing to a model with fewer parameters could reduce costs, but might also impact the performance and capabilities of the application.

Option D: Manually throttling requests is a less efficient and potentially error-prone strategy for managing costs.

Option B is ideal, offering flexibility and cost control, aligning expenses directly with the application's usage patterns.


Contribute your Thoughts:

0/2000 characters
Karan
3 months ago
Wow, I didn't know throttling could be a strategy!
upvoted 0 times
...
Tonette
3 months ago
C could help with performance, but is it really necessary?
upvoted 0 times
...
Jacqueline
3 months ago
Not sure about B, what if the usage spikes later?
upvoted 0 times
...
Rory
4 months ago
I agree, pay-per-token is a smart move!
upvoted 0 times
...
Teri
4 months ago
B seems like the best option for cost-effectiveness.
upvoted 0 times
...
Nobuko
4 months ago
Changing to a model with fewer parameters sounds like it could help, but I’m not convinced it directly addresses the cost issue.
upvoted 0 times
...
Leana
4 months ago
I feel like we practiced a similar question where throttling requests was mentioned, but I don't think that's the best strategy for cost-effectiveness.
upvoted 0 times
...
Joseph
4 months ago
I'm not entirely sure, but switching to external models seems like it could save costs too. I need to think about the trade-offs.
upvoted 0 times
...
Hoa
5 months ago
I remember we discussed the cost implications of different throughput options in class. I think pay-per-token might be the best choice here.
upvoted 0 times
...
Nichelle
5 months ago
I'm not sure about Option A. Switching to external models might not be the best solution if the current model is working well.
upvoted 0 times
...
Colene
5 months ago
I'm leaning towards Option B. The pay-per-token throughput seems like it would provide the most cost guarantees, which is important for this application.
upvoted 0 times
...
Sina
5 months ago
I think the key here is to find a way to reduce the hardware constraints and still maintain the performance of the model. Option C seems like the best approach to me.
upvoted 0 times
...
Han
5 months ago
Hmm, I'm a bit confused by the question. I'm not sure which strategy would be the most cost-effective.
upvoted 0 times
...
Eric
5 months ago
This seems like a tricky question. I'll need to think through the different options carefully.
upvoted 0 times
...
Fernanda
1 year ago
Ah, the joys of scaling AI applications. I'd say option B is the way to go, but maybe they should also consider a backup plan just in case. You know, like a Plan B.
upvoted 0 times
...
Albina
1 year ago
I bet this Generative AI Engineer wishes they had a crystal ball to see the future. Oh well, I'd go with option B and cross my fingers.
upvoted 0 times
Berry
1 year ago
Delbert: Let's hope it works out for the Generative AI Engineer.
upvoted 0 times
...
Rosendo
1 year ago
User 3: Agreed, it's better to have cost guarantees.
upvoted 0 times
...
Delbert
1 year ago
User 2: Yeah, pay-per-token throughput seems like a good choice.
upvoted 0 times
...
Rosita
1 year ago
User 1: I think option B sounds like a safe bet.
upvoted 0 times
...
...
Jolanda
1 year ago
Manually throttling the requests? That's just asking for trouble. Option D seems like a band-aid solution to me.
upvoted 0 times
Reed
1 year ago
D: Changing to a model with fewer parameters might help reduce hardware constraint issues as well.
upvoted 0 times
...
Maryann
1 year ago
B: Maybe switching to External Models would be a better long-term strategy.
upvoted 0 times
...
Madelyn
1 year ago
C: Deploying the model using pay-per-token throughput could also be a cost-effective option.
upvoted 0 times
...
Soledad
1 year ago
A: I agree, manually throttling requests is not a sustainable solution.
upvoted 0 times
...
...
Oliva
1 year ago
Hmm, I'm not so sure about that. Reducing the number of parameters might be a better idea to avoid hardware constraints. Option C looks promising.
upvoted 0 times
...
Rolande
1 year ago
I think option B is the way to go. Pay-per-token throughput sounds like a good cost-effective solution for this scenario.
upvoted 0 times
Edelmira
1 year ago
I think it depends on the specific needs of the application and the budget constraints.
upvoted 0 times
...
Freeman
1 year ago
But wouldn't switching to External Models be a better long-term solution?
upvoted 0 times
...
Fannie
1 year ago
I agree, option B seems like the most cost-effective choice.
upvoted 0 times
...
...
Tasia
1 year ago
I disagree, I believe deploying the model using pay-per-token throughput would be more cost-effective in the long run.
upvoted 0 times
...
Yolando
1 year ago
I think the best strategy would be to switch to using External Models instead.
upvoted 0 times
...

Save Cancel