NVIDIA NCA-GENL Exam - Topic 3 Question 13 Discussion

Actual exam question for NVIDIA's NCA-GENL exam

Question #: 13
Topic #: 3

In the context of evaluating a fine-tuned LLM for a text classification task, which experimental design technique ensures robust performance estimation when dealing with imbalanced datasets?

ASingle hold-out validation with a fixed test set.

BStratified k-fold cross-validation.

CBootstrapping with random sampling.

DGrid search for hyperparameter tuning.

Show Suggested Answer

Suggested Answer: B

Stratified k-fold cross-validation is a robust experimental design technique for evaluating machine learning models, especially on imbalanced datasets. It divides the dataset into k folds while preserving the class distribution in each fold, ensuring that the model is evaluated on representative samples of all classes. NVIDIA's NeMo documentation on model evaluation recommends stratified cross-validation for tasks like text classification to obtain reliable performance estimates, particularly when classes are unevenly distributed (e.g., in sentiment analysis with few negative samples). Option A (single hold-out) is less robust, as it may not capture class imbalance. Option C (bootstrapping) introduces variability and is less suitable for imbalanced data. Option D (grid search) is for hyperparameter tuning, not performance estimation.

NVIDIA NeMo Documentation: https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/model_finetuning.html

by Glory at Nov 21, 2025, 03:12 PM

Limited Time Offer

25%

Off

Get Premium NCA-GENL Questions as Interactive Web-Based Practice Test or PDF

Contribute your Thoughts:

Submit Cancel

Silva

1 month ago

B gives a better estimate of model performance overall.

upvoted 0 times

...

Holley

2 months ago

C could work, but B is more reliable for this task.

upvoted 0 times

...

Laurel

2 months ago

A is too risky. Fixed test set can mislead results.

upvoted 0 times

...

Troy

2 months ago

Agreed, B ensures each fold has the same distribution.

upvoted 0 times

...

Detra

2 months ago

B is definitely the best choice here.

upvoted 0 times

...

Elfriede

2 months ago

Surprised that people still use single hold-out validation!

upvoted 0 times

...

Leila

3 months ago

I thought bootstrapping was better for this?

upvoted 0 times

...

Johnna

3 months ago

Totally agree, it helps with imbalanced data!

upvoted 0 times

...

Elise

3 months ago

D) Grid search? More like grid snooze-fest. I'll take B) for the win!

upvoted 0 times

...

Danica

3 months ago

I'd go with C) Bootstrapping. It's like rolling the dice for your dataset!

upvoted 0 times

...

Kimbery

4 months ago

Definitely B. Stratified k-fold ensures the test sets are representative of the class distribution.

upvoted 0 times

...

Colton

4 months ago

I vaguely recall that single hold-out validation can lead to misleading results with imbalanced data, so I doubt it's the right choice here.

upvoted 0 times

...

Cherilyn

4 months ago

I’m leaning towards option B because it ensures that each fold has a representative distribution of classes, which seems crucial for evaluation.

upvoted 0 times

...

Gearldine

4 months ago

I practiced a similar question, and I feel like bootstrapping could also help, but it might not be as effective as stratified k-fold for this specific scenario.

upvoted 0 times

...

Whitley

4 months ago

I think I remember that stratified k-fold cross-validation is often recommended for imbalanced datasets, but I'm not entirely sure why it's better than the others.

upvoted 0 times

...