In the context of evaluating a fine-tuned LLM for a text classification task, which experimental design technique ensures robust performance estimation when dealing with imbalanced datasets?
Stratified k-fold cross-validation is a robust experimental design technique for evaluating machine learning models, especially on imbalanced datasets. It divides the dataset into k folds while preserving the class distribution in each fold, ensuring that the model is evaluated on representative samples of all classes. NVIDIA's NeMo documentation on model evaluation recommends stratified cross-validation for tasks like text classification to obtain reliable performance estimates, particularly when classes are unevenly distributed (e.g., in sentiment analysis with few negative samples). Option A (single hold-out) is less robust, as it may not capture class imbalance. Option C (bootstrapping) introduces variability and is less suitable for imbalanced data. Option D (grid search) is for hyperparameter tuning, not performance estimation.
NVIDIA NeMo Documentation: https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/model_finetuning.html
Silva
1 month agoHolley
2 months agoLaurel
2 months agoTroy
2 months agoDetra
2 months agoElfriede
2 months agoLeila
3 months agoJohnna
3 months agoElise
3 months agoDanica
3 months agoKimbery
4 months agoColton
4 months agoCherilyn
4 months agoGearldine
4 months agoWhitley
4 months agoDacia
4 months agoGlenna
5 months agoJoseph
5 months agoRocco
5 months agoTerrilyn
5 months agoAlease
5 months agoZita
6 months agoBuddy
6 months agoFrederica
6 months agoMalissa
6 months agoEugene
16 days agoChau
21 days agoHuey
26 days agoKris
1 month agoDortha
1 month ago