Active Learning with AutoNLP and Prodigy for NER and text classification
AI Impact Summary
Describes a repeatable active-learning workflow that couples Prodigy-based labeling with AutoNLP training to continuously improve NLP models, including token classification/NER and multi-class classification. The example demonstrates moving from ~86% accuracy with ~20 samples to ~95.9% with ~250 samples, showing value of incremental labeling and automated hyperparameter tuning. The integration requires exporting Prodigy annotations to AutoNLP format and managing the data flow between labeling, preprocessing, and model retraining, with attention to cost and data privacy in production. This approach accelerates time-to-value for production NLP workloads but introduces pipeline complexity and ongoing labeling costs.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info