CinePile 2.0 release uses adversarial refinement to strengthen long video QA data
AI Impact Summary
CinePile 2.0 introduces an adversarial refinement pipeline that iteratively mutates questions and answer choices to defeat a Deaf-Blind LLM, elevating dataset quality for long video QA. The process uses LLaMA 3.1 70B as the Deaf-Blind LLM and GPT-4 for question modification, with templates and prompts drawn from Gemini 1.0 Pro, Gemini, GPT-3.5, and Phi-1.5, achieving a 90.24% success in removing degenerate items. This reduces manual curation load and enables scalable improvement of datasets, informing future dataset creation pipelines and benchmarking with Hugging Face collaboration. The approach offers a practical path to higher-quality video QA benchmarks, accelerating model evaluation and data-driven development for vision-language systems.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info