StarCoder2-Instruct: Self-aligned, fully transparent code-model training pipeline
AI Impact Summary
StarCoder2-Instruct showcases a fully transparent, self-aligned pipeline that fine-tunes StarCoder2-15B to StarCoder2-15B-Instruct using self-generated instruction-response data and in-sandbox execution tests. The data pipeline draws seed Python functions from The Stack v1, produces 238k instructions, and yields about 50k high-quality SFT pairs after filtering, achieving 70+ HumanEval scores and outperforming several permissive baselines. This demonstrates that competitive code-generation instruction tuning is feasible without distilling from GPT-4 or other proprietary teachers, enabling license-friendly customization with auditable provenance. Business implications include faster, compliant deployment of code assistants and reduced vendor risk, while governance around data provenance and distribution effects remains important.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info