InfoCapability

StarCoder2 released: open-code LLM family (3B/7B/15B) trained on The Stack v2

AI Impact Summary

StarCoder2 introduces an open-family of code LLMs (3B, 7B, 15B) trained on The Stack v2, featuring a 16k token context and a fill-in-the-middle objective. By releasing the models, datasets, and training code, BigCode enables teams to experiment with open-code LLMs using NVIDIA NeMo on NVIDIA-accelerated infrastructure and deploy via the Hugging Face Hub. The Stack v2’s repository-scoped context and broad language coverage expand applicability to diverse coding tasks, potentially accelerating development cycles for code-completion, synthesis, and tooling in enterprise environments.

Affected Systems

StarCoder2-3BStarCoder2-7B

Date: Date not specified
Change type: capability
Severity: info

StarCoder2 released: open-code LLM family (3B/7B/15B) trained on The Stack v2

More from Hugging Face

Get alerts for Hugging Face