InfoCapability

Granite 4.1 LLMs: Multi-stage pre-training, 512K context, and SFT pipeline

AI Impact Summary

Granite 4.1 introduces decoder-only LLMs (3B, 8B, 30B) trained on ~15T tokens with a five-stage pre-training pipeline and a 512K token long-context extension. It emphasizes data quality through an LLM-as-Judge SFT framework and a GRPO/DAPO-based RL loop to strengthen math, coding, and instruction-following, aiming for better performance at smaller scales (notably the 8B model). The combination of long-context capability and rigorous data/method pipelines suggests meaningful gains for enterprise workloads requiring extended interactions and reliable reasoning, while the Apache 2.0 licensing eases integration into customer environments. Operators should plan for supporting long-context workloads and integrating the SFT/LMM-as-Judge data quality pipeline within their deployment and retrieval stacks.

Affected Systems

Granite 4.1 3B Dense

Date: Date not specified
Change type: capability
Severity: info

Granite 4.1 LLMs: Multi-stage pre-training, 512K context, and SFT pipeline

More from Hugging Face

Get alerts for Hugging Face