IBM Granite 4.0 Nano released: edge LLMs at ~1.5B and 350M with hybrid-SSM; supports vLLM, llama.cpp, MLX
AI Impact Summary
Granite 4.0 Nano delivers the smallest models in the Granite 4.0 family, optimized for edge and on device use with a hybrid-SSM architecture. The release includes ~1.5B and 350M parameter variants (H 1B ~1.5B and H 350M, plus transformer versions) and native runtime support on vLLM, llama.cpp, and MLX, under Apache 2.0; benchmarks claim superior performance in knowledge, math, code, and safety for sub-2B models, with IFEval and BFCLv3 showing gains over similar sized peers. This positions IBM to address privacy-conscious, latency-sensitive workloads at the edge, but deployment will require hardware capacity checks and compatibility with existing edge stacks, plus governance considerations (ISO 42001).
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info