InfoCapability

OpenAI releases GPT OSS open-source model family (120B/20B) with MXFP4 MoE and 4-bit quantization

AI Impact Summary

OpenAI has released GPT OSS, a new open-source model family comprising gpt-oss-120b and gpt-oss-20b, both MoEs with 4-bit MXFP4 quantization designed for fast inference and low memory usage. This enables private or on-device deployments via diverse tooling, including Hugging Face Inference Providers, the OpenAI-compatible Responses API, and local ecosystems (transformers, vLLM, llama.cpp, ollama). The Apache 2.0 license and minimal usage policy open up experimentation and distribution, but impose compliance considerations for deployments and governance. Technical teams should plan for GPU provisioning (80 GB for 120B, 16 GB for 20B), integration with existing inference pipelines, and validation of 4-bit MXFP4 weight loading and associated kernels across supported runtimes.

Affected Systems

gpt-oss-120bgpt-oss-20b

Date: Date not specified
Change type: capability
Severity: info

OpenAI releases GPT OSS open-source model family (120B/20B) with MXFP4 MoE and 4-bit quantization

More from Hugging Face

Get alerts for Hugging Face