InfoCapability

OpenAI unveils GPT OSS open-source model family (gpt-oss-120b/20b) with MXFP4 4-bit MoE under Apache 2.0

AI Impact Summary

OpenAI has released GPT OSS, a new open-source MoE model family with gpt-oss-120b and gpt-oss-20b using MXFP4 4-bit quantization. The 20B model runs on consumer GPUs with ~16 GB RAM, while the 120B variant requires ~80 GB, enabling on-device or private deployments via Hugging Face Inference Providers and the OpenAI-compatible Responses API. Realizing this in production will require a capable software stack (transformers v4.55+, vLLM, llama.cpp, ollama) and optional acceleration kernels (Flash Attention 3, Triton, kernels-community) and careful consideration of licensing (Apache 2.0) and MoE routing performance.

Affected Systems

gpt-oss-120bgpt-oss-20b

Date: Date not specified
Change type: capability
Severity: info

OpenAI unveils GPT OSS open-source model family (gpt-oss-120b/20b) with MXFP4 4-bit MoE under Apache 2.0

More from Hugging Face

Get alerts for Hugging Face