Welcome to the gpt-oss series, OpenAI's open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. The gpt-oss-20b model is a 21-billion-parameter variant with 3.6B active parameters, optimized for lower latency and local or specialized use cases. It can run within 16GB of memory thanks to MXFP4 quantization of the MoE weights. The model was trained on OpenAI's harmony response format and features configurable reasoning effort levels (low, medium, high) to balance speed and analysis depth. Key capabilities include full chain-of-thought reasoning for debugging and transparency, agentic capabilities like function calling and web browsing, and support for structured outputs. The model is released under a permissive Apache 2.0 license, is fully fine-tunable for specialized use cases, and can be deployed via multiple inference frameworks including Transformers, vLLM, and Ollama.
Welcome to the gpt-oss series, OpenAI's open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. The gpt-oss-20b model is a 21-billion-parameter variant with 3.6B active parameters, optimized for lower latency and local or specialized use cases. It can run within 16GB of memory thanks to MXFP4 quantization of the MoE weights. The model was trained on OpenAI's harmony response format and features configurable reasoning effort levels (low, medium, high) to balance speed and analysis depth. Key capabilities include full chain-of-thought reasoning for debugging and transparency, agentic capabilities like function calling and web browsing, and support for structured outputs. The model is released under a permissive Apache 2.0 license, is fully fine-tunable for specialized use cases, and can be deployed via multiple inference frameworks including Transformers, vLLM, and Ollama.