Responses API & Built-in Tools
Responses API
GPT‑OSS is compatible with the OpenAI Responses API. Useclient.responses.create(...) with model, instructions, input, and optional tools to enable built‑in tool use. You can set reasoning level via instructions, e.g., “Reasoning: high” (also supports “medium” and “low”) — levels: low (fast), medium (balanced), high (deep).
Built-in Tools
GPT‑OSS can call built‑in tools for web search and Python execution. You can use the demo tool server or connect to external MCP tool servers.Python Tool
- Executes short Python snippets for calculations, parsing, and quick scripts.
- By default runs in a Docker-based sandbox. To run on the host, set
PYTHON_EXECUTION_BACKEND=UV(this executes model-generated code locally; use with care). - Ensure Docker is available if you are not using the UV backend. It is recommended to run
docker pull python:3.11in advance.
Web Search Tool
- Uses the Exa backend for web search.
- Requires an Exa API key; set
EXA_API_KEYin your environment. Create a key athttps://exa.ai.
Tool & Reasoning Parser
- We support OpenAI Reasoning and Tool Call parser, as well as our SGLang native api for tool call and reasoning. Refer to reasoning parser and tool call parser for more details.
Notes
- Use Python 3.12 for the demo tools. And install the required
gpt-osspackages. - The default demo integrates the web search tool (Exa backend) and a demo Python interpreter via Docker.
- For search, set
EXA_API_KEY. For Python execution, either have Docker available or setPYTHON_EXECUTION_BACKEND=UV.
Command
Command
Command
Speculative Decoding
SGLang supports speculative decoding for GPT-OSS models using EAGLE3 algorithm. This can significantly improve decoding speed, especially for small batch sizes. Usage: Add--speculative-algorithm EAGLE3 along with the draft model path.
Command
Quick Demo
Example
Output
