GPT OSS Usage

Please refer to #8833.

Responses API & Built-in Tools

Responses API

GPT‑OSS is compatible with the OpenAI Responses API. Use client.responses.create(...) with model, instructions, input, and optional tools to enable built‑in tool use. You can set reasoning level via instructions, e.g., “Reasoning: high” (also supports “medium” and “low”) — levels: low (fast), medium (balanced), high (deep).

Built-in Tools

GPT‑OSS can call built‑in tools for web search and Python execution. You can use the demo tool server or connect to external MCP tool servers.

Python Tool

Executes short Python snippets for calculations, parsing, and quick scripts.
By default runs in a Docker-based sandbox. To run on the host, set PYTHON_EXECUTION_BACKEND=UV (this executes model-generated code locally; use with care).
Ensure Docker is available if you are not using the UV backend. It is recommended to run docker pull python:3.11 in advance.

Web Search Tool

Uses the Exa backend for web search.
Requires an Exa API key; set EXA_API_KEY in your environment. Create a key at https://exa.ai.

Tool & Reasoning Parser

We support OpenAI Reasoning and Tool Call parser, as well as our SGLang native api for tool call and reasoning. Refer to reasoning parser and tool call parser for more details.

Notes

Use Python 3.12 for the demo tools. And install the required gpt-oss packages.
The default demo integrates the web search tool (Exa backend) and a demo Python interpreter via Docker.
For search, set EXA_API_KEY. For Python execution, either have Docker available or set PYTHON_EXECUTION_BACKEND=UV.

Examples:

Command

export EXA_API_KEY=YOUR_EXA_KEY
# Optional: run Python tool locally instead of Docker (use with care)
export PYTHON_EXECUTION_BACKEND=UV

Launch the server with the demo tool server:

Command

python3 -m sglang.launch_server \
  --model-path openai/gpt-oss-120b \
  --tool-server demo \
  --tp 2

For production usage, sglang can act as an MCP client for multiple services. An example tool server is provided. Start the servers and point sglang to them:

Command

mcp run -t sse browser_server.py:mcp
mcp run -t sse python_server.py:mcp

python -m sglang.launch_server ... --tool-server ip-1:port-1,ip-2:port-2

The URLs should be MCP SSE servers that expose server information and well-documented tools. These tools are added to the system prompt so the model can use them.

Speculative Decoding

SGLang supports speculative decoding for GPT-OSS models using EAGLE3 algorithm. This can significantly improve decoding speed, especially for small batch sizes. Usage: Add --speculative-algorithm EAGLE3 along with the draft model path.

Command

python3 -m sglang.launch_server \
  --model-path openai/gpt-oss-120b \
  --speculative-algorithm EAGLE3 \
  --speculative-draft-model-path lmsys/EAGLE3-gpt-oss-120b-bf16 \
  --tp 2

To enable the experimental overlap scheduler for EAGLE3 speculative decoding, set the environment variable SGLANG_ENABLE_SPEC_V2=1. This can improve performance by enabling overlap scheduling between draft and verification stages.

Quick Demo

Example

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:30000/v1",
    api_key="sk-123456"
)

tools = [
    {"type": "code_interpreter"},
    {"type": "web_search_preview"},
]

# Reasoning level example
response = client.responses.create(
    model="openai/gpt-oss-120b",
    instructions="You are a helpful assistant."
    reasoning_effort="high" # Supports high, medium, or low
    input="In one sentence, explain the transformer architecture.",
)
print("====== reasoning: high ======")
print(response.output_text)

# Test python tool
response = client.responses.create(
    model="openai/gpt-oss-120b",
    instructions="You are a helfpul assistant, you could use python tool to execute code.",
    input="Use python tool to calculate the sum of 29138749187 and 29138749187", # 58,277,498,374
    tools=tools
)
print("====== test python tool ======")
print(response.output_text)

# Test browser tool
response = client.responses.create(
    model="openai/gpt-oss-120b",
    instructions="You are a helfpul assistant, you could use browser to search the web",
    input="Search the web for the latest news about Nvidia stock price",
    tools=tools
)
print("====== test browser tool ======")
print(response.output_text)

Example output:

Output

====== test python tool ======
The sum of 29,138,749,187 and 29,138,749,187 is **58,277,498,374**.
====== test browser tool ======
**Recent headlines on Nvidia (NVDA) stock**

<table style={{width: "100%", borderCollapse: "collapse", tableLayout: "fixed"}}>
  <colgroup>
    <col style={{width: "25%"}} />
    <col style={{width: "25%"}} />
    <col style={{width: "25%"}} />
    <col style={{width: "25%"}} />
  </colgroup>
  <thead>
    <tr style={{borderBottom: "2px solid #d55816"}}>
      <th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.02)"}}>Date (2025)</th>
      <th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.05)"}}>Source</th>
      <th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.02)"}}>Key news points</th>
      <th style={{textAlign: "left", padding: "10px 12px", fontWeight: 700, whiteSpace: "nowrap", backgroundColor: "rgba(255,255,255,0.05)"}}>Stock‑price detail</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style={{padding: "9px 12px", fontWeight: 500, backgroundColor: "rgba(255,255,255,0.02)"}}>**May 13**</td>
      <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>Reuters</td>
      <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.02)"}}>The market data page shows Nvidia trading “higher” at **$116.61** with no change from the previous close.</td>
      <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>**$116.61** – latest trade (delayed ≈ 15 min)【14†L34-L38】</td>
    </tr>
    <tr>
      <td style={{padding: "9px 12px", fontWeight: 500, backgroundColor: "rgba(255,255,255,0.02)"}}>**Aug 18**</td>
      <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>CNBC</td>
      <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.02)"}}>Morgan Stanley kept an **overweight** rating and lifted its price target to **$206** (up from $200), implying a 14 % upside from the Friday close. The firm notes Nvidia shares have already **jumped 34 % this year**.</td>
      <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>No exact price quoted, but the article signals strong upside expectations【9†L27-L31】</td>
    </tr>
    <tr>
      <td style={{padding: "9px 12px", fontWeight: 500, backgroundColor: "rgba(255,255,255,0.02)"}}>**Aug 20**</td>
      <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>The Motley Fool</td>
      <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.02)"}}>Nvidia is set to release its Q2 earnings on Aug 27. The article lists the **current price of $175.36**, down 0.16 % on the day (as of 3:58 p.m. ET).</td>
      <td style={{padding: "9px 12px", backgroundColor: "rgba(255,255,255,0.05)"}}>**$175.36** – current price on Aug 20【10†L12-L15】【10†L53-L57】</td>
    </tr>
  </tbody>
</table>

**What the news tells us**

* Nvidia’s share price has risen sharply this year – up roughly a third according to Morgan Stanley – and analysts are still raising targets (now $206).
* The most recent market quote (Reuters, May 13) was **$116.61**, but the stock has surged since then, reaching **$175.36** by mid‑August.
* Upcoming earnings on **Aug 27** are a focal point; both the Motley Fool and Morgan Stanley expect the results could keep the rally going.

**Bottom line:** Nvidia’s stock is on a strong upward trajectory in 2025, with price targets climbing toward $200‑$210 and the market price already near $175 as of late August.

Basic Usage

Advanced Features

Supported Models

Developer Guide

References

Responses API & Built-in Tools

Responses API

Built-in Tools

Python Tool

Web Search Tool

Tool & Reasoning Parser

Notes

Speculative Decoding

Quick Demo

Basic Usage

Advanced Features

Supported Models

Developer Guide

References

​Responses API & Built-in Tools

​Responses API

​Built-in Tools

​Python Tool

​Web Search Tool

​Tool & Reasoning Parser

​Notes

​Speculative Decoding

​Quick Demo

Responses API & Built-in Tools

Responses API

Built-in Tools

Python Tool

Web Search Tool

Tool & Reasoning Parser

Notes

Speculative Decoding

Quick Demo