Documentation Index
Fetch the complete documentation index at: /llms.txt
Use this file to discover all available pages before exploring further.

Skip to main content

SGLang Documentation home page

SGLang Diffusion

Basic Usage

Basic Usage
Anthropic-Compatible API
Ollama-Compatible API
Offline Engine API
SGLang Native APIs
Sampling Parameters

Advanced Features

Supported Models

Supported models

Developer Guide

Developer Guide
Contribution Guide
Evaluating New Models with SGLang
MSProbe Debugging Guide

References

References
Troubleshooting and Frequently Asked Questions
Environment Variables
Production Metrics
Production Request Tracing
Custom Chat Template
Post-Training Integration
Nightly precision regression

Advanced Features

Advanced Features

Advanced configuration, optimization, and deployment features for SGLang.

Server Arguments
Hyperparameter Tuning
Attention Backend
Speculative Decoding
Structured Outputs
Quantization
Expert Parallelism
LoRA
PD Disaggregation
Pipeline Parallelism
HiCache
Observability
And more…

Was this page helpful?

Sampling Parameters

Server Arguments

⌘I

github x linkedin slack discord

Powered byThis documentation is built and hosted on Mintlify, a developer documentation platform

Assistant

Responses are generated using AI and may contain mistakes.