SGLang supports model families across text generation, retrieval, and reward workflows. Browse the sections below for the primary product paths and jump to the detail pages when you are ready to explore a specific class.Documentation Index
Fetch the complete documentation index at: https://docs.sglang.io/llms.txt
Use this file to discover all available pages before exploring further.
Text generation

Large language models
Production-tuned Llama and Qwen families validated for high-throughput
serving.

Vision language models
Vision-text hybrids that stay responsive on multi-GPU setups.

Diffusion language models
Score-based and diffusion backbones for structured text generation
workflows.
Retrieval and ranking

Embedding models
Dense and sparse embeddings optimized with FlashInfer kernels.

Rerank models
Low-latency rerankers for multi-stage retrieval pipelines.

Classification models
Lightweight classifiers covering safety, intent, and context filters.
Specialized models

Reward models
RLHF and reward scoring pipelines optimized for production latency.
