Skip to main content
SGLang supports model families across text generation, retrieval, and reward workflows. Browse the sections below for the primary product paths and jump to the detail pages when you are ready to explore a specific class.

Text generation

LLM-card

Large language models

Production-tuned Llama and Qwen families validated for high-throughput serving.
VLM-card

Vision language models

Vision-text hybrids that stay responsive on multi-GPU setups.
dLLM-card

Diffusion language models

Score-based and diffusion backbones for structured text generation workflows.

Retrieval and ranking

Embedding-card

Embedding models

Dense and sparse embeddings optimized with FlashInfer kernels.
Rerank-card

Rerank models

Low-latency rerankers for multi-stage retrieval pipelines.
Classification-card

Classification models

Lightweight classifiers covering safety, intent, and context filters.

Specialized models

Reward-card

Reward models

RLHF and reward scoring pipelines optimized for production latency.