Key features
- Broad model support: Wan series, FastWan series, Hunyuan, Qwen-Image, Qwen-Image-Edit, Flux, Z-Image, GLM-Image, and more
- Fast inference: optimized kernels, efficient scheduler loop, and Cache-DiT acceleration
- Ease of use: OpenAI-compatible API, CLI, and Python SDK
- Multi-platform: NVIDIA GPUs (H100, H200, A100, B200, 4090), AMD GPUs (MI300X, MI325X), and Ascend NPU (A2, A3)
Quick start
- Install SGLang Diffusion
- Run a one-off generation
- Serve with the OpenAI-compatible API
