NOTE: There are two chat template systems in SGLang project. This document is about setting a custom chat template for the OpenAI-compatible API server (defined at conversation.py). It is NOT related to the chat template used in the SGLang language frontend (defined at chat_template.py). By default, the server uses the chat template specified in the model tokenizer from Hugging Face. It should just work for most official models such as Llama-2/Llama-3. If needed, you can also override the chat template when launching the server:Documentation Index
Fetch the complete documentation index at: https://docs.sglang.io/llms.txt
Use this file to discover all available pages before exploring further.
Command
JSON Format
You can load the JSON format, which is defined byconversation.py.
Config
Command
Jinja Format
You can also use the Jinja template format as defined by Hugging Face Transformers.Command
