Skip to content

Commit 5a9770a

Browse files
authored
Improve documentation for server chat formats (ggml-org#934)
1 parent b8f29f4 commit 5a9770a

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

README.md

+9
Original file line numberDiff line numberDiff line change
@@ -177,6 +177,15 @@ Navigate to [http://localhost:8000/docs](http://localhost:8000/docs) to see the
177177
To bind to `0.0.0.0` to enable remote connections, use `python3 -m llama_cpp.server --host 0.0.0.0`.
178178
Similarly, to change the port (default is 8000), use `--port`.
179179

180+
You probably also want to set the prompt format. For chatml, use
181+
182+
```bash
183+
python3 -m llama_cpp.server --model models/7B/llama-model.gguf --chat_format chatml
184+
```
185+
186+
That will format the prompt according to how model expects it. You can find the prompt format in the model card.
187+
For possible options, see [llama_cpp/llama_chat_format.py](llama_cpp/llama_chat_format.py) and look for lines starting with "@register_chat_format".
188+
180189
## Docker image
181190

182191
A Docker image is available on [GHCR](https://ghcr.io/abetlen/llama-cpp-python). To run the server:

0 commit comments

Comments
 (0)