Skip to content

Commit a87e007

Browse files
authored
server : add min_p param (#3877)
* Update server.cpp with min_p after it was introduced in ggml-org/llama.cpp#3841 * Use spaces instead of tabs * Update index.html.hpp after running deps.sh * Fix test - fix line ending
1 parent a137c38 commit a87e007

File tree

4 files changed

+2191
-2171
lines changed

4 files changed

+2191
-2171
lines changed

Diff for: examples/server/README.md

+2
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,8 @@ node index.js
122122

123123
`top_p`: Limit the next token selection to a subset of tokens with a cumulative probability above a threshold P (default: 0.95).
124124

125+
`min_p`: The minimum probability for a token to be considered, relative to the probability of the most likely token (default: 0.05).
126+
125127
`n_predict`: Set the maximum number of tokens to predict when generating text. **Note:** May exceed the set limit slightly if the last token is a partial multibyte character. When 0, no tokens will be generated but the prompt is evaluated into the cache. (default: -1, -1 = infinity).
126128

127129
`n_keep`: Specify the number of tokens from the prompt to retain when the context size is exceeded and tokens need to be discarded.

0 commit comments

Comments
 (0)