You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
llama-cli.exe --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
version: 4831 (5e43f104)
built with MSVC 19.39.33523.0 for x64
I'm using llama server exe that I compiled 3/6/25 from the master branch, using DLLAMA_CUDA:
I tried parsing the <tool_call> value when this happens, and I found that the json inside the <tool_call> is invalid, due to error: "Expected ',' or '}' after property value in JSON at position"
Name and Version
I'm using llama server exe that I compiled 3/6/25 from the master branch, using DLLAMA_CUDA:
Operating systems
Windows
GGML backends
CUDA
Hardware
RTX 3090
Models
Qwen2.5-7B-Instruct-1M-Q4_K_M.gguf
Problem description & steps to reproduce
Ocassionally, I see a tool_call that comes back as response.choices[0].message.content, rather than as response.choices[0].message.tool_calls.
Example of the issue:
Example code with debugger and values:
Example of a good result, with the same parameters & prompt
First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: