Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval bug: Llama server <tool_call> is occasionally not parsed as json, and is in content rather than tool_calls #12256

Closed
jasonmcaffee opened this issue Mar 7, 2025 · 2 comments · Fixed by #12291

Comments

@jasonmcaffee
Copy link

Name and Version

llama-cli.exe --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
version: 4831 (5e43f104)
built with MSVC 19.39.33523.0 for x64

I'm using llama server exe that I compiled 3/6/25 from the master branch, using DLLAMA_CUDA:

cmake .. -DLLAMA_CUDA=ON

cmake --build . --config Release

Operating systems

Windows

GGML backends

CUDA

Hardware

RTX 3090

Models

Qwen2.5-7B-Instruct-1M-Q4_K_M.gguf

Problem description & steps to reproduce

Ocassionally, I see a tool_call that comes back as response.choices[0].message.content, rather than as response.choices[0].message.tool_calls.

Example of the issue:

{
  "role": "assistant",
  "content": "<tool_call>\n{\"name\": \"aiCreatePlan\", \"arguments\": {...}}}\n</tool_call>"
}

Example code with debugger and values:

Image

Example of a good result, with the same parameters & prompt

Image

First Bad Commit

No response

Relevant log output

const response = await openai.chat.completions.create({
        model: model.modelName,
        messages: openAiMessages,
        tools: aiFunctionContext.aiFunctionExecutor?.getToolsMetadata(),
        stream: false,
      }, { signal });

      const assistantMessage = response.choices[0].message;

      // Add the assistant's message to our conversation
      openAiMessages.push({
        role: 'assistant' as const,
        content: assistantMessage.content,
        tool_calls: assistantMessage.tool_calls
      });
      const toolCallsFromOpenAi = assistantMessage.tool_calls;
@jasonmcaffee
Copy link
Author

I tried parsing the <tool_call> value when this happens, and I found that the json inside the <tool_call> is invalid, due to error: "Expected ',' or '}' after property value in JSON at position"

@ochafik
Copy link
Collaborator

ochafik commented Mar 10, 2025

@jasonmcaffee thanks for reporting this!

I think this may be a regression from #12034, might get fixed with #12291

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants