Issue with Merging LoRA in Qwen 2.5 (3B) GRPO #9

raising-heart · 2025-02-15T14:19:32Z

Hi,

I tested Qwen 2.5 (3B) with GRPO on Kaggle, and after merging using 16-bit, it seems like the LoRA adaptations are not applied properly. The output lacks reasoning compared to the fine-tuned model before merging.

Is there anyone who can help identify what might be going wrong?

Thanks!

Erland366 · 2025-02-15T19:44:58Z

Hello, we will check again but while doing that, can you elaborate of how you did the merging process and is it the exact same input? Sometimes people forgot to put the system prompt while inferencing

raising-heart · 2025-02-16T07:59:42Z

Thanks for checking.

For merging, I used:

model.save_pretrained_merged(new_model_local, tokenizer, save_method="merged_16bit")

After merging, I loaded the model and ran inference with the same input and system prompt as before merging. However, the reasoning part no longer appears in the output.

Let me know if I should test anything specific.

raising-heart · 2025-02-24T09:06:52Z

Hey, it's working now. I have it figured it out.

sarthak247 · 2025-02-24T23:25:16Z

I was going through the same notebook yesterday. Did you figure out what was causing it @raising-heart ? I tried it before merging but did not have a look at it after merging the model

raising-heart · 2025-02-25T15:08:39Z

I was going through the same notebook yesterday. Did you figure out what was causing it @raising-heart ? I tried it before merging but did not have a look at it after merging the model

What issue?

During the conversion/quantization set one that you selected from "False" to "True".

Try set the Prompt Format on LM studio like below:

"""Respond in the following format:
<reasoning>
...
</reasoning>
<answer>
...
</answer>
"""

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with Merging LoRA in Qwen 2.5 (3B) GRPO #9

Issue with Merging LoRA in Qwen 2.5 (3B) GRPO #9

raising-heart commented Feb 15, 2025

Erland366 commented Feb 15, 2025

raising-heart commented Feb 16, 2025

raising-heart commented Feb 24, 2025

sarthak247 commented Feb 24, 2025

raising-heart commented Feb 25, 2025 •

edited

Loading

Issue with Merging LoRA in Qwen 2.5 (3B) GRPO #9

Issue with Merging LoRA in Qwen 2.5 (3B) GRPO #9

Comments

raising-heart commented Feb 15, 2025

Erland366 commented Feb 15, 2025

raising-heart commented Feb 16, 2025

raising-heart commented Feb 24, 2025

sarthak247 commented Feb 24, 2025

raising-heart commented Feb 25, 2025 • edited Loading

raising-heart commented Feb 25, 2025 •

edited

Loading