Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Merging LoRA in Qwen 2.5 (3B) GRPO #9

Open
raising-heart opened this issue Feb 15, 2025 · 5 comments
Open

Issue with Merging LoRA in Qwen 2.5 (3B) GRPO #9

raising-heart opened this issue Feb 15, 2025 · 5 comments

Comments

@raising-heart
Copy link

Hi,

I tested Qwen 2.5 (3B) with GRPO on Kaggle, and after merging using 16-bit, it seems like the LoRA adaptations are not applied properly. The output lacks reasoning compared to the fine-tuned model before merging.

Is there anyone who can help identify what might be going wrong?

Thanks!

@Erland366
Copy link
Collaborator

Hello, we will check again but while doing that, can you elaborate of how you did the merging process and is it the exact same input? Sometimes people forgot to put the system prompt while inferencing

@raising-heart
Copy link
Author

Thanks for checking.

For merging, I used:

model.save_pretrained_merged(new_model_local, tokenizer, save_method="merged_16bit")

After merging, I loaded the model and ran inference with the same input and system prompt as before merging. However, the reasoning part no longer appears in the output.

Let me know if I should test anything specific.

@raising-heart
Copy link
Author

Hey, it's working now. I have it figured it out.

@sarthak247
Copy link

I was going through the same notebook yesterday. Did you figure out what was causing it @raising-heart ? I tried it before merging but did not have a look at it after merging the model

@raising-heart
Copy link
Author

raising-heart commented Feb 25, 2025

I was going through the same notebook yesterday. Did you figure out what was causing it @raising-heart ? I tried it before merging but did not have a look at it after merging the model

What issue?

During the conversion/quantization set one that you selected from "False" to "True".

Try set the Prompt Format on LM studio like below:

"""Respond in the following format:
<reasoning>
...
<
/reasoning>
<answer>
...
<
/answer>
"""

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants