Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deepseek r1 微调后我应该怎么加载lora参数推理呢 #7185

Open
1 task done
joyyyhuang opened this issue Mar 6, 2025 · 3 comments
Open
1 task done

deepseek r1 微调后我应该怎么加载lora参数推理呢 #7185

joyyyhuang opened this issue Mar 6, 2025 · 3 comments
Labels
bug Something isn't working pending This problem is yet to be addressed

Comments

@joyyyhuang
Copy link

Reminder

  • I have read the above rules and searched the existing issues.

System Info

  • llamafactory version: 0.9.2.dev0
  • Platform: Linux-5.4.241-1-tlinux4-0017.7-x86_64-with-glibc2.28
  • Python version: 3.12.9
  • PyTorch version: 2.5.1+cu124 (GPU)
  • Transformers version: 4.48.3
  • Datasets version: 3.2.0
  • Accelerate version: 1.2.1
  • PEFT version: 0.12.0
  • TRL version: 0.9.6
  • GPU type: NVIDIA H20
  • GPU number: 8
  • GPU memory: 95.00GB
  • DeepSpeed version: 0.16.4
  • vLLM version: 0.7.2

Reproduction

我尝试了

  1. 使用 vllm serve 加载 lora 权重
vllm serve /root/modelzoo/DeepSeek-R1-BF16 \
    --tensor-parallel-size 8 \
    --pipeline-parallel-size 2 \
    --trust-remote-code \
    --max-num-seqs 16 \
    --max-model-len 16384 \
    --enable-lora \
    --lora-modules lora1=LLaMA-Factory/saves/deepseek-r1

结果报错
Image
2. 使用 llamafactory export 合并lora权重,下面是我使用的配置文件

### Note: DO NOT use quantized model or quantization_bit when merging lora adapters

### model
model_name_or_path: /root/modelzoo/DeepSeek-R1-BF16
adapter_name_or_path: saves/deepseek-r1
template: deepseek3
trust_remote_code: true

### export
export_dir: output/DeepSeek-R1-SFT
export_size: 5
export_device: cpu
export_legacy_format: false 

但是速度比较慢,我想问下这里的 export_size=5 的具体含义,以及部署训练后 R1 的正确方法。

Others

No response

@joyyyhuang joyyyhuang added bug Something isn't working pending This problem is yet to be addressed labels Mar 6, 2025
@hiyouga
Copy link
Owner

hiyouga commented Mar 6, 2025

export_size 是导出每个分块 weight 文件的大小(GB)

@yoshi315
Copy link

yoshi315 commented Mar 7, 2025

请问您用了多少机,每机多少卡。感谢

@joyyyhuang
Copy link
Author

7机8卡 H20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

3 participants