Skip to content

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q1 2025
#11862 opened Jan 8, 2025 by simon-mo
Open 7
[V1] Feedback Thread
#12568 opened Jan 30, 2025 by simon-mo
Open 60
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

[V1] [Performance] Optimize Cascade Kernel feature request New feature or request
#14729 opened Mar 13, 2025 by LiuXiaoxuanPKU
1 task done
[Doc]: Document reasoning outputs in structured outputs only works in v0 documentation Improvements or additions to documentation
#14727 opened Mar 13, 2025 by gaocegege
1 task done
[Usage]: how can i run deepseek-ai/deepseek-V3 with vllm? usage How to use vllm
#14726 opened Mar 13, 2025 by lddlww
1 task done
[Feature]: gemma3 raise error feature request New feature or request
#14723 opened Mar 13, 2025 by moseshu
1 task done
[Feature]: Support openai responses API interface feature request New feature or request
#14721 opened Mar 13, 2025 by guunergooner
1 task done
[RFC][V1][Spec Decode] V1 Spec Decode Eagle Support feature request New feature or request
#14719 opened Mar 13, 2025 by LiuXiaoxuanPKU
[Usage]: Can AsyncLLMEngine support batch infer? usage How to use vllm
#14717 opened Mar 13, 2025 by UndefinedMan
1 task done
[Bug]: Short prompts -> !!!!!!! output from Qwen2.5-32B-Instruct-GPTQ-Int4 w/ROCm bug Something isn't working
#14715 opened Mar 13, 2025 by bjj
1 task done
[Bug]: vLLM CPU inference does not use AVX512 on AVX512-capable CPU bug Something isn't working
#14701 opened Mar 12, 2025 by Nero10578
1 task done
[Performance]: [V1] duplicated prefill tokens for n>1 performance Performance-related issues
#14686 opened Mar 12, 2025 by hewr2010
1 task done
[Feature]: Data parallel inference in offline mode feature request New feature or request
#14683 opened Mar 12, 2025 by re-imagined
1 task done
[Bug]: Phi-4-mini function calling support bug Something isn't working
#14682 opened Mar 12, 2025 by kinfey
[Bug]: Unit test tests/models/embedding/vision_language/test_phi3v.py failing bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
#14677 opened Mar 12, 2025 by tjtanaa
[Bug]: "POST /v1/audio/transcriptions HTTP/1.1" 400 Bad Request bug Something isn't working
#14676 opened Mar 12, 2025 by digicontacts
1 task done
[Bug]: Vllm automatically restarts while using cortecs/phi-4-FP8-Dynamic bug Something isn't working
#14675 opened Mar 12, 2025 by ubairnisar
1 task done
[Usage]: Supports int2 quantized models usage How to use vllm
#14674 opened Mar 12, 2025 by yuhuilcm
1 task done
[Bug]: ROCm fail to build due to compilation error of moe_wna16.cu bug Something isn't working
#14669 opened Mar 12, 2025 by tjtanaa
1 task done
ProTip! no:milestone will show everything without a milestone.