-
-
Notifications
You must be signed in to change notification settings - Fork 6.2k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[V1] [Performance] Optimize Cascade Kernel
feature request
New feature or request
#14729
opened Mar 13, 2025 by
LiuXiaoxuanPKU
1 task done
[Doc]: Document reasoning outputs in structured outputs only works in v0
documentation
Improvements or additions to documentation
#14727
opened Mar 13, 2025 by
gaocegege
1 task done
[Usage]: how can i run deepseek-ai/deepseek-V3 with vllm?
usage
How to use vllm
#14726
opened Mar 13, 2025 by
lddlww
1 task done
[Feature]: gemma3 raise error
feature request
New feature or request
#14723
opened Mar 13, 2025 by
moseshu
1 task done
[Bug]: try reinforce++ training with vllm-eager-mode, the inference result is quite different for eager-mode and cuda-graph for greedy-decoding evaluation
bug
Something isn't working
#14722
opened Mar 13, 2025 by
yyzhangecnu
1 task done
[Feature]: Support openai responses API interface
feature request
New feature or request
#14721
opened Mar 13, 2025 by
guunergooner
1 task done
[Bug]: NotImplementedError: Method 'get_kv_cache_spec' is not implemented.
bug
Something isn't working
#14720
opened Mar 13, 2025 by
lfoppiano
[RFC][V1][Spec Decode] V1 Spec Decode Eagle Support
feature request
New feature or request
#14719
opened Mar 13, 2025 by
LiuXiaoxuanPKU
[Usage]: Can AsyncLLMEngine support batch infer?
usage
How to use vllm
#14717
opened Mar 13, 2025 by
UndefinedMan
1 task done
[Bug]: Short prompts -> !!!!!!! output from Qwen2.5-32B-Instruct-GPTQ-Int4 w/ROCm
bug
Something isn't working
#14715
opened Mar 13, 2025 by
bjj
1 task done
[Bug]: ModuleNotFoundError: No module named 'vllm._C' at first start
bug
Something isn't working
#14714
opened Mar 13, 2025 by
laoLiDeHao
[WIP][RFC]: Use auto-functionalization V2 in PyTorch 2.7+
RFC
#14703
opened Mar 12, 2025 by
ProExpertProg
1 task done
[Bug]: vLLM CPU inference does not use AVX512 on AVX512-capable CPU
bug
Something isn't working
#14701
opened Mar 12, 2025 by
Nero10578
1 task done
[Bug]: When using VLLM to run the BGE-reranker-v2-me3 model to process rerank business, an error often occurs: The model does not support the Rerank (Score) API
bug
Something isn't working
#14693
opened Mar 12, 2025 by
super-noodle
1 task done
[Performance]: [V1] duplicated prefill tokens for n>1
performance
Performance-related issues
#14686
opened Mar 12, 2025 by
hewr2010
1 task done
[Feature]: Data parallel inference in offline mode
feature request
New feature or request
#14683
opened Mar 12, 2025 by
re-imagined
1 task done
[Bug]: Phi-4-mini function calling support
bug
Something isn't working
#14682
opened Mar 12, 2025 by
kinfey
[Feature]: Memory interleaving: improve performance by increasing memory bandwidth between CPU and system memory
feature request
New feature or request
#14680
opened Mar 12, 2025 by
askervin
1 task done
[Bug]: Unit test Something isn't working
good first issue
Good for newcomers
help wanted
Extra attention is needed
tests/models/embedding/vision_language/test_phi3v.py
failing
bug
#14677
opened Mar 12, 2025 by
tjtanaa
[Bug]: "POST /v1/audio/transcriptions HTTP/1.1" 400 Bad Request
bug
Something isn't working
#14676
opened Mar 12, 2025 by
digicontacts
1 task done
[Bug]: Vllm automatically restarts while using cortecs/phi-4-FP8-Dynamic
bug
Something isn't working
#14675
opened Mar 12, 2025 by
ubairnisar
1 task done
[Usage]: Supports int2 quantized models
usage
How to use vllm
#14674
opened Mar 12, 2025 by
yuhuilcm
1 task done
[Bug]: ROCm fail to build due to compilation error of Something isn't working
moe_wna16.cu
bug
#14669
opened Mar 12, 2025 by
tjtanaa
1 task done
[Usage]: Qwen2.5-VL - BBOX Ouput Incorrect for Second Image when Request Contains 2 Images
usage
How to use vllm
#14662
opened Mar 12, 2025 by
MotorBottle
1 task done
Previous Next
ProTip!
no:milestone will show everything without a milestone.