vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 6.2k
Star 41.3k

Code
Issues 1.4k
Pull requests 489
Discussions
Actions
Projects 4
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q1 2025

#11862 opened Jan 8, 2025 by simon-mo

Open 7

[V1] Feedback Thread

#12568 opened Jan 30, 2025 by simon-mo

Open 60

Labels 42 Milestones 1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,425 Open 5,660 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[V1] [Performance] Optimize Cascade Kernel feature request

New feature or request

#14729 opened Mar 13, 2025 by LiuXiaoxuanPKU

1 task done

[Doc]: Document reasoning outputs in structured outputs only works in v0 documentation

Improvements or additions to documentation

#14727 opened Mar 13, 2025 by gaocegege

1 task done

[Usage]: how can i run deepseek-ai/deepseek-V3 with vllm? usage

How to use vllm

#14726 opened Mar 13, 2025 by lddlww

1 task done

[RFC]: KV Cache Offloading for Cross-Engine KV Reuse RFC

#14724 opened Mar 13, 2025 by DwyaneShi

[Feature]: gemma3 raise error feature request

New feature or request

#14723 opened Mar 13, 2025 by moseshu

1 task done

[Bug]: try reinforce++ training with vllm-eager-mode, the inference result is quite different for eager-mode and cuda-graph for greedy-decoding evaluation bug

Something isn't working

#14722 opened Mar 13, 2025 by yyzhangecnu

1 task done

[Feature]: Support openai responses API interface feature request

New feature or request

#14721 opened Mar 13, 2025 by guunergooner

1 task done

[Bug]: NotImplementedError: Method 'get_kv_cache_spec' is not implemented. bug

Something isn't working

#14720 opened Mar 13, 2025 by lfoppiano

[RFC][V1][Spec Decode] V1 Spec Decode Eagle Support feature request

New feature or request

#14719 opened Mar 13, 2025 by LiuXiaoxuanPKU

[Usage]: Can AsyncLLMEngine support batch infer？ usage

How to use vllm

#14717 opened Mar 13, 2025 by UndefinedMan

1 task done

[Bug]: Short prompts -> !!!!!!! output from Qwen2.5-32B-Instruct-GPTQ-Int4 w/ROCm bug

Something isn't working

#14715 opened Mar 13, 2025 by bjj

1 task done

[Bug]: ModuleNotFoundError: No module named 'vllm._C' at first start bug

Something isn't working

#14714 opened Mar 13, 2025 by laoLiDeHao

[WIP][RFC]: Use auto-functionalization V2 in PyTorch 2.7+ RFC

#14703 opened Mar 12, 2025 by ProExpertProg

1 task done

[Bug]: vLLM CPU inference does not use AVX512 on AVX512-capable CPU bug

Something isn't working

#14701 opened Mar 12, 2025 by Nero10578

1 task done

[Bug]: When using VLLM to run the BGE-reranker-v2-me3 model to process rerank business, an error often occurs: The model does not support the Rerank (Score) API bug

Something isn't working

#14693 opened Mar 12, 2025 by super-noodle

1 task done

[Performance]: [V1] duplicated prefill tokens for n>1 performance

Performance-related issues

#14686 opened Mar 12, 2025 by hewr2010

1 task done

[Feature]: Data parallel inference in offline mode feature request

New feature or request

#14683 opened Mar 12, 2025 by re-imagined

1 task done

[Bug]: Phi-4-mini function calling support bug

Something isn't working

#14682 opened Mar 12, 2025 by kinfey

[Feature]: Memory interleaving: improve performance by increasing memory bandwidth between CPU and system memory feature request

New feature or request

#14680 opened Mar 12, 2025 by askervin

1 task done

[Bug]: Unit test tests/models/embedding/vision_language/test_phi3v.py failing bug

Something isn't working

good first issue

Good for newcomers

help wanted

Extra attention is needed

#14677 opened Mar 12, 2025 by tjtanaa

[Bug]: "POST /v1/audio/transcriptions HTTP/1.1" 400 Bad Request bug

Something isn't working

#14676 opened Mar 12, 2025 by digicontacts

1 task done

[Bug]: Vllm automatically restarts while using cortecs/phi-4-FP8-Dynamic bug

Something isn't working

#14675 opened Mar 12, 2025 by ubairnisar

1 task done

[Usage]: Supports int2 quantized models usage

How to use vllm

#14674 opened Mar 12, 2025 by yuhuilcm

1 task done

[Bug]: ROCm fail to build due to compilation error of moe_wna16.cu bug

Something isn't working

#14669 opened Mar 12, 2025 by tjtanaa

1 task done

[Usage]: Qwen2.5-VL - BBOX Ouput Incorrect for Second Image when Request Contains 2 Images usage

How to use vllm

#14662 opened Mar 12, 2025 by MotorBottle

1 task done

Previous 1 2 3 4 5 … 56 57 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly