-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Issues: NVIDIA/TensorRT-LLM
[Issue Template]Short one-line summary of the issue #270
#783
opened Jan 1, 2024 by
juney-nvidia
Open
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Qwen 32B becomes 8x slower when enabling lora
bug
Something isn't working
#2883
opened Mar 13, 2025 by
ShuaiShao93
4 tasks
The lookahead speculative decoding failed when applied to gpt2 0.1b.
#2879
opened Mar 12, 2025 by
JeRainXiong
an illegal memory access was encountered with python benchmark.
bug
Something isn't working
Investigating
#2868
opened Mar 10, 2025 by
yuqie
1 of 4 tasks
Could not run on a machine with dual RTX 5090s, using WSL2 and Docker
bug
Something isn't working
Investigating
triaged
Issue has been triaged by maintainers
#2864
opened Mar 7, 2025 by
shahizat
4 tasks
Could Not Run Deepseek-R1 Engine with Nvidia Triton Server in 8xH200
triaged
Issue has been triaged by maintainers
#2863
opened Mar 7, 2025 by
Bihan
Does TensorRT-LLm have a roadmap for techniques like Deepseek's EPLB, DeepGEMM, FlashMLA, DeepEP?
#2861
opened Mar 6, 2025 by
dongs0104
TensorRT-LLM for Qwen2.5-VL
triaged
Issue has been triaged by maintainers
#2859
opened Mar 6, 2025 by
ananthakrishnanpv01
Getting runtime error OOM for generation logits and never go back to normal
Investigating
triaged
Issue has been triaged by maintainers
waiting for feedback
#2857
opened Mar 5, 2025 by
tlyhenry
Cannot build engine w. FP8 FMHA on Blackwell
Investigating
triaged
Issue has been triaged by maintainers
#2855
opened Mar 5, 2025 by
indra83
Fail to convert InternVL2-1B model in container
bug
Something isn't working
triaged
Issue has been triaged by maintainers
#2854
opened Mar 5, 2025 by
tjliupeng
1 of 4 tasks
TensorRT-LLM 0.17.0.post1 fails to run Whisper on 5080 GPU
bug
Something isn't working
Investigating
triaged
Issue has been triaged by maintainers
#2847
opened Mar 3, 2025 by
Banner-Wang
4 tasks
[Error] TypeError: LlmArgs.__init__() got an unexpected keyword argument 'enable_attention_dp'
Investigating
triaged
Issue has been triaged by maintainers
#2846
opened Mar 3, 2025 by
tingjun-cs
【Error】ValueError: Unknown architecture for AutoModelForCausalLM: DeepseekV3ForCausalLM
triaged
Issue has been triaged by maintainers
#2845
opened Mar 3, 2025 by
tingjun-cs
Inquiry on 0.18 Release plan and R1 TRT engine support in different branch
triaged
Issue has been triaged by maintainers
#2844
opened Mar 3, 2025 by
junliu-mde
tensorrt_llm_ucx_wrapper.dll and tensorrt_llm_ucx_wrapper.lib does not exist
triaged
Issue has been triaged by maintainers
#2834
opened Feb 28, 2025 by
LRLVEC
RoBERTa model conversion does not pass the huggingface test
bug
Something isn't working
Investigating
triaged
Issue has been triaged by maintainers
#2829
opened Feb 26, 2025 by
arinaruck
2 of 4 tasks
Bottleneck in _initialize_and_fill_output func. in multimodal_runner_cpp.py
Investigating
triaged
Issue has been triaged by maintainers
#2827
opened Feb 26, 2025 by
nicekevin
pytorch backend run error with fp8 hf model
bug
Something isn't working
triaged
Issue has been triaged by maintainers
waiting for feedback
#2825
opened Feb 26, 2025 by
nickole2018
2 of 4 tasks
Baichuan2 model core dumped when running after quantization to FP8
bug
Something isn't working
Investigating
triaged
Issue has been triaged by maintainers
#2824
opened Feb 26, 2025 by
kanebay
2 of 4 tasks
Previous Next
ProTip!
Follow long discussions with comments:>50.