-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Issues: sgl-project/sglang
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug] flashinfer separate installation: Probably needs either code or documentation fix?
#4361
opened Mar 13, 2025 by
sliedes
2 of 5 tasks
CUDA 12.4 sglang Error Failed to initialize the TMA descriptor 999
#4358
opened Mar 13, 2025 by
JackMeiLong
[Bug] undefined symbol: cublasGemmGroupedBatchedEx after make build in sgl-kernel for CUDA126
#4357
opened Mar 12, 2025 by
hebiao064
4 of 5 tasks
[Feature] add abstraction for different platform
enhancement
New feature or request
high priority
#4353
opened Mar 12, 2025 by
zhyncs
2 tasks
Speculative Decoding Fails with AWQ Quantized Model
quant
LLM Quantization
speculative-decoding
#4351
opened Mar 12, 2025 by
keskinberkem
[Bug] sglang-router load model failed randomly
#4339
opened Mar 12, 2025 by
fmantianxing
3 of 5 tasks
[Bug] fix DeepSeek V2/V3 awq
bug
Something isn't working
good first issue
Good for newcomers
help wanted
Extra attention is needed
quant
LLM Quantization
#4338
opened Mar 12, 2025 by
zhyncs
5 tasks
[Feature] New models Gemma 3
good first issue
Good for newcomers
help wanted
Extra attention is needed
#4332
opened Mar 12, 2025 by
Swipe4057
1 of 2 tasks
[Bug] --speculative-token-map error when TP>1
speculative-decoding
#4328
opened Mar 12, 2025 by
Achazwl
5 tasks done
[Bug] fix gemma-2-2b-it-FP8 accuracy
bug
Something isn't working
good first issue
Good for newcomers
help wanted
Extra attention is needed
high priority
quant
LLM Quantization
#4324
opened Mar 12, 2025 by
zhyncs
5 tasks
[Feature] Make Offline Engine work in Uvicorn web apps
#4319
opened Mar 11, 2025 by
YavorGIvanov
2 tasks done
[Bug] NVIDIA_H100_PCIe Quantization and MoE config file not found
#4316
opened Mar 11, 2025 by
OpenHuShen
5 tasks done
[Bug] qwq-32b does not support concurrent requests.
#4305
opened Mar 11, 2025 by
tingjun-cs
5 tasks done
[Bug] Inference gets error when the batch size is large
#4303
opened Mar 11, 2025 by
thuliu-yt16
5 tasks done
[Feature] update sgl-kernel 3rdparty flashinfer to latest main
good first issue
Good for newcomers
help wanted
Extra attention is needed
high priority
#4301
opened Mar 11, 2025 by
zhyncs
2 tasks
[Bug] Failed to parse fc related info to json format!
#4300
opened Mar 11, 2025 by
tbwang-clound
3 of 5 tasks
[Bug] R1 server stuck when strating with --enable-flashinfer-mla --disable-radix-cache
#4298
opened Mar 11, 2025 by
sunzx8
2 of 5 tasks
how to update weight with sglang_router? Or how to get worker_urls
router
#4282
opened Mar 11, 2025 by
fmantianxing
[Bug] Flashinfer nextn doesn't work with dp-attention for Deepseek R1
deepseek
#4276
opened Mar 10, 2025 by
dsingal0
5 tasks done
[Bug] DeepSeek R1 Model weights not downloading
#4268
opened Mar 10, 2025 by
RonanKMcGovern
5 tasks done
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.