-
-
Notifications
You must be signed in to change notification settings - Fork 17.1k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: parse input_audio payloads with uuid
frontend
#43451
opened May 22, 2026 by
he-yufeng
Contributor
Loading…
[Formatting] Collapse multi-line arg lists where possible
ci/build
cpu
Related to CPU backends
deepseek
Related to DeepSeek models
documentation
Improvements or additions to documentation
frontend
gpt-oss
Related to GPT-OSS models
intel-gpu
Related to Intel GPU
kv-connector
llama
Related to Llama models
mistral
Related to Mistral models
multi-modality
Related to multi-modality (#4194)
needs-rebase
new-model
Requests to new models
nvidia
performance
Performance-related issues
qwen
Related to Qwen models
rocm
Related to AMD ROCm
speculative-decoding
structured-output
tool-calling
tpu
Related to Google TPUs
v1
[Bugfix] Make openai_serving_render optional in OpenAIServingChat and OpenAIServingCompletion
bug
Something isn't working
frontend
#43448
opened May 22, 2026 by
iOptimizeThings
Loading…
3 of 4 tasks
[Prefix Caching] DeepSeekv4 only retain sliding window cache at specified interval boundaries
deepseek
Related to DeepSeek models
v1
[Spec Decode] Allow causal DFlash
speculative-decoding
v1
#43445
opened May 22, 2026 by
benchislett
Member
Loading…
[Bugfix][DeepseekV4] Harden compress_ratio fallback for transformers >=4.57
bug
Something isn't working
deepseek
Related to DeepSeek models
#43443
opened May 22, 2026 by
varjoranta
Loading…
[Bugfix][DeepseekV4] Gate expert weight load on success
bug
Something isn't working
deepseek
Related to DeepSeek models
#43442
opened May 22, 2026 by
varjoranta
Loading…
[Docs] Fix incorrect docstring for
all_non_structural_tag_constraints_none
#43441
opened May 22, 2026 by
jrajath94
Loading…
[Model] Refactor Gemma4 vision tower with vLLM-native modules
#43440
opened May 22, 2026 by
linitra24
Contributor
Loading…
4 tasks
Enable FlashInfer checkpointing SSU STP no-fallback path
needs-rebase
v1
#43439
opened May 22, 2026 by
danielafrimi
Contributor
•
Draft
Update FastAPI dependency to exclude fastapi-cloud-cli
ci/build
#43438
opened May 22, 2026 by
ducviet00
Contributor
Loading…
[Bugfix][Async][SpecDecode] Fix async MTP placeholder draft token handling
bug
Something isn't working
v1
#43434
opened May 22, 2026 by
SorenDreano
Contributor
Loading…
3 of 4 tasks
Keep scheduler alive for delayed KV connector frees
v1
verified
Run pre-commit for new contributors without triggering other tests
#43433
opened May 22, 2026 by
lucifer1004
Loading…
[TurboQuant] Use MSE quantization for values
v1
#43432
opened May 22, 2026 by
lesj0610
Contributor
Loading…
3 of 4 tasks
[Model] Add hotwords support for model Qwen3-ASR
documentation
Improvements or additions to documentation
qwen
Related to Qwen models
#43430
opened May 22, 2026 by
homepy
Loading…
4 tasks done
[rust] fix: aggregate
is_sleeping and reset_prefix_cache across DP engines
needs-rebase
rust
#43429
opened May 22, 2026 by
willamhou
Loading…
2 tasks done
[Bugfix] Fix hash topk dtype mismatch
bug
Something isn't working
#43425
opened May 22, 2026 by
wangyicong52
Loading…
[V1][Bugfix] structured_output × spec-decode: pre-commit grammar filter for boundary-step bonus tokens
bug
Something isn't working
structured-output
v1
#43424
opened May 22, 2026 by
adammoisa
Loading…
3 tasks
[Model] Add get_mm_max_tokens_per_item for GLM4_1V ProcessingInfo to avoid slow dummy inputs in launch model
#43422
opened May 22, 2026 by
labAxiaoming
Contributor
Loading…
4 tasks
[XPU][Mamba] Triton-based selective scan forward op for XPU
intel-gpu
Related to Intel GPU
#43421
opened May 22, 2026 by
mfylcek
Contributor
Loading…
fix(benchmark): correct Performance-related issues
SpecBench.sample method signature to suppor…
performance
#43418
opened May 22, 2026 by
JericRui
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-04-22.