Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fix: parse input_audio payloads with uuid frontend
#43451 opened May 22, 2026 by he-yufeng Contributor Loading…
[Formatting] Collapse multi-line arg lists where possible ci/build cpu Related to CPU backends deepseek Related to DeepSeek models documentation Improvements or additions to documentation frontend gpt-oss Related to GPT-OSS models intel-gpu Related to Intel GPU kv-connector llama Related to Llama models mistral Related to Mistral models multi-modality Related to multi-modality (#4194) needs-rebase new-model Requests to new models nvidia performance Performance-related issues qwen Related to Qwen models rocm Related to AMD ROCm speculative-decoding structured-output tool-calling tpu Related to Google TPUs v1
#43449 opened May 22, 2026 by njhill Member Draft
[Prefix Caching] DeepSeekv4 only retain sliding window cache at specified interval boundaries deepseek Related to DeepSeek models v1
#43447 opened May 22, 2026 by wzhao18 Contributor Draft
4 tasks
[Spec Decode] Allow causal DFlash speculative-decoding v1
#43445 opened May 22, 2026 by benchislett Member Loading…
[Bugfix][DeepseekV4] Harden compress_ratio fallback for transformers >=4.57 bug Something isn't working deepseek Related to DeepSeek models
#43443 opened May 22, 2026 by varjoranta Loading…
[Bugfix][DeepseekV4] Gate expert weight load on success bug Something isn't working deepseek Related to DeepSeek models
#43442 opened May 22, 2026 by varjoranta Loading…
[Model] Refactor Gemma4 vision tower with vLLM-native modules
#43440 opened May 22, 2026 by linitra24 Contributor Loading…
4 tasks
Update FastAPI dependency to exclude fastapi-cloud-cli ci/build
#43438 opened May 22, 2026 by ducviet00 Contributor Loading…
[Bugfix][Async][SpecDecode] Fix async MTP placeholder draft token handling bug Something isn't working v1
#43434 opened May 22, 2026 by SorenDreano Contributor Loading…
3 of 4 tasks
Keep scheduler alive for delayed KV connector frees v1 verified Run pre-commit for new contributors without triggering other tests
#43433 opened May 22, 2026 by lucifer1004 Loading…
[TurboQuant] Use MSE quantization for values v1
#43432 opened May 22, 2026 by lesj0610 Contributor Loading…
3 of 4 tasks
[Model] Add hotwords support for model Qwen3-ASR documentation Improvements or additions to documentation qwen Related to Qwen models
#43430 opened May 22, 2026 by homepy Loading…
4 tasks done
[Bugfix] Fix hash topk dtype mismatch bug Something isn't working
#43425 opened May 22, 2026 by wangyicong52 Loading…
[XPU][Mamba] Triton-based selective scan forward op for XPU intel-gpu Related to Intel GPU
#43421 opened May 22, 2026 by mfylcek Contributor Loading…
fix(benchmark): correct SpecBench.sample method signature to suppor… performance Performance-related issues
#43418 opened May 22, 2026 by JericRui Loading…
[Rust Frontend] Improve startup failure reporting UX frontend ready ONLY add when PR is ready to merge/full CI is needed rust v1
#43417 opened May 22, 2026 by BugenZhao Member Loading…
4 tasks
[Bugfix][Frontend] Fix input_audio parsing when uuid is present bug Something isn't working frontend ready ONLY add when PR is ready to merge/full CI is needed
#43414 opened May 22, 2026 by ffggs Loading…
ProTip! What’s not been updated in a month: updated:<2026-04-22.