Update vllm requirement from <=0.6.3 to <=0.8.3 #7

dependabot · 2025-04-07T18:51:47Z

Updates the requirements on vllm to permit the latest version.

Release notes

v0.8.3

Highlights

This release features 260 commits, 109 contributors, 38 new contributors.

We are excited to announce Day 0 Support for Llama 4 Scout and Maverick (#16104). Please see our blog for detailed user guide.

Please note that Llama4 is only supported in V1 engine only for now.

V1 engine now supports native sliding window attention (#14097) with the hybrid memory allocator.

Cluster Scale Serving

Single node data parallel with API server support (#13923)

Multi-node offline DP+EP example (#15484)

Expert parallelism enhancements

CUTLASS grouped gemm fp8 MoE kernel (#13972)

Fused experts refactor (#15914)

Fp8 Channelwise Dynamic Per Token GroupedGEMM (#15587)

Adding support for fp8 gemm layer input in fp8 (#14578)

Add option to use DeepGemm contiguous grouped gemm kernel for fused MoE operations. (#13932)

Support XpYd disaggregated prefill with MooncakeStore (#12957)

Model Supports

Llama 4 (#16104), Aya Vision (#15441), MiniMaxText01(#13454), Skywork-R1V (#15397), jina-reranker-v2 (#15876)

Add Reasoning Parser for Granite Models (#14202)

Add Phi-4-mini function calling support (#14886)

V1 Engine

Collective RPC (#15444)

Faster top-k only implementation (#15478)

BitsAndBytes support (#15611)

Speculative Decoding: metrics (#15151), Eagle Proposer (#15729), n-gram interface update (#15750), EAGLE Architecture with Proper RMS Norms (#14990)

Features

API

Support Enum for xgrammar based structured output in V1. (#15594, #15757)

A new tags parameter for wake_up (#15500)

V1 LoRA support CPU offload (#15843)

Prefix caching support: FIPS enabled machines with MD5 hashing (#15299), SHA256 as alternative hashing algorithm (#15297)

Addition of http service metrics (#15657)

Performance

LoRA Scheduler optimization bridging V1 and V0 performance (#15422).

Hardwares

AMD:

Add custom allreduce support for ROCM (#14125)

Quark quantization documentation (#15861)

AITER integration: int8 scaled gemm kernel (#15433), fused moe (#14967)

Paged attention for V1 (#15720)

CPU:

... (truncated)

Commits

296c657 Revert "[V1] DP scale-out (1/N): Use zmq ROUTER/DEALER sockets for input queu...
c575232 [Model] Support Llama4 in vLLM (#16104)
63375f0 [V1][Spec Decode] Update N-gram Proposer Interface (#15750)
70ad3f9 [Bugfix][TPU] Fix V1 TPU worker for sliding window (#16059)
d6fc629 [Kernel][Minor] Re-fuse triton moe weight application (#16071)
af51d80 Revert "[V1] Scatter and gather placeholders in the model runner" (#16075)
f5722a5 [V1] Scatter and gather placeholders in the model runner (#15712)
651cf0f [V1] DP scale-out (1/N): Use zmq ROUTER/DEALER sockets for input queue (#15906)
4dc52e1 [CI] Reorganize .buildkite directory (#16001)
4708f13 [Bugfix] Fix default behavior/fallback for pp in v1 (#16057)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Updates the requirements on [vllm](https://github.com/vllm-project/vllm) to permit the latest version. - [Release notes](https://github.com/vllm-project/vllm/releases) - [Changelog](https://github.com/vllm-project/vllm/blob/main/RELEASE.md) - [Commits](vllm-project/vllm@v0.1.0...v0.8.3) --- updated-dependencies: - dependency-name: vllm dependency-version: 0.8.3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>

dependabot · 2025-04-21T17:27:24Z

Superseded by #9.

dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Apr 7, 2025

dependabot bot closed this Apr 21, 2025

dependabot bot deleted the dependabot/pip/vllm-lte-0.8.3 branch April 21, 2025 17:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update vllm requirement from <=0.6.3 to <=0.8.3 #7

Update vllm requirement from <=0.6.3 to <=0.8.3 #7

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Update vllm requirement from <=0.6.3 to <=0.8.3 #7

Update vllm requirement from <=0.6.3 to <=0.8.3 #7

Uh oh!

Conversation

v0.8.3

Highlights

Cluster Scale Serving

Model Supports

V1 Engine

Features

API

Performance

Hardwares

Uh oh!

Uh oh!

Uh oh!