File Limit Request: vLLM - 800 MiB #6326

youkaichao · 2025-05-14T07:10:35Z

Project URL

https://pypi.org/project/vllm/

Does this project already exist?

Yes

New Limit

800 MiB

Update issue title

I have updated the title.

Which indexes

PyPI

About the project

Running large language models (LLMs) is both resource-intensive and complex, especially as these models scale to hundreds of billions of parameters. That’s where vLLM comes in. Originally built around the innovative PagedAttention algorithm, vLLM has grown into a comprehensive, state-of-the-art inference engine. A thriving community is also continuously adding new features and optimizations to vLLM, including pipeline parallelism, chunked prefill, speculative decoding, and disaggregated serving.

Since its release, vLLM has garnered significant attention, achieving over 46,500 GitHub stars and over 1000 contributors—a testament to its popularity and thriving community.

Reasons for the request

Last year, we requested the size limit to be 400 MiB, see #3792 for more details. Since then, vLLM keeps growing, in terms of both popularity and the models/algorithms it supports. Now it's approaching the limit again. The release https://pypi.org/project/vllm/0.8.5.post1/#files has 326 MiB now.

Recently, when we add support for NVIDIA's blackwell, the wheel size grows to 450 MiB, which limits our release process. We tried to drop the support for older GPUs, but that doesn't help. The binary for the new GPUs dominates in size.

In addition, blackwell GPUs introduce more data types, including FP4, FP6, which means we need more variants of GPU kernels to support the functionality. Therefore, we forsee that the wheel size will continue to grow in the near future.

We kindly request to grow the size limit to 800 MiB, so that vLLM can better serve the community.

Code of Conduct

I agree to follow the PSF Code of Conduct

jeejeelee · 2025-05-14T07:29:38Z

+1, it would be great to have this!

WoosukKwon · 2025-05-14T15:57:44Z

1!

houseroad · 2025-05-15T05:04:15Z

+1, many Nvidia arches to support, especially blackwell is in! It's definitely a reasonable ask. :-)

WangErXiao · 2025-05-16T15:12:24Z

+1

edzq · 2025-05-17T07:21:42Z

+1! I love vLLM!

youkaichao · 2025-05-20T07:04:23Z

@Thespi-Brain can you please help take a look? thanks! 🙏

youkaichao · 2025-05-29T15:37:34Z

@Thespi-Brain kindly ping for any update, thanks!

tlrmchlsmth · 2025-05-29T15:59:02Z

+1 - This is currently a blocker for vLLM for support NVIDIA GTX 5000-series consumer GPUs

mgoin · 2025-05-29T16:02:17Z

+1 this would help vLLM serve more users <3

alew3 · 2025-05-29T16:47:00Z

+1 to include blackwell support!

ywang96 · 2025-05-29T17:38:44Z

+1 - This would be critical for vLLM to ship support for more hardware architectures out of box to make users' life easier!

Rezzemy · 2025-05-30T14:04:20Z

+1

RodriMora · 2025-06-02T13:10:02Z

+1 to this.

youkaichao · 2025-06-03T03:18:14Z

a gentle nudge @Thespi-Brain @cmaureir 🙏

ewdurbin · 2025-06-03T17:22:21Z

Howdy folks, brigading issues like this is not helpful.

I'm going to lock this issue. It will be unlocked when it reaches the top of the queue.

youkaichao added the limit request label May 14, 2025

houseroad mentioned this issue May 30, 2025

[Bug]: VLLM Docker v0.9.0 produces Runtime Error: Cuda Error on Blackwell using Qwen0.6B vllm-project/vllm#18916

Open

pypi locked and limited conversation to collaborators Jun 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

File Limit Request: vLLM - 800 MiB #6326

File Limit Request: vLLM - 800 MiB #6326

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

File Limit Request: vLLM - 800 MiB #6326

File Limit Request: vLLM - 800 MiB #6326

Comments

Project URL

Does this project already exist?

New Limit

Update issue title

Which indexes

About the project

Reasons for the request

Code of Conduct

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!