-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Model] Add support for embedding model JambaClassfication]
#10860
opened Dec 3, 2024 by
yecohn
Loading…
[Frontend] correctly record prefill and decode time metrics
#10853
opened Dec 3, 2024 by
tomeras91
Loading…
[Doc] add KubeAI to serving integrations
documentation
Improvements or additions to documentation
#10837
opened Dec 2, 2024 by
samos123
Loading…
[Feature][Hardware][AMD] Enable level 3 compilation on rocm
#10836
opened Dec 2, 2024 by
charlifu
Loading…
[WIP] Xgrammar init in engine
ci/build
documentation
Improvements or additions to documentation
needs-rebase
[Doc] Create a new "Usage" section
documentation
Improvements or additions to documentation
ready
ONLY add when PR is ready to merge/full CI is needed
#10827
opened Dec 2, 2024 by
DarkLight1337
Loading…
[Misc] Split up pooling tasks
documentation
Improvements or additions to documentation
frontend
#10820
opened Dec 2, 2024 by
DarkLight1337
•
Draft
[Model] Add support for embedding model GritLM
documentation
Improvements or additions to documentation
#10816
opened Dec 2, 2024 by
pooyadavoodi
Loading…
[Core]: Support destroying all KV cache during runtime
#10810
opened Dec 1, 2024 by
HollowMan6
Loading…
[Core] add xgrammar as guided generation provider
ci/build
needs-rebase
#10803
opened Dec 1, 2024 by
joennlae
Loading…
[Bugfix] fix race condition that leads to wrong order of token returned
#10802
opened Dec 1, 2024 by
joennlae
Loading…
[Bugfix] Multiple fixes to tool streaming when using auto tool choice.
frontend
#10782
opened Nov 29, 2024 by
cedonley
Loading…
support download Lora Model from ModelScope and download private mode…
#10762
opened Nov 29, 2024 by
AlphaINF
Loading…
Configuration of the model parallelism does not make sense
#10749
opened Nov 28, 2024 by
fajavadi
Loading…
[Model] Add elementwise_affine to RMSNorm and re-enable weights loading tracker for Mamba
#10739
opened Nov 28, 2024 by
Isotr0py
Loading…
[Core] Refactoring disaggregated prefilling/decoding using Mooncake Transfer Engine
ci/build
documentation
Improvements or additions to documentation
frontend
needs-rebase
#10728
opened Nov 28, 2024 by
alogfans
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.