8000
We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
b6553be
This release features 274 commits, from 123 contributors (27 new contributors!)
scaled_fp8_quant
LLM
model
inputs
Qwen2EmbeddingModel
get_dummy_text
get_dummy_mm_data
vllm bench serve
async_timeout
None
base
packed_modules_mapping
WeightsMapper
_Backend
Optional
Annotated
compressed-tensors
generate()
max_model_len
kv_sharing_target_layer_name
use_irope
Full Changelog: v0.9.0...v0.9.1