Web-RWKV-Py

Python binding for web-rwkv.

Todos

Basic V5 inference support
Support V4, V5 and V6
Batched inference

Usage

Install python and rust.
Install maturin by
```
$ pip install maturin
```
Build and install:
```
$ maturin develop
```

Try using web-rwkv in python:

import web_rwkv_py as wrp

model = wrp.v5.Model(
   "/path/to/model.st", # model path
   quant=0,             # int8 quantization layers
   quant_nf4=0,         # nf4 quantization layers
   turbo=True,          # faster when reading long prompts
   token_chunk_size=256 # maximum tokens in an inference chunk (can be 32, 64, 256, 1024, etc.)
)

588C
logits, state = wrp.v5.run_one(model, [114, 514], state=None)

Advanced Usage

Move state to host memory:

logits, state = wrp.v5.run_one(model, [114, 514], state=None) # returned state is on vram
state_cpu = state.back()

Load state from host memory:

state = wrp.v5.ModelState(model, 1)
state.load(state_cpu)
logits, state = wrp.v5.run_one(model, [114, 514], state=state_cpu)

Return predictions of all tokens (not only the last's):

logits, state = wrp.v5.run_one_full(model, [114, 514], state=None)
len(logits) # 2

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Web-RWKV-Py

Todos

Usage

Advanced Usage

About

Uh oh!

Releases

Packages

Languages

jonajosejg/web-rwkv-py

Folders and files

Latest commit

History

Repository files navigation

Web-RWKV-Py

Todos

Usage

Advanced Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages