Putting the 7B Model on a Raspberry Pi 4

This statement is the "Can it run Doom" equivalent for AI.

Introduction

This seems like an exercise in futility, but nonetheless benchmarks the progress of quantization techniques, as well as the portability of AI models to edge devices.

Installation Steps

Update and install git:
```
sudo apt update && sudo apt install git
```

Clone the chatbot repository:

git clone https://github.com/ggerganov/llama.cpp

Install required Python packages:
```
python3 -m pip install torch numpy sentencepiece
```
- If the build doesn't go through, run this command beforehand:
```
rm /usr/lib/python3.11/EXTERNALLY-MANAGED
```
Install essential build tools:
```
sudo apt install g++ build-essential
```
Navigate to the cloned directory and build:
```
cd llama.cpp
make
```
Note: This only works on Linux, not Mac. Edit the MODEL_SIZE line to choose the model size you want (e.g., 7B, 13B). Otherwise, it will install all of them.
Convert the 7B model to GGML FP16 format:
```
python3 convert-pth-to-ggml.py models/7B/ 1
```
Note: Ensure tokenizer.model and .consoldidates.00.pth file/files are present.
Quantize the model to 4-bits using the q4_0 method. You can also try using the q3_0 model for better performance on the Raspberry Pi.
```
./quantize ./models/7B/ggml-model-f16.bin ./models/7B/ggml-model-q4_0.bin q4_0
```
Read more about quantization here.
Run the inference in interactive mode, or converse with Bob using the examples argument:
```
./main -m ./models/7B/ggml-model-q4_0.bin -n 128
```
Note: Bob is cool. Ask him nicely, and he'll tell you if you're using the 7B model or the 7B quantized.
Test the chat interface:
```
./examples/chat.sh
```
Transfer the generated files in the /7B folder (excluding the pre-transformed files) to the Raspberry Pi. Place these files in the same directory structure llama.cpp/models/7B/, and then simply run:
```
./examples/chat.sh
```

Happy tinkering!

Final Verdict - It's not very good, but this is nonetheless the right type of benchmark to run. Next steps: I'll be buying 128GB+ Gbs of Ram to play with bigger engines for the time being.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
7B.png		7B.png
README.MD		README.MD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Putting the 7B Model on a Raspberry Pi 4

Introduction

Installation Steps

About

Uh oh!

Releases

Packages

alejandroniculescu/quantize_potato

Folders and files

Latest commit

History

Repository files navigation

Putting the 7B Model on a Raspberry Pi 4

Introduction

Installation Steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages