The fastest cross-platform framework for deploying AI locally on phones.
- Available in Flutter and React-Native for cross-platform developers.
- Supports any GGUF model you can find on Huggingface; Qwen, Gemma, Llama, DeepSeek etc.
- Run LLMs, VLMs, Embedding Models, TTS models and more.
- Accommodates from FP32 to as low as 2-bit quantized models, for efficiency and less device strain.
- MCP tool-calls to make AI performant and helpful (set reminder, gallery search, reply messages) etc.
- iOS xcframework and JNILibs for native setups.
- Neat and tiny C++ build for custom hardware.
- Chat templates with Jinja2 support and token streaming.
- Install:
Execute the following command in your project terminal:
flutter pub add cactus
- Flutter Text Completion
import 'package:cactus/cactus.dart'; // Initialize final lm = await CactusLM.init( modelUrl: 'huggingface/gguf/link', nCtx: 2048, ); // Completion final messages = [CactusMessage(role: CactusMessageRole.user, content: 'Hello!')]; final params = CactusCompletionParams(nPredict: 100, temperature: 0.7); final response = await lm.completion(messages, params); // Embedding final text = 'Your text to embed'; final params = CactusEmbeddingParams(normalize: true); final result = await lm.embedding(text, params);
- Flutter VLM Completion
import 'package:cactus/cactus.dart'; // Initialize (Flutter handles downloads automatically) final vlm = await CactusVLM.init( modelUrl: 'huggingface/gguf/link', mmprojUrl: 'huggingface/gguf/mmproj/link', ); // Multimodal Completion (can add multiple images or None) final messages = [CactusMessage(role: CactusMessageRole.user, content: 'Describe this image')]; final params = CactusVLMParams( images: ['/absolute/path/to/image.jpg'], nPredict: 200, temperature: 0.3, ); final response = await vlm.completion(messages, params);
N/B: See the Flutter Docs. It covers chat design, embeddings, multimodal models, text-to-speech, and more.
-
Install the
cactus-react-native
package:npm install cactus-react-native && npx pod-install
-
React-Native Text Completion
// Initialize const lm = await CactusLM.init({ model: '/path/to/model.gguf', n_ctx: 2048, }); // Completion const messages = [{ role: 'user', content: 'Hello!' }]; const params = { n_predict: 100, temperature: 0.7 }; const response = await lm.completion(messages, params); // Embedding const text = 'Your text to embed'; const params = { normalize: true }; const result = await lm.embedding(text, params);
-
React-Native VLM
// Initialize const vlm = await CactusVLM.init({ model: '/path/to/vision-model.gguf', mmproj: '/path/to/mmproj.gguf', }); // Multimodal Completion (can add multiple images or None) const messages = [{ role: 'user', content: 'Describe this image' }]; const params = { images: ['/absolute/path/to/image.jpg'], n_predict: 200, temperature: 0.3, }; const response = await vlm.completion(messages, params);
N/B: See the React Docs. It covers chat design, embeddings, multimodal models, text-to-speech, and various options.
Cactus backend is written in C/C++ and can run directly on any ARM/X86/Raspberry PI hardware like phones, smart tvs, watches, speakers, cameras, laptops etc. See the C++ Docs. It covers chat design, embeddings, multimodal models, text-to-speech, and more.
First, clone the repo with git clone https://github.com/cactus-compute/cactus.git
, cd into it and make all scripts executable with chmod +x scripts/*.sh
-
Flutter
- Build the Android JNILibs with
scripts/build-flutter-android.sh
. - Build the Flutter Plugin with
scripts/build-flutter-android.sh
. - Navigate to the example app with
cd flutter/example
. - Open your simulator via Xcode or Android Studio, walkthrough if you have not done this before.
- Always start app with this combo
flutter clean && flutter pub get && flutter run
. - Play with the app, and make changes either to the example app or plugin as desired.
- Build the Android JNILibs with
-
React Native
- Build the Android JNILibs with
scripts/build-react-android.sh
. - Build the Flutter Plugin with
scripts/build-react-android.sh
. - Navigate to the example app with
cd react/example
. - Setup your simulator via Xcode or Android Studio, walkthrough if you have not done this before.
- Always start app with this combo
yarn && yarn ios
oryarn && yarn android
. - Play with the app, and make changes either to the example app or package as desired.
- For now, if changes are made in the package, you would manually copy the files/folders into the
examples/react/node_modules/cactus-react-native
.
- Build the Android JNILibs with
-
C/C++
- Navigate to the example app with
cd cactus/example
. - There are multiple main files
main_vlm, main_llm, main_embed, main_tts
. - Build both the libraries and executable using
build.sh
. - Run with one of the executables
./cactus_vlm
,./cactus_llm
,./cactus_embed
,./cactus_tts
. - Try different models and make changes as desired.
- Navigate to the example app with
-
Contributing
- To contribute a bug fix, create a branch after making your changes with
git checkout -b <branch-name>
and submit a PR. - To con AABA tribute a feature, please raise as issue first so it can be discussed, to avoid intersecting with someone else.
- Join our discord
- To contribute a bug fix, create a branch after making your changes with
Device | Gemma3 1B Q4 (toks/sec) | Qwen3 4B Q4 (toks/sec) |
---|---|---|
iPhone 16 Pro Max | 54 | 18 |
iPhone 16 Pro | 54 | 18 |
iPhone 16 | 49 | 16 |
iPhone 15 Pro Max | 45 | 15 |
iPhone 15 Pro | 45 | 15 |
iPhone 14 Pro Max | 44 | 14 |
OnePlus 13 5G | 43 | 14 |
Samsung Galaxy S24 Ultra | 42 | 14 |
iPhone 15 | 42 | 14 |
OnePlus Open | 38 | 13 |
Samsung Galaxy S23 5G | 37 | 12 |
Samsung Galaxy S24 | 36 | 12 |
iPhone 13 Pro | 35 | 11 |
OnePlus 12 | 35 | 11 |
Galaxy S25 Ultra | 29 | 9 |
OnePlus 11 | 26 | 8 |
iPhone 13 mini | 25 | 8 |
Redmi K70 Ultra | 24 | 8 |
Xiaomi 13 | 24 | 8 |
Samsung Galaxy S24+ | 22 | 7 |
Samsung Galaxy Z Fold 4 | 22 | 7 |
Xiaomi Poco F6 5G | 22 | 6 |
We created a demo chat app we use for benchmarking:
We provide a colleaction of recommended models on our HuggingFace Page