- Project Overview
- Key Features
- Why Z-Ant?
- Project Status & Achievements
- Roadmap to Best-in-Class TinyML Engine
- Getting Started for Contributors
- Development Workflow
- Using Z-Ant
- Build System
- Contributing
- License
Z-Ant (Zig-Ant) is a comprehensive, open-source neural network framework specifically designed for deploying optimized AI models on microcontrollers and edge devices. Built with Zig, Z-Ant provides end-to-end tools for model optimization, code generation, and real-time inference on resource-constrained hardware.
- ONNX Model Support: Full compatibility with ONNX format models
- Cross-platform Compilation: ARM Cortex-M, RISC-V, x86, and more
- Static Library Generation: Generate optimized static libraries for any target architecture
- Real-time Inference: Microsecond-level prediction times on microcontrollers
- Quantization: Automatic model quantization with dynamic and static options
- Pruning: Neural network pruning for reduced model size
- Buffer Optimization: Memory-efficient tensor operations
- Flash vs RAM Execution: Configurable execution strategies
Z-Ant includes an experimental cross-platform GUI built with SDL for basic model selection and code generation. Note that the GUI is currently unstable and under active development - we recommend using the command-line interface for production workflows.
- JPEG Decoding: Complete JPEG image processing pipeline
- Multiple Color Spaces: RGB, YUV, Grayscale support
- Hardware Optimization: SIMD and platform-specific optimizations
- Preprocessing Pipeline: Normalization, resizing, and format conversion
- 30+ Operators: Comprehensive coverage of neural network operations
- Multiple Data Types: Float32, Int64, Bool, and more
- Dynamic Shapes: Support for variable input dimensions
- Custom Operators: Extensible operator framework
- 🚫 Lack of DL Support: Devices like TI Sitara, Raspberry Pi Pico, or ARM Cortex-M lack comprehensive DL libraries
- 🌍 Open-source: Complete end-to-end NN deployment and optimization solution
- 🎓 Research-Inspired: Implements cutting-edge optimization techniques inspired by MIT's Han Lab research
- 🏛 Academic Collaboration: Developed in collaboration with institutions like Politecnico di Milano
- ⚡ Performance First: Designed for real-time inference with minimal resource usage
- 🔧 Developer Friendly: Clear APIs, extensive documentation, and practical examples
- 🏭 Edge AI: Real-time anomaly detection, predictive maintenance
- 🤖 IoT & Autonomous Systems: Lightweight AI models for drones, robots, vehicles, IoT devices
- 📱 Mobile Applications: On-device inference for privacy-preserving AI
- 🏥 Medical Devices: Real-time health monitoring and diagnostics
- 🎮 Gaming: AI-powered gameplay enhancement on embedded systems
- 📷 im2tensor: Complete JPEG image processing pipeline with multiple color space support
- 🚀 Enhanced Code Generation: Advanced code generation with flash vs RAM execution strategies
- 🔧 Expanded ONNX Compatibility: 30+ operators with comprehensive neural network coverage
- 📊 Shape Tracker: Dynamic tensor shape management and optimization
- 🧪 Comprehensive Testing Suite: Automated testing for all major components
- 📚 Static Library Generation: Cross-platform compilation for ARM Cortex-M, RISC-V, x86
- 🔬 Advanced Pruning & Quantization: Research-grade optimization techniques
- 📱 Expanded Microcontroller Support: Additional hardware platforms
- ⚡ Real-time Benchmarking Tools: Performance analysis and profiling suite
- 🔄 Model Execution Optimization: Further inference speed improvements
- Q3 2025: MNIST inference on Raspberry Pi Pico 2 (Target: July 2025)
- Q4 2025: Efficient YOLO deployment on edge devices
To establish Z-Ant as the premier tinyML inference engine, we are pursuing several key improvements:
- Custom Memory Allocators: Zero-allocation inference with pre-allocated memory pools
- In-Place Operations: Minimize memory copies through tensor operation fusion
- SIMD Vectorization: ARM NEON, RISC-V Vector extensions, and x86 AVX optimizations
- Assembly Kernels: Hand-optimized assembly for critical operations (matrix multiplication, convolution)
- Cache-Aware Algorithms: Memory access patterns optimized for L1/L2 cache efficiency
- Dynamic Quantization: Runtime precision adjustment based on input characteristics
- Structured Pruning: Channel and block-level pruning for hardware-friendly sparsity
- Knowledge Distillation: Automatic teacher-student model compression pipeline
- Neural Architecture Search (NAS): Hardware-aware model architecture optimization
- Binary/Ternary Networks: Extreme quantization for ultra-low power inference
- DSP Instruction Utilization: Leverage ARM Cortex-M DSP instructions and RISC-V packed SIMD
- DMA-Accelerated Operations: Offload data movement to DMA controllers
- Flash Execution Strategies: XIP (Execute-in-Place) optimization for flash-resident models
- Low-Power Modes: Dynamic frequency scaling and sleep mode integration
- Hardware Security Modules: Secure model storage and execution
- NPU Integration: Support for dedicated neural processing units (e.g., Arm Ethos, Intel Movidius)
- FPGA Acceleration: Custom hardware generation for ultra-performance inference
- GPU Compute: OpenCL/CUDA kernels for edge GPU acceleration
- Neuromorphic Computing: Spike-based neural network execution
- Lottery Ticket Hypothesis: Sparse subnetwork discovery and training
- Progressive Quantization: Gradual precision reduction during training/deployment
- Magnitude-Based Pruning: Automatic weight importance analysis
- Channel Shuffling: Network reorganization for efficient inference
- Tensor Decomposition: Low-rank approximation for parameter reduction
- Early Exit Networks: Conditional computation based on input complexity
- Dynamic Model Selection: Runtime model switching based on resource availability
- Cascaded Inference: Multi-stage models with progressive complexity
- Attention Mechanism Optimization: Efficient transformer and attention implementations
- Hardware Performance Counters: Cycle-accurate performance measurement
- Energy Profiling: Power consumption analysis per operation
- Memory Footprint Analysis: Detailed RAM/Flash usage breakdown
- Thermal Analysis: Temperature impact on inference performance
- Real-Time Visualization: Live performance monitoring dashboards
- AutoML Integration: Automated hyperparameter tuning for target hardware
- Benchmark-Driven Optimization: Continuous performance regression testing
- Hardware-in-the-Loop Testing: Automated testing on real hardware platforms
- Model Validation: Accuracy preservation verification throughout optimization
- Deploy-to-Production Pipeline: One-click deployment to embedded systems
- TensorFlow Lite Compatibility: Seamless migration from TFLite models
- PyTorch Mobile Integration: Direct PyTorch model deployment pipeline
- ONNX Runtime Parity: Feature-complete ONNX runtime alternative
- MLflow Integration: Model versioning and experiment tracking
- Edge Impulse Compatibility: Integration with popular edge ML platforms
- OTA Model Updates: Over-the-air model deployment and versioning
- A/B Testing Framework: Safe model rollout with performance comparison
- Federated Learning Support: Distributed training on edge devices
- Model Encryption: Secure model storage and execution
- Compliance Tools: GDPR, HIPAA, and safety-critical certifications
- MLPerf Tiny: Competitive performance on standard benchmarks
- EEMBC MLMark: Energy efficiency measurements
- Custom TinyML Benchmarks: Domain-specific performance evaluation
- Real-World Workload Testing: Production-representative model validation
- Cross-Platform Consistency: Identical results across all supported hardware
- Fuzzing Infrastructure: Automated testing with random inputs
- Formal Verification: Mathematical proof of correctness for critical operations
- Hardware Stress Testing: Extended operation under extreme conditions
- Regression Test Suite: Comprehensive backward compatibility testing
- Performance Monitoring: Continuous integration with performance tracking
- Zig Compiler: Install the latest Zig compiler
- Git: For version control and collaboration
- Basic Zig Knowledge: Improve Zig proficiency via Ziglings
-
Clone the repository:
git clone https://github.com/ZIGTinyBook/Z-Ant.git cd Z-Ant
-
Run tests to verify setup:
zig build test --summary all
-
Generate code for a model:
zig build codegen -Dmodel=mnist-1
Start here if you're new to Z-Ant:
- Run existing tests: Use
zig build test --summary all
to understand the codebase - Try code generation: Use
zig build codegen -Dmodel=mnist-1
to see the workflow - Read the documentation: Check
/docs/
folder for detailed guides
Z-Ant/
├── src/ # Core source code
│ ├── Core/ # Neural network core functionality
│ ├── CodeGen/ # Code generation engine
│ ├── ImageToTensor/ # Image preprocessing pipeline
│ ├── onnx/ # ONNX model parsing
│ └── Utils/ # Utilities and helpers
├── tests/ # Comprehensive test suite
├── datasets/ # Sample models and test data
├── generated/ # Generated code output
├── examples/ # Arduino and microcontroller examples
└── docs/ # Documentation and guides
# Run comprehensive tests
zig build test --summary all
# Generate code for a specific model
zig build codegen -Dmodel=mnist-1
# Test generated code
zig build test-codegen -Dmodel=mnist-1
# Compile static library for deployment
zig build lib -Dmodel=mnist-1 -Dtarget=thumb-freestanding -Dcpu=cortex_m33
We follow a structured branching strategy to ensure code quality and smooth collaboration:
main
: Stable, production-ready code for releasesfeature/<feature-name>
: New features under developmentfix/<issue-description>
: Bug fixes and patchesdocs/<documentation-topic>
: Documentation improvementstest/<test-improvements>
: Test suite enhancements
- Test Before Committing: Run
zig build test --summary all
before every commit - Document Your Code: Follow Zig's doc-comments standard
- Small, Focused PRs: Keep pull requests small and focused on a single feature/fix
- Use Conventional Commits: Follow commit message conventions (feat:, fix:, docs:, etc.)
- Install the latest Zig compiler
- Improve Zig proficiency via Ziglings
Add tests to build.zig/test_list
.
- Regular tests:
zig build test --summary all
- Heavy computational tests:
zig build test -Dheavy --summary all
zig build codegen -Dmodel=model_name [-Dlog -Duser_tests=user_tests.json]
Generated code will be placed in:
generated/model_name/
├── lib_{model_name}.zig
├── test_{model_name}.zig
└── user_tests.json
zig build test-codegen -Dmodel=model_name
Build the static library:
zig build lib -Dmodel=model_name -Dtarget={arch} -Dcpu={cpu}
Linking with CMake:
target_link_libraries(your_project PUBLIC path/to/libzant.a)
To set a custom log function from your C code:
extern void setLogFunction(void (*log_function)(uint8_t *string));
-
Standard build:
zig build # Build all targets
-
Run unit tests:
zig build test --summary all # Run all unit tests
-
Code generation:
zig build codegen -Dmodel=model_name # Generate code for specified model
-
Static library compilation:
zig build lib -Dmodel=model_name # Compile static library for deployment
-
Test generated library:
zig build test-generated-lib -Dmodel=model_name # Test specific generated model library
-
OneOp model testing:
zig build test-codegen-gen # Generate oneOperation test models zig build test-codegen # Test all generated oneOperation models
-
ONNX parser testing:
zig build onnx-parser # Test ONNX parser functionality
- Build main executable for profiling:
zig build build-main -Dmodel=model_name # Build profiling target executable
-Dtarget=<arch>
: Target architecture (e.g.,thumb-freestanding
,native
)-Dcpu=<cpu>
: CPU model (e.g.,cortex_m33
,cortex_m4
)
-Dmodel=<name>
: Model name (default:mnist-8
)-Dmodel_path=<path>
: Custom ONNX model path-Dgenerated_path=<path>
: Output directory for generated code-Doutput_path=<path>
: Output directory for compiled library
-Dlog=true|false
: Enable detailed logging during code generation-Duser_tests=<path>
: Specify custom user tests JSON file-Dshape=<shape>
: Input tensor shape-Dtype=<type>
: Input data type (default:f32
)-Dcomm=true|false
: Generate code with comments-Ddynamic=true|false
: Enable dynamic memory allocation
-Dheavy=true|false
: Run heavy computational tests-Dtest_name=<name>
: Run specific test by name
-Dtrace_allocator=true|false
: Use tracing allocator for debugging (default:true
)-Dallocator=<type>
: Allocator type to use (default:raw_c_allocator
)
# Generate code for MNIST model with logging
zig build codegen -Dmodel=mnist-1 -Dlog=true
# Build static library for ARM Cortex-M33
zig build lib -Dmodel=mnist-1 -Dtarget=thumb-freestanding -Dcpu=cortex_m33
# Test with heavy computational tests enabled
zig build test -Dheavy=true --summary all
# Generate code with custom paths and comments
zig build codegen -Dmodel=custom_model -Dmodel_path=my_models/custom.onnx -Dgenerated_path=output/ -Dcomm=true
# Build library with custom output location
zig build lib -Dmodel=mnist-1 -Doutput_path=/path/to/deployment/
# Run specific test
zig build test -Dtest_name=tensor_math_test
# Build profiling executable for performance analysis
zig build build-main -Dmodel=mnist-1 -Dtarget=native
We welcome contributions from developers of all skill levels! Here's how to get involved:
- Fork the repository on GitHub
- Clone your fork locally
- Create a feature branch for your work
- Make your changes following our coding standards
- Run tests to ensure everything works
- Submit a pull request for review
- 🐛 Bug Reports: Found an issue? Let us know!
- ✨ Feature Requests: Have an idea? Share it with us!
- 💻 Code Contributions: Improve the codebase or add new features
- 📚 Documentation: Help make the project easier to understand
- 🧪 Testing: Write tests or improve test coverage
- Follow our Code of Conduct
- Check out the Contributing Guide for detailed guidelines
- Join discussions on GitHub Issues and Discussions
All contributors are recognized in our Contributors list. Thank you for helping shape the future of tinyML!
This project is licensed under the LICENSE file in the repository.
Join us in revolutionizing AI on edge devices! 🚀
GitHub • Documentation • Examples • Community