-
Xidian University
- Xi’an
- https://jackzhu.top/
Highlights
- Pro
Lists (7)
Sort Name ascending (A-Z)
Stars
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
Image editing is worth a single LoRA! 0.1% training data and 1% training parameters for fantastic image editing! Surpasses GPT-4o in ID persistence! Official ComfyUI workflow release! Only 4GB VRAM…
The simplest, fastest repository for training/finetuning small-sized VLMs.
✨✨VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model
ACE-Step: A Step Towards Music Generation Foundation Model
⚡ Automatically decrypt encryptions without knowing the key or cipher, decode encodings, and crack hashes ⚡
Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment
Have a natural, spoken conversation with AI!
Lightweight Knowledge Base and Feed Reader.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
MPC-BE – универсальный проигрыватель аудио и видеофайлов для операционной системы Windows.
Mini website for testing both general CS knowledge and enforce coding practice and common algorithm/data structure memorization.
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.
AI-Powered Photos App for the Decentralized Web 🌈💎✨
MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining
ACI.dev is the open source platform that connects your AI agents to 600+ tool integrations with multi-tenant auth, granular permissions, and access through direct function calling or a unified MCP …
100% open source dev kit for EOS S3 MCU+eFPGA SoC supported by fully open source SDK and FPGA Toolchain
Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
A feature-rich command-line audio/video downloader
zero-shot voice conversion & singing voice conversion, with real-time support