Lists (14)
Sort Name ascending (A-Z)
Starred repositories
[CVPR 2025 Highlight] Official Code Release Volumetrically Consistent 3D Gaussian Rasterization
ultralytics / mobileclip
Forked from apple/ml-mobileclipUltralytics implementation of the research paper, "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training" CVPR 2024
Share a single keyboard and mouse between multiple computers.
如果想体验小智项目,或者开发server端测试的同志,可以使用这个web端damo 体验下。 语音端已经完成,文字端完成,可以语音加文字输出。 等迭代慢慢完善。欢迎PR
A PyTorch implementation of the paper "EDGS: Eliminating Densification for Efficient Convergence of 3DGS"
一个基于小智、xiaozhi-server的Android、IOS语音对话应用,支持实时语音交互和文字对话。现在是flutter版本,打通IOS、Android端。请同志们动动小手,点点小星星,予以鼓励。
python版本的小智ai,主要帮助那些没有硬件却想体验小智功能的人,如果可以请点个小星星!
本项目为xiaozhi-esp32提供后端服务,帮助您快速搭建ESP32设备控制服务器。Backend service for xiaozhi-esp32, helps you quickly build an ESP32 device control server.
The python library for real-time communication
[CVPR 2025 Oral] VGGT: Visual Geometry Grounded Transformer
Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
Ultimate camera streaming application with support RTSP, RTMP, HTTP-FLV, WebRTC, MSE, HLS, MP4, MJPEG, HomeKit, FFmpeg, etc.
The official release of paper "SeeLe: A Unified Acceleration Framework for Real-Time Gaussian Splatting"
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Agent Framework / shim to use Pydantic with LLMs
An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.
A generative world for general-purpose robotics & embodied AI learning.
streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
🤗 smolagents: a barebones library for agents that think in python code.
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
This repository collects papers on VLLM applications. We will update new papers irregularly.
SplatAD: Real-Time Lidar and Camera Rendering with 3D Gaussian Splatting for Autonomous Driving