Stars
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
Tile primitives for speedy kernels
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
manmay-nakhashi / TTSizer
Forked from taresh18/TTSizer🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨
Get started with building Fullstack Agents using Gemini 2.5 and LangGraph
Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
Tutel MoE: Optimized Mixture-of-Experts Library, Support DeepSeek FP8/FP4
Framework for testing vulnerabilities of large language models (LLM).
Kamailio - The Open Source SIP Server for large VoIP and real-time communication platforms -
dangvansam / pyannote-onnx
Forked from pyannote/pyannote-audioPyAnnote Voice Activity Detection (ONNX version)
FULL v0, Cursor, Manus, Same.dev, Lovable, Devin, Replit Agent, Windsurf Agent, VSCode Agent, Dia Browser & Trae AI (And other Open Sourced) System Prompts, Tools & AI Models.
anan235 / dia-multilingual
Forked from nari-labs/diaA TTS model capable of generating ultra-realistic dialogue in one pass.
ONNX Inference of Pyannote Segmentation
A gallery that showcases on-device ML/GenAI use cases and allows people to try and use models locally.
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
Distillation of Self-Supervised Representation-Based Speech Quality Assessment
llm-d is a Kubernetes-native high-performance distributed LLM inference framework
VLMHyperBench – open source фреймворк для оценки возможностей Vision language models (VLM) распознавать документы на русском языке с целью оценки их потенциала для автоматизации документооборота.
TEN VAD: low-latency high-performance Voice Activity Detector
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.