- Tokyo, Japan
-
22:16
(UTC +09:00) - @1MoNo2Prod
Stars
π A curated list of awesome .cursorrules files
FULL v0, Cursor, Manus, Same.dev, Lovable, Devin, Replit Agent, Windsurf Agent & VSCode Agent (And other Open Sourced) System Prompts, Tools & AI Models.
Real-time Speech-Text Foundation Model Toolkit (wip)
Run Orpheus 3B Locally With LM Studio
Autonomous software development agent apps using Amazon Bedrock, capable of customize to create/edit files, execute commands, search the web, use knowledge base, use multi-agents, generative imagesβ¦
A Conversational Speech Generation Model
π₯ Enterprise SaaS Starter Kit - Kickstart your enterprise app development with the Next.js SaaS boilerplate π
This project contains all the necessary boilerplate to setup a multi-tenant SaaS with Next.js including authentication and RBAC authorization.
SpeechGateway - A reverse proxy server that enhances speech synthesis with essential, extensible features. π¦π¬
litagin02 / Style-Bert-VITS2
Forked from fishaudio/Bert-VITS2Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles.
π¦π Build context-aware reasoning applications π¦π
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Azure OpenAI code resources for using gpt-4o-realtime capabilities.
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Create a TypeScript Action with tests, linting, workflow, publishing, and versioning
Headless TypeScript ORM with a head. Runs on Node, Bun and Deno. Lives on the Edge and yes, it's a JavaScript ORM too π
A python package to build AI-powered real-time audio applications
ι³ε£°θͺθγζη« ηζγι³ε£°εζγδ½Ώγ£γ¦ε―Ύθ©±γγγγ£γγγγγγ’γγͺ
Real-time transcription using faster-whisper
Foundational Models for State-of-the-Art Speech and Text Translation
Deezer source separation library including pretrained models.