Lists (1)
Sort Name ascending (A-Z)
Stars
Hacks for training RL systems from John Schulman's lecture at Deep RL Bootcamp (Aug 2017)
Get started with building Fullstack Agents using Gemini 2.5 and LangGraph
AFAC2024金融智能创新大赛
基于大模型搭建的聊天机器人,同时支持 微信公众号、企业微信应用、飞书、钉钉 等接入,可选择ChatGPT/Claude/DeepSeek/文心一言/讯飞星火/通义千问/ Gemini/GLM-4/Kimi/LinkAI,能处理文本、语音和图片,访问操作系统和互联网,支持基于自有知识库进行定制企业智能客服。
Bring projects, wikis, and teams together with AI. AppFlowy is the AI collaborative workspace where you achieve more without losing control of your data. The leading open source Notion alternative.
Fully Local Manus AI. No APIs, No $200 monthly bills. Enjoy an autonomous agent that thinks, browses the web, and code for the sole cost of electricity. 🔔 Official updates only via twitter @Martin9…
🌐 WebAgent for Information Seeking bulit by Tongyi Lab: WebWalker & WebDancer & WebSailor https://arxiv.org/pdf/2507.02592
Unleashing the Power of Reinforcement Learning for Math and Code Reasoners
DeerFlow is a community-driven Deep Research framework, combining language models with tools like web search, crawling, and Python execution, while contributing back to the open-source community.
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
Official PyTorch implementation of BigVGAN (ICLR 2023)
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation me…
The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
The python library for real-time communication
基于通义千问 Qwen2.5-Omni 的实时语音对话系统,使用在线API服务,支持实时语音交互、动态语音活动检测和流式音频处理。A real-time voice conversation system based on Qwen2.5-Omni Online-API, supporting real-time voice interaction, dynamic voice activi…
Have a natural, spoken conversation with AI!
Open Source framework for voice and multimodal conversational AI
ZeroSearch: Incentivize the Search Capability of LLMs without Searching
🌐 WebThinker: Empowering Large Reasoning Models with Deep Research Capability
[ACM MM 2025] FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis
这是一个可以识别视频语音自动生成字幕SRT文件的开源 Windows-GUI 软件工具。
「妙幕」是一款跨平台客户端工具,可以批量为视频或者音频生成字幕文件,并支持对字幕进行翻译,支持百度、火山、openai、ollama、deepseek 等多家翻译
🎬 卡卡字幕助手 | VideoCaptioner - 基于 LLM 的智能字幕助手 - 视频字幕生成、断句、校正、字幕翻译全流程处理!- A powered tool for easy and efficient video subtitling.
[ACL 2024] Progressive LLaMA with Block Expansion.
OpenAI Assistants API quickstart with Next.js.
A video translation and dubbing tool powered by LLMs, offering professional-grade translations and one-click full-process deployment. It can generate content optimized for platforms like YouTube,T…
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.