-
Tsinghua University
- China
- https://wanng-ide.github.io/
- https://orcid.org/0000-0001-9869-7085
Stars
YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open
Xiaomi Home Integration for Home Assistant
A Survey on the Honesty of Large Language Models
[EMNLP 2024] A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models.
[ICLR 2025] ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability via Chart-to-Code Generation
A collection of various Markdown files for testing purposes.
Open-Sora: Democratizing Efficient Video Production for All
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
The most accurate natural language detection library for Python, suitable for short text and mixed-language text
An Open-Source Python3 tool with SMALL models for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowe…
[ACL 2023] Solving Math Word Problems via Cooperative Reasoning induced Language Models (LLMs + MCTS + Self-Improvement)
🔥[IJCAI 2022, Official Code] for paper "Rethinking Image Aesthetics Assessment: Models, Datasets and Benchmarks". Official Weights and Demos provided. 首个面向多主题场景的美学评估数据集、算法和benchmark.
Stable Diffusion web UI
Cross-platform internet upload/download manager for HTTP(S), FTP(S), SSH, magnet-link, BitTorrent, m3u8, ed2k, and online videos. WebDAV client, FTP client, SSH client.
Multitask Optimization Platform (MToP): A MATLAB Optimization Platform for Evolutionary Multitasking
⭐⭐⭐FightingCV Paper Reading, which helps you understand the most advanced research work in an easier way 🍀 🍀 🍀
📄 Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.
🎦 Extract video hard subtitles and automatically generate corresponding srt files.
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
METER: A Multimodal End-to-end TransformER Framework