Stars
A powerful tool for creating fine-tuning datasets for LLM
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥
A course on aligning smol models.
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
This repository contains demos I made with the Transformers library by HuggingFace.
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
中文羊驼大模型三期项目 (Chinese Llama-3 LLMs) developed from Meta Llama 3
This repo is the homebase of a community driven course on Computer Vision with Neural Networks. Feel free to join us on the Hugging Face discord: hf.co/join/discord
deep learning for image processing including classification and object-detection etc.
A configurable, tunable, and reproducible library for CTR prediction https://fuxictr.github.io
An open-source tool-augmented conversational language model from Fudan University
State-of-the-Art Text Embeddings
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡ 8974 ️🍸 🍹 🍷
ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Chat with your documents on your local device using GPT models. No data leaves your device and 100% private.
DIAMBRA Arena: a New Reinforcement Learning Platform for Research and Experimentation
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
A toolkit for developing and comparing reinforcement learning algorithms.