ydli-ai

🎯

Focusing

Li Yudong (李煜东) ydli-ai

🎯

Focusing

PostDoc, Tsinghua University

154 followers · 1 following

Beijing
01:25 (UTC +08:00)
https://scholar.google.com/citations?user=j4EmuqkAAAAJ&hl=zh-CN

Achievements

x3 x2

Achievements

x3 x2

Organizations

Stars

kenjihiranabe / The-Art-of-Linear-Algebra

Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"

PostScript 19,339 2,351 Updated Nov 13, 2024

joonspk-research / generative_agents

Generative Agents: Interactive Simulacra of Human Behavior

18,984 2,533 Updated Aug 5, 2024

jeinlee1991 / chinese-llm-benchmark

目前已囊括213个大模型，覆盖chatgpt、gpt-4o、o3-mini、谷歌gemini、Claude3.5、智谱GLM-Zero、文心一言、qwen-max、百川、讯飞星火、商汤senseChat、minimax等商用模型，以及DeepSeek-R1、qwq-32b、deepseek-v3、qwen2.5、llama3.3、phi-4、glm4、gemma3、mistral、书生in…

4,189 175 Updated May 6, 2025

twang2218 / vocab-coverage

语言模型中文认知能力分析

Python 237 25 Updated Sep 9, 2023

Lightning-AI / lit-llama

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Python 6,053 522 Updated Sep 6, 2024

pengxiao-song / LaWGPT

🎉 Repo for LaWGPT, Chinese-Llama tuned with Chinese Legal knowledge. 基于中文法律知识的大语言模型

Python 5,975 552 Updated Jun 11, 2024

esbatmop / MNBVC

MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化，也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。

3,846 272 Updated Apr 13, 2025

togethercomputer / RedPajama-Data

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Python 4,716 353 Updated Dec 7, 2024

OpenLMLab / GAOKAO-Bench

GAOKAO-Bench is an evaluation framework that utilizes GAOKAO questions as a dataset to evaluate large language models.

Python 643 44 Updated Jan 7, 2025

LianjiaTech / BELLE

BELLE: Be Everyone's Large Language model Engine（开源中文对话大模型）

HTML 8,146 768 Updated Oct 16, 2024

PhoebusSi / Alpaca-CoT

We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts…

Jupyter Notebook 2,741 253 Updated Dec 12, 2023

ProjectD-AI / llama_inference

llama inference for tencentpretrain

Python 98 11 Updated Jun 8, 2023

LAION-AI / Open-Instruction-Generalist

Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks

Python 208 19 Updated Jan 13, 2024

ggml-org / llama.cpp

LLM inference in C/C++

C++ 79,444 11,656 Updated May 9, 2025

qwopqwop200 / GPTQ-for-LLaMa

4 bits quantization of LLaMA using GPTQ

Python 3,050 460 Updated Jul 13, 2024

XueFuzhao / InstructionWild

459 41 Updated Jun 9, 2024

CLUEbenchmark / pCLUE

pCLUE: 1000000+多任务提示学习数据集

Jupyter Notebook 492 59 Updated Oct 4, 2022

nebuly-ai / optimate

A collection of libraries to optimise AI model performances

Python 8,372 636 Updated Jul 22, 2024

hukkelas / DSFD-Pytorch-Inference

A High-Performance Pytorch Implementation of face detection models, including RetinaFace and DSFD

Python 227 60 Updated Apr 5, 2024

zhanlaoban / EDA_NLP_for_Chinese

An implement of the paper of EDA for Chinese corpus.中文语料的EDA数据增强工具。NLP数据增强。论文阅读笔记。

Python 1,372 240 Updated May 31, 2022

FacePerceiver / FaRL

FaRL for Facial Representation Learning [Official, CVPR 2022]

Python 412 23 Updated Jun 9, 2023

SelfishGene / SFHQ-dataset

Synthetic Faces High Quality (SFHQ) Dataset. 425,000 curated 1024x1024 synthetic face images

Python 231 6 Updated Oct 14, 2024

UKPLab / sentence-transformers

State-of-the-Art Text Embeddings

Python 16,631 2,586 Updated May 8, 2025

dbiir / UER-py

Open Source Pre-training Model Framework in PyTorch & Pre-trained Model Zoo

Python 3,063 523 Updated May 9, 2024

rom1504 / img2dataset

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Python 4,021 355 Updated Aug 7, 2024

lucidrains / DALLE-pytorch

Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch

Python 5,612 643 Updated Feb 17, 2024

google-research-datasets / wit

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

1,051 42 Updated Sep 27, 2024

IIGROUP / MM-CelebA-HQ-Dataset

[CVPR 2021] Multi-Modal-CelebA-HQ: A Large-Scale Text-Driven Face Generation and Understanding Dataset

Python 240 20 Updated Jun 1, 2024

skywind3000 / ECDICT

Free English to Chinese Dictionary Database

Python 6,541 1,135 Updated Mar 28, 2025

liuhuanyong / ChineseSemanticKB

ChineseSemanticKB,chinese semantic knowledge base, 面向中文处理的12类、百万规模的语义常用词典，包括34万抽象语义库、34万反义语义库、43万同义语义库等，可支持句子扩展、转写、事件抽象与泛化等多种应用场景。

Python 755 161 Updated Mar 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Li Yudong (李煜东) ydli-ai

Achievements

Achievements

Organizations

Block or report ydli-ai

Stars

kenjihiranabe / The-Art-of-Linear-Algebra

joonspk-research / generative_agents

jeinlee1991 / chinese-llm-benchmark

twang2218 / vocab-coverage

Lightning-AI / lit-llama

pengxiao-song / LaWGPT

esbatmop / MNBVC

togethercomputer / RedPajama-Data

OpenLMLab / GAOKAO-Bench

LianjiaTech / BELLE

PhoebusSi / Alpaca-CoT

ProjectD-AI / llama_inference

LAION-AI / Open-Instruction-Generalist

ggml-org / llama.cpp

qwopqwop200 / GPTQ-for-LLaMa

XueFuzhao / InstructionWild

CLUEbenchmark / pCLUE

nebuly-ai / optimate

hukkelas / DSFD-Pytorch-Inference

zhanlaoban / EDA_NLP_for_Chinese

FacePerceiver / FaRL

SelfishGene / SFHQ-dataset

UKPLab / sentence-transformers

dbiir / UER-py

rom1504 / img2dataset

lucidrains / DALLE-pytorch

google-research-datasets / wit

IIGROUP / MM-CelebA-HQ-Dataset

skywind3000 / ECDICT

liuhuanyong / ChineseSemanticKB