8000 hlp-ai (Xiaofeng Liu) / Starred · GitHub

More Web Proxy on the site http://driver.im/

hlp-ai

Follow

Xiaofeng Liu hlp-ai

Follow

Still Writing...

10 followers · 0 following

Human Language Processing Laboratory (HLP Lab)
Wuhan, China

Stars

Denis2054 / Transformers-for-NLP-2nd-Edition

Transformer models from BERT to GPT-4, environments from Hugging Face to OpenAI. Fine-tuning, training, and prompt engineering examples. A bonus section with ChatGPT, GPT-3.5-turbo, GPT-4, and DALL…

Jupyter Notebook 883 330 Updated Jan 4, 2024

nlp-with-transformers / notebooks

Jupyter notebooks for the Natural Language Processing with Transformers book

Jupyter Notebook 4,326 1,346 Updated Aug 21, 2024

fe1ixxu / ALMA

State-of-the-art LLM-based translation models.

Ruby 525 40 Updated Apr 9, 2025

shjwudp / c4-dataset-script

Inspired by google c4, here is a series of colossal clean data cleaning scripts focused on CommonCrawl data processing. Including Chinese data processing and cleaning methods in MassiveText.

Python 125 14 Updated Jun 7, 2023

huggingface / course

The Hugging Face course on Transformers

MDX 2,956 932 Updated May 21, 2025

Byaidu / PDFMathTranslate

PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译，支持 Google/DeepL/Ollama/OpenAI 等服务，提供 CLI/GUI/MCP/Docker/Zotero

Python 23,866 2,040 Updated May 9, 2025

rasbt / LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 49,784 7,184 Updated Apr 20, 2025

oscar-project / ungoliant

🕷️ The pipeline for the OSCAR corpus

Rust 167 16 Updated Dec 18, 2023

google-research / bleurt

BLEURT is a metric for Natural Language Generation based on transfer learning.

Python 730 85 Updated Aug 4, 2023

jingyaogong / minimind

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT！🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 21,129 2,476 Updated Apr 30, 2025

google-research / metricx

Python 93 16 Updated Dec 12, 2024

karpathy / minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 9,649 914 Updated Jul 1, 2024

karpathy / nano-llama31

nanoGPT style version of Llama 3.1

Python 1,369 83 Updated Aug 8, 2024

AI-Study-Han / Zero-Chatgpt

从0开始，将chatgpt的技术路线跑一遍。

Python 236 41 Updated Sep 5, 2024

Lightning-AI / litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 12,139 1,232 Updated May 21, 2025

jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,489 532 Updated May 3, 2024

PacktPublishing / Python-GUI-Programming-Cookbook-Third-Edition

Python GUI Programming Cookbook, Third Edition, Published by Packt

Python 103 58 Updated Jan 18, 2021

zhangfh-cq / android-translator

安卓课设：翻译君APP

Java 14 1 Updated Nov 15, 2021

thammegowda / mtdata

A tool that locates, downloads, and extracts machine translation corpora

Python 154 23 Updated Apr 27, 2025

MicrosoftTranslator / NTREX

NTREX -- News Test References for MT Evaluation

83 16 Updated Jun 5, 2024

facebookresearch / flores

Facebook Low Resource (FLoRes) MT Benchmark

Python 736 127 Updated Nov 20, 2023

QuyAnh2005 / vits-japanese

Text to Speech for Japanese

Python 16 6 Updated May 11, 2023

kdrkdrkdr / JK-VITS

Bilingual-TTS (Japanese and Korean)

Jupyter Notebook 30 5 Updated Jul 1, 2023

AlexandaJerry / whisper-vits-japanese

Vits Japanese with Whisper as data processor (you can train your VITS even you only have audios)

Jupyter Notebook 161 28 Updated May 7, 2023

CjangCjengh / vits

Forked from jaywalnut310/vits

VITS implementation of Japanese, Chinese, Korean, Sanskrit and Thai

Python 928 197 Updated Dec 6, 2023

innnky / emotional-vits

无需情感标注的情感可控语音合成模型，基于VITS

Jupyter Notebook 1,387 166 Updated Mar 30, 2023

eole-nlp / eole

Open language modeling toolkit based on PyTorch

Python 118 21 Updated May 15, 2025

pymupdf / PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Python 7,193 608 Updated May 21, 2025

zxyle / PDF-Explained

《PDF 解析》

1,043 124 Updated Aug 5, 2024

minitorch / minitorch

The full minitorch student suite.

Python 2,074 458 Updated Aug 17, 2024

0