- San Jose
- www.sanghyunyi.com
- @Yi_SangHyun
Highlights
- Pro
Stars
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
A curated list of recent and past chart understanding work based on our IEEE TKDE survey paper: From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Mod…
Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS ev…
Awesome multilingual OCR and Document Parsing toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools,…
End-to-end neural table-text understanding models.
This is an official implementation for the WTW Dataset in "Parsing Table Structures in the Wild " on table detection and table structure recognition.
[ICLR2025 Oral] ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding
Official Repository of ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Implementation of Nougat Neural Optical Understanding for Academic Documents
Get your documents ready for gen AI
Repository for collecting and categorizing papers outlined in our survey paper: "Large Language Models on Tabular Data -- A Survey".
An extremely fast Python package and project manager, written in Rust.
A high-performance distributed file system designed to address the challenges of AI training and inference workloads.
The official Github repository for paper "R^2AG: Incorporating Retrieval Information into Retrieval Augmented Generation" (EMNLP 2024 Findings).
A one stop repository for generative AI research updates, interview resources, notebooks and much more!
A comprehensive collection of KAN(Kolmogorov-Arnold Network)-related resources, including libraries, projects, tutorials, papers, and more, for researchers and developers in the Kolmogorov-Arnold N…
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
Retrieval and Retrieval-augmented LLMs
A tool for extracting plain text from Wikipedia dumps
Train transformer language models with reinforcement learning.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.