8000 dkawahara (Daisuke Kawahara) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View dkawahara's full-sized avatar
  • Waseda University
  • Tokyo, Japan

Block or report dkawahara

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning

Python 356 29 Updated Sep 6, 2024

Fully open reproduction of DeepSeek-R1

Python 24,735 2,289 Updated Jun 2, 2025

List of papers on Self-Correction of LLMs.

73 2 Updated Dec 28, 2024

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 2,598 186 Updated Jun 6, 2025

The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems

Python 2,095 131 Updated Jun 9, 2025

Yomitoku is an AI-powered document image analysis package designed specifically for the Japanese language.

Python 811 26 Updated Jun 6, 2025

S2ORC: The Semantic Scholar Open Research Corpus: https://www.aclweb.org/anthology/2020.acl-main.447/

Python 940 71 Updated Apr 26, 2024

The evaluation scripts of JMTEB (Japanese Massive Text Embedding Benchmark)

Python 61 14 Updated Apr 23, 2025

Benchmarking LLMs with Challenging Tasks from Real Users

Python 223 42 Updated Nov 3, 2024

Official repository for KoMT-Bench built by LG AI Research

Python 63 1 Updated Aug 8, 2024
Python 518 45 Updated Nov 20, 2024

A series of large language models trained from scratch by developers @01-ai

Jupyter Notebook 7,830 492 Updated Nov 27, 2024

Arena-Hard-Auto: An automatic LLM benchmark.

Python 844 102 Updated May 1, 2025

Scalable toolkit for efficient model alignment

Python 808 100 Updated May 31, 2025

OCR, layout analysis, reading order, table recognition in 90+ languages

Python 17,596 1,170 Updated Jun 6, 2025

MLX: An array framework for Apple silicon

C++ 20,881 1,226 Updated Jun 9, 2025

LLM training in simple, raw C/CUDA

Cuda 26,824 3,083 Updated May 10, 2025
Python 3,663 362 Updated May 13, 2025

A terminal application to view, tail, merge, and search log files (plus JSONL).

Python 3,474 72 Updated Aug 5, 2024

JMultiWOZ: A Large-Scale Japanese Multi-Domain Task-Oriented Dialogue Dataset, LREC-COLING 2024

Python 24 Updated Mar 27, 2024

[Nature Reviews Bioengineering🔥] Application of Large Language Models in Medicine. A curated list of practical guide resources of Medical LLMs (Medical LLMs Tree, Tables, and Papers)

1,587 139 Updated Jun 9, 2025

RealPersonaChat: A Realistic Persona Chat Corpus with Interlocutors' Own Personalities

59 Updated Mar 13, 2024

Grok open release

Python 50,301 8,354 Updated Aug 30, 2024

📃Language Model based sentences scoring library

Python 308 36 Updated Feb 9, 2022

Data and tools for generating and inspecting OLMo pre-training data.

Python 1,235 143 Updated Jun 5, 2025

Orion-14B is a family of models includes a 14B foundation LLM, and a series of models: a chat model, a long context model, a quantized model, a RAG fine-tuned model, and an Agent fine-tuned model. …

Python 790 56 Updated Jun 3, 2024

A principled instruction benchmark on formulating effective queries and prompts for large language models (LLMs). Our paper: https://arxiv.org/abs/2312.16171

Python 963 98 Updated May 28, 2024
HTML 20 5 Updated Jan 27, 2025

[ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct

Python 2,018 164 Updated Nov 1, 2024
Next
0