8000 duchao0726 (Chao Du) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View duchao0726's full-sized avatar

Block or report duchao0726

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Reinforcing General Reasoning without Verifiers

Python 53 4 Updated May 30, 2025

Understanding R1-Zero-Like Training: A Critical Perspective

Python 966 45 Updated May 24, 2025

V1: Toward Multimodal Reasoning by Designing Auxiliary Task

Python 34 1 Updated Apr 14, 2025

A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.

Python 239 10 Updated Apr 15, 2025

AnchorAttention: Improved attention for LLMs long-context training

Python 208 6 Updated Jan 15, 2025

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

Python 372 27 Updated Jun 3, 2025

Official PyTorch implementation for ICLR2025 paper "Scaling up Masked Diffusion Models on Text"

Python 202 12 Updated Dec 22, 2024

[ICLR 2025] A Closer Look at Machine Unlearning for Large Language Models

Python 29 5 Updated Dec 4, 2024

[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)

Python 85 3 Updated Oct 17, 2024

The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.

Python 46 Updated Oct 18, 2024

[ArXiv 2024] Denial-of-Service Poisoning Attacks on Large Language Models

Python 18 3 Updated Oct 22, 2024
Python 27 1 Updated Apr 22, 2025

[ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)

Jupyter Notebook 78 Updated Oct 23, 2024

[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.

Python 123 7 Updated Mar 21, 2025

Official implementation of Bootstrapping Language Models via DPO Implicit Rewards

Python 44 3 Updated Apr 15, 2025

Improved techniques for optimization-based jailbreaking on large language models (ICLR2025)

Python 102 7 Updated Apr 7, 2025

Improved Few-Shot Jailbreaking Can Circumvent Aligned Language Models and Their Defenses (NeurIPS 2024)

Jupyter Notebook 61 9 Updated Jan 11, 2025

[ICML 2024] Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast

Python 102 13 Updated Mar 26, 2024

Graph Diffusion Policy Optimization

Python 36 4 Updated Mar 17, 2024
Python 27 Updated Jan 23, 2024

Code of the paper: Finetuning Text-to-Image Diffusion Models for Fairness

Python 43 3 Updated Apr 26, 2024

Intriguing Properties of Data Attribution on Diffusion Models (ICLR 2024)

Jupyter Notebook 30 3 Updated Jan 23, 2024
Python 21 1 Updated May 7, 2024

[TMLR 2025] On Memorization in Diffusion Models

Python 26 1 Updated Oct 5, 2023

[COLM 2024] LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

Python 637 38 Updated Jul 22, 2024

Official code for "On Calibrating Diffusion Probabilistic Models"

Python 29 1 Updated Feb 22, 2023

Official code for "DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps" (Neurips 2022 Oral)

Python 1,707 129 Updated Feb 6, 2024

Flax is a neural network library for JAX that is designed for flexibility.

Jupyter Notebook 6,596 706 Updated Jun 5, 2025

JAX-based neural network library

Python 3,042 246 Updated May 29, 2025
Next
0