8000 aster2024 (Jizhou Guo) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View aster2024's full-sized avatar

Highlights

  • Pro

Block or report aster2024

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

《EasyOffer》(<大模型面经合集>)是针对LLM宝宝们量身打造的大模型暑期实习Offer指南,主要记录大模型暑期实习和秋招准备的一些常见大厂手撕代码、大厂面经经验、常见大厂思考题等;小白一个,正在学习ing......有问题各位大佬随时指正,希望大家都能拿到心仪Offer!

Jupyter Notebook 217 18 Updated Mar 25, 2025

Offline markdown to pdf, choose -> edit -> transform 🥂

JavaScript 1,590 154 Updated Apr 11, 2025
Python 7 Updated May 16, 2025

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 6,557 428 Updated May 29, 2024

Tips for releasing research code in Machine Learning (with official NeurIPS 2020 recommendations)

2,793 734 Updated May 19, 2023

FlashMLA: Efficient MLA decoding kernels

Cuda 11,571 836 Updated Apr 29, 2025
Python 45 Updated Oct 28, 2024
C 10 Updated Aug 14, 2024

Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"

Python 118 5 Updated May 6, 2025

A recipe for online RLHF and online iterative DPO.

Python 514 48 Updated Dec 28, 2024

[ICML 2025] Teaching Language Models to Critique via Reinforcement Learning

Python 97 6 Updated May 6, 2025

Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024

Python 1,467 151 Updated May 24, 2025

APPS: Automated Programming Progress Standard (NeurIPS 2021)

Python 462 61 Updated Jun 19, 2024

Synthetic question-answering dataset to formally analyze the chain-of-thought output of large language models on a reasoning task.

Python 147 15 Updated Oct 17, 2024

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 3,287 304 Updated May 13, 2025

My collection of machine learning papers

283 23 Updated Aug 10, 2023

Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them

492 31 Updated Jun 25, 2024

Dream 7B, a large diffusion language model

Python 659 28 Updated May 3, 2025

The first differentially-private diffusion model for tabular data

Python 25 7 Updated Jun 5, 2024

Implementation of the paper: "FedTabDiff: Federated Learning of Diffusion Models for Synthetic Mixed-Type Tabular Data Generation"

Python 20 4 Updated Nov 10, 2024

Differentially Private Diffusion Models

Python 99 15 Updated Dec 26, 2023
Python 315 14 Updated Sep 18, 2024

official implementation of paper "Process Reward Model with Q-value Rankings"

Python 58 6 Updated Feb 5, 2025

A simple unified framework for evaluating LLMs

HTML 212 23 Updated Apr 14, 2025

Implementation of self-certainty as an extention of ZeroEval Project

Python 5 Updated Mar 27, 2025

The MATH Dataset (NeurIPS 2021)

Python 1,121 98 Updated Aug 5, 2024

Recipes to train reward model for RLHF.

Python 1,345 96 Updated Apr 24, 2025

No fortress, purely open ground. OpenManus is Coming.

Python 46,040 8,033 Updated May 20, 2025

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 9,293 925 Updated May 16, 2025
Next
0