8000 sophiapeng90 / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View sophiapeng90's full-sized avatar

Block or report sophiapeng90

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 1,821 129 Updated May 16, 2025

A live stream development of RL tunning for LLM agents

Python 2,749 379 Updated May 13, 2025

SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning

Python 280 21 Updated May 15, 2025
Python 34 2 Updated May 15, 2025
Python 170 7 Updated Apr 2, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 8,081 947 Updated May 17, 2025

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 3,241 302 Updated May 13, 2025

TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios

Python 199 20 Updated Sep 24, 2024

Scaling Deep Research via Reinforcement Learning in Real-world Environments.

Python 364 24 Updated Apr 13, 2025

FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.

Jupyter Notebook 16,180 2,277 Updated Dec 26, 2024

Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas…

Python 19,207 3,150 Updated May 13, 2025
Python 209 43 Updated Aug 26, 2024
Python 33 10 Updated Jul 20, 2022

A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

Python 2,558 183 Updated Jan 30, 2025

Code for the paper "FinRL-DeepSeek: LLM-Infused Risk-Sensitive Reinforcement Learning for Trading Agents" arXiv:2502.07393

Jupyter Notebook 204 63 Updated Apr 8, 2025

Data and code for EMNLP 2021 paper "FinQA: A Dataset of Numerical Reasoning over Financial Data"

Python 286 44 Updated Jun 6, 2022

Official repository of ’Visual-RFT: Visual Reinforcement Fine-Tuning’

Python 1,660 77 Updated Apr 18, 2025

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 2,264 158 Updated May 16, 2025

Minimal reproduction of DeepSeek R1-Zero

Python 11,764 1,485 Updated Apr 24, 2025

Reproduce R1 Zero on Logic Puzzle

Python 2,337 155 Updated Mar 20, 2025

[EMNLP 2024 Findings] OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs.

Python 147 14 Updated Nov 13, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 14,405 1,769 Updated May 17, 2025

A guidance language for controlling large language models.

Jupyter Notebook 20,195 1,104 Updated May 16, 2025
0