D1 student. Interested in Offline RL, Game AI, and JAX-based RL.
-
The University of Tokyo
- Tokyo, Japan
-
14:47
(UTC -12:00) - https://nissymori.github.io/
- @nissymori1
-
JAX-CORL Public
Clean single-file implementation of offline RL algorithms in JAX
-
-
direct-preference-optimization Public
Forked from eric-mitchell/direct-preference-optimizationReference implementation for DPO (Direct Preference Optimization)
Python Apache License 2.0 UpdatedAug 11, 2024 -
-
SRPO Public
Forked from AIDefender/SRPO[NeurIPS 2023] The official code for paper "State Regularized Policy Optimization on Data with Dynamics Shift"
Python GNU General Public License v3.0 UpdatedNov 23, 2023 -
td-gammon Public
Forked from dellalibera/td-gammonTD-Gammon implementation
Python MIT License UpdatedSep 25, 2023 -
D4RL Public
Forked from Farama-Foundation/D4RLA collection of reference environments for offline reinforcement learning
Python Apache License 2.0 UpdatedAug 29, 2023 -
a2c-minatar Public
Forked from sotetsuk/a2c-minatarPython GNU General Public License v3.0 UpdatedMay 17, 2023 -
reinforce Public
Forked from sotetsuk/reinforceA simple REINFORCE algorithm implementation in PyTorch
Python MIT License UpdatedNov 10, 2022 -
CDA Public archive
Forked from XuhuiZhou/CDAcode for our EMNLP2020 paper: Multilevel Text Alignment with Cross-Document Attention by Xuhui Zhou, Nikolaos Pappas, and Noah A. Smith
Python UpdatedSep 14, 2022 -