8000 ummagumm-a (Sinii Viacheslav) / Starred · GitHub

More Web Proxy on the site http://driver.im/

ummagumm-a

Follow

Sinii Viacheslav ummagumm-a

Follow

16 followers · 10 following

Achievements

Achievements

Lists (1)

Sort

✨ Inspiration

Stars

corl-team / lime

Official implementation of the paper "You Do Not Fully Utilize Transformer's Representation Capacity"

Python 28 1 Updated May 28, 2025

yangky11 / miniF2F-lean4

Lean 57 20 Updated May 18, 2025

guidance-ai / guidance

A guidance language for controlling large language models.

Jupyter Notebook 20,248 1,107 Updated May 30, 2025

moaradwan / deep-learning-contextual-bandits

Deep learning models for contextual multi-armed bandit setting

Python 13 1 Updated May 16, 2021

huggingface / trl

Train transformer language models with reinforcement learning.

Python 13,970 1,922 Updated May 30, 2025

OpenLMLab / LOMO

LOMO: LOw-Memory Optimization

Python 984 70 Updated Jul 2, 2024

ayhem18 / Towards_Data_Science

This is a repository where I track and share the knowledge I acquire on my journey to reach my dream data position

Jupyter Notebook 4 Updated May 23, 2025

pytorch / rl

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

Python 2,795 372 Updated May 29, 2025

NousResearch / Hermes-Function-Calling

Jupyter Notebook 895 114 Updated Sep 13, 2024

lucidrains / vector-quantize-pytorch

Vector (and Scalar) Quantization, in Pytorch

Python 3,269 264 Updated May 3, 2025

dodgejesse / show_your_work

Python 11 1 Updated Jan 21, 2020

showyourwork / showyourwork

A workflow for reproducible and open scientific articles

TeX 603 49 Updated May 27, 2025

allenai / allentune

Hyperparameter Search for AllenNLP

Python 139 12 Updated Mar 6, 2025

nisheeth-golakiya / hybrid-sac

Single-file pytorch implementation of hybrid-SAC

Python 58 12 Updated Jun 25, 2021

clvrai / create

CREATE Environment for long-horizon physics-puzzle tasks with diverse tools

Python 18 Updated Nov 22, 2022

corl-team / toy-meta-gym

Toy meta-RL environments for testing algorithms implementations

Python 7 Updated Feb 21, 2024

facebookresearch / online-dt

Online Decision Transformer

Python 259 39 Updated Jan 22, 2024

clvrai / new-actions-rl

Jupyter Notebook 24 8 Updated Aug 9, 2024

corl-team / ad-eps

Official Implementation for "In-Context Reinforcement Learning from Noise Distillation"

Python 31 1 Updated Sep 18, 2024

jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,510 536 Updated May 3, 2024

suessmann / Agentic-Transformer-Pytorch

Python 6 Updated Jun 11, 2024

google / brax

Massively parallel rigidbody physics simulation on accelerator hardware.

Jupyter Notebook 2,696 291 Updated May 27, 2025

stas00 / the-art-of-debugging

The Art of Debugging

C 885 39 Updated Aug 3, 2024

google-research / tuning_playbook

A playbook for systematically maximizing the performance of deep learning models.

28,751 2,366 Updated Jun 18, 2024

replicate / hype

A feed of trending repos/models from GitHub, Replicate, HuggingFace, and Reddit.

JavaScript 132 15 Updated Sep 12, 2024

dunnolab / xland-minigrid

JAX-accelerated Meta-Reinforcement Learning Environments Inspired by XLand and MiniGrid 🏎️

Python 289 20 Updated May 26, 2025

google-research / rliable

[NeurIPS'21 Outstanding Paper] Library for reliable evaluation on RL and ML benchmarks, even with only a handful of seeds.

Jupyter Notebook 832 49 Updated Aug 12, 2024

facebookresearch / nevergrad

A Python toolbox for performing gradient-free optimization

Python 4,073 366 Updated May 27, 2025

corl-team / katakomba

Forked from tinkoff-ai/katakomba

Data-Driven NetHack Tools: Datasets (30+) and recurrent-baselines (AWAC, BC, CQL, IQL, REM)

Python 40 1 Updated Aug 22, 2023

corl-team / CORL

Forked from tinkoff-ai/CORL

High-quality single-file implementations of SOTA Offline and Offline-to-Online RL algorithms: AWAC, BC, CQL, DT, EDAC, IQL, SAC-N, TD3+BC, LB-SAC, SPOT, Cal-QL, ReBRAC

Python 556 31 Updated Feb 10, 2024

0