Tags: lyu-xg/tianshou
Tags
bump to v0.4.3 (thu-ml#432) * add makefile * bump version * add isort and yapf * update contributing.md * update PR template * spelling check
Fix SAC loss explode (thu-ml#333) * change SAC action_bound_method to "clip" (tanh is hardcoded in forward) * docstring update * modelbase -> modelbased
v0.3.2 (thu-ml#292) Throw a warning in ListReplayBuffer. This version update is needed because of thu-ml#289, the previous v0.3.1 cannot work well under torch<=1.6.0 with cuda environment.
Add offline trainer and discrete BCQ algorithm (thu-ml#263) The result needs to be tuned after `done` issue fixed. Co-authored-by: n+e <trinkle23897@gmail.com>
specify the meaning of logits in documentation (thu-ml#238)
change API of train_fn and test_fn (thu-ml#229) train_fn(epoch) -> train_fn(epoch, num_env_step) test_fn(epoch) -> test_fn(epoch, num_env_step)
add PSRL policy (thu-ml#202) Add PSRL policy in tianshou/policy/modelbase/psrl.py. Co-authored-by: n+e <trinkle23897@cmu.edu>
PreviousNext