Tags: ashok-arora/tianshou
Tags
fix info not pass issue in PGPolicy (thu-ml#787) close thu-ml#775
Add vecenv wrappers for obs_norm to support running mujoco experiment… … with envpool (thu-ml#628) - add VectorEnvWrapper and VectorEnvNormObs - obs_rms store in policy save/load - align mujoco scripts with atari: obs_norm, envpool, wandb and README
rename save_fn to save_best_fn to avoid ambiguity (thu-ml#575) This PR also introduces `tianshou.utils.deprecation` for a unified deprecation wrapper.
Add VizDoom PPO example and results (thu-ml#533) * update vizdoom ppo example * update README with results
fix conda support and keep API compatibility (thu-ml#536) * loose constrains * fix nni issue (thu-ml#478) * fix coverage
Fix critic network for Discrete CRR (thu-ml#485) - Fixes an inconsistency in the implementation of Discrete CRR. Now it uses `Critic` class for its critic, following conventions in other actor-critic policies; - Updates several offline policies to use `ActorCritic` class for its optimizer to eliminate randomness caused by parameter sharing between actor and critic; - Add `writer.flush()` in TensorboardLogger to ensure real-time result; - Enable `test_collector=None` in 3 trainers to turn off testing during training; - Updates the Atari offline results in README.md; - Moves Atari offline RL examples to `examples/offline`; tests to `test/offline` per review comments.
PreviousNext