8000 Tags · YunqiuXu/tianshou · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Tags: YunqiuXu/tianshou

Tags

v0.4.5

Toggle v0.4.5's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Fix critic network for Discrete CRR (thu-ml#485)

- Fixes an inconsistency in the implementation of Discrete CRR. Now it uses `Critic` class for its critic, following conventions in other actor-critic policies;
- Updates several offline policies to use `ActorCritic` class for its optimizer to eliminate randomness caused by parameter sharing between actor and critic;
- Add `writer.flush()` in TensorboardLogger to ensure real-time result;
- Enable `test_collector=None` in 3 trainers to turn off testing during training;
- Updates the Atari offline results in README.md;
- Moves Atari offline RL examples to `examples/offline`; tests to `test/offline` per review comments.

v0.4.4

Toggle v0.4.4's commit message
bump to 0.4.4

v0.4.3

Toggle v0.4.3's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
bump to v0.4.3 (thu-ml#432)

* add makefile
* bump version
* add isort and yapf
* update contributing.md
* update PR template
* spelling check

v0.4.2

Toggle v0.4.2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
add vizdoom example, bump version to 0.4.2 (thu-ml#384)

v0.4.1

Toggle v0.4.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Fix SAC loss explode (thu-ml#333)

* change SAC action_bound_method to "clip" (tanh is hardcoded in forward)

* docstring update

* modelbase -> modelbased

v0.4.0

Toggle v0.4.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Merge pull request thu-ml#302 from thu-ml/dev

v0.4.0

v0.3.2

Toggle v0.3.2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
v0.3.2 (thu-ml#292)

Throw a warning in ListReplayBuffer.

This version update is needed because of thu-ml#289, the previous v0.3.1 cannot work well under torch<=1.6.0 with cuda environment.

v0.3.1

Toggle v0.3.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Add offline trainer and discrete BCQ algorithm (thu-ml#263)

The result needs to be tuned after `done` issue fixed.

Co-authored-by: n+e <trinkle23897@gmail.com>

v0.3.0.post1

Toggle v0.3.0.post1's commit message
specify the meaning of logits in documentation (thu-ml#238)

v0.3.0

Toggle v0.3.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
change API of train_fn and test_fn (thu-ml#229)

train_fn(epoch) -> train_fn(epoch, num_env_step)
test_fn(epoch) -> test_fn(epoch, num_env_step)
0