8000 Releases · evanatyourservice/kron_torch · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Releases: evanatyourservice/kron_torch

kron-torch 0.3.2

24 Mar 18:10
Compare
Choose a tag to compare

What's Changed

  • Transformers trainer compatibility by @mkurman in #7
  • small changes/improvements

New Contributors

Full Changelog: v0.3.1...v0.3.2

kron-torch 0.3.1

20 Feb 22:15
Compare
Choose a tag to compare

What's Changed

  • Improve merge dims option and default to True. Now merging dims finds the most square matrix to reshape grad tensors into.

kron-torch 0.3.0

12 Feb 17:52
Compare
Choose a tag to compare

What's Changed

  • Adding distributed versions of PSGD Kron and PSGD One-Sided Kron that use simple pipeline sharding, distributing params across GPUs layer-wise

kron-torch 0.2.9

02 Jan 18:51
Compare
Choose a tag to compare

What's Changed

  • merge memory improvement PR from Lucas Nestler @ClashLuke

kron-torch 0.2.6

02 Dec 16:31
Compare
Choose a tag to compare

What's Changed

  • Get rid of trust region
  • Add normalize grads layer-wise argument
  • deterministically update preconditioners for stability
  • TODO: update using Lucas Nestler's optimizations

kron-torch 0.2.5

10 Nov 01:53
Compare
Choose a tag to compare

What's Changed

  • small improvements

kron-torch 0.2.4

07 Nov 03:12
Compare
Choose a tag to compare

What's Changed

  • Efficiency improvements from ClashLuke
  • New trust region clipping that needs less (maybe no) tuning

kron-torch 0.2.3

29 Oct 21:29
Compare
Choose a tag to compare

What's Changed

  • triton install, 3.0.0

kron-torch 0.2.2

29 Oct 21:09
Compare
Choose a tag to compare

What's Changed

  • Trust region clipping improved
  • Get rid of max skew triangular and replace with memory_save_mode which can be either None to use default triangular preconditioners, 'one_diag' to use one diagonal per layer, or 'all_diag' to use all diagonal preconditioners (fastest/lowest mem but slower learning)

kron-torch 0.2.1

07 Oct 17:16
Compare
Choose a tag to compare

What's Changed

0