8000 fixed model output function when computing gradients in float16 by AlaaKhaddaj · Pull Request #36 · MadryLab/trak · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

fixed model output function when computing gradients in float16 #36

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 31, 2023

Conversation

AlaaKhaddaj
Copy link

When computing the margins from image_classification task, the default dtype of ch.tensor(-ch.inf) is float32. This leads to a datatype mismatch if the model gradients and output were computed in float16.

@kristian-georgiev
Copy link
Member

Great catch, thanks!

@kristian-georgiev kristian-georgiev merged commit e18838a into 0.2.0 May 31, 2023
@kristian-georgiev kristian-georgiev deleted the 0.2.0_float16 branch May 31, 2023 17:12
kristian-georgiev added a commit that referenced this pull request Jun 1, 2023
* clean up old nb

* trak scores quickstart fig

* clean up quickstart

* minor docs updates

* no-op projector

* bump version

* test for scoring in shards

* test for featurizing in shards

* tie experiment name to scoring targets; simplify saver; add logging

* support dataset sharding during featurizing and scoring

* save scores as mmap

* migrate to torch.func

* bump torch dep requirement to 2.0.0 bc of torch.func

* project and store in float16 by default

* test autocast vs .half() on the model with functional_call

* test_install function

* minor edits in tests and install docs

* pass in an instance of a class for tasks, rather than init inside of gradientcomputer

* bug fix

* normalization factor for numerical stability

* fixed model output function when computing gradients in float16 (#36)

* fixed model output function when computing gradients in float16
6889


* also fix for text clsf MOF

* instantiate on device directly

---------

Co-authored-by: alaakh <alaakh@mit.edu>
Co-authored-by: Kristian Georgiev <krisgrg@mit.edu>

* _is_featurized array

* handle pre-emption for featurizing

* vectorize without stacking to save memory

* add assertion to load ckpt

* python >=3.8 for pytorch 2.0

* make it easy to use GPU with smaller cuda mem

* pytest cuda markers

* fix CLIP modelout function

* bring back iter gradient computer

---------

Co-authored-by: alaakh <alaakh@mit.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0