8000 Update TH, THC, THNN, THCUNN by soumith · Pull Request #1097 · pytorch/pytorch · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Update TH, THC, THNN, THCUNN #1097

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 24, 2017
Merged

Update TH, THC, THNN, THCUNN #1097

merged 9 commits into from
Mar 24, 2017

Conversation

soumith
Copy link
Member
@soumith soumith commented Mar 24, 2017

contbuilds

ngimel and others added 9 commits March 23, 2017 17:22
Currently in-place and out-of-place updateGradOutput will produce different results for input=max_val or input=min_val - in-place won't backprop gradient where input=max_val or input=min_val, out-of-place will backprop gradient in this case.
in-place and out-of-place updateGradOutput results are different where input=min_val or input=max_val
@soumith soumith merged commit cce0307 into master Mar 24, 2017
@soumith soumith deleted the THSUpdate branch March 24, 2017 23:49
jjsjann123 pushed a commit to jjsjann123/pytorch that referenced this pull request Nov 5, 2021
rraminen pushed a commit to rraminen/pytorch that referenced 8000 this pull request Sep 29, 2022
hubertlu-tw pushed a commit to hubertlu-tw/pytorch that referenced this pull request Nov 1, 2022
Take-over of pytorch#1097

* Add fast CUDA focal loss implementation.

* Enable fast math for CUDA focal loss.

* Correct typo.

* replace deprecated macros

* Add fast CUDA focal loss implementation.

* Enable fast math for CUDA focal loss.

* Correct typo.

* replace deprecated macros

* TORCH_CUDA_CHECK -> AT_CUDA_CHECK

The former is defined in torch/csrc/profiler/cuda.cpp so it's not available usually.
The latter however is defined in ATen/cuda/Exceptions.h as an alias of C10_CUDA_CHECK.

* add test

* clean up

* guard for torchvision

Co-authored-by: Wil Kong <alpha0422@gmail.com>
akashveramd pushed a commit to akashveramd/pytorch that referenced this pull request Apr 9, 2025
* Squashed 'src/composable_kernel/' content from commit f6edda6

git-subtree-dir: src/composable_kernel
git-subtree-split: f6edda6

* add solver ConvIgemmFwdV6r1DlopsNchwKcyxNkhw; rename static ck source files

* Squashed 'src/composable_kernel/' changes from f6edda6..5781adf

5781adf Update develop (pytorch#5) (pytorch#6)
97e6d51 Merge pull request pytorch#4 from ROCmSoftwarePlatform/separate_online_compile
7b1ec41 refactor
49c33aa refactor
54b3e73 rename

git-subtree-dir: src/composable_kernel
git-subtree-split: 5781adf

* fix

* refactor

* remove online compilation from CK

* refactor

* fix

* add ctest

* add c-style pointer cast

* vector/scalar pointer cast use c-style pointer cast instead of reinterpret_cast

* fix clang warning suppression

* tidy

* suppress cppcheck

* fix enum issue

* revert chagnes to hip build

* fix kernel filename

* update CK build script

* rename

* rename

* make innner product compatiable on gfx900

* Update src/include/miopen/solver/ck_utility_common.hpp

Co-authored-by: JD <Jehandad.Khan@amd.com>

* compiler parameter use stream

* use int instead of index_t in kernel wrapper

* DynamicBuffer, StaticBuffer, amd_buffer_load support customized value for invalid element

* refactor

* refactor

* change cmakelist

* change ck common utility

* fix

Co-authored-by: JD <Jehandad.Khan@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
0