Tags: gilfree/pytorch
Tags
Update on "Add tests for SyncBatchNorm with CUDA graph capturing" [ghstack-poisoned]
Merge pull request pytorch#19 from kulinseth/denis/cpu_to_mps_view_copy Fix view copies for cpu->mps case; add testcase
Remove mentions of deleted TH and friends (pytorch#78683) Summary: Pull Request resolved: pytorch#78683 Test Plan: No-op, rely on CI. Reviewed By: dagitses Differential Revision: D36829491 fbshipit-source-id: a57c07ad239f55e0096424e1517a8b5b38798fd5
Update on "Avoid CPU Sync in SyncBatchNorm When Capturing CUDA Graphs" We recently updated `SyncBatchNorm` to support empty input batches. The new code removes stats from ranks with empty inputs. However, this change breaks CUDA graph capture as it forces CPU sync. This commit uses `is_current_stream_capturing()` to guard the new code path, and only run the new code when not capturing CUA Graphs. To support empty inputs with CUDA graph capturing, we might need to update CUDA kernels for `batch_norm_backward_elemt` and `batch_norm_gather_stats_with_counts`. See pytorch#78656. Fixes pytorch#78549 Differential Revision: [D36826558](https://our.internmc.facebook.com/intern/diff/D36826558) [ghstack-poisoned]
[1] move simple pytorch buck targets to the shared build file (part 1) ( pytorch#78345) Summary: Pull Request resolved: pytorch#78345 This diff moved a few BUCK targets (mostly headers) to the shared build file, which will be used for both internal and OSS BUCK build Test Plan: sandcaslte, OSS BUCK CI Differential Revision: D36694963 fbshipit-source-id: 99d53dfbfea76d6f301964aa2cc2913a625a6536
Merge remote-tracking branch 'upstream/master' into allow-cuda-extens… …ion-rdc
Add onlyNativeDeviceTypes to roll's OpInfo because XLA is failing
Fuse matmul in row-wise sharded linear to have a single matmul. Performing a single large matmul is more efficient than having to perform multiple matmuls in a loop. Similar improvement to pytorch#78449 Differential Revision: [D36828505](https://our.internmc.facebook.com/intern/diff/D36828505/) [ghstack-poisoned]
PreviousNext