Pulse · pytorch/pytorch · GitHub

8000 Pulse · pytorch/pytorch · GitHub

More Web Proxy on the site http://driver.im/

June 14, 2025 – June 21, 2025

Overview

205 Active pull requests

236 Active issues

1 Pull request merged by 1 person

Bump requests from 2.32.2 to 2.32.4 in /.github
#155491 merged Jun 16, 2025

204 Pull requests opened by 110 people

feat(cmake): add NCCL version selection based on CUDA version
#156014 opened Jun 15, 2025
[profiler] add more CUDA API for kernel launcher
#156016 opened Jun 15, 2025
[BE] add a minimal linter to check `pyproject.toml` consistency
#156017 opened Jun 15, 2025
[build] modernize build-frontend: `python setup.py develop/install` -> `[uv ]pip install[ -e] .`
#156027 opened Jun 15, 2025
bmm, topk, cholesky, linalg.norm, max with out variants set causing r…
#156030 opened Jun 15, 2025
[BE][Easy] set end-of-line for `.bat` file to CRLF in `.editorconfig`
#156032 opened Jun 15, 2025
[dynamo] Weblink generation when unimplemented_v2() is called
#156033 opened Jun 15, 2025
Fix atleast_{1,2,3}d() with no arguments description
#156042 opened Jun 16, 2025
[BE][Easy][setup] wrap over long error messages and redirect them to `stderr` in `setup.py`
#156043 opened Jun 16, 2025
[BE][Easy][setup] use `super().method(...)` in command subclasses in `setup.py`
#156044 opened Jun 16, 2025
[build] remove upper version pin for `setuptools<80.0`
#156049 opened Jun 16, 2025
Update URL for RPATH documentation
#156060 opened Jun 16, 2025
Support transpose and pack for bit8
#156065 opened Jun 16, 2025
Register hpu device to fake backend
#156076 opened Jun 16, 2025
revamp dtype documentation for 2025
#156087 opened Jun 16, 2025
Convert to markdown: jit.rst
#156094 opened Jun 16, 2025
Add error to intercept crash in issue #154882 on maxpool2d with indices
#156101 opened Jun 16, 2025
[opinfo] Exclude aten_name if its not actually a name
#156104 opened Jun 16, 2025
[opinfo] add overloads to opinfo
#156109 opened Jun 16, 2025
local load/save
#156110 opened Jun 16, 2025
Add debug messages for deps issues during fx splits
#156111 opened Jun 16, 2025
Bump transfomers version
#156118 opened Jun 16, 2025
[ROCm][Inductor][CK] update API for gemm-multiD change
#156122 opened Jun 16, 2025
Display a warning when overwriting `CMAKE_CUDA_ARCHITECTURES`
#156123 opened Jun 16, 2025
[test] re-run CI with complex + Python dispatch key changes
#156131 opened Jun 16, 2025
NOT-FOR-LAND: enable autochunker by default
#156132 opened Jun 16, 2025
[WIP][ci][cutlass backend] Add ci for cutlass backend tests
#156136 opened Jun 16, 2025
Templatize model_container
#156137 opened Jun 16, 2025
Improve documentation for torch.lobpcg
#156139 opened Jun 16, 2025
[cuDNN][64-bit indexing] update conv depthwise 64bit indexing dispatch condition to match native kernel
#156140 opened Jun 17, 2025
[executorch hash update] update the pinned executorch hash
#156141 opened Jun 17, 2025
[dcp_poc] Introduce a new simple rank local checkpointer
#156142 opened Jun 17, 2025
[Docs] Fix indentations in cond.md
#156147 opened Jun 17, 2025
[list] Raise exception in invalid list method call
#156148 opened Jun 17, 2025
Convert bottleneck.rst to markdown
#156149 opened Jun 17, 2025
Optimize dim description in torch.max
#156153 opened Jun 17, 2025
Attempt to microoptimize TorchDynamoContext enter/exit
#156155 opened Jun 17, 2025
Bump protobuf from 5.29.4 to 5.29.5 in /.ci/docker
#156157 opened Jun 17, 2025
Optimize scatter/gather kernel for ARM.
#156161 opened Jun 17, 2025
Deprecate CUDAAllocatorConfig, use AllocatorConfig instead
#156165 opened Jun 17, 2025
draft: [cp] context_parallel + flex_attention using monkey_patch
#156170 opened Jun 17, 2025
Implementation of a ScannedModule
#156172 opened Jun 17, 2025
[WIP] Add a new API of allocator setting for accelerator
#156175 opened Jun 17, 2025
[NVIDIA] Refactor Family Blackwell Support codegen
#156176 opened Jun 17, 2025
[WIP] Remove legacy aarch64_linux builder in favor of Manylinux
#156178 opened Jun 17, 2025
[TEST] Add Windows cuda 12.9.1 build
#156179 opened Jun 17, 2025
[Native][CPU][TopK] Improve perf by reducing swap operations
#156183 opened Jun 17, 2025
[Inductor] Subgraph as a choice symbolic expression as input
#156185 opened Jun 17, 2025
[TEST] Triton 3.4.0 pin update
#156186 opened Jun 17, 2025
[inductor] Quiesce Triton compile worker pool after each dynamo compile
#156187 opened Jun 17, 2025
Engine reuse calling thread when only single device detected
#156188 opened Jun 17, 2025
Add auto support
#156189 opened Jun 17, 2025
[ROCm] [CK] Composable Kernel integration for ROCm
#156192 opened Jun 17, 2025
[dynamo] allow symints in list.__setitem__
#156197 opened Jun 17, 2025
[Codemod][Folly target clean up] 57
#156198 opened Jun 17, 2025
Fix torch.clamp CPU overflow with float16 tensors
#156199 opened Jun 17, 2025
[CUTLASS] [CUDA] SM100 GroupMM
#156203 opened Jun 17, 2025
[DCP] OSS Zero Overhead Checkpointing Implementation
#156207 opened Jun 17, 2025
[CI] Add prebuild command option, set prebuild command option for CI to build flash attention
#156236 opened Jun 17, 2025
[dynamo] updated version of detecting any differences between PRs unimplemented_v2() callsites and graph_break_registry json file
#156237 opened Jun 17, 2025
Fix constant folding pass for mutable buffer
#156239 opened Jun 17, 2025
Fix `aten::index_put` args Dtensor type mismatch
#156240 opened Jun 17, 2025
[list] Implement `list.remove`
#156242 opened Jun 17, 2025
Extract CPU log_softmax kernels to header
#156243 opened Jun 17, 2025
[EZ/Profiler] Change 'b' to 'B' in FunctionEvent Frontend
#156250 opened Jun 17, 2025
[dynamo] fix some cross-graph-break refleaks in eval_frame
#156252 opened Jun 17, 2025
[Test] Kineto Submodule Update
#156253 opened Jun 17, 2025
Add size_hints_or_throw
#156255 opened Jun 17, 2025
Consolidate stack trace in Tracer
#156257 opened Jun 18, 2025
[invoke_subgraph] make same subgraph share get_attr target
#156260 opened Jun 18, 2025
Convert quantization.rst to markdown
#156266 opened Jun 18, 2025
Add fallback-aware device checking for MPS operations
#156267 opened Jun 18, 2025
[Inductor][CPP backend] Optimize parallel depth algorithm [Don't merge]
#156268 opened Jun 18, 2025
Implement list.__add__ and list.__iadd__
#156270 opened Jun 18, 2025
[list] Add list.__mul__ and list.__imul__
#156271 opened Jun 18, 2025
[Intel GPU] Enable training for SDPA XPU [WIP]
#156272 opened Jun 18, 2025
[inductor] split out triton templates
#156276 opened Jun 18, 2025
[inductor][tma template] subclass workspace arg for choice
#156277 opened Jun 18, 2025
[inductor] add KernelTemplateParams
#156278 opened Jun 18, 2025
[inductor] introduce kernel_inputs
#156279 opened Jun 18, 2025
[inductor][1/2] break out TritonTemplate, TritonTemplateKernel, TritonTemplateCaller out of select_algorithm.py
#156280 opened Jun 18, 2025
[inductor][2/2] break out TritonTemplate, TritonTemplateKernel, TritonTemplateCaller out of select_algorithm.py
#156281 opened Jun 18, 2025
[inductor] heuristics based on kernel templates
#156282 opened Jun 18, 2025
Introduce sync_cross_rank_decision
#156287 opened Jun 18, 2025
[inductor] KernelTemplates report their own KernelParams
#156292 opened Jun 18, 2025
Add cascade sum support for Inductor CPP backend
#156296 opened Jun 18, 2025
FlexAttn config refactor + ROCm optimisations
#156307 opened Jun 18, 2025
[BE][1/16] fix typos in torch/
#156311 opened Jun 18, 2025
[BE][2/16] fix typos in torch/ (torch/_*/)
#156312 opened Jun 18, 2025
[BE][3/16] fix typos in torch/ (torch/_inductor/)
#156313 opened Jun 18, 2025
[BE][4/16] fix typos in torch/ (torch/_dynamo/)
#156314 opened Jun 18, 2025
[BE][5/16] fix typos in torch/ (torch/distributed/)
#156315 opened Jun 18, 2025
[BE][6/16] fix typos in torch/
#156316 opened Jun 18, 2025
[BE][7/16] fix typos in torch/ (torch/csrc/)
#156317 opened Jun 18, 2025
[BE][8/16] fix typos in torch/ (torch/csrc/jit/)
#156318 opened Jun 18, 2025
[BE][9/16] fix typos in torch/ (torch/csrc/)
#156319 opened Jun 18, 2025
[BE][10/16] fix typos in torch/ (torch/csrc/jit/)
#156320 opened Jun 18, 2025
[BE][11/16] fix typos in torch/ (torch/csrc/distributed/)
#156321 opened Jun 18, 2025
Address richard's comments on libtorch_stable_abi note
#156324 opened Jun 18, 2025
[TEST] DO not commit
#156326 opened Jun 18, 2025
Migrate c10/macros/cmake_macros.h.in
#156329 opened Jun 18, 2025
Fix native static dispatch kernels
#156331 opened Jun 18, 2025
Validate custom op support for compile_kernel
#156332 opened Jun 18, 2025
[testing] test/run_test.py: Only shutdown pool if it was created
#156333 opened Jun 18, 2025
Storage: add_delete_hook for deregistration
#156338 opened Jun 18, 2025
[list] Add list.__delitem__
#156339 opened Jun 18, 2025
Add private API to modify the tags for a custom operator
#156343 opened Jun 18, 2025
[inductor] set config.min_num_split by default
#156345 opened Jun 18, 2025
[invoke_subgraph] make collect_meta_analysis fake prop cachable
#156347 opened Jun 18, 2025
Add User defined subclass handling to funcitonalize impl
#156349 opened Jun 18, 2025
Build FBGEMM GenAI as part of PyTorch
#156355 opened Jun 18, 2025
[wip]
#156356 opened Jun 18, 2025
[BE] comments + try to get rid of secondary `make_autotune_fn`
#156358 opened Jun 18, 2025
[Codemod][Folly target clean up] 28
#156365 opened Jun 18, 2025
[Codemod][Folly target clean up] 22
#156366 opened Jun 18, 2025
[iter] Update some of the tests to not call pickle
#156369 opened Jun 18, 2025
[iter] exhaust `ListIterator` when `unpack_var_sequence` is called
#156370 opened Jun 18, 2025
[iter] Add support for sequence protocol in `iter(..)`
#156371 opened Jun 18, 2025
Add macos26 beta test runner
#156372 opened Jun 18, 2025
[TSAN][live speech translation] Fix A data race in caffe2
#156378 opened Jun 18, 2025
cub and compile_kernel composition
#156380 opened Jun 19, 2025
Prevent cudaStreamSync when indexing GPU tensors with boolean CPU mask
#156384 opened Jun 19, 2025
[InductorBench] Fix accuracy validation logic for MPS
#156385 opened Jun 19, 2025
[Inductor][CPP] Enable WOQ int4 concat linear
#156387 opened Jun 19, 2025
Improve All to All Perf for inter-node use-case (#156376)
#156389 opened Jun 19, 2025
Bump urllib3 from 2.2.2 to 2.5.0 in /tools/build/bazel
#156390 opened Jun 19, 2025
Use CUDA::cusolver targets
#156391 opened Jun 19, 2025
Use CMake wholearchive group
#156393 opened Jun 19, 2025
[Inductor][CPP] Enable a config to use a small dequant buffer for woq int4
#156395 opened Jun 19, 2025
Use CUDA::cupti target
#156396 opened Jun 19, 2025
Added index 0 for ROCR_VISIBLE_DEVICES
#156398 opened Jun 19, 2025
[OpenReg][1/N] Migrate cpp_extensions_open_device_registration to OpenReg
#156400 opened Jun 19, 2025
[OpenReg][2/N] Migrate cpp_extensions_open_device_registration to OpenReg
#156401 opened Jun 19, 2025
[ez] fix typo in comment
#156402 opened Jun 19, 2025
[Codemod][Folly target clean up] 28 [A]
#156403 opened Jun 19, 2025
[DONT MERGE][TESTING][2/2] test new xpu runner
#156410 opened Jun 19, 2025
Fix storage_offset preservation in clone_preserve_strides
#156415 opened Jun 19, 2025
[iter] support `iter(callable, sentinel)`
#156416 opened Jun 19, 2025
Change t.is_cuda to t.device.type == 'cuda' in torch/utils/viz
#156418 opened Jun 19, 2025
[cc][multi-kernel] attempt 1
#156421 opened Jun 19, 2025
[dm][multi-kernel] attempt 1
#156422 opened Jun 19, 2025
[dm][mk] attempt 2
#156423 opened Jun 19, 2025
[cc][multi-kernel] attempt 2
#156427 opened Jun 19, 2025
[br][mk] attempt 1
#156428 opened Jun 19, 2025
[precompile] Detect source code changes for save/load.
#156432 opened Jun 19, 2025
[dynamo] show frame information when recompilation is triggered on fail_on_recompile
#156433 opened Jun 19, 2025
use cmake target torch instead of ${TORCH_LIBRARIES} in cpp installation docs
#156435 opened Jun 19, 2025
[cc][multi-kernel] attempt 3
#156439 opened Jun 19, 2025
[invoke_subgraph] Add config flag to control support of input mutation
#156450 opened Jun 19, 2025
[cc][multi-kernel] attempt 4
#156452 opened Jun 19, 2025
[WIP]Fallback to CPU for XPU FP64
#156456 opened Jun 19, 2025
Fixes issue #156414: Fixes bug in implementation of _combine_histograms.
#156457 opened Jun 20, 2025
wip Updates to scaled_mm code
#156458 opened Jun 20, 2025
[iter] Wrap iter(..) call in a ObjectIteratorVariable
#156460 opened Jun 20, 2025
[dynamo] fixes to lru_cache message and adding user stack trace in debug mode
#156463 opened Jun 20, 2025
[inductor] select_algorithm: add preprocessing fns
#156464 opened Jun 20, 2025
[torchbench] update environment setup script
#156465 opened Jun 20, 2025
WIP: Add `max_pool3d` for MPS
#156467 opened Jun 20, 2025
Debug PR, no need to review
#156468 opened Jun 20, 2025
Docs/update contributing rebase tip
#156469 opened Jun 20, 2025
kernel arg munging attempt
#156470 opened Jun 20, 2025
[wip][inductor] add kernel choice
#156477 opened Jun 20, 2025
[Codemod][Folly target clean up] 22 [B]
#156478 opened Jun 20, 2025
[ROCm][Windows] Fixing undefined symbol linker error after exposing MIOpen symbols
#156479 opened Jun 20, 2025
[WIP] Add device_id to XPU device properties
#156481 opened Jun 20, 2025
Fix torch.onnx.export parameter for onnx_shape_inference (#156480)
#156483 opened Jun 20, 2025
[Profiler] Fix profile_all_threads in debug build
#156484 opened Jun 20, 2025
Add regression test for UnicodeDecodeError in torch.compile with extreme values
#156485 opened Jun 20, 2025
[ROCm][Windows] Skip using rocm-core on Windows case
#156486 opened Jun 20, 2025
[DO NOT MERGE] Update trunk.yml to change the runner that the job runs-on
#156491 opened Jun 20, 2025
[INIT DRAFT] setting up the build for torch/standalone
#156492 opened Jun 20, 2025
Fix type annotations for dim parameter in torch.amin and torch.amax
#156493 opened Jun 20, 2025
[MPS] Optimize cumsum/cumprod metal kernels
#156494 opened Jun 20, 2025
cublaslt/hipblaslt persistent workspace
#156495 opened Jun 20, 2025
add test_batchnorn_2D and 3D tests
#156498 opened Jun 20, 2025
[ROCm] Bump AOTriton to 0.10b
#156499 opened Jun 20, 2025
[Inductor] Fix epilogue fusion decision with 1 Triton caller as choice
#156500 opened Jun 20, 2025
[MTIA Aten Backend] Migrate maximum.out / minimum.out / cos.out / erf.out / exp.out
#156502 opened Jun 20, 2025
Organize BUCK for torch/standalone
#156503 opened Jun 20, 2025
added stubs for jit tree views
#156504 opened Jun 20, 2025
Fix dynamo benchmarks no dtype.__name__
#156505 opened Jun 20, 2025
[nativert] Move PrimKernelRegistry to PyTorch core
#156506 opened Jun 20, 2025
[nativert] Move HigherOrderKernel
#156507 opened Jun 20, 2025
[nativert] move layout planner algorithms to libtorch
#156508 opened Jun 20, 2025
[docs][typing] Document and type support for dim=None in torch.amin and torch.amax
#156510 opened Jun 20, 2025
python definitely_contiguous-> is_contiguous_or_false
#156515 opened Jun 20, 2025
Unify dynamic shapes APIs naming 2 (expect_true and check) attempt2
#156518 opened Jun 20, 2025
[aoti] Check longlong upperbound for codegening input size check
#156522 opened Jun 20, 2025
remove gso from set_storage_meta__symint
#156525 opened Jun 20, 2025
[Inductor][CPP] Fix perf regression of functorch_maml_omniglot
#156526 opened Jun 21, 2025
[dynamo] fix segfault due to dangling CacheEntry backend pointer
#156527 opened Jun 21, 2025
[dynamo] Guard eagerly on list objects to avoid guard on getitem index
#156531 opened Jun 21, 2025
Add RoPE (Rotary Positional Embedding) to PyTorch core
#156532 opened Jun 21, 2025
[inductor] Quiesce Triton compile worker pool by default in OSS
#156534 opened Jun 21, 2025
remove allow-untyped-defs from c10d_rendezvous_backend.py
#156536 opened Jun 21, 2025
remove allow-untyped-defs from torch/ao/nn/sparse/quantized/linear.py
#156537 opened Jun 21, 2025
remove allow-untyped-defs from torch/fx/passes/utils/fuser_utils.py
#156538 opened Jun 21, 2025
[MTIA Aten Backend] Migrate _log_softmax.out / _log_softmax_backward_data.out
#156539 opened Jun 21, 2025
avoid to declare an unknown bound array without any element
#156543 opened Jun 21, 2025
Enable target-determination (TD) for ROCm CI
#156545 opened Jun 21, 2025
[ddp] improve c++ reducer bucketing readability
#156550 opened Jun 21, 2025
[CUDAGraph] add config `cudagraph_capture_sizes`
#156551 opened Jun 21, 2025
Add fx_graph_runnable tests boilerplate
#156552 opened Jun 21, 2025
[MTIA Aten Backend] Migrate isnan
#156554 opened Jun 22, 2025

145 Issues closed by 41 people

Is it possible to serialize a torch.cuda.CUDAGraph into disk or CPU memory
#125820 closed Jun 22, 2025
`scaled_dot_product_attention` backwards: illegal memory access with large inputs
#150054 closed Jun 21, 2025
Using `opset_version = 22` in `torch.onnx.export` with `dynamo=True` includes dropout nodes in the model
#156542 closed Jun 21, 2025
`torch.distributed.pipelining.pipeline` error when initializing on meta device
#156541 closed Jun 21, 2025
Add runtime profiler info for AOTDispatcher prologue
#155721 closed Jun 21, 2025
UNSTABLE inductor-rocm-mi300 / rocm-py3.10-inductor-mi300 / test (inductor)
#154884 closed Jun 21, 2025
UNSTABLE rocm-mi300 / linux-jammy-rocm-py3.10-mi300 / test (default)
#156360 closed Jun 21, 2025
`torch.compile(fullgraph=True, options=...)` fails with `NoValidChoicesError` on simple `Conv2d` model, but gives no actionable trace
#156304 closed Jun 21, 2025
Support input mutations + aliasing with scan during training
#156337 closed Jun 20, 2025
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_bfloat16 (__main__.TestForeachCUDA)
#151228 closed Jun 20, 2025
Add @markDynamoStrictTest to all TestCase
#115671 closed Jun 20, 2025
Loss with LBFGS not going down
#156501 closed Jun 20, 2025
Can we have Dim.AUTO/Dim.DYNAMIC with an optional min & max?
#147483 closed Jun 20, 2025
DTensor does not compose with Parameters Groups
#156453 closed Jun 20, 2025
[ONNX] Support for grouped query attention
#151762 closed Jun 20, 2025
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_int16 (__main__.TestForeachCUDA)
#149627 closed Jun 20, 2025
[XPU] Support toggling profiler on/off for XPU.
#154898 closed Jun 20, 2025
Sourceforge outage causing multiple CI failures
#108773 closed Jun 20, 2025
pytorchbot erroneously thinks PR has already been merged as a different commit
#154427 closed Jun 20, 2025
[ONNX] Inputs generated by onnx.export() with dynamo=False are not consistent with dynamo=True
#136179 closed Jun 20, 2025
[Torch TO ONNX BUG] The right shift operation in torch is mapped as a division operation when converted to ONNX.
#139455 closed Jun 20, 2025
[ONNX] 2.0 regression: dynamic shapes lost for an operator
#139463 closed Jun 20, 2025
[ONNX] Document the registration API
#139499 closed Jun 20, 2025
[ONNX] Run report_exportability when report=True
#139904 closed Jun 20, 2025
Replace reduce(operator.mul) with math.prod for computing product of dimensions
#140888 closed Jun 20, 2025
Exporting the operator 'aten::_transformer_encoder_layer_fwd' to ONNX opset version 17 is not supported
#144242 closed Jun 20, 2025
Custom symbolic functions for ONNX export with None args causes SEGFAULT
#145261 closed Jun 20, 2025
ONNX export failing when using `symbolic` functions and scripting
#146035 closed Jun 20, 2025
onnx.export: When a quantized model is exported using onnx.export, the convolution result has discrepency with the original quantized model.
#146541 closed Jun 20, 2025
Export HuggingFace mamba to ONNX
#146835 closed Jun 20, 2025
UnsupportedOperatorError: Exporting the operator 'aten::_make_per_tensor_quantized_tensor ' to ONNX opset version 11
#147602 closed Jun 20, 2025
[ONNX] BitwiseOr was generated for bool inputs (invalid)
#147854 closed Jun 20, 2025
[ONNX] dynamic dims are not exported with the specified names
#148629 closed Jun 20, 2025
[ONNX] How to export Llama4
#150891 closed Jun 20, 2025
Exporting the operator 'aten::lift_fresh' to ONNX - not supported
#151932 closed Jun 20, 2025
Exporting the operator 'aten::fft_fft2' to ONNX opset version 19 is not supported.
#153823 closed Jun 20, 2025
[ONNX] Verify the translation of SDPA to Attention-23
#156105 closed Jun 20, 2025
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_float64 (__main__.TestForeachCUDA)
#151214 closed Jun 20, 2025
`torch.compile(..., mode="max-autotune", dynamic=True)` causes small but nonzero output mismatch with `nn.Conv2d` compared to eager output
#156301 closed Jun 20, 2025
DISABLED test_triton_template_generated_code_caching (__main__.TestMaxAutotune)
#154108 closed Jun 20, 2025
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_float32 (__main__.TestForeachCUDA)
#151136 closed Jun 20, 2025
Inductor cpp_wrapper has performance regressions
#156037 closed Jun 20, 2025
DISABLED test_export_opnames_interface (__main__.TestMisc)
#154986 closed Jun 20, 2025
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_float16 (__main__.TestForeachCUDA)
#151114 closed Jun 19, 2025
_flash_attention_forward accuracy drop from CUDA to ROCM implementation.
#154582 closed Jun 19, 2025
xpu: AOT compilation does not happen with sycl extension (JIT fallback happens)
#156249 closed Jun 19, 2025
Cannot install pytorch through official pip guidance
#156413 closed Jun 19, 2025
Tensors with no explicit references are possible not freed timely with torch.compile
#155778 closed Jun 19, 2025
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_float32 (__main__.TestForeachCUDA)
#149409 closed Jun 19, 2025
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_float16 (__main__.TestForeachCUDA)
#149522 closed Jun 19, 2025
Support C shim for customized OP
#150988 closed Jun 19, 2025
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_complex64 (__main__.TestForeachCUDA)
#151099 closed Jun 19, 2025
DISABLED test_qlinear_add_int8_mixed_bf16_use_relu_True_is_qat_True_is_dynamic_True (__main__.TestPatternMatcher)
#154565 closed Jun 19, 2025
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_complex128 (__main__.TestForeachCUDA)
#151093 closed Jun 19, 2025
FSDP + save optimizer dtype AssertionError
#156166 closed Jun 19, 2025
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_bfloat16 (__main__.TestForeachCUDA)
#151054 closed Jun 19, 2025
`max_entries` parameter of `torch.cuda.memory._record_memory_history()`
#129674 closed Jun 19, 2025
Indexing beyond end of array on ROCm build
#155045 closed Jun 18, 2025
[ued][kokoro] torch.compile fails in kokoro (both fullgraph=True and False)
#149570 closed Jun 18, 2025
Actual torch `ExportGraphSignature` does not match the example in the docs
#156184 closed Jun 18, 2025
`torch.export` fails with `KeyError` when `BatchNorm.running_mean` is read and modified, even when shape/value is unchanged
#156167 closed Jun 18, 2025
Certain operations cause implicity sync-points
#12461 closed Jun 18, 2025
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_complex64 (__main__.TestForeachCUDA)
#149199 closed Jun 18, 2025
DISABLED test_parity__foreach_acos_fastpath_inplace_cuda_ 1E79 float64 (__main__.TestForeachCUDA)
#151019 closed Jun 18, 2025
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_float64 (__main__.TestForeachCUDA)
#149523 closed Jun 18, 2025
DISABLED test_comprehensive_pca_lowrank_cuda_float32 (__main__.TestInductorOpInfoCUDA)
#139828 closed Jun 18, 2025
DISABLED test_roi_align_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#103156 closed Jun 18, 2025
NCCL init hits CUDA failure 'invalid argument' on 12.2 driver
#150852 closed Jun 18, 2025
Schema version check fails in `torch.export.load`
#156354 closed Jun 18, 2025
Windows Runners are not available on PyTorch CI/CD
#156352 closed Jun 18, 2025
format_flamegraph failed to setup the script
#156309 closed Jun 18, 2025
Cannot install >=2.7.0 on ubuntu 18.04, conflict with prerequisite
#156215 closed Jun 18, 2025
[tracker] DTensor Operator Coverage
#156204 closed Jun 18, 2025
Flip is much slower than advanced indexing
#16424 closed Jun 18, 2025
Please implement the batching rule for torch.matrix_exp.
#115992 closed Jun 18, 2025
Function 'MmBackward0' returned nan values in its 0th output.
#156015 closed Jun 18, 2025
DISABLED test_parity__foreach_acos_fastpath_inplace_cuda_float32 (__main__.TestForeachCUDA)
#151003 closed Jun 18, 2025
Status of support for ROCm 6.4.1
#155292 closed Jun 18, 2025
Loading sparse tensors in a DataLoader raises CUDA initialization error since 2.5.0 if you have already initialized CUDA
#153143 closed Jun 18, 2025
DISABLED test_matmul_layer_norm_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#151835 closed Jun 18, 2025
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_float32 (__main__.TestForeachCUDA)
#150530 closed Jun 18, 2025
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_float16 (__main__.TestForeachCUDA)
#150510 closed Jun 18, 2025
[FR] Expose CUDAGraph handle to allow customized modification on the graph
#155106 closed Jun 18, 2025
Have compiled autograd config API support nested compilation
#152219 closed Jun 18, 2025
Convert to markdown: rpc.rst, signal.rst, size.rst, sparse.rst, special.rst
#155033 closed Jun 18, 2025
DISABLED test_serialize_by_key (__main__.PrecompileContextTests)
#156146 closed Jun 17, 2025
DISABLED test_randint_distribution_dynamic_shapes_xpu (__main__.DynamicShapesGPUTests)
#155692 closed Jun 17, 2025
DISABLED test_randint_distribution_dynamic_shapes_xpu (__main__.DynamicShapesCodegenGPUTests)
#155689 closed Jun 17, 2025
DISABLED test_basic (__main__.PrecompileContextTests)
#156063 closed Jun 17, 2025
The documentation lacks an explanation of the constraints between larger padding and padding mode in convolutional layers
#134840 closed Jun 17, 2025
`torch.ops.aten.index_put` returns different results on CUDA and CPU
#156173 closed Jun 17, 2025
DISABLED test_grad_with_manual_interleaved_ScheduleClass0_use_new_runtime_True (__main__.ScheduleTest)
#154373 closed Jun 17, 2025
DISABLED test_grad_with_manual_interleaved_ScheduleClass1_use_new_runtime_False (__main__.ScheduleTest)
#154391 closed Jun 17, 2025
DISABLED test_grad_with_manual_interleaved_ScheduleClass1_use_new_runtime_True (__main__.ScheduleTest)
#154408 closed Jun 17, 2025
DISABLED test_grad_with_manual_interleaved_ScheduleClass2_use_new_runtime_False (__main__.ScheduleTest)
#154443 closed Jun 17, 2025
torch.where() can produce nan values for unselected branch during backward
#156212 closed Jun 17, 2025
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_bool (__main__.TestForeachCUDA)
#150468 closed Jun 17, 2025
Convert to markdown: quantization-accuracy-debugging.rst, quantization-backend-configuration.rst, quantization-support.rst, quantization.rst, random.rst
#155032 closed Jun 17, 2025
Failure of iOS Build Test: Build (default, 1, 1, macos-14-xlarge, SIMULATOR, arm64)
#136284 closed Jun 17, 2025
[Testing] multigpu tests are still running against CUDA-11
#154119 closed Jun 17, 2025
ONNX Dynamo Export - Unsupported FX nodes: {'call_function': ['aten._upsample_bilinear2d_aa.default']}.
#128818 closed Jun 17, 2025
torch.compile fails to trace methods decorated with @lru_cache
#155841 closed Jun 17, 2025
[FDSP2] express zero-1 with fully_shard
#155952 closed Jun 17, 2025
MPS cumsum failure for 5D tensor or above
#154881 closed Jun 17, 2025
get different result between conv1x1 and linear
#156154 closed Jun 17, 2025
BatchNorm1d fails with batch size 1 if track_running_stats=False, claims it's not it eval mode even if .eval() is called
#156051 closed Jun 17, 2025
[dynamo] Add support for torch.cuda.FloatTensor()
#130722 closed Jun 17, 2025
Convert to markdown: linalg.rst, logging.rst, masked.rst, meta.rst, miscellaneous_environment_variables.rst
#155025 closed Jun 17, 2025
A mistake in PyTorch Docs for nn.RNN
#129446 closed Jun 17, 2025
`ELU()`'s `alpha` argument with `int`, `complex` or `bool` and `inplace` argument with `int`, `complex` and `float` work against the doc
#133563 closed Jun 17, 2025
When calling torch.histc the CPU and CUDA implementations produce different outputs.
#156019 closed Jun 17, 2025
When calling torch.cumprod on a float16 tensor, the CPU and CUDA implementations produce different outputs.
#156018 closed Jun 17, 2025
Extra onnx::Neg_2 input after torch.onnx.export
#148655 closed Jun 17, 2025
RuntimeError: CUDA driver error: operation not supported with test_stream_write_value32 and cuStreamWriteValue32
#154073 closed Jun 17, 2025
DISABLED test_reentrant_parent_error_on_cpu_cuda (__main__.TestAutogradDeviceTypeCUDA)
#86735 closed Jun 17, 2025
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_int8 (__main__.TestForeachCUDA)
#150407 closed Jun 17, 2025
[XPU] Upgrade the XPU support packages version to 2025.1 in CI/CD
#151097 closed Jun 17, 2025
libtorch doesn't work with cuda 12.6 and 12.4
#132575 closed Jun 17, 2025
DISABLED test_weight_norm_bwd_dynamic_shapes_cpu (__main__.DynamicShapesCodegenCpuTests)
#153803 closed Jun 17, 2025
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_int64 (__main__.TestForeachCUDA)
#150392 closed Jun 17, 2025
DISABLED test_pattern_matcher_multi_user_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#134433 closed Jun 17, 2025
Update ONNX Opset Version to Support Attention Operator
#153611 closed Jun 17, 2025
DISABLED test_weight_norm_bwd_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#141484 closed Jun 17, 2025
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_bool (__main__.TestForeachCUDA)
#150120 closed Jun 17, 2025
[ONNX] Implement scan
#151327 closed Jun 17, 2025
`TestCppExtensionOpenRgistration.test_base_device_registration` hangs during shutdown on MacOS
#155759 closed Jun 16, 2025
ROCm: no HIP device available if device is already initialized
#152941 closed Jun 16, 2025
Inductor CI failure due to Huggingface outage
#156113 closed Jun 16, 2025
[Compiled_autograd] running nn.LayerNorm failed for torch.compile with compiled_autograd when deepspeed Zero3
#140091 closed Jun 16, 2025
Convert to markdown: distributed.checkpoint.rst, distributed.elastic.rst, distributed.fsdp.fully_shard.rst, distributed.optim.rst, distributed.pipelining.rst
#155018 closed Jun 16, 2025
add x/0 gradient behaviour to documentation
#128796 closed Jun 16, 2025
Stop special-casing einops in Dynamo
#142486 closed Jun 16, 2025
None deterministic output of linear projection based on batch size and projection dimensions
#156084 closed Jun 16, 2025
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_complex128 (__main__.TestForeachCUDA)
#149323 closed Jun 16, 2025
DISABLED test_tmp_not_defined_issue2_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#135219 closed Jun 16, 2025
DISABLED test_grad_with_manual_interleaved_ScheduleClass2_use_new_runtime_True (__main__.ScheduleTest)
#154481 closed Jun 16, 2025
DISABLED test_on_device_tma_store_old_api (__main__.MutationTests)
#155691 closed Jun 16, 2025
torch.cuda.set_device(0) behaves differently from torch.cuda.set_device(1) in terms of cuda context
#155668 closed Jun 16, 2025
DISABLED test_cache_hot_load_device_cuda_bfloat16_dynamic_False (__main__.AOTAutogradCacheTests)
#145334 closed Jun 16, 2025
IInconsistent Error Handling in `torch.fused_moving_avg_obs_fake_quant` Between CPU and GPU Implementations
#153310 closed Jun 16, 2025
DISABLED test_fake_registration (__main__.TestOpProfiles)
#151301 closed Jun 16, 2025
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_int32 (__main__.TestForeachCUDA)
#150350 closed Jun 16, 2025
旧版pytorch标注python版本
#156038 closed Jun 16, 2025
In the docs for torch.amax/amin the note about min/max gradient behavior is outdated
#155048 closed Jun 15, 2025
[feature request]: Update max onnx opset to 21 for compatability
#127167 closed Jun 15, 2025

91 Issues opened by 59 people

ConvertTritonGPUToLLVM pass fails on fused GroupNorm backward (SM 89) under torch.compile(…, backend='inductor')
#156549 opened Jun 21, 2025
Inconsistent Model Results and Failures on Windows with CUDA vs. CPU PyTorch Builds
#156547 opened Jun 21, 2025
`TorchScript` does not allow accessing methods of nested tensors
#156544 opened Jun 21, 2025
Issue when Using SparseMPS backend (especially the operation aten::_sparse_coo_tensor_with_dims_and_tensors) for Whisper model
#156540 opened Jun 21, 2025
FSDP2 - Tensor incompatibility
#156535 opened Jun 21, 2025
`<<` and `>>` operators seem silently broken for DTensor operand 1 and scalar operand 2
#156533 opened Jun 21, 2025
[ONNX] Update tests for attention
#156524 opened Jun 20, 2025
[Dtensor] handle dtensor ops that only need to operate on certain shard without all_gather first.
#156523 opened Jun 20, 2025
UNSTABLE inductor / linux-jammy-cpu-py3.9-gcc11-inductor / test (inductor_torchbench_cpu_smoketest_perf)
#156521 opened Jun 20, 2025
[user empathy] compile for `transformers` model
#156520 opened Jun 20, 2025
Convenient way to create device with torch.accelerator and a specific device index
#156519 opened Jun 20, 2025
DISABLED test_comprehensive_nn_functional_linear_cuda_float16 (__main__.TestInductorOpInfoCUDA)
#156514 opened Jun 20, 2025
Native BFloat16 Mixed BatchNorm Train gives incorrect gradients
#156513 opened Jun 20, 2025
functorch_maml_omniglot is a bad CPU performance smoketest model
#156511 opened Jun 20, 2025
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_int32 (__main__.TestForeachCUDA)
#156497 opened Jun 20, 2025
mypy.ini deprecation: numpy.typing.mypy_plugin
#156489 opened Jun 20, 2025
Add stub for mypy-torch._C._jit_tree_views
#156488 opened Jun 20, 2025
SDPA FLASH_ATTENTION backend gets NaN values for IPEX on Intel CPU
#156487 opened Jun 20, 2025
Dynamo benchmark test got failed torch.dtype object has no attribute '__name__'
#156482 opened Jun 20, 2025
Parameter "onnx_shape_inference" can't be successfully passed to the "_export" interface in "torch/onnx/utils.py" via the "torch.onnx.export" interface
#156480 opened Jun 20, 2025
`torch.compile` fails with `UnicodeDecodeError` when model contains extreme value injection
#156451 opened Jun 19, 2025
Tensor.is_pinned() raises error after renaming privateuseone backend.
#156444 opened Jun 19, 2025
DISABLED test_inlined_optimized_graph (__main__.TestTEFuserDynamic)
#156438 opened Jun 19, 2025
Accuracy minifier fails to minify anything
#156437 opened Jun 19, 2025
DISABLED test_skip_grad_in_check (__main__.TestTEFuserDynamic)
#156436 opened Jun 19, 2025
Suggest to use the torch cmake target instead of ${TORCH_LIBRARIES} in the c++ docs
#156434 opened Jun 19, 2025
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_int16 (__main__.TestForeachCUDA)
#156430 opened Jun 19, 2025
[MPSInductor] Silently incorrect result with varmean+epilogue
#156426 opened Jun 19, 2025
[CI] "Update viable/strict" job occasionally hangs for days
#156425 opened Jun 19, 2025
research adding cuda-bindings to core
#156424 opened Jun 19, 2025
DISABLED test_fake_crossref_backward_no_amp_cholesky_solve_cuda_float32 (__main__.TestFakeTensorCUDA)
#156419 opened Jun 19, 2025
ShardedTensor breaks cycle detection
#156417 opened Jun 19, 2025
A possible bug in HistogramObserver._combine_histograms()
#156414 opened Jun 19, 2025
bmm max-autotune segfaults on x86 cpu
#156412 opened Jun 19, 2025
[compile] torch._dynamo.exc.TorchRuntimeError: Failed running call_function aten.lift_fresh_copy.default
#156411 opened Jun 19, 2025
DataParallel gather NaN across multiple gpus
#156392 opened Jun 19, 2025
[compile][transformers] Recompilation with mark_static_address with cudagraphs
#156377 opened Jun 18, 2025
DISABLED test_inplace_on_view_undefined_grad_output_cpu (__main__.TestAutogradDeviceTypeCPU)
#156363 opened Jun 18, 2025
DISABLED test_schedule_with_native_zero_bubble_ScheduleClass1 (__main__.ScheduleTest)
#156328 opened Jun 18, 2025
UNSTABLE periodic / linux-jammy-rocm-py3.10 / test (distributed)
#156327 opened Jun 18, 2025
[inductor] tune_scaled_grouped_mm fails with memory layout assertion, despite memory layout assertions prior to op call passing
#156325 opened Jun 18, 2025
Provide a way to allow dynamo to trace into an operator defined with `torch.library.custom_op`
#156322 opened Jun 18, 2025
Composition of nested `torch.compile` calls is not well defined
#156308 opened Jun 18, 2025
DISABLED test_inplace_on_view_then_no_grad_cpu (__main__.TestAutogradDeviceTypeCPU)
#156306 opened Jun 18, 2025
Inductor error with Torch XPU optimizations to StableDiffusion3 Pipeline
#156303 opened Jun 18, 2025
DISABLED test_inplace_on_view_of_view_cpu (__main__.TestAutogradDeviceTypeCPU)
#156289 opened Jun 18, 2025
DISABLED test_inplace_on_view_non_contig_cpu (__main__.TestAutogradDeviceTypeCPU)
#156265 opened Jun 18, 2025
DISABLED test_shape_env (__main__.TestGuardSerialization)
#156264 opened Jun 18, 2025
torch._foreach_copy_ causing CUDA illegal memory access.
#156261 opened Jun 18, 2025
[ONNX] Create a tutorial for exporting hf transformers model
#156258 opened Jun 18, 2025
[dynamic shapes] translation validation failure under `fake_tensor_propagate_real_tensors`
#156251 opened Jun 17, 2025
DISABLED test_name_match (__main__.TestGuardSerialization)
#156246 opened Jun 17, 2025
Upgrade torch._scaled_grouped_mm to SM100+
#156238 opened Jun 17, 2025
[Tracker] AutoParallel's feature request to DTensor
#156217 opened Jun 17, 2025
DISABLED test_inplace_on_view_makes_base_require_grad_cpu (__main__.TestAutogradDeviceTypeCPU)
#156209 opened Jun 17, 2025
When compiling submodules, AOTInductor is significantly slower with torch.export
#156206 opened Jun 17, 2025
Upgrade torch._grouped_mm to SM100+
#156202 opened Jun 17, 2025
Dynamo does not know how to trace method `__len__` of class `<unknown type>` with torch.logging calls
#156191 opened Jun 17, 2025
[CD] Windows Wheel builds CUDA 12.9.1 Stack Overflow during build
#156181 opened Jun 17, 2025
DISABLED test_inplace_on_view_backprop_view_of_view_cpu (__main__.TestAutogradDeviceTypeCPU)
#156180 opened Jun 17, 2025
nn.Module._load_from_state_dict is always called with strict=True
#156177 opened Jun 17, 2025
[Feature Request]: Native C++ API for ONNX Export in LibTorch
#156168 opened Jun 17, 2025
DISABLED test_inplace_on_view_backprop_view_cpu (__main__.TestAutogradDeviceTypeCPU)
#156163 opened Jun 17, 2025
torch2.7.1 issue for torch.compile numpy
#156162 opened Jun 17, 2025
[SDPA] RTX5080 is different from CPU calculation result in backward with long seq
#156160 opened Jun 17, 2025
Error shm.dll
#156159 opened Jun 17, 2025
Inconsistent `torch.rsqrt` results on complex128 between CPU and CUDA
#156152 opened Jun 17, 2025
DISABLED test_inplace_on_view_backprop_base_cpu (__main__.TestAutogradDeviceTypeCPU)
#156143 opened Jun 17, 2025
[dynamo, dynamic shapes] .item() on Tensor created in the compiled region fails
#156135 opened Jun 16, 2025
[DDP][FSDP2] add unit test to showcase DDP mixed precision with FSDP2 mixed precision
#156130 opened Jun 16, 2025
[dynamo] Show carets in graph break stack traces
#156127 opened Jun 16, 2025
[compile][torchtune] Full model compiled Qwen3 is 4x slower than eager
#156103 opened Jun 16, 2025
UNSTABLE rocm / linux-jammy-rocm-py3.10 / test (default)
#156098 opened Jun 16, 2025
DISABLED test_quantize (__main__.TestOpenReg)
#156089 opened Jun 16, 2025
DISABLED test_schedule_with_native_zero_bubble_ScheduleClass0 (__main__.ScheduleTest)
#156088 opened 10000 Jun 16, 2025
Idea: Add SBOM Generation (and optional vuln scan) for better supply chain insight
#156085 opened Jun 16, 2025
`torch.logsumexp`: support `dim=None`
#156075 opened Jun 16, 2025
[codespell] fix typos in the codebase
#156073 opened Jun 16, 2025
[typing][docs] `torch.amin` and `torch.amax` do not document `dim=None`
#156072 opened Jun 16, 2025
Docs incorrectly claim `torch.max` and `torch.logsumexp` accept `dim=None`
#156071 opened Jun 16, 2025
torch.nn.functional.conv_transpose3d return inconsistent results when weight containing inf between CPU and GPU
#156062 opened Jun 16, 2025
Dynamo trace an incorrect result on torch._C._storage_Use_Count
#156059 opened Jun 16, 2025
torch.equal causes fallback to eager mode in torch.compile
#156057 opened Jun 16, 2025
`torch.distributed.tensor.parallel.style.ColwiseParallel` introduce huge guard eval latency
#156054 opened Jun 16, 2025
Ability to set device guard in Python
#156052 opened Jun 16, 2025
[RFC] Migrate to modern Python build system and replace `setup.py` commands with their modern alternatives
#156029 opened Jun 15, 2025
[Upstream Triton] persistent mm + tma accuracy failures
#156028 opened Jun 15, 2025
Ошибка установки torch для CUDA 12.1 на GTX 1660 Ti
#156024 opened Jun 15, 2025
[Segfault Bug] Out-of-bounds write of at::native::cpubla::gemm (bfloat16) in at::native::cpu_flash_attention_backward
#156022 opened Jun 15, 2025
torch.fft.ifft for complex64 produces inconsistent results between CPU and CUDA
#156020 opened Jun 15, 2025
[ROCm] BF16 Context Parallelism MI300X Not Numerically Accurate
#156012 opened Jun 15, 2025

443 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

[build] modernize build-backend: `setuptools.build_meta:__legacy__` -> `setuptools.build_meta`
#155998 commented on Jun 21, 2025 • 19 new comments
[ATen][CPU][Sparse] Use Third-Party Eigen for sparse add and addmm
#155357 commented on Jun 19, 2025 • 19 new comments
[DLPack] Add support for missing keyword-arguments.
#150218 commented on Jun 21, 2025 • 10 new comments
[list] Implement list.count
#153969 commented on Jun 21, 2025 • 8 new comments
[BE][Ez]: Use ruff type inference to autotype parts of dynamo
#156001 commented on Jun 17, 2025 • 7 new comments
[hop] support torch.func.functional_call in hop subgraph
#155886 commented on Jun 20, 2025 • 7 new comments
Add DeviceAllocator as the base device allocator
#138222 commented on Jun 20, 2025 • 6 new comments
[cpp wrapper] add AOTI shim for collective ops
#154492 commented on Jun 20, 2025 • 6 new comments
[doc] Updates to distributed.md for XCCL backend
#155834 commented on Jun 20, 2025 • 6 new comments
fix: 155029 convert rst to md
#155554 commented on Jun 19, 2025 • 6 new comments
Fix issue with set_reduce_scatter_divide_factor errors and MixedPrecisionPolicy
#155964 commented on Jun 20, 2025 • 6 new comments
[PowerPC] Fixed build issue for vsx vec256 complexfloat and scaled_mm_out_cpu
#155255 commented on Jun 21, 2025 • 5 new comments
Fix clang-tidy bugprone* warnings
#148529 commented on Jun 21, 2025 • 5 new comments
unify dynamic shapes API namings 3 (guard_int, guard_int_seq)
#155973 commented on Jun 19, 2025 • 5 new comments
Optionally avoid `record_streams` in autograd with `TORCH_AUTOGRAD_AVOID_RECORD_STREAMS=1`
#155857 commented on Jun 16, 2025 • 5 new comments
Fused RMSNorm implementation
#153666 commented on Jun 21, 2025 • 4 new comments
Enable Leak Sanitizer
#154584 commented on Jun 22, 2025 • 4 new comments
Support --inplace flag for tools/nightly.py
#155419 commented on Jun 20, 2025 • 4 new comments
New Sampler: DistributedWeightedRandomSampler
#150182 commented on Jun 21, 2025 • 4 new comments
Fix cudagraph record_stream memory leak
#155658 commented on Jun 17, 2025 • 4 new comments
Convert to markdown: jit_python_reference.rst, jit_unsupported.rst, jit_utils.rst, library.rst
#155404 commented on Jun 19, 2025 • 3 new comments
Fix: fallback in deserialize_torch_artifact for ScriptObject using weights_only=FalseFix: fallback in deserialize_torch_artifact for ScriptObject using we…
#154333 commented on Jun 21, 2025 • 3 new comments
DOC: update CrossEntropyLoss with note and example of incorrect target specification
#155649 commented on Jun 18, 2025 • 3 new comments
NUMA Binding Integration with torchrun
#149334 commented on Jun 16, 2025 • 3 new comments
Compute contiguity symbolically to avoid dde, and introduce c++ is_contiguous_or_false.
#155590 commented on Jun 21, 2025 • 3 new comments
distributions/constraints type annotations + public classes + some refactoring
#154827 commented on Jun 16, 2025 • 3 new comments
Convert sparse rst to md
#155438 commented on Jun 18, 2025 • 3 new comments
Add unified memory APIs for torch.accelerator
#152932 commented on Jun 20, 2025 • 3 new comments
[Inductor] Fix output discrepancy between Inductor and eager of mean with input of a large size tensor
#155428 commented on Jun 17, 2025 • 3 new comments
docs: fix dead link in torch.compile docs
#152734 commented on Jun 17, 2025 • 3 new comments
[WIP] Port dynamo test cases for xpu backend.
#155524 commented on Jun 20, 2025 • 3 new comments
[aotd] Support mutations of the same input in fw and bw
#155354 commented on Jun 20, 2025 • 2 new comments
add sfdp pattern
#155792 commented on Jun 17, 2025 • 2 new comments
Add UT for torch.accelerator memory-related API
#155200 commented on Jun 20, 2025 • 2 new comments
[Reland] [Intel GPU] Make SDPA output has the same stride as Query.
#154340 commented on Jun 20, 2025 • 2 new comments
[2/n] rewrite load balancing and sharding in context parallel
#155442 commented on Jun 18, 2025 • 2 new comments
[ONNX] Don't link to third-party protobuf
#153920 commented on Jun 20, 2025 • 2 new comments
Default USE_PRIORITIZED_TEXT_FOR_LD=1 on Linux aarch64 via setup.py
#155901 commented on Jun 20, 2025 • 2 new comments
[Set] Support sets in VariableBuilder
#153150 commented on Jun 21, 2025 • 2 new comments
[export] add serialized_artifact test
#152739 commented on Jun 16, 2025 • 2 new comments
Build libgomp (gcc-11) from src on AArch64
#152361 commented on Jun 20, 2025 • 2 new comments
Deprecated pkg_resources and use distributions instead
#151915 commented on Jun 20, 2025 • 2 new comments
Refactor cpp codegen to support overridable class attributes.
#155553 commented on Jun 19, 2025 • 2 new comments
Fix clang-tidy warnings of performance from uncovered files
#144542 commented on Jun 19, 2025 • 2 new comments
Upgrade to DLPack 1.0.
#145000 commented on Jun 20, 2025 • 2 new comments
Update OpenBLAS commit
#151547 commented on Jun 20, 2025 • 2 new comments
XCCL changes for DDP
#155497 commented on Jun 21, 2025 • 1 new comment
[DRAFT] Evaluate feasability of using FunctionalTensor for Example Value
#155606 commented on Jun 19, 2025 • 1 new comment
Add serialized_type_name to torch.return_types.* so we can dump them
#155245 commented on Jun 19, 2025 • 1 new comment
[docs] Decorator to create a deprecation warning
#155127 commented on Jun 17, 2025 • 1 new comment
[ca] cpp tensor pre hooks
#155082 commented on Jun 19, 2025 • 1 new comment
Support deterministic upsample trilinear backward
#154239 commented on Jun 20, 2025 • 1 new comment
[CUDA] Allow cuDNN or flash attn in `test_activation_checkpointing` pattern match check
#153272 commented on Jun 16, 2025 • 1 new comment
Enable the AMP precision with freezing for CPU nightly test
#152298 commented on Jun 20, 2025 • 1 new comment
[reland][ROCm] remove caffe2 from hipify
#151845 commented on Jun 19, 2025 • 1 new comment
Deprecate DataLoader pin_memory_device param
#146821 commented on Jun 20, 2025 • 1 new comment
ROCm OCP Micro-scaling Format (mx-fp8/mx-fp4) Support
B93C #151360 commented on Jun 20, 2025 • 1 new comment
Add CPython string tests
#150793 commented on Jun 17, 2025 • 1 new comment
Refactor CUDAAllocatorConfig to reuse AllocatorConfig
#150312 commented on Jun 18, 2025 • 1 new comment
[nn.utils] scale_grad_ with for_each
#150033 commented on Jun 21, 2025 • 1 new comment
[dynamo] Support builtin bool on non-constant VTs
#155863 commented on Jun 19, 2025 • 1 new comment
Overload `mul_overflows` for `size_t`
#155736 commented on Jun 22, 2025 • 1 new comment
[CI] Remove conda from from windows
#155731 commented on Jun 17, 2025 • 1 new comment
Add Windows CUDA 12.9.1 build
#155748 commented on Jun 20, 2025 • 1 new comment
Fix argument validation for torch.nn.attention.sdpa_kernel
#155922 commented on Jun 17, 2025 • 1 new comment
Allow as_tensor to retain grad info
#156006 commented on Jun 17, 2025 • 1 new comment
Fixed NLLLoss 1D input crash with torch.compile
#155672 commented on Jun 18, 2025 • 1 new comment
Add BufferDict works like ParameterDict
#151870 commented on Jun 21, 2025 • 0 new comments
distributed: add distributed P2P TensorQueue and TensorStore
#151631 commented on Jun 17, 2025 • 0 new comments
[draft export] normalize sympy expressions for data-dependent counting
#151856 commented on Jun 21, 2025 • 0 new comments
Add assert_on_assumption on to guard_or_true, and guard_or_false
#151854 commented on Jun 21, 2025 • 0 new comments
[demo] Verify test runner integration
#151645 commented on Jun 21, 2025 • 0 new comments
Fix stride comparison max(512 - s, 1) vs. (512 - s)
#155938 commented on Jun 21, 2025 • 0 new comments
Deduplicate library deletion
#151795 commented on Jun 20, 2025 • 0 new comments
[ca] mark some sparse tests fixed by AccumulateGrad functionalization
#155948 commented on Jun 17, 2025 • 0 new comments
enable windows inductor UT in CI
#151777 commented on Jun 21, 2025 • 0 new comments
[MPS] Implement upsample_nearest3d_vec operator
#151760 commented on Jun 22, 2025 • 0 new comments
don't do a full deserialize on every file
#155942 commented on Jun 16, 2025 • 0 new comments
torch.testing._internal.optests - MPS Support
#151758 commented on Jun 21, 2025 • 0 new comments
Refactor duplicate code into a utility function in pytorch/torch/nn/functional.py
#151752 commented on Jun 20, 2025 • 0 new comments
Update __init__.py
#151751 commented on Jun 22, 2025 • 0 new comments
[WIP][draft_export] suppress pending unbacked for divisibility symbol
#151718 commented on Jun 19, 2025 • 0 new comments
Remove unnecessary recompile
#151711 commented on Jun 18, 2025 • 0 new comments
Update link to NVIDIA cuDNN Support Matrix
#151647 commented on Jun 19, 2025 • 0 new comments
[aot] Set config partitioner recompute_views True by default
#151676 commented on Jun 17, 2025 • 0 new comments
test without rocblas conv when using cudagraphs
#155902 commented on Jun 18, 2025 • 0 new comments
Mitigate upcoming removal of direct invocation of setup.py support
#155910 commented on Jun 20, 2025 • 0 new comments
[FrozenSet] Fixes for FrozenSet
#152991 commented on Jun 21, 2025 • 0 new comments
[ROCm] Ck gemm architecture guard
#152951 commented on Jun 17, 2025 • 0 new comments
[WIP] Automatic load/save
#155913 commented on Jun 18, 2025 • 0 new comments
[dynamo] Add `-> bool` to functions named `is_*` or `_is_*`
#155923 commented on Jun 20, 2025 • 0 new comments
[ROCm] Initial AITER Integration for mha_bwd asm kernels
#152630 commented on Jun 17, 2025 • 0 new comments
[pytree] make `tree_*` functions accept both Python and C++ `PyTreeSpec`
#152624 commented on Jun 18, 2025 • 0 new comments
Implemented `Size.__radd__`
#152554 commented on Jun 16, 2025 • 0 new comments
complex.pow(2) on GPU by replacing with complex * complex to avoid numerical instability
#152373 commented on Jun 21, 2025 • 0 new comments
[inductor] Add `-> bool` to functions named `is_*` or `_is_*`
#155928 commented on Jun 17, 2025 • 0 new comments
updated matplotlib version in docs requirements
#155931 commented on Jun 20, 2025 • 0 new comments
Switch to standard pep517 sdist generation
#152098 commented on Jun 20, 2025 • 0 new comments
Test
#152055 commented on Jun 19, 2025 • 0 new comments
[inductor][profiler] lazily import things in standalone_compile
#151956 commented on Jun 22, 2025 • 0 new comments
add tlpare logs
#151948 commented on Jun 22, 2025 • 0 new comments
[profiler] use inspect.getattr_static to avoid importing inductor
#151946 commented on Jun 22, 2025 • 0 new comments
[WIP][dynamic shapes] whitelist at dim-level
#151941 commented on Jun 22, 2025 • 0 new comments
[Observability][Optimus] Fix the tlparse name
#151935 commented on Jun 22, 2025 • 0 new comments
[export] add _union_dataclass to support comparing dataclasses that inherits from union.
#155932 commented on Jun 19, 2025 • 0 new comments
Add additional MacOS test runners for MPS
#150964 commented on Jun 15, 2025 • 0 new comments
Add complex logaddexp
#150946 commented on Jun 15, 2025 • 0 new comments
all_reduce autograd
#150942 commented on Jun 22, 2025 • 0 new comments
Pin all root requirements to major versions
#150833 commented on Jun 18, 2025 • 0 new comments
[CI][cpp_wrapper] Fix selection of CPU OpInfo tests
#155967 commented on Jun 19, 2025 • 0 new comments
draft: [cp] context_parallel + flex_attention_backward using torch_function and autograd function
#155970 commented on Jun 16, 2025 • 0 new comments
[Inductor] Set the default value of min_chunk_size to 512
#150762 commented on Jun 19, 2025 • 0 new comments
cd: Introduce new binary build workflows (cpu)
#150713 commented on Jun 16, 2025 • 0 new comments
Raise `BufferError` for DLPack buffer-related errors.
#150691 commented on Jun 20, 2025 • 0 new comments
fix dynamic shapes for kwargs
#150583 commented on Jun 19, 2025 • 0 new comments
Initial Implementation of Padded Tensor
#150567 commented on Jun 18, 2025 • 0 new comments
Fixes for CPython int/float tests
#155978 commented on Jun 17, 2025 • 0 new comments
Add `mse_loss_backward_out` type promotion
#150384 commented on Jun 16, 2025 • 0 new comments

10000
Copy native runtime code to OSS.
#150338 commented on Jun 15, 2025 • 0 new comments
[FlexAttention] Don't load invalid values from mask mod
#150331 commented on Jun 15, 2025 • 0 new comments
Handling overflow for long int overflow for the product of kernel_hei…
#155989 commented on Jun 15, 2025 • 0 new comments
[Inductor] Synchronize type annotations between torch and triton
#150311 commented on Jun 21, 2025 • 0 new comments
[BE][Ez]: Fix untyped decorator in dcp utils
#156003 commented on Jun 19, 2025 • 0 new comments
add enum for core Backend class
#156004 commented on Jun 19, 2025 • 0 new comments
Fix `L1Loss`, `MSELoss`, `HuberLoss` missing `weight` param
#150097 commented on Jun 17, 2025 • 0 new comments
added six and pyyaml to requirements.txt to fix missing module error …
#151605 commented on Jun 17, 2025 • 0 new comments
[DRAFT] fix issues related to deferred assertion on unabcked floats
#151604 commented on Jun 17, 2025 • 0 new comments
[bazel] Fix aten generator directory path
#151580 commented on Jun 19, 2025 • 0 new comments
[autodeps2] Replace third-party/pyqt5 with third-party/pypi/pyqt5
#151557 commented on Jun 16, 2025 • 0 new comments
TopK workaround when tensor rank - sort axis > 4
#155950 commented on Jun 17, 2025 • 0 new comments
Fix normalize mypy warning with tuple dim
#151553 commented on Jun 18, 2025 • 0 new comments
[DRAFT][cuDNN][SDPA] Introduce `TORCH_CUDNN_SDPA_AVOID_RECOMPILE=1`
#155958 commented on Jun 18, 2025 • 0 new comments
inductor.config.descriptive_names = False is not actually supported (#145523) (#146051)
#151481 commented on Jun 18, 2025 • 0 new comments
[ROCm] Initial plumbing for CK Gemm Perf Improvement
#151465 commented on Jun 19, 2025 • 0 new comments
Add default value for `serialization_format` in `_write_item` function for better compatibility
#151452 commented on Jun 16, 2025 • 0 new comments
[ca] default on in CI also for PYTORCH_TEST_WITH_INDUCTOR
#155960 commented on Jun 18, 2025 • 0 new comments
update fx.Interpreter error logging to check if submodules are GraphModules
#151451 commented on Jun 17, 2025 • 0 new comments
Implement fast exp for AVX2 and AVX512 for the flash attention
#151441 commented on Jun 16, 2025 • 0 new comments
Update docker image names for s390x release
#151429 commented on Jun 16, 2025 • 0 new comments
[CI] Remove redundant accuracy benchmarks for cpp_wrapper
#155966 commented on Jun 19, 2025 • 0 new comments
Add inductor backend to device interface; make minifier_tests more device agnostic
#151314 commented on Jun 20, 2025 • 0 new comments
[WIP] Generalize device caching allocator
#151298 commented on Jun 17, 2025 • 0 new comments
Remove outdated Android workarounds of nearbyintf
#151292 commented on Jun 22, 2025 • 0 new comments
[WIP][dynamic shapes] lru cache bound_sympy
#151271 commented on Jun 16, 2025 • 0 new comments
[AMD][FA] Block mem efficient attention if backward head_dim > 128 in CK backend
#151258 commented on Jun 16, 2025 • 0 new comments
[aot] bw_module for ca: do not clone real buffers/params
#155370 commented on Jun 19, 2025 • 0 new comments
[AOTInductor] Inherit extern kernels for runtime constant folding
#155361 commented on Jun 16, 2025 • 0 new comments
Fix serialization of nans in torch.export
#155359 commented on Jun 18, 2025 • 0 new comments
[MPS] Activation kernels: do compute at float precision
#155735 commented on Jun 17, 2025 • 0 new comments
[#155034] Converted RST files to Markdown
#155287 commented on Jun 15, 2025 • 0 new comments
[inductor] support linear & layer_norm unbacked
#155267 commented on Jun 17, 2025 • 0 new comments
[Torch Package] Make get names of OrderedImporters support fallback to importers
#155743 commented on Jun 17, 2025 • 0 new comments
updated adafactor doc #154862
#155248 commented on Jun 20, 2025 • 0 new comments
Add is_hidden_event method to KinetoEvent Python interface
#155214 commented on Jun 20, 2025 • 0 new comments
Add AMD AWS runners to inductor performance tests
#155206 commented on Jun 20, 2025 • 0 new comments
add `__annotations__` attribute to `OpOverload`
#155784 commented on Jun 16, 2025 • 0 new comments
[MPS] Add device guard for MPS dispatch key
#155165 commented on Jun 20, 2025 • 0 new comments
[DONT MERGE][TESTING][1/2] xpu test runner
#155793 commented on Jun 19, 2025 • 0 new comments
Fix conversion of values in libtorch agnostic tests
#155115 commented on Jun 18, 2025 • 0 new comments
Issue warning with reference to user code rather than torch
#155112 commented on Jun 16, 2025 • 0 new comments
[Quant][CPU] fix fake_quantize_per_tensor_affine of inf values
#155109 commented on Jun 17, 2025 • 0 new comments
Adapting pipeline parallelism test cases to be device agnostic
#155108 commented on Jun 20, 2025 • 0 new comments
Deprecate c10::string
#155084 commented on Jun 22, 2025 • 0 new comments
Refactor DynamoStore into disk and in memory implementations
#155818 commented on Jun 18, 2025 • 0 new comments
update the baseline for nightly max_autotune tests
#154973 commented on Jun 18, 2025 • 0 new comments
[cond] auto_functionalize cond
#155645 commented on Jun 17, 2025 • 0 new comments
[Misc] handle sys exit caused by skip_if_lt_x_gpu in test_composabili…
#155665 commented on Jun 18, 2025 • 0 new comments
[ROCm][SymmetricMemory] Avoid bf16 to float conversion during reduce
#155587 commented on Jun 20, 2025 • 0 new comments
[Misc] skip the case test_foreach_add_different_mesh if world size is…
#155563 commented on Jun 18, 2025 • 0 new comments
Implement guard collectives
#155558 commented on Jun 20, 2025 • 0 new comments
Document `Flop Counter Mode` in torch.utils
#155673 commented on Jun 16, 2025 • 0 new comments
Update MAIAHooksInterface to pin host memory in MAIA device
#155541 commented on Jun 20, 2025 • 0 new comments
[Quant][CPU] Enable fp8 qlinear
#155678 commented on Jun 16, 2025 • 0 new comments
Making implicit packages explicit (torch)
#155505 commented on Jun 20, 2025 • 0 new comments
[fsdp] fix: fix optim_state_dict with FSDP model not on global rank 0
#155685 commented on Jun 18, 2025 • 0 new comments
[ROCm] skip convolution tests on Navi, enable batch_norm_with_update
#155454 commented on Jun 16, 2025 • 0 new comments
Update slow tests
#155448 commented on Jun 16, 2025 • 0 new comments
[Profiler] Fix lost C call events problem in Python 3.12.0-3.12.4
#155446 commented on Jun 21, 2025 • 0 new comments
Clean up HF components
#155707 commented on Jun 17, 2025 • 0 new comments
docs: clean up docstring for clarity and correctness
#155712 commented on Jun 15, 2025 • 0 new comments
[2/2] proxy_tensor do not clobber for mutating ops
#155716 commented on Jun 16, 2025 • 0 new comments
[FP8] Fix Benchmarking for certain Priors
#155722 commented on Jun 17, 2025 • 0 new comments
Remove remaining CUDA 12.4 CI code
#155412 commented on Jun 22, 2025 • 0 new comments
Convert onnx torchscript rst to md
#155390 commented on Jun 21, 2025 • 0 new comments
[Precompile] Hook up backend="inductor"
#155387 commented on Jun 22, 2025 • 0 new comments
[cuBLAS][cuBLASLt] Reduce scale of inputs for reduced precision reduction matmul test
#154293 commented on Jun 17, 2025 • 0 new comments
[Dynamo] [FrozensetSubclass] Add support for user defined frozensets
#154263 commented on Jun 21, 2025 • 0 new comments
[NOT FOR MERGE] Exploratory work on AOTInductor training
#155877 commented on Jun 19, 2025 • 0 new comments
implement MKLGenerator
#154199 commented on Jun 18, 2025 • 0 new comments
[cond] support gen_schema for cond
#154193 commented on Jun 17, 2025 • 0 new comments
[cuBLASLt][cuBLAS] Support 2D bias and `beta != 1.0` in cuBLASLt
#154170 commented on Jun 17, 2025 • 0 new comments
[BE]: Update pybind11 submodule to 3.0.0rc
#154115 commented on Jun 19, 2025 • 0 new comments
[Dynamo] [Set] Add comparison for set subclass
#154066 commented on Jun 21, 2025 • 0 new comments
[Dynamo] [Set] Raise TypeError in set.union(...) and "__or__"
#154065 commented on Jun 21, 2025 • 0 new comments
[Dynamo] [Set] Raise TypeError if object is unhashable
#154064 commented on Jun 21, 2025 • 0 new comments
[Dynamo] [Set] Implement some binop operators for dict/set/frozenset/dict_keys
#154063 commented on Jun 21, 2025 • 0 new comments
[draft][do not review] H-FSDP prototype
#154000 commented on Jun 18, 2025 • 0 new comments
Docs: Fix sphinx heading markup in `nn.rst`
#155883 commented on Jun 17, 2025 • 0 new comments
[WIP][user triton] AOT inductor support for device-side TMA
#155896 commented on Jun 17, 2025 • 0 new comments
Ignore url lint in install_xpu.sh
#153796 commented on Jun 16, 2025 • 0 new comments
[BE] Use latest mkl-include and mkl-devel on Windows CI
#153684 commented on Jun 15, 2025 • 0 new comments
[Cutlass] Fix buffer missing issues
#155897 commented on Jun 17, 2025 • 0 new comments
[Dynamo] [SetSubclass] Add support for user defined sets
#153553 commented on Jun 21, 2025 • 0 new comments
[ROCm] update state check for test_trace_while_active*
#153545 commented on Jun 19, 2025 • 0 new comments
CMake: update FindCUDAToolkit.cmake, use torch::nvtx3 if present, mod…
#153339 commented on Jun 15, 2025 • 0 new comments
[CI] Removing --user flag from all pip install commands
#154900 commented on Jun 16, 2025 • 0 new comments
[ROCm] SDPA fix mem fault when dropout is enabled
#154864 commented on Jun 21, 2025 • 0 new comments
Skip FSDP tests if device count is less then requested world_size value
#155836 commented on Jun 20, 2025 • 0 new comments
[BE]: Try to enable LTO
#154819 commented on Jun 18, 2025 • 0 new comments
[Wheel Variant] Experimental Support
#154733 commented on Jun 21, 2025 • 0 new comments
[vision hash update] update the pinned vision hash
#154694 commented on Jun 22, 2025 • 0 new comments
Use official CUDAToolkit module in CMake
#154595 commented on Jun 22, 2025 • 0 new comments
Fix MKL error: Inconsistent configuration parameters
#154585 commented on Jun 17, 2025 • 0 new comments
[einops] Ensure Dynamo can trace through explicit set dunder method call
#155842 commented on Jun 20, 2025 • 0 new comments
[dynamo] raise hard error if error is encountered while tracing resume function prologue
#154564 commented on Jun 21, 2025 • 0 new comments
Fixes Issue #154491
#154561 commented on Jun 15, 2025 • 0 new comments
Improve torch.ops typing
#154555 commented on Jun 21, 2025 • 0 new comments
[cpp_wrapper] Build main and kernel code in separate threads
#154551 commented on Jun 19, 2025 • 0 new comments
[Generator] Implement generator.__contains__
#154539 commented on Jun 21, 2025 • 0 new comments
Fix Float16 CooperativeReduction Test Failure
#154516 commented on Jun 18, 2025 • 0 new comments
[NCCL][P2P] Optionally avoid `recordStream`in P2P comms
#155854 commented on Jun 17, 2025 • 0 new comments
[easy] better copy_misaligned_inputs assertion failure message
#154472 commented on Jun 21, 2025 • 0 new comments
Fix: Ensure writeback handles NO_SHARD correctly by flattening tensors before copying
#154369 commented on Jun 18, 2025 • 0 new comments
Ensure Dynamo can trace through explicit dunder method call
#154366 commented on Jun 20, 2025 • 0 new comments
[DONT MERGE] Diffusion models benchmarking for compile time
#155866 commented on Jun 21, 2025 • 0 new comments
DISABLED test_wait_tensor (__main__.CompileTest)
#148014 commented on Jun 19, 2025 • 0 new comments
Torchrun should handle SIGUSR1 and SIGUSR2
#154849 commented on Jun 18, 2025 • 0 new comments
Cudnn attention is very slow when sequence length changed in every step
#154602 commented on Jun 18, 2025 • 0 new comments
DISABLED test_inductor_all_gather_into_tensor_single (__main__.CompileTest)
#147707 commented on Jun 18, 2025 • 0 new comments
DISABLED test_per_sample_api_compute_batch_size_not_pytreeable_cpu (__main__.TestExpandedWeightModuleCPU)
#146972 commented on Jun 18, 2025 • 0 new comments
xpu: implement aten::_linalg_eigvals for XPU backend (affecting HF Transformers v4.46.0-v4.48.0)
#140965 commented on Jun 18, 2025 • 0 new comments
Export Huggingface models with StaticCache
#155862 commented on Jun 18, 2025 • 0 new comments
DTensor RNG state for non CUDA backends
#138329 commented on Jun 18, 2025 • 0 new comments
UR Error when calling grid_sample
#153996 commented on Jun 18, 2025 • 0 new comments
Reproducibility of results without AVX512 by setting ATEN_CPU_CAPABILITY=avx2
#155552 commented on Jun 18, 2025 • 0 new comments
Enable CUDA 12.9 binaries
#155196 commented on Jun 17, 2025 • 0 new comments
Device check missing in torch.linalg.solve_triangular leading to hard crash
#142048 commented on Jun 17, 2025 • 0 new comments
RuntimeError: NVML_SUCCESS == r INTERNAL ASSERT FAILED at "/opt/pytorch/pytorch/c10/cuda/CUDACachingAllocator.cpp":830, please report a bug to PyTorch.
#123834 commented on Jun 17, 2025 • 0 new comments
support for cuDNN 9.8+
#155203 commented on Jun 17, 2025 • 0 new comments
Excessively restrictive dependencies
#155325 commented on Jun 17, 2025 • 0 new comments
CUDA 12.6->12.8 slow and periodic failures
#155607 commented on Jun 17, 2025 • 0 new comments
Functional all_gather_into_tensor does not support stacking, fails when compiled
#155632 commented on Jun 17, 2025 • 0 new comments
[RFC] Experimental Wheel Variant Support
#155141 commented on Jun 17, 2025 • 0 new comments
torch.compile produces incorrect output
#155690 commented on Jun 17, 2025 • 0 new comments
Question in aot_autograd trace in torch.distributed case
#155599 commented on Jun 17, 2025 • 0 new comments
Graph break when modifying a list that contains symints.
#155174 commented on Jun 17, 2025 • 0 new comments
[Misc] test_foreach_add_different_mesh cannot work on machines with less than 4 GPUs
#155562 commented on Jun 17, 2025 • 0 new comments
[NJT] can only chunk if the 2nd dimension is ragged
#153238 commented on Jun 17, 2025 • 0 new comments
Vulkan interoperability
#155986 commented on Jun 17, 2025 • 0 new comments
The difference between input grad computed by channels last backward and the input grad computed by channels first backward of Hardswish on MPS is too large
#107214 commented on Jun 17, 2025 • 0 new comments
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_float32 (__main__.TestForeachCUDA)
#153470 commented on Jun 17, 2025 • 0 new comments
DCP save only saves one shard of tensor parallel model when using DP + TP
#156002 commented on Jun 17, 2025 • 0 new comments
`torch.prod` or `torch.special.entr` triggers `CUDA driver error: invalid argument` on GPU unless kernel cache is cleared
#156010 commented on Jun 17, 2025 • 0 new comments
Torch compile CUDA graphs leads to a large number of CUDA streams
#155679 commented on Jun 19, 2025 • 0 new comments
ROCm: torch.cholesky_inverse raises Memory access fault for large tensor shapes
#155046 commented on Jun 19, 2025 • 0 new comments
Dynamo export: Fake tensor broadcast error
#129534 commented on Jun 19, 2025 • 0 new comments
Sparse tensor indexing not implemented, but partially supported by using index_select
#150277 commented on Jun 19, 2025 • 0 new comments
NotImplementedError: Could not run 'aten::index.Tensor' with arguments from the 'SparseCUDA' backend.
#152226 commented on Jun 19, 2025 • 0 new comments
`torch.sparse.log_softmax` output mismatch between CPU and CUDA
#152293 commented on Jun 19, 2025 • 0 new comments
Segmentation fault when converting sparse COO tensor with complex values to dense
#153329 commented on Jun 19, 2025 • 0 new comments
cpp wrapper calls back to python for custom op even when a C++ registration is made
#153478 commented on Jun 19, 2025 • 0 new comments
NotImplementedError: Could not run 'aten::log' with arguments from the 'SparseCUDA' backend.
#153497 commented on Jun 19, 2025 • 0 new comments
Clarify default value of eps in RMSNorm documentation
#155527 commented on Jun 19, 2025 • 0 new comments
Triton pin update for PyTorch 2.8 / Triton 3.4
#154206 commented on Jun 19, 2025 • 0 new comments
[torch.export] Cannot export TorchVision raft_small, raft_large
#155550 commented on Jun 19, 2025 • 0 new comments
DISABLED test_mempool_ctx_multithread (__main__.TestMemPool)
#153460 commented on Jun 19, 2025 • 0 new comments
Windows inductor genarated zero size array code, and is not supported by MSVC(C2466).
#153180 commented on Jun 19, 2025 • 0 new comments
Dynamo handling for all methods of torch.Generator
#88576 commented on Jun 19, 2025 • 0 new comments
DISABLED test_memory_snapshot (__main__.TestCudaMallocAsync)
#126953 commented on Jun 19, 2025 • 0 new comments
RFC: The State of Custom CUDA extensions in PyTorch
#152032 commented on Jun 19, 2025 • 0 new comments
`torch._dynamo.exc.Unsupported: Attempted to call function marked as skipped`. Explanation: Dynamo developers have intentionally marked that the function `_immutable_list_unflatten`
#155426 commented on Jun 18, 2025 • 0 new comments
[dtensor] ops coverage tracker
#119930 commented on Jun 18, 2025 • 0 new comments
Context Parallel -- unsharded output doesn't match output without CP.
#152261 commented on Jun 18, 2025 • 0 new comments
"RuntimeError: makeDeviceForHostname(): unsupported gloo device" with nightly torch 2.8
#150381 commented on Jun 18, 2025 • 0 new comments
Escape hatch: way to dynamically add or remove tags from custom operators
#150972 commented on Jun 18, 2025 • 0 new comments
Compile produces different result than eager for mutable custom op use case
#153389 commented on Jun 18, 2025 • 0 new comments
[RFC][API-Unstable] Support 3rd party SYCL kernels with CPP Extension API
#153265 commented on Jun 18, 2025 • 0 new comments
Timer benchmark stores only one time value, and therefore has broken mean/median/etc metrics
#106801 commented on Jun 18, 2025 • 0 new comments
[feature request] Native checkpointing to/from `s3://`
#155992 commented on Jun 18, 2025 • 0 new comments
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_float64 (__main__.TestForeachCUDA)
#153544 commented on Jun 18, 2025 • 0 new comments
torch.distributions.kl_divergence Fails with MultivariateNormal in Dynamo Due to _infer_size Type Error
#155800 commented on Jun 16, 2025 • 0 new comments
Activation Checkpointing breaks "torch.distributed.checkpoint.state_dict._get_fqns"
#155924 commented on Jun 16, 2025 • 0 new comments
Can't call torch.compile inside of a custom op
#151328 commented on Jun 16, 2025 • 0 new comments
pipeline() fails when a sub-module uses "no_grad()"; impacts RoPE implementation on HF models
#155589 commented on Jun 16, 2025 • 0 new comments
Cannot compile with latest LLVM-19
#139065 commented on Jun 16, 2025 • 0 new comments
torch.clamp throws overflow error on CPU but not on CUDA
#155671 commented on Jun 16, 2025 • 0 new comments
crash in torch.histc
#155393 commented on Jun 16, 2025 • 0 new comments
nn.init.trunc_normal_ Creates Massive Outliers with Small std Due to erfinv Instability
#155588 commented on Jun 16, 2025 • 0 new comments
FSDP learning hangs when the program tries to save the model
#143536 commented on Jun 16, 2025 • 0 new comments
Most requested ops for the MPS backend
#154052 commented on Jun 16, 2025 • 0 new comments
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_float16 (__main__.TestForeachCUDA)
#153379 commented on Jun 16, 2025 • 0 new comments
[Tracker] Nested tensor op coverage requests
#118107 commented on Jun 16, 2025 • 0 new comments
[RFC][API-Unstable] Intel GPU distributed Backend integration in `torch-xpu-ops`and registeration in PyTorch
#141741 commented on Jun 16, 2025 • 0 new comments
TypeError when using torch.cuda.list_gpu_processes() on Windows with the WDDM driver
#64491 commented on Jun 16, 2025 • 0 new comments
DISABLED test_jacobian_vectorize_raises_no_warnings_logging_tensor (__main__.TestAutogradFunctional)
#153707 commented on Jun 16, 2025 • 0 new comments
DTensor + torch.compile on CPU: compiled matmul fails with multiple shape inputs
#154111 commented on Jun 16, 2025 • 0 new comments
FSDP2's `set_reduce_scatter_divide_factor` is inconsistent wrt reduce dtype
#155904 commented on Jun 16, 2025 • 0 new comments
Including XPU and CUDA in ProfilerActivity causes XPU profiling to be ignored
#155957 commented on Jun 16, 2025 • 0 new comments
Some Doc Issue about `torch.lobpcg()`
#152107 commented on Jun 16, 2025 • 0 new comments
DISABLED test_remove_noop_view_dtype_cuda (__main__.GPUTests)
#151541 commented on Jun 16, 2025 • 0 new comments
Continuous calls to nn.Linear in fp32 on the 5090D cause severe performance degradation
#150725 commented on Jun 15, 2025 • 0 new comments
Enable TorchInductor to Generate Matmuls Natively via `tl.dot`
#151705 commented on Jun 15, 2025 • 0 new comments
bmm, topk, cholesky, linalg.norm, max with out variants set causing recompilations in torch.compile
#135859 commented on Jun 15, 2025 • 0 new comments
Enable `torch.topk` to support `stable` flag
#88227 commented on Jun 15, 2025 • 0 new comments
ONNX export via Dynamo sets `dft_length = 1` in `DFT`, breaking shape-inference for `torch.fft.rfft`
#155997 commented on Jun 15, 2025 • 0 new comments
Incompatible Torch and Torchvision while building from source for 2.6.0 and CUDA 12.6, RuntimeError: operator torchvision::nms does not exist
#146221 commented on Jun 15, 2025 • 0 new comments
[MPS] Performance regression and visual bug with ComfyUI Flux dev since nightly 20250510
#155797 commented on Jun 15, 2025 • 0 new comments
Keep gettting AssertionError: found no DeviceMesh from dtensor args for c10d.broadcast_.default!
#155993 commented on Jun 17, 2025 • 0 new comments
[ONNX] dynamic_axes does not rename dynamic dimension in torch.onnx.export
#150544 commented on Jun 17, 2025 • 0 new comments
Inductor Perf MX to_blocked
#153194 commented on Jun 17, 2025 • 0 new comments
[Async TP] Fuse all-gather-matmuls for float8 rowwise training
#149990 commented on Jun 17, 2025 • 0 new comments
torch.compile on MPS progress tracker
#150121 commented on Jun 17, 2025 • 0 new comments
canUse32BitIndexMath set to False with efficient net
#155225 commented on Jun 17, 2025 • 0 new comments
TorchInductor CPU Performance Dashboard
#93531 commented on Jun 17, 2025 • 0 new comments
Pypi Support for Windows arm64
#154260 commented on Jun 17, 2025 • 0 new comments
[feature request] Rank-Revealing QR - Adding dgeqp3 support to torch.qr
#10454 commented on Jun 17, 2025 • 0 new comments
Feature Request: Add a rounding mode to round
#55289 commented on Jun 17, 2025 • 0 new comments
DISABLED test_run_decompositions_map_handle_to_new_nodes (__main__.TestNumericDebugger)
#144933 commented on Jun 17, 2025 • 0 new comments
foreach CUDA tests flaky on CUDA 12.6+ due to flaky profiler results
#148681 commented on Jun 17, 2025 • 0 new comments
AssertionError: found no DeviceMesh from dtensor args for c10d.broadcast_.default!
#155463 commented on Jun 17, 2025 • 0 new comments
[feature request] Exact euclidean distance transform
#61509 commented on Jun 17, 2025 • 0 new comments
[DCP] Allow for rank-specific tensors with duplicate keys
#146566 commented on Jun 17, 2025 • 0 new comments
DISABLED test_re_export_preserve_handle (__main__.TestNumericDebugger)
#144898 commented on Jun 17, 2025 • 0 new comments
DISABLED test_lowering_to_x86 (__main__.TestQuantizePT2EX86Inductor)
#153140 commented on Jun 17, 2025 • 0 new comments
Process never ends when sending tensors through multiprocessing queues in Python 3.12+ on macOS
#153050 commented on Jun 16, 2025 • 0 new comments
DISABLED test_ddp_uneven_inputs (__main__.TestDistBackendWithSpawn)
#75648 commented on Jun 16, 2025 • 0 new comments
Running dispatch modes on compile-disabled regions of a compiled model
#155825 commented on Jun 16, 2025 • 0 new comments
[MPS] Migrate torch.sort to Metal shader
#155560 commented on Jun 16, 2025 • 0 new comments
[DTensor] DTensor is not well supported on older versions of GPUs, such as A10
#155657 commented on Jun 16, 2025 • 0 new comments
Add support for MaxPool3D on the MPS backend
#100674 commented on Jun 16, 2025 • 0 new comments
Graph Partition Issue Tracker
#151832 commented on Jun 16, 2025 • 0 new comments
Segmentation error for torch==2.2.1 on MacOs
#121101 commented on Jun 16, 2025 • 0 new comments
Suggestion: integration of einops test suite
#146782 commented on Jun 16, 2025 • 0 new comments
test vLLM with PyTorch 2.8rc before releasing PyTorch 2.8
#155933 commented on Jun 16, 2025 • 0 new comments
[inductor] Add typing to _inductor/ir.py
#149958 commented on Jun 19, 2025 • 0 new comments
gloo: fix building system gloo with CUDA/HIP
#146637 commented on Jun 15, 2025 • 0 new comments
Add MPS OpInfo db, rework test_mps to use OpInfo
#145955 commented on Jun 16, 2025 • 0 new comments
[WIP] Allow generation of inductor backend specific tests using instantiate_device_type_tests
#145873 commented on Jun 21, 2025 • 0 new comments
NJT support for cat() on the ragged dim
#145778 commented on Jun 18, 2025 • 0 new comments
Fix full_like decomposition to preserve strides
#144765 commented on Jun 20, 2025 • 0 new comments
[Reopen] [Intel GPU] Set higher tolerance for some models only on XPU Device
#144756 commented on Jun 16, 2025 • 0 new comments
[BE][PYFMT] remove `black`: finish `black -> ruff format` migration
#144557 commented on Jun 16, 2025 • 0 new comments
[BE][PYFMT] migrate PYFMT for `test/[i-z]*/` to `ruff format`
#144556 commented on Jun 16, 2025 • 0 new comments
[BE][PYFMT] migrate PYFMT for `test/[a-h]*/` to `ruff format`
#144555 commented on Jun 16, 2025 • 0 new comments
[BE][PYFMT] migrate PYFMT for `torch/[p-z]*/` to `ruff format`
#144552 commented on Jun 16, 2025 • 0 new comments
Support Swiglu for Module and functional
#144465 commented on Jun 21, 2025 • 0 new comments
[ci] Add riscv opt-int build
#143979 commented on Jun 17, 2025 • 0 new comments
Defaults to C++20 in CMake torch targets
#143959 commented on Jun 15, 2025 • 0 new comments
[Don't Review] Test CI
#139971 commented on Jun 19, 2025 • 0 new comments
`has_triton`: Use the device interface for detecting Triton availability
#139171 commented on Jun 16, 2025 • 0 new comments
Fix `USE_STATIC_MKL` lost functionality
#138996 commented on Jun 19, 2025 • 0 new comments
[Docker] Create an independent dependecies layer
#138612 commented on Jun 18, 2025 • 0 new comments
[pytree] add `treespec_{leaf,tuple,dict}` functions for args_spec modification
#138214 commented on Jun 18, 2025 • 0 new comments
Add TORCH_CHECK_INDEX in convert_indices_from_coo_to_csr_cpu
#138068 commented on Jun 18, 2025 • 0 new comments
[pytree] Add public pytree module `torch.utils.pytree`
#137400 commented on Jun 18, 2025 • 0 new comments
Avoid sqrt calculations with values less than zero
#136824 commented on Jun 21, 2025 • 0 new comments
[Inductor] auto-chunker
#136702 commented on Jun 17, 2025 • 0 new comments
Remove deprecated jit code
#131296 commented on Jun 20, 2025 • 0 new comments
[DTensor] decomposed sharding propagation
#130887 commented on Jun 19, 2025 • 0 new comments
[inductor] enable bf32 for mkldnn linear pointwise/binary in inductor
#127294 commented on Jun 22, 2025 • 0 new comments
[inductor] enable bf32 test for mkldnn conv
#127293 commented on Jun 22, 2025 • 0 new comments
[AOTAutograd] tweak min-cut partitioner to avoid saving softmax output
#126348 commented on Jun 18, 2025 • 0 new comments
[draft] Add support in Flex for non-contiguous NJT
#149892 commented on Jun 18, 2025 • 0 new comments
cd: Add script for generating binary build matrix
#149830 commented on Jun 16, 2025 • 0 new comments
[Inductor] Restrict block analysis to only match integer dims and strides
#149615 commented on Jun 20, 2025 • 0 new comments
Generalize AllocatorConfig to be device-agnostic
#149601 commented on Jun 18, 2025 • 0 new comments
[ROCm] support experimental CU carveout
#149466 commented on Jun 21, 2025 • 0 new comments
Use mypy 1.15
#149426 commented on Jun 20, 2025 • 0 new comments
Fix B018 Useless Expressions in Multiple Files (#106571)
#149408 commented on Jun 17, 2025 • 0 new comments
[cuDNN][SDPA] cuDNN SDPA refactor/cleanup, nested tensor backward, test priority bump for `sm90`, `sm100`
#149282 commented on Jun 17, 2025 • 0 new comments
PaddedTensor Init
#149140 commented on Jun 18, 2025 • 0 new comments
[Intel GPU] Allow XPU backend in Depthwise_conv2d&3d operators
#149114 commented on Jun 21, 2025 • 0 new comments
remove guard_size_oblivious from unbind.
#148815 commented on Jun 21, 2025 • 0 new comments
Trunk workflow for Windows Arm64
#148753 commented on Jun 17, 2025 • 0 new comments
< 10000 a href="https://github.com/pytorch/pytorch/pull/148569" class="h4 Link--primary mb-1">[BE][pytree] cleanup parameterized pytree tests
#148569 commented on Jun 18, 2025 • 0 new comments
[triton hash update] update the pinned triton hash
#148492 commented on Jun 22, 2025 • 0 new comments
[BE][pytree] rename argument name in register function to match the type annotations: `*_fn -> *_func`
#148484 commented on Jun 18, 2025 • 0 new comments
[BE][pytree] rename `NodeDef` member to match the type annotations: `*_fn -> *_func`
#148474 commented on Jun 18, 2025 • 0 new comments
[pytree] simplify public API exposition with `__module__`
#148328 commented on Jun 18, 2025 • 0 new comments
[BE][PYFMT] migrate PYFMT for `test/inductor/` to `ruff format`
#148186 commented on Jun 16, 2025 • 0 new comments
[pytree] add another simplified pytree module `torch.pytree`
#148180 commented on Jun 18, 2025 • 0 new comments
[test] compile cmd
#147470 commented on Jun 18, 2025 • 0 new comments
Small scheduler refactor
#147410 commented on Jun 21, 2025 • 0 new comments
[MPS] Fix incorrect size for uint3 arg
#147325 commented on Jun 16, 2025 • 0 new comments
[MPS] Fix metallib embedding in static builds
#147324 commented on Jun 16, 2025 • 0 new comments
Add ppc64le wheel build support
#147194 commented on Jun 20, 2025 • 0 new comments
Fix the Problems About Defining Static Variable in Inline Function
#147095 commented on Jun 20, 2025 • 0 new comments
Porting Pytorch to AIX Operating System.
#146983 commented on Jun 19, 2025 • 0 new comments
Optimize isclose() for CPU and GPU by adding specific implementations
#146656 commented on Jun 17, 2025 • 0 new comments
DISABLED test_slice_scatter_reinplace_cuda (__main__.GPUTests)
#145189 commented on Jun 20, 2025 • 0 new comments
[AC] torch.utils.checkpoint.CheckpointError from HF qwen2
#155171 commented on Jun 20, 2025 • 0 new comments
DISABLED test_non_contiguous_input_mm_plus_mm (__main__.TestMaxAutotune)
#126867 commented on Jun 20, 2025 • 0 new comments
Migrating existing backend-MAIA integration toward PrivateUse1 / openReg
#155864 commented on Jun 20, 2025 • 0 new comments
Multi-dimensional tensors in datasets might get incorrectly flattened when fetching data from dataloader which is specified 'batch_sampler' when created
#154810 commented on Jun 20, 2025 • 0 new comments
[torch.export] Cannot export TorchVision fasterrcnn_mobilenet_v3_large_fpn
#146152 commented on Jun 20, 2025 • 0 new comments
CUDA 12.6 Inductor accuracy test failures
#148699 commented on Jun 20, 2025 • 0 new comments
`setup.py develop` command is disappearing soon from `setuptools`
#152276 commented on Jun 20, 2025 • 0 new comments
[release] Make pytorch source distribution package respect pep-0517
#150461 commented on Jun 20, 2025 • 0 new comments
[CI] [anaconda] Docker files have conda environment installed
#148335 commented on Jun 20, 2025 • 0 new comments
[CI] [anaconda] CI Build and Test scripts MacOS
#148340 commented on Jun 20, 2025 • 0 new comments
[Docs] [anaconda] Review and update
#148339 commented on Jun 20, 2025 • 0 new comments
[CI] [anaconda] CI Build and Test scripts Windows
#148338 commented on Jun 20, 2025 • 0 new comments
[CI] [anaconda] CI Build and Test scripts Linux
#148336 commented on Jun 20, 2025 • 0 new comments
Deprecation notice of `torch.norm` and `Tensor.norm` across the documentation
#156005 commented on Jun 20, 2025 • 0 new comments
[ONNX] broadcast_in_dim: model (ReDimNet)
#138313 commented on Jun 20, 2025 • 0 new comments
`torch.onnx.export` (dynamo=False) fails with uninformative error when exporting `apply_rotary_pos_emb`/`repeat_interleave`
#145100 commented on Jun 20, 2025 • 0 new comments
[ONNX Convert] Error when input to nn.AdaptiveAvgPool2d size is variable
#147720 commented on Jun 20, 2025 • 0 new comments
[export] Decomp failure when running `aten.item.default`
#150823 commented on Jun 20, 2025 • 0 new comments
[ONNX] Use dlpack to transfer tensors when onnxruntime implements proper support
#151064 commented on Jun 20, 2025 • 0 new comments
[ONNX] Simple torch.nn.Identity onnx export with dynamo=True does not load
#151017 commented on Jun 20, 2025 • 0 new comments
DISABLED test_sdpa_mask_fp16_L6_S17_NH23_HS121 (__main__.TestSDPA)
#138905 commented on Jun 20, 2025 • 0 new comments
The docstring linter should not force overridden methods to be documented
#151692 commented on Jun 20, 2025 • 0 new comments
Remove redundant type aliases of _device for torch.Device
#152952 commented on Jun 20, 2025 • 0 new comments
[torch.compile][Megatron] Error with Megatron with Pytorch v2.5.0 using `AOTAutograd` and `torch.compile`
#141783 commented on Jun 20, 2025 • 0 new comments
[ONNX] ONNX export of simple quantized model fails
#113817 commented on Jun 20, 2025 • 0 new comments
Make streams used for NCCL operations configurable
#67158 commented on Jun 19, 2025 • 0 new comments
allow to use bf16 as fp32 internal precision for mkldnn conv backward
#126054 commented on Jun 22, 2025 • 0 new comments
allow to use bf16 as fp32 internal precision for mkldnn conv
#126050 commented on Jun 22, 2025 • 0 new comments
refine fp32 precision api
#125888 commented on Jun 22, 2025 • 0 new comments
Automated submodule update: FBGEMM
#115316 commented on Jun 21, 2025 • 0 new comments
[pytree] support PyStructSequence types for Python pytree
#113258 commented on Jun 19, 2025 • 0 new comments
Automated submodule update: kineto
#106149 commented on Jun 19, 2025 • 0 new comments
Support building pytorch using MKL ILP64 model.
#102613 commented on Jun 19, 2025 • 0 new comments
Support sparse COO/CSR/CSC/BSR/BSC return values in gradcheck input function
#97825 commented on Jun 19, 2025 • 0 new comments
Online softmax is disabled on the fly
#153241 commented on Jun 21, 2025 • 0 new comments
[Tracker] Support flash attention fa3 ABI stable w/ libtorch
#154908 commented on Jun 21, 2025 • 0 new comments
[ONNX] exported nodes of Multi-head attention can be simplified
#151209 commented on Jun 21, 2025 • 0 new comments
General MPS op coverage tracking issue
#77764 commented on Jun 21, 2025 • 0 new comments
CompiledFxGraph.current_callable is not thread-safe
#138961 commented on Jun 21, 2025 • 0 new comments
MPS operator coverage tracking issue (2.6+ version)
#141287 commented on Jun 21, 2025 • 0 new comments
[RFC] Use CUDA graphs by default on torch.compile
#121968 commented on Jun 21, 2025 • 0 new comments
Segmentation fault when calling `torch.choose_qparams_optimized()` with empty tensors and extreme num_bins value
#153326 commented on Jun 21, 2025 • 0 new comments
Tensor.lerp inconsistent when using -Infinity between MPS and CPU
#111374 commented on Jun 21, 2025 • 0 new comments
Looking for valid compiling option for extension based on torch-2.1.0+cpu.cxx11.abi
#143780 commented on Jun 21, 2025 • 0 new comments
Division by zero in ONNX export with `dynamo=True` leading to NaN outputs
#150623 commented on Jun 21, 2025 • 0 new comments
High-performance LLM quantization on X86 CPU with native PyTorch
#155435 commented on Jun 21, 2025 • 0 new comments
`torch.jit.script` models with `Dict[str, Tensor]` return cannot be exported via `torch.onnx.export` without `dynamo=True`, and error message is unclear
#155091 commented on Jun 21, 2025 • 0 new comments
as_tensor of list of tensors should keep grad history
#155983 commented on Jun 21, 2025 • 0 new comments
Segfault after clearing Dynamo Cache
#155057 commented on Jun 21, 2025 • 0 new comments
MSE documentation is weak
#88327 commented on Jun 20, 2025 • 0 new comments
torch.export does not support torchaudio.transforms.Spectrogram
#112844 commented on Jun 20, 2025 • 0 new comments
UNSTABLE pull / cuda12.8-py3.10-gcc9-sm75 / test (pr_time_benchmarks)
#153987 commented on Jun 20, 2025 • 0 new comments
DISABLED test_inductor_reduce_scatter_tensor_coalesced (__main__.CompileTest)
#147887 commented on Jun 20, 2025 • 0 new comments

0