Pulse · pytorch/pytorch · GitHub

8000 Pulse · pytorch/pytorch · GitHub

More Web Proxy on the site http://driver.im/

May 23, 2025 – June 23, 2025

Overview

539 Active pull requests

1,077 Active issues

1 Release published by 1 person

v2.7.1 PyTorch 2.7.1 Release, bug fix release
published Jun 4, 2025

10 Pull requests merged by 4 people

Bump requests from 2.32.2 to 2.32.4 in /.github
#155491 merged Jun 16, 2025
Bump pillow from 10.0.1 to 10.3.0 in /.github/requirements
#154416 merged Jun 4, 2025
Revert "Temporarily disable sparse tensor validation when loading from external storage."
#154755 merged May 30, 2025
Revert "Add optional check_pinning argument to _validate_sparse_compressed_tensor/coo_args"
#154751 merged May 30, 2025
Add optional check_pinning argument to _validate_sparse_compressed_tensor/coo_args
#154617 merged May 30, 2025
Temporarily disable sparse tensor validation when loading from external storage.
#154600 merged May 29, 2025
set thread_work_size to 4 for unrolled kernel
#154541 merged May 29, 2025
[c10d] Fix extra CUDA context created by barrier
#152834 merged May 27, 2025
[c10d] Add more tests to prevent extra context
#154179 merged May 27, 2025
[CI] Remove the xpu env source for linux binary validate
#154409 merged May 27, 2025

529 Pull requests opened by 236 people

Revert D74898941 (#154188)
#154203 opened May 23, 2025
[c10d] Add more tests to prevent extra context
#154204 opened May 23, 2025
Fix for ISSUE #153069
#154211 opened May 23, 2025
Support deterministic upsample trilinear backward
#154239 opened May 23, 2025
[Dynamo] [FrozensetSubclass] Add support for user defined frozensets
#154263 opened May 23, 2025
[MTIA Aten Backend][3/n] Migrate mm.out from out-of-tree to in-tree
#154277 opened May 23, 2025
Add Support for transposed convolution with Padding Mode 'same' in C++
#154279 opened May 23, 2025
[dynamo] control one_graph behavior additionally through config
#154283 opened May 23, 2025
[dynamo] add set_fullgraph decorator/context manager
#154289 opened May 23, 2025
[cuBLAS][cuBLASLt] Reduce scale of inputs for reduced precision reduction matmul test
#154293 opened May 24, 2025
Aten vector default constructors set to 0, add fnmadd and fnmsub
#154298 opened May 24, 2025
[MTIA Aten Backend][1.2/n] Migrate remaining view ops, which all need explicit register in `native_functions.yaml`
#154308 opened May 24, 2025
[DO NOT MERGE] Oboarding lab pt 3 4
#154315 opened May 25, 2025
feat: add torch.export .save/.load to support `safetensors` and/or weights_only=True
#154330 opened May 25, 2025
Fix: fallback in deserialize_torch_artifact for ScriptObject using weights_only=FalseFix: fallback in deserialize_torch_artifact for ScriptObject using we…
#154333 opened May 26, 2025
[MTIA Aten Backend][1.2/n] Migrate as_strided to in-tree, and add unit tests
#154334 opened May 26, 2025
[MTIA Aten Backend][1.3/n] Migrate remaining view ops, which all need explicit register in `native_functions.yaml`
#154335 opened May 26, 2025
Add dispatch log for torch benchmark
#154338 opened May 26, 2025
Enhance testing infrastructure to add half-precision support for `histc` on XPU
#154339 opened May 26, 2025
[Reland] [Intel GPU] Make SDPA output has the same stride as Query.
#154340 opened May 26, 2025
Allow decomposeK to fuse
#154349 opened May 26, 2025
Draft subgraph fusion
#154350 opened May 26, 2025
Add `torch.segment_reduce` docs
#154352 opened May 26, 2025
Ensure Dynamo can trace through explicit dunder method call
#154366 opened May 26, 2025
Fix: Ensure writeback handles NO_SHARD correctly by flattening tensors before copying
#154369 opened May 26, 2025
Draft
#154388 opened May 26, 2025
[WIP] Do not unroll reduction when reduced new size is empty.
#154389 opened May 27, 2025
Don't need to handle PyTrace_EXCEPTION in pyProfileFn
#154392 opened May 27, 2025
Updated padding validation in max_pool functions to account for dilation
#154395 opened May 27, 2025
Convert torch.rst to .md
#154438 opened May 27, 2025
[Not for review] Fix rebuild buckets when find_unused_parameters=True for use cases like GAN based models
#154447 opened May 27, 2025
[Inductor] Fix remove_noop_ops pass where the types for the same_meta would differ
#154460 opened May 27, 2025
[WIP] oblivious where
#154468 opened May 27, 2025
[CI] [CUDA] Add CUDA 12.8 eager CI tests
#154469 opened May 28, 2025
[easy] better copy_misaligned_inputs assertion failure message
#154472 opened May 28, 2025
[dynamic shapes] guard_or_false should_swap
#154475 opened May 28, 2025
Replace deprecated `is_compiling` method
#154476 opened May 28, 2025
[cpp wrapper] add AOTI shim for collective ops
#154492 opened May 28, 2025
Fix CI failures from inductor periodic
#154497 opened May 28, 2025
Reland "Collect packages with importlib in collect_env #144616"
#154505 opened May 28, 2025
[precompile] Package dynamo artifacts. [jjwu test copy]
#154510 opened May 28, 2025
hack
#154511 opened May 28, 2025
Fix Float16 CooperativeReduction Test Failure
#154516 opened May 28, 2025
simplify modularindexing
#154523 opened May 28, 2025
antoher test
#154535 opened May 28, 2025
[Generator] Implement generator.__contains__
#154539 opened May 28, 2025
add envvar to bisect number of graphs compiled
#154543 opened May 28, 2025
[cpp_wrapper] Build main and kernel code in separate threads
#154551 opened May 28, 2025
Fixes Issue #154491
#154561 opened May 28, 2025
[dynamo] raise hard error if error is encountered while tracing resume function prologue
#154564 opened May 28, 2025
Always set CPU affinity for benchmark jobs
#154569 opened May 28, 2025
[WIP][export][cond] support exporting cond with unbacked symint shaped tensor
#154570 opened May 28, 2025
Enable Leak Sanitizer
#154584 opened May 29, 2025
Fix MKL error: Inconsistent configuration parameters
#154585 opened May 29, 2025
Bump mimalloc to v3.0.3
#154594 opened May 29, 2025
Add `scale` complex type check in `quantize_per_tensor`
#154601 opened May 29, 2025
deprecate MTIA_WORKLOADD from pytorch
#154609 opened May 29, 2025
[et/triton kernel] Assign correct stream_id from cudaStream
#154640 opened May 29, 2025
[1/n]adding torch.distributed.run option to provide destination for event logging
#154644 opened May 29, 2025
Add support for tracing vmap in pre-dispatch export
#154650 opened May 29, 2025
Add pg transport and tests
#154653 opened May 29, 2025
Extract DeviceType to a standalone header file
#154654 opened May 29, 2025
[forward fix] add support for MemoryFormat after type tightening
#154658 opened May 29, 2025
pin torchao in torchbench jobs
#154665 opened May 29, 2025
[Graph Partition] Turn-on in OSS by default
#154667 opened May 29, 2025
Experimental: torch.vulkan.compile_shader
#154676 opened May 29, 2025
NOCOMMIT: hack to allow allocating SSBO-backed Vulkan Tensors from Python
#154677 opened May 29, 2025
Experimental: also support SSBO calling convention in torch.vulkan.compile_shader
#154678 opened May 29, 2025
[cp] test: add memory snapshot for flex_attention tests
#154691 opened May 29, 2025
[vision hash update] update the pinned vision hash
#154694 opened May 30, 2025
[WIP][export] assume standard slice for non-strict
#154699 opened May 30, 2025
[Dynamo] Support torch dynamo for PrivateUse1 backend
#154702 opened May 30, 2025
Improve `lr_scheduler` epoch type and description
#154706 opened May 30, 2025
Type hints for distributions/constraints
#154711 opened May 30, 2025
[DO NOT merge] windows ami test
#154729 opened May 30, 2025
[Wheel Variant] Experimental Support
#154733 opened May 30, 2025
Script for consolidation of sharded safetensor files
#154743 opened May 30, 2025
Add NestedTensorHPU to to_padded_tensor and _nested_tensor_storage_offsets in native_functions.yaml
#154744 opened May 30, 2025
[dynamo] fix debugging code_parts for relational guards
#154753 opened May 30, 2025
Fix in pytorch do_bench_using_profiling
#154766 opened May 30, 2025
[dynamo] fix selecting shape guards
#154772 opened May 30, 2025
ci: Refactor reuse_old_whl to use python stdlib
#154777 opened May 31, 2025
[dynamo] fix set_fullgraph for nested calls
#154782 opened May 31, 2025
Fix setuptools
#154788 opened May 31, 2025
Add note about using latest clangd for IDE support
#154790 opened May 31, 2025
[dict] Add dict.popitem
#154793 opened May 31, 2025
[dict] Allow Dynamo to trace through explicit dict dunder method call
#154794 opened May 31, 2025
[BE][Ez]: Fully type nn.utils.clip_grad
#154801 opened May 31, 2025
add fixes for missing headers, setup.py for local build
#154812 opened Jun 1, 2025
[BE]: Try to enable LTO
#154819 opened Jun 1, 2025
distributions/constraints type annotations + public classes + some refactoring
#154827 opened Jun 1, 2025
Fix circular imports
#154832 opened Jun 2, 2025
[dynamo] Refactor TorchCtxManagerClassVariable to use dispatch map fo…
#154833 opened Jun 2, 2025
Add serialization support for register_constant
#154834 opened Jun 2, 2025
[Inductor] Fix CUDAGraphTree input/output aliasing bug in torch.compile with inductor backend
#154839 opened Jun 2, 2025
Fix DataLoader to Pass List to getitems When Using BatchSampler. Fixes Issue_#154810
#154844 opened Jun 2, 2025
[TorchGen] Add explicit op level CPU fallback feature via env variables
#154854 opened Jun 2, 2025
[ROCm] SDPA fix mem fault when dropout is enabled
#154864 opened Jun 2, 2025
[CI] Removing --user flag from all pip install commands
#154900 opened Jun 2, 2025
[dict] Implement dict.__eq__ and dict.__ne__
#154903 opened Jun 2, 2025
[DO NOT MERGE] Test mi300 tests on IBM cluster
#154907 opened Jun 2, 2025
[WIP] cast to bf16 before mul op in flex bwd
#154922 opened Jun 2, 2025
[Inductor] Re-run torchgen/fuse/gen_patterns for DistilBert
#154923 opened Jun 2, 2025
[inductor] Add CSE.get_prefix to allow customized behavior
#154924 opened Jun 2, 2025
inductor: add wrap_expr_for_assignment_to_var
#154925 opened Jun 2, 2025
RFC: Prototype Vulkan inductor backend
#154926 opened Jun 2, 2025
Remove CUDA 11.8 CI code
#154937 opened Jun 3, 2025
[dict] Implement dict.__eq__ and dict.__ne__
#154942 opened Jun 3, 2025
[OrderedDict] Implement explicit OrderedDict dunder method call
#154943 opened Jun 3, 2025
[DO NOT LAND] all_gather_copy_in for cpu offload
#154959 opened Jun 3, 2025
Remove unsafe PyTorchError constructor
#154961 opened Jun 3, 2025
update the baseline for nightly max_autotune tests
#154973 opened Jun 3, 2025
[Intel GPU] Refactor Matmul integration: Modularize bias handling and memory creation
#154977 opened Jun 3, 2025
Avoid differing results in `linalg.(tensor_)solve`
#154983 opened Jun 3, 2025
change cpu_blas and cmakefile to support openblas bgemm
#155000 opened Jun 3, 2025
[c10] buffered atomic stack
#155012 opened Jun 3, 2025
Add functionalize util
#155042 opened Jun 3, 2025
[cp] test: understand why flex_attention doesn't get dispatched in the assumed way
#155059 opened Jun 3, 2025
Add type validation for alpha and inplace in nn.CELU
#155061 opened Jun 3, 2025
[test] lintrunner thing
#155062 opened Jun 3, 2025
Add a crash handler to async compile subprocesses
#155068 opened Jun 3, 2025
[DO NOT MERGE] test mi300 workflows on vultr cluster
#155069 opened Jun 3, 2025
Supporting compilation of distributed_c10d.send and distributed_c10d.recv
#155070 opened Jun 3, 2025
[dict] Implement dict.__ior__ and fix return type in dict.__or__
#155072 opened Jun 3, 2025
[WIP][dynamic shapes] guard_or_false for are_strides_like_channel_last
#155076 opened Jun 3, 2025
[ca] cpp tensor pre hooks
#155082 opened Jun 3, 2025
Deprecate c10::string
#155084 opened Jun 4, 2025
Add device check in `mse_loss`
#155089 opened Jun 4, 2025
docs: link to Nvidia Container Toolkit in README
#155102 opened Jun 4, 2025
Adapting pipeline parallelism test cases to be device agnostic
#155108 opened Jun 4, 2025
[Quant][CPU] fix fake_quantize_per_tensor_affine of inf values
#155109 opened Jun 4, 2025
Convert to markdown: cpp_extension.rst, cpp_index.rst, cpu.rst, cuda_environment_variables.rst, cuda._sanitizer.rst
#155110 opened Jun 4, 2025
Fixes #154982: add missing to_result_dtype in vector_norm
#155111 opened Jun 4, 2025
Issue warning with reference to user code rather than torch
#155112 opened Jun 4, 2025
Fix conversion of values in libtorch agnostic tests
#155115 opened Jun 4, 2025
[TESTING] [DO NOT MERGE] Updated triton commit pin
#155117 opened Jun 4, 2025
[docs] Decorator to create a deprecation warning
#155127 opened Jun 4, 2025
[MTIA Aten Backend] Migrate set_.source_Storage and set_.source_Tensor
#155144 opened Jun 4, 2025
[OrderedDict] Implement `OrderedDict.move_to_end(key, last=False)`
#155152 opened Jun 4, 2025
[OrderedDict] Implement `OrderedDict.popitem(last=...)`
#155153 opened Jun 4, 2025
[dict] Implement `__eq__` for dict_items
#155154 opened Jun 4, 2025
[MPS] Add device guard for MPS dispatch key
#155165 opened Jun 4, 2025
[dynamo] handle fullgraph toggle using nested torch.compile
#155166 opened Jun 4, 2025
[WIP][fake tensor] avoid nonzero memo
#155176 opened Jun 5, 2025
adding arg values and arg types to Strobelight USDT
#155185 opened Jun 5, 2025
[WIP][fake tensor] invalidate memos for PropagateUnbackedSymInts
#155187 opened Jun 5, 2025
[PT] support custom all_gather and reduce_scatter comms
#155189 opened Jun 5, 2025
Add UT for torch.accelerator memory-related API
#155200 opened Jun 5, 2025
Add AMD AWS runners to inductor performance tests
#155206 opened Jun 5, 2025
Add is_hidden_event method to KinetoEvent Python interface
#155214 opened Jun 5, 2025
[PrivateUse1] Optimize 3rd party backend experiences
#155215 opened Jun 5, 2025
Add more logging
#155219 opened Jun 5, 2025
Add serialized_type_name to torch.return_types.* so we can dump them
#155245 opened Jun 5, 2025
updated adafactor doc #154862
#155248 opened Jun 5, 2025
Add STD_TORCH_CHECK to torch::standalone
#155253 opened Jun 5, 2025
[PowerPC] Fixed build issue for vsx vec256 complexfloat and scaled_mm_out_cpu
#155255 opened Jun 5, 2025
Add STD_TORCH_CHECK to torch::standalone [imported]
#155258 opened Jun 5, 2025
[Dynamo] Add CPython default dict tests
#155263 opened Jun 5, 2025
higher_order_ops.py unimplemented_v2 migration, part1
#155264 opened Jun 5, 2025
[inductor] support linear & layer_norm unbacked
#155267 opened Jun 5, 2025
[WIP][dynamic shapes] if-then-else for meta_select storage offset
#155269 opened Jun 5, 2025
turn off reorder_for_peak_memory in case of collectives
#155271 opened Jun 5, 2025
[#155034] Converted RST files to Markdown
#155287 opened Jun 5, 2025
[BE] Deprecate `search_autotune_cache`
#155302 opened Jun 6, 2025
Use expecttest in test_compiled_optimizers.py
#155308 opened Jun 6, 2025
[einops] Ensure Dynamo can trace through einops
#155310 opened Jun 6, 2025
Try adding bfloat16 to test_nn_lstm
#155338 opened Jun 6, 2025
[oss] Add version to metadata
#155343 opened Jun 6, 2025
[aotd] Support mutations of the same input in fw and bw
#155354 opened Jun 6, 2025
[ATen][CPU][Sparse] Use Third-Party Eigen for sparse add and addmm
#155357 opened Jun 6, 2025
Fix serialization of nans in torch.export
#155359 opened Jun 6, 2025
[AOTInductor] Inherit extern kernels for runtime constant folding
#155361 opened Jun 6, 2025
Try running test_foreach sequentially
#155366 opened Jun 6, 2025
[aot] bw_module for ca: do not clone real buffers/params
#155370 opened Jun 6, 2025
[export] inline into torch.jit.traced nn module
#155381 opened Jun 6, 2025
[Precompile] Integrate PrecompileContext with CompilePackage
#155384 opened Jun 7, 2025
Convert onnx torchscript rst to md
#155390 opened Jun 7, 2025
Convert to markdown: jit_python_reference.rst, jit_unsupported.rst, jit_utils.rst, library.rst
#155404 opened Jun 7, 2025
Support --inplace flag for tools/nightly.py
#155419 opened Jun 8, 2025
[DTensor] Fix aten.all strategy with min instead of sum as the reduce_op
#155420 opened Jun 8, 2025
[scan] Fix issues with scan on CPU and for autograd when implementing an RNN with multiple layers
#155422 opened Jun 8, 2025
[PT2][partitioners] Add aten.split to view_ops list
#155424 opened Jun 8, 2025
[inductor] Improve GEMM loggings
#155427 opened Jun 8, 2025
[Inductor] Fix output discrepancy between Inductor and eager of mean with input of a large size tensor
#155428 opened Jun 8, 2025
Clean up memory management in impl_func_norm
#155432 opened Jun 9, 2025
Convert sparse rst to md
#155438 opened Jun 9, 2025
Use unpack instructions for vec256 (de)interleave2
#155440 opened Jun 9, 2025
[1/n] refactor the ring attention implementation
#155441 opened Jun 9, 2025
[2/n] rewrite load balancing and sharding in context parallel
#155442 opened Jun 9, 2025
[test][do not merge] test on MI300 for #125888
#155445 opened Jun 9, 2025
[Profiler] Fix lost C call events problem in Python 3.12.0-3.12.4
#155446 opened Jun 9, 2025
Update slow tests
#155448 opened Jun 9, 2025
[test][do not merge] test on main on MI300, compared with for #125888
#155449 opened Jun 9, 2025
Quiet Inductor #135521
#155450 opened Jun 9, 2025
[Dynamo] Enable torch function dispatch on HOPs
#155452 opened Jun 9, 2025
[ROCm] skip convolution tests on Navi, enable batch_norm_with_update
#155454 opened Jun 9, 2025
[proxy_tensor] Do not clobber tensor proxies for inplace ops
#155456 opened Jun 9, 2025
[PyTorch][NCCLX] expose nccl_nonblocking_timeout for NCCLX PG building to reuse NCCLUtils macro
#155487 opened Jun 9, 2025
Add torch._C._log_api_usage_once to datapipes (mapper)
#155489 opened Jun 9, 2025
XCCL changes for DDP
#155497 opened Jun 9, 2025
[OrderedDict] Implement `hasattr(..., IteratorVariable)`
#155501 opened Jun 9, 2025
[OrderedDict] Set the correct dict class in UserDefinedDictVariable
#155502 opened Jun 9, 2025
[OrderedDict] Add `bool(OrderedDict)`
#155503 opened Jun 9, 2025
[inductor] Fix propagating torch.utils._sympy.functions.Identity in IndexPropagation
#155504 opened Jun 9, 2025
Making implicit packages explicit (torch)
#155505 opened Jun 10, 2025
[WIP] Port dynamo test cases for xpu backend.
#155524 opened Jun 10, 2025
Update MAIAHooksInterface to pin host memory in MAIA device
#155541 opened Jun 10, 2025
[Inductor] Add decomposition for aten.mul
#155542 opened Jun 10, 2025
[Misc] fix distributed/_tools/test_sac_ilp.py::TestSACILP::test_sac_i…
#155548 opened Jun 10, 2025
FractionalMaxPool3d add kernel_size check
#155549 opened Jun 10, 2025
fix: 155029 convert rst to md
#155554 opened Jun 10, 2025
WIP add support for dynamic shapes
#155557 opened Jun 10, 2025
Implement guard collectives
#155558 opened Jun 10, 2025
[Misc] skip the case test_foreach_add_different_mesh if world size is…
#155563 opened Jun 10, 2025
[ROCm][SymmetricMemory] Avoid bf16 to float conversion during reduce
#155587 opened Jun 10, 2025
Compute contiguity symbolically to avoid dde, and introduce c++ is_contiguous_or_false.
#155590 opened Jun 10, 2025
[do NOT land] torch_function_mode + flex_attention dispatch test
#155594 opened Jun 10, 2025
[aoti][mps] Enable test_aot_inductor.py tests
#155598 opened Jun 10, 2025
[do NOT land] DTensor + torch_function_mode + flex_attention dispatch test
#155600 opened Jun 10, 2025
Remove unnecessary MPSStream initialization
#155602 opened Jun 10, 2025
[DRAFT] Evaluate feasability of using FunctionalTensor for Example Value
#155606 opened Jun 10, 2025
[dict] Implement dict subclass `fromkeys` classmethod
#155608 opened Jun 10, 2025
[dynamo] added github_cli to detect unimplemented_v2 calls
#155610 opened Jun 10, 2025
[Scripts] Add refresh script to clean, pull and build repo
#155639 opened Jun 10, 2025
[WIP][PGO] exclude optimizer state from PGO whitelist
#155643 opened Jun 10, 2025
[cond] preserve merged phs meta for subgraph
#155644 opened Jun 10, 2025
[cond] auto_functionalize cond
#155645 opened Jun 10, 2025
DOC: update CrossEntropyLoss with note and example of incorrect target specification
#155649 opened Jun 11, 2025
Fix UB in BFloat16 round_to_nearest_even
#155650 opened Jun 11, 2025
Make upsample accept list scale_factor
#155654 opened Jun 11, 2025
Fix cudagraph record_stream memory leak
#155658 opened Jun 11, 2025
[Misc] handle sys exit caused by skip_if_lt_x_gpu in test_composabili…
#155665 opened Jun 11, 2025
[Optimus] add einsum_to_pointwise_pass pattern
#155666 opened Jun 11, 2025
[inductor] Make size_hint fallback parameter required and add size_hi…
#155669 opened Jun 11, 2025
Fixed NLLLoss 1D input crash with torch.compile
#155672 opened Jun 11, 2025
Document `Flop Counter Mode` in torch.utils
#155673 opened Jun 11, 2025
[Quant][CPU] Enable fp8 qlinear
#155678 opened Jun 11, 2025
[fsdp] fix: fix optim_state_dict with FSDP model not on global rank 0
#155685 opened Jun 11, 2025
Remove unused nonlocal declarations from checkpoint and library helper functions
#155686 opened Jun 11, 2025
[CUDA][MAGMA][Linalg][WIP] Remove MAGMA
#155694 opened Jun 11, 2025
[CI] Use `setup-python` from for Mac tests
#155698 opened Jun 11, 2025
Clean up HF components
#155707 opened Jun 11, 2025
docs: clean up docstring for clarity and correctness
#155712 opened Jun 11, 2025
Remove actionable label from docathon label sync script
#155713 opened Jun 11, 2025
[2/2] proxy_tensor do not clobber for mutating ops
#155716 opened Jun 11, 2025
[FP8] Fix Benchmarking for certain Priors
#155722 opened Jun 11, 2025
HOP py_impl register to tensor subclass cannot dispatch
#155726 opened Jun 11, 2025
Adds number of channels check in PixelShuffle
#155728 opened Jun 11, 2025
[MPS] Add regression test for memory leak in nn.MaxPool2d
#155730 opened Jun 11, 2025
[CI] Remove conda from from windows
#155731 opened Jun 11, 2025
Fix torch.export.export() GPU failure with RNN modules.
#155734 opened Jun 11, 2025
[MPS] Activation kernels: do compute at float precision
#155735 opened Jun 11, 2025
Overload `mul_overflows` for `size_t`
#155736 opened Jun 11, 2025
torch.distributed TCP bind address
#155741 opened Jun 11, 2025
[Torch Package] Make get names of OrderedImporters support fallback to importers
#155743 opened Jun 11, 2025
Add Windows CUDA 12.9.1 build
#155748 opened Jun 11, 2025
[refactor] simplify the implementation of check_input_alias_and_mutation_return_outputs
#155749 opened Jun 11, 2025
[dynamo][guards] Skip dispatch key guards for requires_grad=False
#155756 opened Jun 11, 2025
efficient zero_mask implementation for vec128_*_neon
#155766 opened Jun 12, 2025
add `__annotations__` attribute to `OpOverload`
#155784 opened Jun 12, 2025
add sfdp pattern
#155792 opened Jun 12, 2025
[DONT MERGE][TESTING][1/2] xpu test runner
#155793 opened Jun 12, 2025
[Do not merge] DCP ZOC Test Changes
#155802 opened Jun 12, 2025
[CD] Move build_magma.bat to build_magma.py
#155804 opened Jun 12, 2025
[VibeCoding] Replace clone.bat with clone.ps1
#155805 opened Jun 12, 2025
[VibeCoding] Convert architecture specific batch scripts to PowerShell
#155807 opened Jun 12, 2025
wip
#155810 opened Jun 12, 2025
Refactor DynamoStore into disk and in memory implementations
#155818 opened Jun 12, 2025
[doc] Updates to distributed.md for XCCL backend
#155834 opened Jun 12, 2025
Skip FSDP tests if device count is less then requested world_size value
#155836 opened Jun 12, 2025
[einops] Ensure Dynamo can trace through explicit set dunder method call
#155842 opened Jun 12, 2025
[VibeCoding] build_pytorch.bat to build_pytorch.ps1
#155843 opened Jun 12, 2025
[ATen MTIA backend] Use aten native CPU fallback function on MTIA
#155845 opened Jun 12, 2025
assert statement to check if output_size is not None
#155846 opened Jun 12, 2025
patch PR #151719
#155851 opened Jun 12, 2025
[test][inductor] fix test_conv_cat failure
#155852 opened Jun 12, 2025
[NCCL][P2P] Optionally avoid `recordStream`in P2P comms
#155854 opened Jun 12, 2025
[BE][c10d/Store]add check in pyi
#155855 opened Jun 12, 2025
Optionally avoid `record_streams` in autograd with `TORCH_AUTOGRAD_AVOID_RECORD_STREAMS=1`
#155857 opened Jun 12, 2025
[dynamo] Support builtin bool on non-constant VTs
#155863 opened Jun 12, 2025
[DONT MERGE] Diffusion models benchmarking for compile time
#155866 opened Jun 13, 2025
[cutlass backend] compile and link for .so files
#155876 opened Jun 13, 2025
[NOT FOR MERGE] Exploratory work on AOTInductor training
#155877 opened Jun 13, 2025
Docs: Fix sphinx heading markup in `nn.rst`
#155883 opened Jun 13, 2025
[hop] support torch.func.functional_call in hop subgraph
#155886 opened Jun 13, 2025
[WIP][user triton] AOT inductor support for device-side TMA
#155896 opened Jun 13, 2025
Default USE_PRIORITIZED_TEXT_FOR_LD=1 on Linux aarch64 via setup.py
#155901 opened Jun 13, 2025
test without rocblas conv when using cudagraphs
#155902 opened Jun 13, 2025
Mitigate upcoming removal of direct invocation of setup.py support
#155910 opened Jun 13, 2025
[WIP] Automatic load/save
#155913 opened Jun 13, 2025
Fix argument validation for torch.nn.attention.sdpa_kernel
#155922 opened Jun 13, 2025
[dynamo] Add `-> bool` to functions named `is_*` or `_is_*`
#155923 opened Jun 13, 2025
[inductor] Add `-> bool` to functions named `is_*` or `_is_*`
#155928 opened Jun 13, 2025
updated matplotlib version in docs requirements
#155931 opened Jun 13, 2025
[export] add _union_dataclass to support comparing dataclasses that inherits from union.
#155932 opened Jun 13, 2025
Fix stride comparison max(512 - s, 1) vs. (512 - s)
#155938 opened Jun 13, 2025
consolidate in finish step
#155940 opened Jun 13, 2025
don't do a full deserialize on every file
#155942 opened Jun 13, 2025
[ca] mark some sparse tests fixed by AccumulateGrad functionalization
#155948 opened Jun 13, 2025
TopK workaround when tensor rank - sort axis > 4
#155950 opened Jun 13, 2025
[cuDNN] Adding SDPA tests for cuDNN backend
#155951 opened Jun 13, 2025
NotForLand: LOAF on by default
#155956 opened Jun 13, 2025
[DRAFT][cuDNN][SDPA] Introduce `TORCH_CUDNN_SDPA_AVOID_RECOMPILE=1`
#155958 opened Jun 13, 2025
[ca] default on in CI also for PYTORCH_TEST_WITH_INDUCTOR
#155960 opened Jun 13, 2025
draft: [cp] context_parallel + flex_attention using torch_function and autograd function
#155962 opened Jun 13, 2025
Fix issue with set_reduce_scatter_divide_factor errors and MixedPrecisionPolicy
#155964 opened Jun 13, 2025
[CI] Remove redundant accuracy benchmarks for cpp_wrapper
#155966 opened Jun 13, 2025
[CI][cpp_wrapper] Fix selection of CPU OpInfo tests
#155967 opened Jun 13, 2025
draft: [cp] context_parallel + flex_attention_backward using torch_function and autograd function
#155970 opened Jun 13, 2025
unify dynamic shapes API namings 3 (guard_int, guard_int_seq)
#155973 opened Jun 14, 2025
Fixes for CPython int/float tests
#155978 opened Jun 14, 2025
Handling overflow for long int overflow for the product of kernel_hei…
#155989 opened Jun 14, 2025
[build] modernize build-backend: `setuptools.build_meta:__legacy__` -> `setuptools.build_meta`
#155998 opened Jun 14, 2025
[BE][Ez]: Use ruff type inference to autotype parts of dynamo
#156001 opened Jun 14, 2025
[BE][Ez]: Fix untyped decorator in dcp utils
#156003 opened Jun 14, 2025
add enum for core Backend class
#156004 opened Jun 14, 2025
Allow as_tensor to retain grad info
#156006 opened Jun 14, 2025
feat(cmake): add NCCL version selection based on CUDA version
#156014 opened Jun 15, 2025
[profiler] add more CUDA API for kernel launcher
#156016 opened Jun 15, 2025
[BE] add a minimal linter to check `pyproject.toml` consistency
#156017 opened Jun 15, 2025
[build] modernize build-frontend: `python setup.py develop/install` -> `[uv ]pip install[ -e] .`
#156027 opened Jun 15, 2025
bmm, topk, cholesky, linalg.norm, max with out variants set causing r…
#156030 opened Jun 15, 2025
[BE][Easy] set end-of-line for `.bat` file to CRLF in `.editorconfig`
#156032 opened Jun 15, 2025
Fix atleast_{1,2,3}d() with no arguments description
#156042 opened Jun 16, 2025
[BE][Easy][setup] wrap over long error messages and redirect them to `stderr` in `setup.py`
#156043 opened Jun 16, 2025
[BE][Easy][setup] use `super().method(...)` in command subclasses in `setup.py`
#156044 opened Jun 16, 2025
[build] remove upper version pin for `setuptools<80.0`
#156049 opened Jun 16, 2025
Update URL for RPATH documentation
#156060 opened Jun 16, 2025
Support transpose and pack for bit8
#156065 opened Jun 16, 2025
Register hpu device to fake backend
#156076 opened Jun 16, 2025
revamp dtype documentation for 2025
#156087 opened Jun 16, 2025
Convert to markdown: jit.rst
#156094 opened Jun 16, 2025
Add error to intercept crash in issue #154882 on maxpool2d with indices
#156101 opened Jun 16, 2025
[opinfo] Exclude aten_name if its not actually a name
#156104 opened Jun 16, 2025
[opinfo] add overloads to opinfo
#156109 opened Jun 16, 2025
local load/save
#156110 opened Jun 16, 2025
Add debug messages for deps issues during fx splits
#156111 opened Jun 16, 2025
Bump transfomers version
#156118 opened Jun 16, 2025
[ROCm][Inductor][CK] update API for gemm-multiD change
#156122 opened Jun 16, 2025
Display a warning when overwriting `CMAKE_CUDA_ARCHITECTURES`
#156123 opened Jun 16, 2025
[test] re-run CI with complex + Python dispatch key changes
#156131 opened Jun 16, 2025
NOT-FOR-LAND: enable autochunker by default
#156132 opened Jun 16, 2025
[WIP][ci][cutlass backend] Add ci for cutlass backend tests
#156136 opened Jun 16, 2025
Templatize model_container
#156137 opened Jun 16, 2025
Improve documentation for torch.lobpcg
#156139 opened Jun 16, 2025
[cuDNN][64-bit indexing] update conv depthwise 64bit indexing dispatch condition to match native kernel
#156140 opened Jun 17, 2025
[executorch hash update] update the pinned executorch hash
#156141 opened Jun 17, 2025
[dcp_poc] Introduce a new simple rank local checkpointer
#156142 opened Jun 17, 2025
[Docs] Fix indentations in cond.md
#156147 opened Jun 17, 2025
[list] Raise exception in invalid list method call
#156148 opened Jun 17, 2025
Convert bottleneck.rst to markdown
#156149 opened Jun 17, 2025
Optimize dim description in torch.max
#156153 opened Jun 17, 2025
Bump protobuf from 5.29.4 to 5.29.5 in /.ci/docker
#156157 opened Jun 17, 2025
Optimize scatter/gather kernel for ARM.
#156161 opened Jun 17, 2025
Deprecate CUDAAllocatorConfig, use AllocatorConfig instead
#156165 opened Jun 17, 2025
draft: [cp] context_parallel + flex_attention using monkey_patch
#156170 opened Jun 17, 2025
Implementation of a ScannedModule
#156172 opened Jun 17, 2025
[WIP] Add a new API of allocator setting for accelerator
#156175 opened Jun 17, 2025
[NVIDIA] Refactor Family Blackwell Support codegen
#156176 opened Jun 17, 2025
[WIP] Remove legacy aarch64_linux builder in favor of Manylinux
#156178 opened Jun 17, 2025
[TEST] Add Windows cuda 12.9.1 build
#156179 opened Jun 17, 2025
[Native][CPU][TopK] Improve perf by reducing swap operations
#156183 opened Jun 17, 2025
[Inductor] Subgraph as a choice symbolic expression as input
#156185 opened Jun 17, 2025
[TEST] Triton 3.4.0 pin update
#156186 opened Jun 17, 2025
[inductor] Quiesce Triton compile worker pool after each dynamo compile
#156187 opened Jun 17, 2025
Engine reuse calling thread when only single device detected
#156188 opened Jun 17, 2025
Add auto support
#156189 opened Jun 17, 2025
[ROCm] [CK] Composable Kernel integration for ROCm
#156192 opened Jun 17, 2025
[dynamo] allow symints in list.__setitem__
#156197 opened Jun 17, 2025
[Codemod][Folly target clean up] 57
#156198 opened Jun 17, 2025
Fix torch.clamp CPU overflow with float16 tensors
#156199 opened Jun 17, 2025
[CUTLASS] [CUDA] SM100 GroupMM
#156203 opened Jun 17, 2025
[DCP] OSS Zero Overhead Checkpointing Implementation
#156207 opened Jun 17, 2025
[CI] Add prebuild command option, set prebuild command option for CI to build flash attention
#156236 opened Jun 17, 2025
[dynamo] updated version of detecting any differences between PRs unimplemented_v2() callsites and graph_break_registry json file
#156237 opened Jun 17, 2025
Fix constant folding pass for mutable buffer
#156239 opened Jun 17, 2025
Fix `aten::index_put` args Dtensor type mismatch
#156240 opened Jun 17, 2025
[list] Implement `list.remove`
#156242 opened Jun 17, 2025
Extract CPU log_softmax kernels to header
#156243 opened Jun 17, 2025
[EZ/Profiler] Change 'b' to 'B' in FunctionEvent Frontend
#156250 opened Jun 17, 2025
[dynamo] fix some cross-graph-break refleaks in eval_frame
#156252 opened Jun 17, 2025
[Test] Kineto Submodule Update
#156253 opened Jun 17, 2025
Add size_hints_or_throw
#156255 opened Jun 17, 2025
Consolidate stack trace in Tracer
#156257 opened Jun 18, 2025
[invoke_subgraph] make same subgraph share get_attr target
#156260 opened Jun 18, 2025
Convert quantization.rst to markdown
#156266 opened Jun 18, 2025
Add fallback-aware device checking for MPS operations
#156267 opened Jun 18, 2025
[Inductor][CPP backend] Optimize parallel depth algorithm [Don't merge]
#156268 opened Jun 18, 2025
Implement list.__add__ and list.__iadd__
#156270 opened Jun 18, 2025
[list] Add list.__mul__ and list.__imul__
#156271 opened Jun 18, 2025
[Intel GPU] Enable training for SDPA XPU [WIP]
#156272 opened Jun 18, 2025
[inductor] split out triton templates
#156276 opened Jun 18, 2025
[inductor][tma template] subclass workspace arg for choice
#156277 opened Jun 18, 2025
[inductor] add KernelTemplateParams
#156278 opened Jun 18, 2025
[inductor] introduce kernel_inputs
#156279 opened Jun 18, 2025
[inductor][1/2] break out TritonTemplate, TritonTemplateKernel, TritonTemplateCaller out of select_algorithm.py
#156280 opened Jun 18, 2025
[inductor][2/2] break out TritonTemplate, TritonTemplateKernel, TritonTemplateCaller out of select_algorithm.py
#156281 opened Jun 18, 2025
[inductor] heuristics based on kernel templates
#156282 opened Jun 18, 2025
Introduce sync_cross_rank_decision
#156287 opened Jun 18, 2025
[inductor] KernelTemplates report their own KernelParams
#156292 opened Jun 18, 2025
Add cascade sum support for Inductor CPP backend
#156296 opened Jun 18, 2025
[BE][1/16] fix typos in torch/
#156311 opened Jun 18, 2025
[BE][2/16] fix typos in torch/ (torch/_*/)
#156312 opened Jun 18, 2025
[BE][8/16] fix typos in torch/ (torch/csrc/jit/)
#156318 opened Jun 18, 2025
[BE][10/16] fix typos in torch/ (torch/csrc/jit/)
#156320 opened Jun 18, 2025
Address richard's comments on libtorch_stable_abi note
#156324 opened Jun 18, 2025
[TEST] DO not commit
#156326 opened Jun 18, 2025
Migrate c10/macros/cmake_macros.h.in
#156329 opened Jun 18, 2025
Fix native static dispatch kernels
#156331 opened Jun 18, 2025
Validate custom op support for compile_kernel
#156332 opened Jun 18, 2025
[testing] test/run_test.py: Only shutdown pool if it was created
#156333 opened Jun 18, 2025
Storage: add_delete_hook for deregistration
#156338 opened Jun 18, 2025
[list] Add list.__delitem__
#156339 opened Jun 18, 2025
Add private API to modify the tags for a custom operator
#156343 opened Jun 18, 2025
[inductor] set config.min_num_split by default
#156345 opened Jun 18, 2025
[invoke_subgraph] make collect_meta_analysis fake prop cachable
#156347 opened Jun 18, 2025
Add User defined subclass handling to funcitonalize impl
#156349 opened Jun 18, 2025
Build FBGEMM GenAI as part of PyTorch
#156355 opened Jun 18, 2025
[wip]
#156356 opened Jun 18, 2025
[BE] comments + try to get rid of secondary `make_autotune_fn`
#156358 opened Jun 18, 2025
[Codemod][Folly target clean up] 28
#156365 opened Jun 18, 2025
[Codemod][Folly target clean up] 22
#156366 opened Jun 18, 2025
[iter] Update some of the tests to not call pickle
#156369 opened Jun 18, 2025
[iter] exhaust `ListIterator` when `unpack_var_sequence` is called
#156370 opened Jun 18, 2025
[iter] Add support for sequence protocol in `iter(..)`
#156371 opened Jun 18, 2025
Add macos26 beta test runner
#156372 opened Jun 18, 2025
[TSAN][live speech translation] Fix A data race in caffe2
#156378 opened Jun 18, 2025
cub and compile_kernel composition
#156380 opened Jun 19, 2025
Prevent cudaStreamSync when indexing GPU tensors with boolean CPU mask
#156384 opened Jun 19, 2025
[InductorBench] Fix accuracy validation logic for MPS
#156385 opened Jun 19, 2025
Bump urllib3 from 2.2.2 to 2.5.0 in /tools/build/bazel
#156390 opened Jun 19, 2025
Use CMake wholearchive group
#156393 opened Jun 19, 2025
Use CUDA::cupti target
#156396 opened Jun 19, 2025
Added index 0 for ROCR_VISIBLE_DEVICES
#156398 opened Jun 19, 2025
[ez] fix typo in comment
#156402 opened Jun 19, 2025
[Codemo 57AE d][Folly target clean up] 28 [A]
#156403 opened Jun 19, 2025
[DONT MERGE][TESTING][2/2] test new xpu runner
#156410 opened Jun 19, 2025
Fix storage_offset preservation in clone_preserve_strides
#156415 opened Jun 19, 2025
[iter] support `iter(callable, sentinel)`
#156416 opened Jun 19, 2025
Change t.is_cuda to t.device.type == 'cuda' in torch/utils/viz
#156418 opened Jun 19, 2025
[cc][multi-kernel] attempt 1
#156421 opened Jun 19, 2025
[dm][multi-kernel] attempt 1
#156422 opened Jun 19, 2025
[dm][mk] attempt 2
#156423 opened Jun 19, 2025
[cc][multi-kernel] attempt 2
#156427 opened Jun 19, 2025
[br][mk] attempt 1
#156428 opened Jun 19, 2025
[precompile] Detect source code changes for save/load.
#156432 opened Jun 19, 2025
[dynamo] show frame information when recompilation is triggered on fail_on_recompile
#156433 opened Jun 19, 2025
use cmake target torch instead of ${TORCH_LIBRARIES} in cpp installation docs
#156435 opened Jun 19, 2025
[cc][multi-kernel] attempt 3
#156439 opened Jun 19, 2025
[invoke_subgraph] Add config flag to control support of input mutation
#156450 opened Jun 19, 2025
[cc][multi-kernel] attempt 4
#156452 opened Jun 19, 2025
[WIP]Fallback to CPU for XPU FP64
#156456 opened Jun 19, 2025
Fixes issue #156414: Fixes bug in implementation of _combine_histograms.
#156457 opened Jun 20, 2025
wip Updates to scaled_mm code
#156458 opened Jun 20, 2025
[iter] Wrap iter(..) call in a ObjectIteratorVariable
#156460 opened Jun 20, 2025
[inductor] select_algorithm: add preprocessing fns
#156464 opened Jun 20, 2025
[torchbench] update environment setup s B41A cript
#156465 opened Jun 20, 2025
WIP: Add `max_pool3d` for MPS
#156467 opened Jun 20, 2025
Debug PR, no need to review
#156468 opened Jun 20, 2025
Docs/update contributing rebase tip
#156469 opened Jun 20, 2025
kernel arg munging attempt
#156470 opened Jun 20, 2025
[wip][inductor] add kernel choice
#156477 opened Jun 20, 2025
[Codemod][Folly target clean up] 22 [B]
#156478 opened Jun 20, 2025
[ROCm][Windows] Fixing undefined symbol linker error after exposing MIOpen symbols
#156479 opened Jun 20, 2025
[WIP] Add device_id to XPU device properties
#156481 opened Jun 20, 2025
Fix torch.onnx.export parameter for onnx_shape_inference (#156480)
#156483 opened Jun 20, 2025
[Profiler] Fix profile_all_threads in debug build
#156484 opened Jun 20, 2025
Add regression test for UnicodeDecodeError in torch.compile with extreme values
#156485 opened Jun 20, 2025
[ROCm][Windows] Skip using rocm-core on Windows case
#156486 opened Jun 20, 2025
[DO NOT MERGE] Update trunk.yml to change the runner that the job runs-on
#156491 opened Jun 20, 2025
[INIT DRAFT] setting up the build for torch/standalone
#156492 opened Jun 20, 2025
Fix type annotations for dim parameter in torch.amin and torch.amax
#156493 opened Jun 20, 2025
[MPS] Optimize cumsum/cumprod metal kernels
#156494 opened Jun 20, 2025
cublaslt/hipblaslt persistent workspace
#156495 opened Jun 20, 2025
add test_batchnorn_2D and 3D tests
#156498 opened Jun 20, 2025
[ROCm] Bump AOTriton to 0.10b
#156499 opened Jun 20, 2025
[Inductor] Fix epilogue fusion decision with 1 Triton caller as choice
#156500 opened Jun 20, 2025
[MTIA Aten Backend] Migrate maximum.out / minimum.out / cos.out / erf.out / exp.out
#156502 opened Jun 20, 2025
Organize BUCK for torch/standalone
#156503 opened Jun 20, 2025
added stubs for jit tree views
#156504 opened Jun 20, 2025
[nativert] Move PrimKernelRegistry to PyTorch core
#156506 opened Jun 20, 2025
[nativert] Move HigherOrderKernel
#156507 opened Jun 20, 2025
[nativert] move layout planner algorithms to libtorch
#156508 opened Jun 20, 2025
[docs][typing] Document and type support for dim=None in torch.amin and torch.amax
#156510 opened Jun 20, 2025
python definitely_contiguous-> is_contiguous_or_false
#156515 opened Jun 20, 2025
Unify dynamic shapes APIs naming 2 (expect_true and check) attempt2
#156518 opened Jun 20, 2025
[aoti] Check longlong upperbound for codegening input size check
#156522 opened Jun 20, 2025
remove gso from set_storage_meta__symint
#156525 opened Jun 20, 2025
[Inductor][CPP] Fix perf regression of functorch_maml_omniglot
#156526 opened Jun 21, 2025
[dynamo] fix segfault due to dangling CacheEntry backend pointer
#156527 opened Jun 21, 2025
[dynamo] Guard eagerly on list objects to avoid guard on getitem index
#156531 opened Jun 21, 2025
Add RoPE (Rotary Positional Embedding) to PyTorch core
#156532 opened Jun 21, 2025
[inductor] Quiesce Triton compile worker pool by default in OSS
#156534 opened Jun 21, 2025
remove allow-untyped-defs from c10d_rendezvous_backend.py
#156536 opened Jun 21, 2025
remove allow-untyped-defs from torch/ao/nn/sparse/quantized/linear.py
#156537 opened Jun 21, 2025
[MTIA Aten Backend] Migrate _log_softmax.out / _log_softmax_backward_data.out
#156539 opened Jun 21, 2025
avoid to declare an unknown bound array without any element
#156543 opened Jun 21, 2025
Enable target-determination (TD) for ROCm CI
#156545 opened Jun 21, 2025
[ddp] improve c++ reducer bucketing readability
#156550 opened Jun 21, 2025
F438 [CUDAGraph] add config `cudagraph_capture_sizes`
#156551 opened Jun 21, 2025
Add fx_graph_runnable tests boilerplate
#156552 opened Jun 21, 2025
[MTIA Aten Backend] Migrate isnan
#156554 opened Jun 22, 2025
Clarify online softmax split reduction limitation and invite contributions (refs #153241)
#156556 opened Jun 22, 2025
Don't use deprecated CUDA.cmake module
#156559 opened Jun 22, 2025
typo
#156560 opened Jun 22, 2025
Implement guard collectives (optimized version)
#156562 opened Jun 22, 2025
[nativert] reland D76832891 remove designated initializer cpp20
#156565 opened Jun 22, 2025
[MPSInductor][BE] Fix multistage reduction check
#156567 opened Jun 22, 2025
[MTIA Aten Backend] Migrate max.dim_max / min.dim_min
#156568 opened Jun 23, 2025
[nativert] Move call_torchbind_kernel
#156571 opened Jun 23, 2025
[WIP][AOTI][Intel GPU] Add XPU quantization ops to AOT Inductor.
#156572 opened Jun 23, 2025
[MTIA Aten Backend] Migrate ge.Tensor_out / ge.Scalar_out
#156573 opened Jun 23, 2025
add torch.concat to normalization pass
#156574 opened Jun 23, 2025
[WIP] Port three dynamo test to Intel GPU
#156575 opened Jun 23, 2025
Fix UT failure on non-cuda backend
#156577 opened Jun 23, 2025
Added philox based RNG context for HPU device in Dtensor scenarios
#156581 opened Jun 23, 2025
[SymmMem] Rename all_to_all_vdev ops
#156582 opened Jun 23, 2025
Update github first merge rule
#156583 opened Jun 23, 2025
[xla hash update] update the pinned xla hash
#156584 opened Jun 23, 2025
[CPU] Fix memory access for sbgemm bf16
#156585 opened Jun 23, 2025
[Profiler] the doc of _ExperimentalConfig is incorrectly truncated by commas
#156586 opened Jun 23, 2025
[OpenReg][1/N] Migrate cpp_extensions_open_device_registration to OpenReg
#156588 opened Jun 23, 2025
[OpenReg][2/N] Migrate cpp_extensions_open_device_registration to OpenReg
#156589 opened Jun 23, 2025
[Doc] remove WSL2 in support matrix for Intel GPU
#156590 opened Jun 23, 2025
[ROCm][Windows] Fix rocsolver undefined symbol error
#156591 opened Jun 23, 2025
[Inductor Dashboard] Enable deterministic algorithms for some models
#156592 opened Jun 23, 2025
[Break XPU] Fix UT failures introduced by community.
#156594 opened Jun 23, 2025
docstring_linter: Fix #151692 and other issues
#156596 opened Jun 23, 2025
[ZENDNN] Integrate ZenDNN library, implement Linear op, add unit-tests
#156599 opened Jun 23, 2025

704 Issues closed by 116 people

`set_reduce_scatter_divide_factor` is inconsistent between FSDP and HSDP
#155903 closed Jun 23, 2025
FSDP2's `set_reduce_scatter_divide_factor` is inconsistent wrt reduce dtype
#155904 closed Jun 23, 2025
DISABLED test_ind_worker_queue (__main__.TestIndividualWorkerQueue)
#68643 closed Jun 23, 2025
DISABLED test_module_and_optimizer_ids (__main__.TestTorchTidyProfiler)
#87581 closed Jun 23, 2025
Provide a way to allow dynamo to trace into an operator defined with `torch.library.custom_op`
#156322 closed Jun 23, 2025
Dynamo benchmark test got failed torch.dtype object has no attribute '__name__'
#156482 closed Jun 23, 2025
[RFC] Migrate to modern Python build system and replace `setup.py` commands with their modern alternatives
#156029 closed Jun 23, 2025
[MPSInductor] Silently incorrect result with varmean+epilogue
#156426 closed Jun 23, 2025
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_complex128 (__main__.TestForeachCUDA)
#151300 closed Jun 23, 2025
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_bool (__main__.TestForeachCUDA)
#151268 closed Jun 23, 2025
DISABLED test_graph_partition (__main__.TritonCodeGenTests)
#148957 closed Jun 23, 2025
DISABLED test_mm_plus_mm (__main__.TestPatternMatcher)
#145335 closed Jun 23, 2025
Segmentation fault (core dumped) in `torch.profiler.profile`
#156564 closed Jun 22, 2025
UNSTABLE inductor / linux-jammy-cpu-py3.9-gcc11-inductor / test (inductor_torchbench_cpu_smoketest_perf)
#156521 closed Jun 22, 2025
Is it possible to serialize a torch.cuda.CUDAGraph into disk or CPU memory
#125820 closed Jun 22, 2025
`scaled_dot_product_attention` backwards: illegal memory access with large inputs
#150054 closed Jun 21, 2025
Using `opset_version = 22` in `torch.onnx.export` with `dynamo=True` includes dropout nodes in the model
#156542 closed Jun 21, 2025
`torch.distributed.pipelining.pipeline` error when initializing on meta device
#156541 closed Jun 21, 2025
Add runtime profiler info for AOTDispatcher prologue
#155721 closed Jun 21, 2025
UNSTABLE inductor-rocm-mi300 / rocm-py3.10-inductor-mi300 / test (inductor)
#154884 closed Jun 21, 2025
UNSTABLE rocm-mi300 / linux-jammy-rocm-py3.10-mi300 / test (default)
#156360 closed Jun 21, 2025
`torch.compile(fullgraph=True, options=...)` fails with `NoValidChoicesError` on simple `Conv2d` model, but gives no actionable trace
#156304 closed Jun 21, 2025
Support input mutations + aliasing with scan during training
#156337 closed Jun 20, 2025
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_bfloat16 (__main__.TestForeachCUDA)
#151228 closed Jun 20, 2025
Add @markDynamoStrictTest to all TestCase
#115671 closed Jun 20, 2025
Loss with LBFGS not going down
#156501 closed Jun 20, 2025
Can we have Dim.AUTO/Dim.DYNAMIC with an optional min & max?
#147483 closed Jun 20, 2025
DTensor does not compose with Parameters Groups
#156453 closed Jun 20, 2025
[ONNX] Support for grouped query attention
#151762 closed Jun 20, 2025
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_int16 (__main__.TestForeachCUDA)
#149627 closed Jun 20, 2025
[XPU] Support toggling profiler on/off for XPU.
#154898 closed Jun 20, 2025
Sourceforge outage causing multiple CI failures
#108773 closed Jun 20, 2025
pytorchbot erroneously thinks PR has already been merged as a different commit
#154427 closed Jun 20, 2025
[ONNX] Inputs generated b 10000 y onnx.export() with dynamo=False are not consistent with dynamo=True
#136179 closed Jun 20, 2025
[Torch TO ONNX BUG] The right shift operation in torch is mapped as a division operation when converted to ONNX.
#139455 closed Jun 20, 2025
[ONNX] 2.0 regression: dynamic shapes lost for an operator
#139463 closed Jun 20, 2025
[ONNX] Document the registration API
#139499 closed Jun 20, 2025
[ONNX] Run report_exportability when report=True
#139904 closed Jun 20, 2025
Replace reduce(operator.mul) with math.prod for computing product of dimensions
#140888 closed Jun 20, 2025
Exporting the operator 'aten::_transformer_encoder_layer_fwd' to ONNX opset version 17 is not supported
#144242 closed Jun 20, 2025
Custom symbolic functions for ONNX export with None args causes SEGFAULT
#145261 closed Jun 20, 2025
ONNX export failing when using `symbolic` functions and scripting
#146035 closed Jun 20, 2025
onnx.export: When a quantized model is exported using onnx.export, the convolution result has discrepency with the original quantized model.
#146541 closed Jun 20, 2025
Export HuggingFace mamba to ONNX
#146835 closed Jun 20, 2025
UnsupportedOperatorError: Exporting the operator 'aten::_make_per_tensor_quantized_tensor ' to ONNX opset version 11
#147602 closed Jun 20, 2025
[ONNX] BitwiseOr was generated for bool inputs (invalid)
#147854 closed Jun 20, 2025
[ONNX] dynamic dims are not exported with the specified names
#148629 closed Jun 20, 2025
[ONNX] How to export Llama4
#150891 closed Jun 20, 2025
Exporting the operator 'aten::lift_fresh' to ONNX - not supported
#151932 closed Jun 20, 2025
Exporting the operator 'aten::fft_fft2' to ONNX opset version 19 is not supported.
#153823 closed Jun 20, 2025
[ONNX] Verify the translation of SDPA to Attention-23
#156105 closed Jun 20, 2025
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_float64 (__main__.TestForeachCUDA)
#151214 closed Jun 20, 2025
`torch.compile(..., mode="max-autotune", dynamic=True)` causes small but nonzero output mismatch with `nn.Conv2d` compared to eager output
#156301 closed Jun 20, 2025
DISABLED test_triton_template_generated_code_caching (__main__.TestMaxAutotune)
#154108 closed Jun 20, 2025
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_float32 (__main__.TestForeachCUDA)
#151136 closed Jun 20, 2025
Inductor cpp_wrapper has performance regressions
#156037 closed Jun 20, 2025
DISABLED test_export_opnames_interface (__main__.TestMisc)
#154986 closed Jun 20, 2025
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_float16 (__main__.TestForeachCUDA)
#151114 closed Jun 19, 2025
_flash_attention_forward accuracy drop from CUDA to ROCM implementation.
#154582 closed Jun 19, 2025
xpu: AOT compilation does not happen with sycl extension (JIT fallback happens)
#156249 closed Jun 19, 2025
Cannot install pytorch through official pip guidance
#156413 closed Jun 19, 2025
Tensors with no explicit references are possible not freed timely with torch.compile
#155778 closed Jun 19, 2025
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_float32 (__main__.TestForeachCUDA)
#149409 closed Jun 19, 2025
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_float16 (__main__.TestForeachCUDA)
#149522 closed Jun 19, 2025
Support C shim for customized OP
#150988 closed Jun 19, 2025
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_complex64 (__main__.TestForeachCUDA)
#151099 closed Jun 19, 2025
DISABLED test_qlinear_add_int8_mixed_bf16_use_relu_True_is_qat_True_is_dynamic_True (__main__.TestPatternMatcher)
#154565 closed Jun 19, 2025
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_complex128 (__main__.TestForeachCUDA)
#151093 closed Jun 19, 2025
FSDP + save optimizer dtype AssertionError
#156166 closed Jun 19, 2025
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_bfloat16 (__main__.TestForeachCUDA)
#151054 closed Jun 19, 2025
`max_entries` parameter of `torch.cuda.memory._record_memory_history()`
#129674 closed Jun 19, 2025
Indexing beyond end of array on ROCm build
#155045 closed Jun 18, 2025
[ued][kokoro] torch.compile fails in kokoro (both fullgraph=True and False)
#149570 closed Jun 18, 2025
Actual torch `ExportGraphSignature` does not match the example in the docs
#156184 closed Jun 18, 2025
`torch.export` fails with `KeyError` when `BatchNorm.running_mean` is read and modified, even when shape/value is unchanged
#156167 closed Jun 18, 2025
Certain operations cause implicity sync-points
#12461 closed Jun 18, 2025
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_complex64 (__main__.TestForeachCUDA)
#149199 closed Jun 18, 2025
DISABLED test_parity__foreach_acos_fastpath_inplace_cuda_float64 (__main__.TestForeachCUDA)
#151019 closed Jun 18, 2025
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_float64 (__main__.TestForeachCUDA)
#149523 closed Jun 18, 2025
DISABLED test_comprehensive_pca_lowrank_cuda_float32 (__main__.TestInductorOpInfoCUDA)
#139828 closed Jun 18, 2025
DISABLED test_roi_align_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#103156 closed Jun 18, 2025
NCCL init hits CUDA failure 'invalid argument' on 12.2 driver
#150852 closed Jun 18, 2025
Schema version check fails in `torch.export.load`
#156354 closed Jun 18, 2025
Windows Runners are not available on PyTorch CI/CD
#156352 closed Jun 18, 2025
format_flamegraph failed to setup the script
#156309 closed Jun 18, 2025
Cannot install >=2.7.0 on ubuntu 18.04, conflict with prerequisite
#156215 closed Jun 18, 2025
[tracker] DTensor Operator Coverage
#156204 closed Jun 18, 2025
Flip is much slower than advanced indexing
#16424 closed Jun 18, 2025
Please implement the batching rule for torch.matrix_exp.
#115992 closed Jun 18, 2025
Function 'MmBackward0' returned nan values in its 0th output.
#156015 closed Jun 18, 2025
DISABLED test_parity__foreach_acos_fastpath_inplace_cuda_float32 (__main__.TestForeachCUDA)
#151003 closed Jun 18, 2025
Status of support for ROCm 6.4.1
#155292 closed Jun 18, 2025
Loading sparse tensors in a DataLoader raises CUDA initialization error since 2.5.0 if you have already initialized CUDA
#153143 closed Jun 18, 2025
DISABLED test_matmul_layer_norm_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#151835 closed Jun 18, 2025
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_float32 (__main__.TestForeachCUDA)
#150530 closed Jun 18, 2025
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_float16 (__main__.TestForeachCUDA)
#150510 closed Jun 18, 2025
[FR] Expose CUDAGraph handle to allow customized modification on the graph
#155106 closed Jun 18, 2025
Have compiled autograd config API support nested compilation
#152219 closed Jun 18, 2025
Convert to markdown: rpc.rst, signal.rst, size.rst, sparse.rst, special.rst
#155033 closed Jun 18, 2025
DISABLED test_randint_distribution_dynamic_shapes_xpu (__main__.DynamicShapesCodegenGPUTests)
#155689 closed Jun 17, 2025
DISABLED test_randint_distribution_dynamic_shapes_xpu (__main__.DynamicShapesGPUTests)
#155692 closed Jun 17, 2025
DISABLED test_serialize_by_key (__main__.PrecompileContextTests)
#156146 closed Jun 17, 2025
DISABLED test_basic (__main__.PrecompileContextTests)
#156063 closed Jun 17, 2025
The documentation lacks an explanation of the constraints between larger padding and padding mode in convolutional layers
#134840 closed Jun 17, 2025
`torch.ops.aten.index_put` returns different results on CUDA and CPU
#156173 closed Jun 17, 2025
DISABLED test_grad_with_manual_interleaved_ScheduleClass0_use_new_runtime_True (__main__.ScheduleTest)
#154373 closed Jun 17, 2025
DISABLED test_grad_with_manual_interleaved_ScheduleClass1_use_new_runtime_False (__main__.ScheduleTest)
#154391 closed Jun 17, 2025
DISABLED test_grad_with_manual_interleaved_ScheduleClass1_use_new_runtime_True (__main__.ScheduleTest)
#154408 closed Jun 17, 2025
DISABLED test_grad_with_manual_interleaved_ScheduleClass2_use_new_runtime_False (__main__.ScheduleTest)
#154443 closed Jun 17, 2025
torch.where() can produce nan values for unselected branch during backward
#156212 closed Jun 17, 2025
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_bool (__main__.TestForeachCUDA)
#150468 closed Jun 17, 2025
Convert to markdown: quantization-accuracy-debugging.rst, quantization-backend-configuration.rst, quantization-support.rst, quantization.rst, random.rst
#155032 closed Jun 17, 2025
Failure of iOS Build Test: Build (default, 1, 1, macos-14-xlarge, SIMULATOR, arm64)
#136284 closed Jun 17, 2025
[Testing] multigpu tests are still running against CUDA-11
#154119 closed Jun 17, 2025
ONNX Dynamo Export - Unsupported FX nodes: {'call_function': ['aten._upsample_bilinear2d_aa.default']}.
#128818 closed Jun 17, 2025
torch.compile fails to trace methods decorated with @lru_cache
#155841 closed Jun 17, 2025
[FDSP2] express zero-1 with fully_shard
#155952 closed Jun 17, 2025
MPS cumsum failure for 5D tensor or above
#154881 closed Jun 17, 2025
get different result between conv1x1 and linear
#156154 closed Jun 17, 2025
BatchNorm1d fails with batch size 1 if track_running_stats=False, claims it's not it eval mode even if .eval() is called
#156051 closed Jun 17, 2025
[dynamo] Add support for torch.cuda.FloatTensor()
#130722 closed Jun 17, 2025
Convert to markdown: linalg.rst, logging.rst, masked.rst, meta.rst, miscellaneous_environment_variables.rst
#155025 closed Jun 17, 2025
A mistake in PyTorch Docs for nn.RNN
#129446 closed Jun 17, 2025
`ELU()`'s `alpha` argument with `int`, `complex` or `bool` and `inplace` argument with `int`, `complex` and `float` work against the doc
#133563 closed Jun 17, 2025
When calling torch.histc the CPU and CUDA implementations produce different outputs.
#156019 closed Jun 17, 2025
When calling torch.cumprod on a float16 tensor, the CPU and CUDA implementations produce different outputs.
#156018 closed Jun 17, 2025
Extra onnx::Neg_2 input after torch.onnx.export
#148655 closed Jun 17, 2025
RuntimeError: CUDA driver error: operation not supported with test_stream_write_value32 and cuStreamWriteValue32
#154073 closed Jun 17, 2025
DISABLED test_reentrant_parent_error_on_cpu_cuda (__main__.TestAutogradDeviceTypeCUDA)
#86735 closed Jun 17, 2025
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_int8 (__main__.TestForeachCUDA)
#150407 closed Jun 17, 2025
[XPU] Upgrade the XPU support packages version to 2025.1 in CI/CD
#151097 closed Jun 17, 2025
libtorch doesn't work with cuda 12.6 and 12.4
#132575 closed Jun 17, 2025
DISABLED test_weight_norm_bwd_dynamic_shapes_cpu (__main__.DynamicShapesCodegenCpuTests)
#153803 closed Jun 17, 2025
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_int64 (__main__.TestForeachCUDA)
#150392 closed Jun 17, 2025
DISABLED test_pattern_matcher_multi_user_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#134433 closed Jun 17, 2025
Update ONNX Opset Version to Support Attention Operator
#153611 closed Jun 17, 2025
DISABLED test_weight_norm_bwd_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#141484 closed Jun 17, 2025
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_bool (__main__.TestForeachCUDA)
#150120 closed Jun 17, 2025
[ONNX] Implement scan
#151327 closed Jun 17, 2025
`TestCppExtensionOpenRgistration.test_base_device_registration` hangs during shutdown on MacOS
#155759 closed Jun 16, 2025
ROCm: no HIP device available if device is already initialized
#152941 closed Jun 16, 2025
Inductor CI failure due to Huggingface outage
#156113 closed Jun 16, 2025
[Compiled_autograd] running nn.LayerNorm failed for torch.compile with compiled_autograd when deepspeed Zero3
#140091 closed Jun 16, 2025
Convert to markdown: distributed.checkpoint.rst, distributed.elastic.rst, distributed.fsdp.fully_shard.rst, distributed.optim.rst, distributed.pipelining.rst
#155018 closed Jun 16, 2025
add x/0 gradient behaviour to documentation
#128796 closed Jun 16, 2025
Stop special-casing einops in Dynamo
#142486 closed Jun 16, 2025
None deterministic output of linear projection based on batch size and projection dimensions
#156084 closed Jun 16, 2025
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_complex128 (__main__.TestForeachCUDA)
#149323 closed Jun 16, 2025
DISABLED test_tmp_not_defined_issue2_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#135219 closed Jun 16, 2025
DISABLED test_grad_with_manual_interleaved_ScheduleClass2_use_new_runtime_True (__main__.ScheduleTest)
#154481 closed Jun 16, 2025
DISABLED test_on_device_tma_store_old_api (__main__.MutationTests)
#155691 closed Jun 16, 2025
torch.cuda.set_device(0) behaves differently from torch.cuda.set_device(1) in terms of cuda context
#155668 closed Jun 16, 2025
DISABLED test_cache_hot_load_device_cuda_bfloat16_dynamic_False (__main__.AOTAutogradCacheTests)
#145334 closed Jun 16, 2025
IInconsistent Error Handling in `torch.fused_moving_avg_obs_fake_quant` Between CPU and GPU Implementations
#153310 closed Jun 16, 2025
DISABLED test_fake_registration (__main__.TestOpProfiles)
#151301 closed Jun 16, 2025
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_int32 (__main__.TestForeachCUDA)
#150350 closed Jun 16, 2025
旧版pytorch标注python版本
#156038 closed Jun 16, 2025
In the docs for torch.amax/amin the note about min/max gradient behavior is outdated
#155048 closed Jun 15, 2025
[feature request]: Update max onnx opset to 21 for compatability
#127167 closed Jun 15, 2025
`torch.compile(..., mode="max-autotune")` fails with `TypeError: Expected a number but got Identity` under `torch.no_grad()`
#155688 closed Jun 15, 2025
Update nccl 2.27.3 in pytorch nightly
#155052 closed Jun 14, 2025
torch.fake_quantize_per_tensor_affine handles +inf inconsistently between CPU and CUDA (CPU maps to quant_min, GPU maps to quant_max)
#154328 closed Jun 14, 2025
[DCP] failure case of save method
#152310 closed Jun 14, 2025
The PyTorch version is too low and does not support 50 series GPUs
#155985 closed Jun 14, 2025
[cutlass backend] Add cutlass 3x support ops config to control which ops to do cutlass lowerings on
#155718 closed Jun 14, 2025
Convert to markdown: named_tensor.rst, nested.rst, nn.attention.bias.rst, nn.attention.experimental.rst, nn.attention.flex_attention.rst
#155028 closed Jun 14, 2025
`lintrunner init` fails
#152999 closed Jun 14, 2025
[inductor] [fake tensor] `torch.conj` crashes when `add` original complex tensor
#148950 closed Jun 14, 2025
DistributedDataParallel with compile(..., mode="max-autotune") hangs in 2.5+
#140395 closed Jun 14, 2025
[CI][CUDA][Distributed] test_assert_nan_float16 unit test hangs with certain Host OS + CUDA KMD 570.133.07
#153479 closed Jun 13, 2025
Hide getitems in Dynamo bytecode profiling
#153372 closed Jun 13, 2025
Convert to markdown: torch.ao.ns._numeric_suite_fx.rst, torch.ao.ns._numeric_suite.rst, torch.compiler_aot_inductor_minifier.rst, torch.compiler_aot_inductor.rst, torch.compiler_api.rst
#155036 closed Jun 13, 2025
DISABLED test_sdpa_rewriter_12_cuda (__main__.SDPAPatternRewriterCudaTests)
#145187 closed Jun 13, 2025
DISABLED test_sdpa_rewriter_12_cuda (__main__.SDPAPatternRewriterCudaDynamicTests)
#145188 closed Jun 13, 2025
DISABLED test_sdpa_rewriter_11_cuda (__main__.SDPAPatternRewriterCudaTests)
#148525 closed Jun 13, 2025
DISABLED test_parity__foreach_acos_fastpath_inplace_cuda_float16 (__main__.TestForeachCUDA)
#150985 closed Jun 13, 2025
Convert to markdown: cuda.rst, cuda.tunable.rst, cudnn_persistent_rnn.rst, cudnn_rnn_determinism.rst, data.rst
#155016 closed Jun 13, 2025
Tensor.backward type hints clarification
#81963 closed Jun 13, 2025
Convert to markdown: torch.compiler_dynamo_overview.rst, torch.compiler_fake_tensor.rst, torch.compiler_faq.rst, torch.compiler_fine_grain_apis.rst, torch.compiler_get_started.rst
#155038 closed Jun 13, 2025
DISABLED test_parity__foreach_ceil_fastpath_inplace_cuda_complex128 (__main__.TestForeachCUDA)
#155887 closed Jun 13, 2025
DISABLED test_parity__foreach_ceil_fastpath_inplace_cuda_complex64 (__main__.TestForeachCUDA)
#155908 closed Jun 13, 2025
`make_fx` error in nightly but not PyTorch 2.7.1
#155605 closed Jun 13, 2025
Cannot build docs via `make html`
#155092 closed Jun 13, 2025
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_float64 (__main__.TestForeachCUDA)
#150298 closed Jun 13, 2025
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_bfloat16 (__main__.TestForeachCUDA)
#148965 closed Jun 13, 2025
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_float32 (__main__.TestForeachCUDA)
#150208 closed Jun 13, 2025
[Upstream Triton] [ROCm] Accuracy issues in ```inductor/test_torchinductor_opinfo```
#155803 closed Jun 13, 2025
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_bfloat16 (__main__.TestForeachCUDA)
#148966 closed Jun 13, 2025
INTERNAL ASSERT FAILED in `torch.cuda.current_stream/default_stream/ExternalStream/set_per_process_memory_fraction`
#136849 closed Jun 13, 2025
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_float16 (__main__.TestForeachCUDA)
#150173 closed Jun 13, 2025
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_complex64 (__main__.TestForeachCUDA)
#150161 closed Jun 13, 2025
Typing: Incorrect overload for boolean operators
#155701 closed Jun 12, 2025
`python torchgen/gen.py --update-aoti-c-shim` should update all the c_shim_*_files
#155349 closed Jun 12, 2025
module.cuda() doesn't work under FakeTensorMode
#148977 closed Jun 12, 2025
Expand Examples for torch.autograd.functional.jacobian
#132140 closed Jun 12, 2025
Github outage resulting in jobs failing at checkout
#155829 closed Jun 12, 2025
MPS: Conv1d fails with NotImplementedError for output_channels > 65536
#152278 closed Jun 12, 2025
Convert to markdown: distributed.rst, distributed.tensor.parallel.rst, distributed.tensor.rst, distributions.rst, dlpack.rst
#155019 closed Jun 12, 2025
[Upstream Triton] [ROCm] AssertionError: Tensor-likes are not close! ```test_upsample_nearest1d_dynamic_shapes_cuda```
#154215 closed Jun 12, 2025
Convert to markdown: fx.rst, hub.rst, jit_builtin_functions.rst, jit_language_reference_v2.rst, jit_language_reference.rst
#155023 closed Jun 12, 2025
NO support for torch 2.7.1 + cuda 12.4?
#155790 closed Jun 12, 2025
GH200/GB200 NCCL Build Pytorch
#152182 closed Jun 12, 2025
[Multiprocesing] missing `_release_ipc_counter` in rebuilding cuda ipc tensor with UntypedStorage
#155311 closed Jun 12, 2025
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_complex128 (__main__.TestForeachCUDA)
#150141 closed Jun 12, 2025
DISABLED test_variant_consistency_jit_linalg_lu_factor_cuda_float32 (__main__.TestJitCUDA)
#86839 closed Jun 12, 2025
DISABLED test_comprehensive_nn_functional_nll_loss_cuda_float64 (__main__.TestDecompCUDA)
#118355 closed Jun 12, 2025
DISABLED test_vmapjvpvjp_linalg_lu_cuda_float32 (__main__.TestOperatorsCUDA)
#86733 closed Jun 12, 2025
DISABLED test_vmapvjpvjp_linalg_lu_factor_cuda_float32 (__main__.TestOperatorsCUDA)
#113850 closed Jun 12, 2025
DISABLED test_vmapvjp_linalg_lu_cuda_float32 (__main__.TestOperatorsCUDA)
#86893 closed Jun 12, 2025
DISABLED test_vmapvjpvjp_linalg_lu_cuda_float32 (__main__.TestOperatorsCUDA)
#86929 closed Jun 12, 2025
DISABLED test_vmapjvpall_linalg_lu_cuda_float32 (__main__.TestOperatorsCUDA)
#86770 closed Jun 12, 2025
DISABLED test_variant_consistency_jit_linalg_lu_cuda_complex64 (__main__.TestJitCUDA)
#87070 closed Jun 12, 2025
DISABLED test_variant_consistency_jit_linalg_lu_factor_ex_cuda_complex64 (__main__.TestJitCUDA)
#86887 closed Jun 12, 2025
TRACK: integral + floating inputs to an op with floating requiring grad result in INTERNAL_ASSERT
#78332 closed Jun 12, 2025
`matmul, mm` triggers INTERNAL ASSERT FAILED when input requires grad
#78141 closed Jun 12, 2025
`index_fill` will trigger INTERNAL ASSERT when float tensor requiring grad + int tensor
#78443 closed Jun 12, 2025
`layer_norm` triggers INTERNAL ASSERT with input requiring grad + zero-size int tensor
#78444 closed Jun 12, 2025
`addmv, mv` will trigger INTERNAL ASSERT FAILED when input requiring grad
#77814 closed Jun 12, 2025
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_uint8 (__main__.TestForeachCUDA)
#150417 closed Jun 12, 2025
DISABLED test_dict_contains (__main__.TestGuardSerialization)
#153530 closed Jun 12, 2025
DISABLED test_cublas_addmm_reduced_precision_fp16_accumulate_size_10000_backend_cublaslt_cuda_bfloat16 (__main__.TestMatmulCudaCUDA)
#154588 closed Jun 12, 2025
Convert to markdown: accelerator.rst, amp.rst, autograd.rst, backends.rst, benchmark_utils.rst
#155013 closed Jun 12, 2025
DISABLED test_ddp_apply_optim_in_backward_ignored_params (__main__.TestDistBackendWithSpawn)
#106361 closed Jun 12, 2025
Convert to markdown: func.ux_limitations.rst, func.whirlwind_tour.rst, future_mod.rst, futures.rst, fx.experimental.rst
#155022 closed Jun 12, 2025
ERROR: Unknown target name: "deepspeed"
#155158 closed Jun 11, 2025
Convert to markdown: fsdp.rst, func.api.rst, func.batch_norm.rst, func.migrating.rst, func.rst
#155021 closed Jun 11, 2025
Convert to markdown: mobile_optimizer.rst, model_zoo.rst, module_tracker.rst, monitor.rst, mps_environment_variables.rst
#155026 closed Jun 11, 2025
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_bfloat16 (__main__.TestForeachCUDA)
#150119 closed Jun 11, 2025
Convert to markdown: draft_export.rst, export.ir_spec.rst, export.programming_model.rst, export.rst, fft.rst
#155020 closed Jun 11, 2025
Convert to markdown: torch.compiler_transformations.rst, torch.compiler_troubleshooting_old.rst, torch.compiler_troubleshooting.rst, torch.compiler.config.rst, torch.compiler.rst
#155040 closed Jun 11, 2025
DISABLED test_parity__foreach_acos_fastpath_inplace_cuda_complex64 (__main__.TestForeachCUDA)
#150960 closed Jun 11, 2025
DISABLED test_cublas_addmm_reduced_precision_fp16_accumulate_size_10000_backend_cublas_cuda_float16 (__main__.TestMatmulCudaCUDA)
#154546 closed Jun 11, 2025
DISABLED test_parity__foreach_acos_fastpath_inplace_cuda_complex128 (__main__.TestForeachCUDA)
#150933 closed Jun 11, 2025
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_uint8 (__main__.TestForeachCUDA)
#149858 closed Jun 11, 2025
UNSTABLE pull / linux-jammy-py3-clang12-executorch / build
#150261 closed Jun 11, 2025
CUDA Graph capture through torch.cuda.CUDAGraph API fails when using dynamic indexing, but succeeds with dynamic slicing
#155682 closed Jun 11, 2025
[inductor] Improve GEMM loggings for torch.bmm
#155307 closed Jun 11, 2025
Inductor's require_stride_order and require_exact_strides should pull from the graph sent to inductor
#137979 closed Jun 11, 2025
get custom operators to use exact strides
#146210 closed Jun 11, 2025
torch.library.custom_op string support
#152685 closed Jun 11, 2025
Autograd doc does not mention torch.autograd.set_grad_enabled
#86718 closed Jun 11, 2025
DISABLED test_multi_output_unbacked_custom_op_cuda (__main__.TestInductorDynamicCUDA)
#135755 closed Jun 11, 2025
Enhanced Feedback for `load_state_dict` with `strict=False`
#141256 closed Jun 11, 2025
DISABLED test_parity__foreach_acos_fastpath_inplace_cuda_bfloat16 (__main__.TestForeachCUDA)
#150902 closed Jun 11, 2025
DISABLED test_cublas_addmm_reduced_precision_fp16_accumulate_size_10000_backend_cublaslt_cuda_float16 (__main__.TestMatmulCudaCUDA)
#154499 closed Jun 11, 2025
DISABLED test_cublas_addmm_size_10000_backend_cublaslt_cuda_float32 (__main__.TestMatmulCudaCUDA)
#154498 closed Jun 11, 2025
DISABLED test_cublas_addmm_reduced_precision_fp16_accumulate_size_10000_backend_cublas_cuda_bfloat16 (__main__.TestMatmulCudaCUDA)
#154500 closed Jun 11, 2025
DISABLED test_vdd_clamp_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#134445 closed Jun 11, 2025
Partitioner loses Inplace ops where source is constant
#155242 closed Jun 11, 2025
[XPU] Kineto profiler fails on XPU with `PTI_ERROR_NOT_IMPLEMENTED`
#153632 closed Jun 11, 2025
DISABLED test_reorder_peak_memory_dfs (__main__.TestOperatorReorderForPeakMemory)
#145183 closed Jun 11, 2025
DISABLED test_ddp_comm_hook_sparse_gradients (__main__.DistributedDataParallelTest)
#146400 closed Jun 11, 2025
DISABLED test_sparse_gradients (__main__.DistributedDataParallelTest)
#153142 closed Jun 11, 2025
DISABLED test_reorder_peak_memory_lpmf (__main__.TestOperatorReorderForPeakMemory)
#145210 closed Jun 11, 2025
DISABLED test_reorder_peak_memory_bfs (__main__.TestOperatorReorderForPeakMemory)
#147949 closed Jun 11, 2025
DISABLED test_variant_consistency_jit_linalg_lu_cuda_float32 (__main__.TestJitCUDA)
#86711 closed Jun 11, 2025
DISABLED test_variant_consistency_jit_linalg_lu_factor_cuda_complex64 (__main__.TestJitCUDA)
#86732 closed Jun 11, 2025
DISABLED test_mm_concat_cuda (__main__.FreezingGpuTests)
#145186 closed Jun 11, 2025
[XPU User Empathy Day] `torch.linalg.solve` support on XPU
#154182 closed Jun 11, 2025
DISABLED test_smoke (__main__.TestCollectEnv)
#77345 closed Jun 11, 2025
xpu: can't install latest nighly torch along with latest torchvision (torchvision depends on N-1 torch)
#154687 closed Jun 11, 2025
DISABLED test_duplicate_registration_impl (__main__.TestOpProfiles)
#151281 closed Jun 11, 2025
FP8 quantization causes bad precision with torch.compile
#154020 closed Jun 11, 2025
Inconsistent result of torch.eye() in CPU vs GPU
#155661 closed Jun 11, 2025
[doc] Long Function Name (C++) overlapping on the side menu
#66937 closed Jun 11, 2025
deepcopy of Lazy modules returns an exception
#65051 closed Jun 11, 2025
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_uint8 (__main__.TestForeachCUDA)
#150878 closed Jun 11, 2025
isDifferentiableType(variable.scalar_type()) INTERNAL ASSERT FAILED at "/pytorch/torch/csrc/autograd/functions/utils.h
#154357 closed Jun 11, 2025
torch.library.custom_op doesn't handle 1-element tuples returns
#150472 closed Jun 11, 2025
Core dumpect when inspecting instantiation of a torch.autograd.Function (Future deprecation)
#154981 closed Jun 11, 2025
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_int8 (__main__.TestForeachCUDA)
#150837 closed Jun 10, 2025
Convert to markdown: onnx_verification.rst, onnx.rst, optim.rst, package.rst, profiler.rst
#155031 closed Jun 10, 2025
Convert to markdown: onnx_dynamo_onnxruntime_backend.rst, onnx_dynamo.rst, onnx_ops.rst, onnx_torchscript_supported_aten_ops.rst, onnx_torchscript.rst
#155030 closed Jun 10, 2025
Convert to markdown: torch.compiler_best_practices_for_backends.rst, torch.compiler_cudagraph_trees.rst, torch.compiler_custom_backends.rst, torch.compiler_dynamic_shapes.rst, torch.compiler_dynamo_deepdive.rst
#155037 closed Jun 10, 2025
Convert to markdown: testing.rst, threading_environment_variables.rst, torch_cuda_memory.rst, torch_environment_variables.rst, torch_nccl_environment_variables.rst
#155035 closed Jun 10, 2025
Jacobian mismatch for `nn.functional.ctc_loss`
#67462 closed Jun 10, 2025
DISABLED test_dim_dynamic_inline_and_install_strict (__main__.InlineAndInstallStrictExportTestExport)
#154951 closed Jun 10, 2025
DISABLED test_dim_dynamic (__main__.TestExport)
#154950 closed Jun 10, 2025
DISABLED test_dim_dynamic_retraceability_strict (__main__.RetraceExportTestExport)
#154940 closed Jun 10, 2025
DISABLED test_dim_dynamic_retraceability_nonstrict (__main__.RetraceExportNonStrictTestExport)
#154939 closed Jun 10, 2025
DISABLED test_cond_contains_unbacked_no_escape_training_ir_to_decomp_strict (__main__.TrainingIRToRunDecompExportTestExport)
#154917 closed Jun 10, 2025
DISABLED test_dim_dynamic_specialization_training_ir_to_decomp_nonstrict (__main__.TrainingIRToRunDecompExportNonStrictTestExport)
#154918 closed Jun 10, 2025
DISABLED test_cond_contains_unbacked_no_escape_cpp_serdes (__main__.CppSerdesTestExport)
#154919 closed Jun 10, 2025
DISABLED test_cond_contains_unbacked_no_escape_training_ir_to_decomp_nonstrict (__main__.TrainingIRToRunDecompExportNonStrictTestExport)
#154916 closed Jun 10, 2025
DISABLED test_cond_contains_unbacked_no_escape_retraceability_strict (__main__.RetraceExportTestExport)
#154915 closed Jun 10, 2025
DISABLED test_cond_contains_unbacked_no_escape_retraceability_nonstrict (__main__.RetraceExportNonStrictTestExport)
#154914 closed Jun 10, 2025
DISABLED test_cond_contains_unbacked_no_escape_strict (__main__.StrictExportTestExport)
#154913 closed Jun 10, 2025
DISABLED test_cond_contains_unbacked_no_escape_serdes_strict (__main__.SerDesExportTestExport)
#154912 closed Jun 10, 2025
DISABLED test_cond_contains_unbacked_no_escape_serdes_nonstrict (__main__.SerDesExportNonStrictTestExport)
#154911 closed Jun 10, 2025
DISABLED test_cond_contains_unbacked_no_escape_inline_and_install_strict (__main__.InlineAndInstallStrictExportTestExport)
#154910 closed Jun 10, 2025
DISABLED test_reshape_view_helper_inline_and_install_strict (__main__.InlineAndInstallStrictExportTestExport)
#154877 closed Jun 10, 2025
DISABLED test_reshape_view_helper_strict (__main__.StrictExportTestExport)
#154878 closed Jun 10, 2025
DISABLED test_cond_contains_unbacked_no_escape (__main__.TestExport)
#154909 closed Jun 10, 2025
DISABLED test_reshape_view_helper (__main__.TestExport)
#154876 closed Jun 10, 2025
DISABLED test_reshape_view_helper_retraceability_strict (__main__.RetraceExportTestExport)
#154875 closed Jun 10, 2025
DISABLED test_reshape_view_helper_retraceability_nonstrict (__main__.RetraceExportNonStrictTestExport)
#154874 closed Jun 10, 2025
DISABLED test_reshape_view_helper_serdes_strict (__main__.SerDesExportTestExport)
#154873 closed Jun 10, 2025
DISABLED test_reshape_view_helper_serdes_nonstrict (__main__.SerDesExportNonStrictTestExport)
#154872 closed Jun 10, 2025
DISABLED test_reshape_view_helper_cpp_serdes (__main__.CppSerdesTestExport)
#154871 closed Jun 10, 2025
DISABLED test_reshape_view_helper_training_ir_to_decomp_strict (__main__.TrainingIRToRunDecompExportTestExport)
#154870 closed Jun 10, 2025
DISABLED test_reshape_view_helper_training_ir_to_decomp_nonstrict (__main__.TrainingIRToRunDecompExportNonStrictTestExport)
#154869 closed Jun 10, 2025
DISABLED test_dim_hint_range_violations_inline_and_install_strict (__main__.InlineAndInstallStrictExportTestExport)
#155010 closed Jun 10, 2025
DISABLED test_dim_hint_range_violations_serdes_strict (__main__.SerDesExportTestExport)
#154993 closed Jun 10, 2025
DISABLED test_dim_hint_range_violations_strict (__main__.StrictExportTestExport)
#154994 closed Jun 10, 2025
DISABLED test_dim_hint_range_violations_training_ir_to_decomp_strict (__main__.TrainingIRToRunDecompExportTestExport)
#154995 closed Jun 10, 2025
DISABLED test_dim_hint_range_violations_serdes_nonstrict (__main__.SerDesExportNonStrictTestExport)
#154992 closed Jun 10, 2025
DISABLED test_dim_hint_range_violations_retraceability_strict (__main__.RetraceExportTestExport)
#154991 closed Jun 10, 2025
DISABLED test_dim_hint_range_violations_retraceability_nonstrict (__main__.RetraceExportNonStrictTestExport)
#154990 closed Jun 10, 2025
DISABLED test_dim_hint_range_violations_cpp_serdes (__main__.CppSerdesTestExport)
#154989 closed Jun 10, 2025
DISABLED test_dim_hint_range_violations (__main__.TestExport)
#154988 closed Jun 10, 2025
DISABLED test_dim_dynamic_specialization_inline_and_install_strict (__main__.InlineAndInstallStrictExportTestExport)
#154987 closed Jun 10, 2025
DISABLED test_dim_dynamic_strict (__main__.StrictExportTestExport)
#154972 closed Jun 10, 2025
DISABLED test_dim_dynamic_specialization_retraceability_nonstrict (__main__.RetraceExportNonStrictTestExport)
#154970 closed Jun 10, 2025
DISABLED test_dim_dynamic_specialization_retraceability_strict (__main__.RetraceExportTestExport)
#154971 closed Jun 10, 2025
DISABLED test_dim_dynamic_specialization_cpp_serdes (__main__.CppSerdesTestExport)
#154969 closed Jun 10, 2025
DISABLED test_dim_hint_range_violations_training_ir_to_decomp_nonstrict (__main__.TrainingIRToRunDecompExportNonStrictTestExport)
#154968 closed Jun 10, 2025
DISABLED test_dim_dynamic_training_ir_to_decomp_strict (__main__.TrainingIRToRunDecompExportTestExport)
#154967 closed Jun 10, 2025
DISABLED test_dim_dynamic_specialization (__main__.TestExport)
#154966 closed Jun 10, 2025
DISABLED test_dim_dynamic_specialization_serdes_strict (__main__.SerDesExportTestExport)
#154965 closed Jun 10, 2025
DISABLED test_dim_dynamic_specialization_serdes_nonstrict (__main__.SerDesExportNonStrictTestExport)
#154964 closed Jun 10, 2025
DISABLED test_dim_dynamic_cpp_serdes (__main__.CppSerdesTestExport)
#154957 closed Jun 10, 2025
DISABLED test_dim_dynamic_training_ir_to_decomp_nonstrict (__main__.TrainingIRToRunDecompExportNonStrictTestExport)
#154956 closed Jun 10, 2025
DISABLED test_dim_dynamic_specialization_training_ir_to_decomp_strict (__main__.TrainingIRToRunDecompExportTestExport)
#154955 closed Jun 10, 2025
DISABLED test_dim_dynamic_serdes_nonstrict (__main__.SerDesExportNonStrictTestExport)
#154952 closed Jun 10, 2025
DISABLED test_dim_dynamic_serdes_strict (__main__.SerDesExportTestExport)
#154953 closed Jun 10, 2025
DISABLED test_dim_dynamic_specialization_strict (__main__.StrictExportTestExport)
#154954 closed Jun 10, 2025
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_int8 (__main__.TestForeachCUDA)
#149859 closed Jun 10, 2025
Integrate with ONNX 1.18.0 release branch
#151681 closed Jun 10, 2025
[Profile] unexpected ops have been profiled from torch.profiler
#155188 closed Jun 10, 2025
Inconsistent behavior in `torch.export.export()` for empty forward functions with `*args` / `**kwargs`: one fails with `list index out of range`, the other succeeds
#155190 closed Jun 10, 2025
Tensor.long() produces inconsistent results for torch.inf between CPU and GPU
#154724 closed Jun 10, 2025
Support Delay Loading of c10.dll in when using libtorch as a thirdparty library.
#105058 closed Jun 10, 2025
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_float32 (__main__.TestForeachCUDA)
#150747 closed Jun 10, 2025
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_uint8 (__main__.TestForeachCUDA)
#150662 closed Jun 10, 2025
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_int8 (__main__.TestForeachCUDA)
#150630 closed Jun 10, 2025
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_int64 (__main__.TestForeachCUDA)
#150617 closed Jun 10, 2025
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_bfloat16 (__main__.TestForeachCUDA)
#150668 closed Jun 10, 2025
DISABLED test_remote_cache_load_function_device_cuda_float32_dynamic_False_bundle_triton_False_use_static_cuda_launcher_False (__main__.TestFxGraphCache)
#151074 closed Jun 10, 2025
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_float64 (__main__.TestForeachCUDA)
#150752 closed Jun 10, 2025
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_bool (__main__.TestForeachCUDA)
#150680 closed Jun 10, 2025
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_int32 (__main__.TestForeachCUDA)
#150602 closed Jun 10, 2025
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_int64 (__main__.TestForeachCUDA)
#150822 closed Jun 10, 2025
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_int32 (__main__.TestForeachCUDA)
#150800 closed Jun 10, 2025
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_int16 (__main__.TestForeachCUDA)
#150590 closed Jun 10, 2025
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_int16 (__main__.TestForeachCUDA)
#150772 closed Jun 10, 2025
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_float16 (__main__.TestForeachCUDA)
#150712 closed Jun 10, 2025
[compile time][inductor] Quadratic compile time observed in Inductor fusion
#154652 closed Jun 10, 2025
torch._dynamo.exc.Unsupported: Failed to trace builtin operator
#155436 closed Jun 10, 2025
Convert to markdown: torch.compiler_inductor_profiling.rst, torch.compiler_ir.rst, torch.compiler_nn_module.rst, torch.compiler_performance_dashboard.rst, torch.compiler_profiling_torch_compile.rst
#155039 closed Jun 10, 2025
Potential Bug with HYBRID_SHARD and (n, 1) Device Mesh Falling Back to NO_SHARD
#154888 closed Jun 9, 2025
Broken link in doc.
#132178 closed Jun 9, 2025
Rprop does not work in MPS for noncontiguous tensors
#118117 closed Jun 9, 2025
[cuBLAS] relax the restrictions on the use of cublasLt
#153590 closed Jun 9, 2025
[Upstream Triton] experimental_tensormap_fenceproxy_acquire
#154692 closed Jun 9, 2025
The compilation result is incorrect when torch.compile compiles code containing torch.nonzero.
#155324 closed Jun 9, 2025
`weight` argument of `nn.CrossEntropyLoss()` works with `int`, `complex` and `bool` type
#134896 closed Jun 9, 2025
[Inductor] Need a hash key for the underlying kernel for ChoiceCaller that doesn't depend on runtime params
#154467 closed Jun 9, 2025
request for faster inductor kernels for blockwise reduction across dim1 -> write
#149982 closed Jun 9, 2025
Sliced float16/bfloat16 tensors can have numerical errors > 1e-3 after passing through nn.Linear.
#154997 closed Jun 9, 2025
Internal Assertion Failure in torch.tensor() in Restricted eval() Environment
#155224 closed Jun 9, 2025
internal assert failed while trying to run dorado (ONT basecaller)
#155326 closed Jun 9, 2025
DISABLED test_pending_fusions_multiple (__main__.TestPrologueFusion)
#152221 closed Jun 9, 2025
[bug] the LTS torch==1.8.2 pip package is incomplete
#69689 closed Jun 9, 2025
Adam is 30% slower than SGD on Apple Metal.
#78063 closed Jun 9, 2025
Need support and testing for Adam optimizer for MPS
#105382 closed Jun 9, 2025
DISABLED test_sparse_add_cuda_float64 (__main__.TestSparseCSRCUDA)
#145019 closed Jun 9, 2025
Document garbage_collection_threshold default
#150917 closed Jun 8, 2025
Convert to markdown: mps.rst, mtia.memory.rst, mtia.rst, multiprocessing.rst, name_inference.rst
#155027 closed Jun 8, 2025
Segmentation fault (core dumped) in torch.concat
#155306 closed Jun 8, 2025
I get different results on simple network operation on a computer with and without AVX512
#155423 closed Jun 8, 2025
Export _in_spec different before and after load
#154674 closed Jun 8, 2025
There is a performance drop because we have not yet implemented the batching rule for aten::matrix_exp.
#155400 closed Jun 8, 2025
profile for torch.add(x, x) where x is a zero-sized tensor looks bogus
#151829 closed Jun 7, 2025
Potential indexing issues in compile for large tensors
#154168 closed Jun 7, 2025
Pytorch with Rocm nightly is missing a dependency
#155207 closed Jun 7, 2025
DISABLED test_penalized_small_dim (__main__.TestTiling)
#155186 closed Jun 7, 2025
torch.device context manager change doesn't show in torch.get_default_device
#131328 closed Jun 7, 2025
[ONNX] Use onnx Attention operator for scaled_dot_product_attention
#149662 closed Jun 7, 2025
[ONNX] rfftn/irfftn produces incorrect shapes
#125903 closed Jun 7, 2025
Inductor doesn't move 0-D tensors to enable cudagraphs
#119241 closed Jun 7, 2025
how to save the fx graph with output tensor shapes ?
#155391 closed Jun 7, 2025
MPS missing erfc
#155337 closed Jun 7, 2025
[TensorDict - compile] cuda.is_initialized compatibility
#129659 closed Jun 7, 2025
torch.from_numpy with string dtype leads to Floating point exception (core dumped) instead of runtime error when operating
#155328 closed Jun 7, 2025
Convert to markdown: ddp_comm_hooks.rst, debugging_environment_variables.rst, deploy.rst, deterministic.rst, distributed.algorithms.join.rst
#155017 closed Jun 6, 2025
DISABLED test_stft_xpu (__main__.AOTInductorTestABICompatibleGpu)
#154701 closed Jun 6, 2025
[ROCm][TunableOp] Contents of untuned csv files are ignored during offline tuning
#153462 closed Jun 6, 2025
Weight Shape Corruption in Sequential Modules During Stateful Execution
#155246 closed Jun 6, 2025
[Upstream Triton] 'tt.expand_dims' op failed to infer returned types (`test_comprehensive_nanquantile_cuda_float64`)
#154933 closed Jun 6, 2025
Convert to markdown: torch.overrides.rst, type_info.rst, utils.rst, xpu.rst
#155041 closed Jun 6, 2025
Cpp-wrapper mode issue tracker
#117363 closed Jun 6, 2025
torch.compiled flex_attention + NJT raises `RuntimeError: Attempting to use FunctionalTensor on its own.`
#154556 closed Jun 6, 2025
Release 2.7.1 validations checklist and cherry-picks
#154512 closed Jun 6, 2025
Cannot Export Dynamic Shape Decoder to Onnx
#153955 closed Jun 6, 2025
Better mergebot messages when reverting a PR
#139680 closed Jun 6, 2025
"Automatically add `__all__`" tool and linter
#146242 closed Jun 6, 2025
ImportError: cannot import name 'make_fx' from 'torch.fx
#155323 closed Jun 6, 2025
Long CI Queue Times: VolumeLimitExceeded
#155265 closed Jun 6, 2025
Unexpected behaviour when resuming from checkpoint using CosineAnnealingLR
#65342 closed Jun 6, 2025
[feature request] Global GPU Flag
#7535 closed Jun 6, 2025
TorchScript based ONNX exporter: Shape Inference failure
#153214 closed Jun 6, 2025
Jobs failing with: "The job was not acquired by Runner of type", Internal server error
#155250 closed Jun 5, 2025
torch._higher_order_ops.scan graph breaks and clamp error with compile/autograd
#153437 closed Jun 5, 2025
autograd.grad in compiled function can't run with code cache hit
#154536 closed Jun 5, 2025
[MPS] "Can't be indexed using 32-bit iterator" error as of 20250430 nightly cpu MacOS build
#154828 closed Jun 5, 2025
Move all CI/CD workflows from focal to jammy
#154157 closed Jun 5, 2025
Inconsistent behavior between CPU and GPU implementations of `torch.arange`
#153133 closed Jun 5, 2025
torch.UntypedStorage.from_file(filename, shared=False, size=0) documentation error: 'nbytes' instead of 'size'
#130629 closed Jun 5, 2025
AI On-device (Feature)
#155161 closed Jun 5, 2025
[Upstream Triton] RuntimeError: Expected to find "uint32_t grid_0 = 1023L;" but did not find it ```test_triton_autotuning_cuda```
#154252 closed Jun 5, 2025
[Upstream Triton] AttributeError: args ```test_deep_reentrant```
#154249 closed Jun 5, 2025
[Upstream Triton] torch._dynamo.exc.BackendCompilerFailed ```test_conv_weight_layout_convert_cuda```
#154231 closed Jun 5, 2025
[Upstream Triton] RuntimeError: Expected to not find ".run(" but found it ```test_low_precision```
#154228 closed Jun 5, 2025
`torch.cross`'s behavior is different on cpu and gpu on torch 2.5.0.dev20240708+cu121
#132031 closed Jun 5, 2025
DISABLED test_sparse_gradients_grad_is_view (__main__.DistributedDataParallelTest)
#140368 closed Jun 5, 2025
[Upstream Triton] nvcc / ptx mismatch w/ AOTI
#154938 closed Jun 5, 2025
[Windows UT] CpuTests.test_fractional_max_pool2d2_cpu failed with PyTorch 2025-05-25 nightly wheel
#154697 closed Jun 5, 2025
Unexpected behavior when using dist.all_reduce(x, op=dist.ReduceOp.SUM)
#152300 closed Jun 5, 2025
DISABLED test_deterministic_algorithms (__main__.TestGuardSerialization)
#154090 closed Jun 5, 2025
hugging face transformer regression on `facebook/opt-125m`
#155168 closed Jun 5, 2025
Non-negligible overhead of `OpOverloadPacket` dispatch w/o overload
#153626 closed Jun 5, 2025
AI On-Device
#155159 closed Jun 4, 2025
`2` and `-2` for `ord` argument of `linalg.norm()` should be explained more clearly
#136453 closed Jun 4, 2025
[v2.7.1] Release Tracker
#152627 closed Jun 4, 2025
quantize_fx.prepare_qat_fx, `get_default_qat_qconfig_mapping` is unused in code.
#144522 closed Jun 4, 2025
DISABLED TCPStoreTest.testMultiTenantStoresUV (__main__.TCPStoreTest)
#139150 closed Jun 4, 2025
DISABLED TCPStoreTest.testMultiTenantStores (__main__.TCPStoreTest)
#142030 closed Jun 4, 2025
DISABLED PyTorchStreamWriterAndReader.LoadWithMultiThreads (build.win_tmp.build.torch.test.inline_container_test.exe)
#154011 closed Jun 4, 2025
DISABLED AotInductorTest.FreeInactiveConstantBufferRuntimeConstantFoldingCuda (build.bin.test_aoti_inference)
#150299 closed Jun 4, 2025
DISABLED AotInductorTest.FreeInactiveConstantBufferCuda (build.bin.test_aoti_inference)
#149495 closed Jun 4, 2025
`lintrunner-noclang` job name is confusing
#126324 closed Jun 4, 2025
Enable opting out of CI experiments
#139334 closed Jun 4, 2025
torch.compile failed to handle a custom __delattr__ method correctly
#150765 closed Jun 4, 2025
[Upstream Triton] AssertionError: Scalars are not equal! ```inductor.test_mkldnn_pattern_matcher```
#154225 closed Jun 4, 2025
[Upstream Triton] AssertionError: Incorrect result from choice TritonTemplateCaller ```test_mm_dropout```
#154222 closed Jun 4, 2025
[Upstream Triton] RuntimeError: Expected to find "convolution(" but did not find it ```test_conv_inference_heuristics_cuda```
#154220 closed Jun 4, 2025
[CPU][CUDA] `cpu` and `cuda` implementations of `log_softmax` backward seem to disagree on `dim > 2`
#155043 closed Jun 4, 2025
[Upstream Triton] AssertionError: Scalars are not equal!\n\nExpected 0 but got 2 ```inductor.test_cudagraph_trees```
#154243 closed Jun 4, 2025
[Upstream Triton] [HIP] torch._inductor.exc.InductorError ```inductor.test_custom_lowering```
#154242 closed Jun 4, 2025
[Upstream Triton] [HIP] RuntimeError: built without cuda ```inductor.test_distributed_patterns```
#154233 closed Jun 4, 2025
[Upstream Triton] [HIP] RuntimeError: bgemm input type at::Half and output type float is not supported for ROCm ```test_subgraph_decompose_k```
#154221 closed Jun 4, 2025
[Upstream Triton] [HIP] AssertionError: 'device-side assert' not found ```test_after_aot_gpu_runtime_error```
#154227 closed Jun 4, 2025
[Upstream Triton] [HIP] RuntimeError: _cudnn_rnn: ATen not compiled with cuDNN support ```test_cudnn_rnn_dynamic_shapes_cuda```
#154213 closed Jun 4, 2025
[Upstream Triton] AssertionError: Tensor-likes are not close! ```inductor.test_torchinductor_opinfo```
#154212 closed Jun 4, 2025
Experiencing "429: Too Many Requests" on downloading actions
#155075 closed Jun 4, 2025
[ROCm] AssertionError: Scalars are not equal! ```test_linear_with_input_of_flexible_layout```
#154246 closed Jun 4, 2025
loading state dict with mismatching shapes error
#154597 closed Jun 4, 2025
[Upstream Triton] RuntimeError: Expected to find "triton_poi_fused_cat_2.run" but did not find it ```test_conv_cat```
#154229 closed Jun 4, 2025
DISABLED test_ddp_apply_optim_in_backward (__main__.TestDistBackendWithSpawn)
#153266 closed Jun 4, 2025
[Upstream Triton] AssertionError: False is not true ```test_cpu_repro``` and ```test_cuda_repro```
#154245 closed Jun 4, 2025
[Upstream Triton] AssertionError: False is not true : num_same_stride is 0 ```test_redundant_clone_for_layout_convert_cuda```
#154230 closed Jun 4, 2025
[Upstream Triton] KeyError: Unknown key: ```'cubin'```
#154207 closed Jun 4, 2025
[Dynamo] recompiles with empty (no reason) guard
#154707 closed Jun 4, 2025
tensor_to_numpy symbol not exported in libtorch_python.so in pytorch 2.7
#154105 closed Jun 4, 2025
`randint(max)` causes a graph break, but not `rand().mul(max).floor().to(torch.long)` (on CPU)
#135664 closed Jun 4, 2025
MPS does not support sigmoid op with int64 input
#154895 closed Jun 4, 2025
Some PyTorch tensor functions silently change the default locale encoding
#151442 closed Jun 4, 2025
Mismatch of mixed precision `cast_fn` in FSDP and FSDP2
#153077 closed Jun 4, 2025
`CELU()`'s `alpha` argument with `int`, `complex` or `bool` and `inplace` argument with `int`, `complex` and `float` work against the doc
#133573 closed Jun 4, 2025
DISABLED test_linear (__main__.TestAOTInductorPackageCpp_xpu)
#154689 closed Jun 3, 2025
DISABLED test_multiple_methods (__main__.TestAOTInductorPackageCpp_xpu)
#154690 closed Jun 3, 2025
DISABLED test_metadata (__main__.TestAOTInductorPackageCpp_xpu)
#154685 closed Jun 3, 2025
DISABLED test_bool_input (__main__.TestAOTInductorPackageCpp_xpu)
#154683 closed Jun 3, 2025
DISABLED test_specified_output_dir (__main__.TestAOTInductorPackageCpp_xpu)
#154681 closed Jun 3, 2025
DISABLED test_add (__main__.TestAOTInductorPackageCpp_xpu)
#154682 closed Jun 3, 2025
Improve sharding algorithm for ASAN (any maybe other jobs as well)
#74620 10000 closed Jun 3, 2025
on-pr docker build stuck with `user is not authorized to BatchGetImage`
#148771 closed Jun 3, 2025
CI/CD: Figure out what to do with split build
#138750 closed Jun 3, 2025
CUDA not found in NVIDIA runners
#153760 closed Jun 3, 2025
[Mergebot] Adding ciflow/pull in PR without pull and lint workflows
#152718 closed Jun 3, 2025
CI workflows being skipped on PR
#152697 closed Jun 3, 2025
Wrong formula for CosineAnnealingLR
#152081 closed Jun 3, 2025
[Upstream Triton] AssertionError: Scalars are not equal! ```inductor/test_codecache.py```
#154250 closed Jun 3, 2025
[Upstream Triton] AssertionError: False is not true ```test_pad_3d_tensor```
#154224 closed Jun 3, 2025
`_amp_foreach_non_finite_check_and_unscale_` can be torch.compiled inside torch.amp, but not in identical code outside it
#138412 closed Jun 3, 2025
[Testing][Inductor][CUDA12.6] inductor/test_max_autotune.py test_max_autotune_decompose_k_dynamic_input unit test failure
#154254 closed Jun 3, 2025
inductor codegen for masks is non-deterministic
#154741 closed Jun 3, 2025
DISABLED test_sort_transpose_mps (__main__.GPUTests)
#153939 closed Jun 3, 2025
DISABLED test_slice_mutation1_mps (__main__.GPUTests)
#153913 closed Jun 3, 2025
Some `Improve Error Message` Bugs
#149625 closed Jun 3, 2025
torch.compile regression: it cause recompile when int value changed
#154490 closed Jun 3, 2025
torch.compile supported with GIL disabled
#147946 closed Jun 3, 2025
[FSDP2] reshard_after_forward=True for root models
#154655 closed Jun 3, 2025
[Upstream Triton] RuntimeError: Tried to register an operator ```test_item_to_inputs_kernel_nobreak_cpu```
#154216 closed Jun 3, 2025
[Upstream Triton] AssertionError: False is not true ```test_inductor_profiling_triton_hooks```
#154223 closed Jun 3, 2025
pytorchbot gets confused when ghstacked commit order are interactively rebased
#154461 closed Jun 2, 2025
MPS does not support log1p op with int64 input
#154883 closed Jun 2, 2025
[ONNX] Exporter improvement tasks
#129274 closed Jun 2, 2025
[ONNX][low pri] Move old (non-public) implementation into legacy/ and schedule for deprecation
#129308 closed Jun 2, 2025
ShardedGradScaler is not documented on the website.
#141543 closed Jun 2, 2025
[FSDP2] mixed precision: auto turn off `cast_forward_inputs`
#146130 closed Jun 2, 2025
Parameter not updating when FSDP2 model is used before optimizre creation
#149205 closed Jun 2, 2025
torch.distributed.checkpoint CUDA OOM with broadcast_from_rank0
#149640 closed Jun 2, 2025
[FSDP] Moving module's view tensor to device
#147321 closed Jun 2, 2025
FSDP with AveragedModel
#149138 closed Jun 2, 2025
[RFC] Deprecate silent fallback to aten logic in Inductor
#147479 closed Jun 2, 2025
DISABLED test_dtypeview_int16_bfloat16_mps (__main__.GPUTests)
#153864 closed Jun 2, 2025
DISABLED test_upsample_nearest1d_mps (__main__.GPUTests)
#153866 closed Jun 2, 2025
[Upstream Triton] AssertionError: expected to fail, but actually passed ```test_unbacked_reduction_cpu```
#154217 closed Jun 2, 2025
DISABLED AotInductorTest.BasicTestCuda (build.bin.test_aoti_inference)
#152888 closed Jun 2, 2025
DISABLED AotInductorTest.BasicTestCpu (build.bin.test_aoti_inference)
#152889 closed Jun 2, 2025
[dynamo] Improve final traceback frame format
#152867 closed Jun 2, 2025
Slow-autograd tests regression
#154459 closed Jun 2, 2025
Small numerical discrepancy in sum/mean after torch.block_diag
#154616 closed Jun 2, 2025
Dynamo converts `Size + tuple` to `tuple` instead of `Size`
#154432 closed Jun 2, 2025
`Inconsistent Results` from `torch.svd_lowrank` on CPU and CUDA
#154479 closed Jun 2, 2025
Add MPS support for ConvTranspose3d
#154615 closed Jun 2, 2025
Inconsistent result of torch.amin() in CPU vs GPU
#154792 closed Jun 2, 2025
Update `pocketfft` in third party
#154843 closed Jun 2, 2025
DISABLED test_resize_mps (__main__.GPUTests)
#153811 closed Jun 2, 2025
DISABLED test_cache_load_function_device_cuda_bfloat16_dynamic_False_bundle_triton_True_use_static_cuda_launcher_False_grad_True (__main__.TestFxGraphCache)
#150868 closed Jun 2, 2025
DISABLED test_cat_upcasting_mps (__main__.GPUTests)
#153814 closed Jun 2, 2025
DISABLED test_scalar_output_mps (__main__.GPUTests)
#153810 closed Jun 2, 2025
DISABLED test_invalid_operand_issue1_mps (__main__.GPUTests)
#153813 closed Jun 2, 2025
DISABLED test_to_dtype_mps (__main__.GPUTests)
#153812 closed Jun 2, 2025
[Upstream Triton] AssertionError: Tensor-likes are not close! ```test_conv_with_as_strided_cpu```
#154232 closed Jun 2, 2025
DISABLED test_split_cumsum_mps (__main__.GPUTests)
#153804 closed Jun 2, 2025
DISABLED test_profiler_mark_wrapper_call_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#135294 closed Jun 2, 2025
[Inductor] opacus_cifar10 test returns eager_fail_to_run after opacus 1.5.4 update
#154446 closed Jun 2, 2025
Inductor codegen fails due to specializations resulting in input removed from the graph but used in the runtime assertion.
#153650 closed Jun 2, 2025
Runtime assertions ignored in many cases
#153756 closed Jun 2, 2025
[FSDP] Cannot writeback when the parameter shape changes
#151223 closed Jun 1, 2025
Support for CUDA 12.8 (e.g., RTX 5090) in Previous PyTorch Versions (e.g., 2.3)
#154813 closed Jun 1, 2025
there is a little color problem of the example code in sparse docs.
#154779 closed Jun 1, 2025
Failing to free a tensor allocated while a torch.cuda.Mempool is active results in that tensor being freed with cudaFree() rather than the custom free function.
#146431 closed Jun 1, 2025
empty_cache does not work for CUDAPluggableAllocator + MemPool
#145168 closed May 31, 2025
torch.nextafter(0, 1) returns 0 on MPS device
#150027 closed May 31, 2025
torch.compile on MPS: error running compiled RMSNorm
#150454 closed May 31, 2025
aot_compile default configuration doesn't seem to appropriately setup support for unbacked SymInts
#118304 closed May 31, 2025
Multiple torch.fft APIs produce inconsistent results for infinite inputs between CPU and GPU (ihfftn, ihfft, ihfft2, rfft)
#154736 closed May 31, 2025
[Indexing] Incoherent Tensor indexing for nested lists
#100080 closed May 31, 2025
Indexing a tensor with a NumPy array sometimes works and sometimes doesn't
#65218 closed May 31, 2025
Inconsistent list indexing behavior
#68595 closed May 31, 2025
Cannot index into a tensor using indices from another device - regression from 1.12
#85450 closed May 30, 2025
Mixed logical indexing / numerical indexing fails.
#60261 closed May 30, 2025
Advanced indexing: allow combining Boolean & integer index
#46468 closed May 30, 2025
Cannot use mask and slice assignment together
#140802 closed May 30, 2025
Tensor slice copy across multiple devices fails silently
#84573 closed May 30, 2025
DISABLED test_neg_index_cpu (__main__.CpuTests)
#154760 closed May 30, 2025
Advanced indexing with uint8 tensor versus int64 tensor is inconsistent
#20149 closed May 30, 2025
Named Tensors: Slicing based on name
#29023 closed May 30, 2025
Advanced indexing slower than numpy
#14687 closed May 30, 2025
DISABLED test_AllenaiLongformerBase_repro_cpu (__main__.CpuTests)
#102865 closed May 30, 2025
Accessing elements of tensor with multi-dimensional index results `IndexError`
#43128 closed May 30, 2025
Indexing assignment can have no effect on CUDA with deterministic algorithms
#76176 closed May 30, 2025
Inconsistency between index_select and __get_item__
#83702 closed May 30, 2025
Advanced Indexing does not trace correctly for tensor shape that has leading 1s
#49852 closed May 30, 2025
Unexpected behaviour of 1.13.0
#90194 closed May 30, 2025
CUDA error: device-side assert triggered
#120901 closed May 30, 2025
Torch doesn't copy the assigned self-referential memory on `cpu` (inconsistent with `numpy` and `cuda`)
#126097 closed May 30, 2025
index_put : INTERNAL ASSERT FAILED
#72053 closed May 30, 2025
`torch.index_put` raise error when `accumulate=True`
#144539 closed May 30, 2025
Add int32 support to torch.gather
#148119 closed May 30, 2025
DISABLED test_var_mean_tile_reduction_False_mps (__main__.GPUTests)
#153762 closed May 30, 2025
DISABLED test_tensor_index_put_slice_mps (__main__.GPUTests)
#153761 closed May 30, 2025
[inductor] `proxy_tensor.py` throws `SyntaxError` when using `.random_`
#151432 closed May 30, 2025
DISABLED test_mean_mps (__main__.GPUTests)
#153747 closed May 30, 2025
Tensor.type_as() produces inconsistent results for torch.inf between CPU and GPU
#154730 closed May 30, 2025
Tensor.char() produces inconsistent results for torch.inf between CPU and GPU
#154727 closed May 30, 2025
Tensor.int() produces inconsistent results for torch.inf between CPU and GPU
#154726 closed May 30, 2025
Doc for `assign` parameter of `load_state_dict` is not rendered correctly
#141364 closed May 30, 2025
[export] torch.tensor constructor specializes on float value
#153411 closed May 30, 2025
Cannot override __add__ in NamedTuple with __new__ + torch.compile
#133762 closed May 30, 2025
DISABLED test_lerp_mps (__main__.GPUTests)
#153713 closed May 30, 2025
DISABLED test_view_on_aliased_mps (__main__.GPUTests)
#153717 closed May 30, 2025
DISABLED test_shape_padding_mps (__main__.GPUTests)
#153719 closed May 30, 2025
DISABLED test_topk_mps (__main__.GPUTests)
#153718 closed May 30, 2025
DISABLED test_resize_as_mps (__main__.GPUTests)
#153716 closed May 30, 2025
DISABLED test_xblock_divides_xnumel_mps (__main__.GPUTests)
#153715 closed May 30, 2025
DISABLED test_max_pool2d_with_indices_backward5_mps (__main__.GPUTests)
#153721 closed May 30, 2025
DISABLED test_dtypeview_bfloat16_bfloat16_mps (__main__.GPUTests)
#153710 closed May 30, 2025
DISABLED test_searchsorted_mps (__main__.GPUTests)
#153711 closed May 30, 2025
DISABLED test_view_uint8_through_differing_bitwidths_mps (__main__.GPUTests)
#153720 closed May 30, 2025
DISABLED test_where_with_logical_op_mps (__main__.GPUTests)
#153712 closed May 30, 2025
DISABLED test_inductor_multiple_specializations_dynamic_shapes_cuda (__main__.DynamicShapesCodegenGPUTests)
#154717 closed May 30, 2025
DISABLED test_zero_element_mutation_mps (__main__.GPUTests)
#153708 closed May 30, 2025
DISABLED test_full_dtype (__main__.TestFull)
#138574 closed May 30, 2025
DISABLED test_tensor1_mps (__main__.GPUTests)
#153709 closed May 30, 2025
DISABLED test_cache_load_function_device_cuda_float32_dynamic_False_bundle_triton_True_use_static_cuda_launcher_False_grad_True (__main__.TestFxGraphCache)
#150991 closed May 30, 2025
inductor: inductor conv2d get a different size and stride with eager mod when input channel is zero
#101356 closed May 30, 2025
`torch.compile` fails with sparse embedding (`F.embedding(sparse=True)`)
#150656 closed May 30, 2025
jit compilation returns an int rather than a bool when using math.isnan()
#107166 closed May 30, 2025
llama model failed for dynamic shape path
#106110 closed May 30, 2025
[FSDP2] relax uniform dtype assertion for requires_grad=False
#154082 closed May 30, 2025
AttributeError: type object 'torch._C._distributed_c10d.BackendType' has no attribute 'XCCL'.
#147059 closed May 30, 2025
DISABLED test_div_zero_dim_mps (__main__.GPUTests)
#153694 closed May 30, 2025
DISABLED test_tmp_not_defined_issue2_mps (__main__.GPUTests)
#153693 closed May 30, 2025
DISABLED test_dropout_trivial_1_mps (__main__.GPUTests)
#153695 closed May 30, 2025
DISABLED test_views1_mps (__main__.GPUTests)
#153692 closed May 30, 2025
DISABLED test_var_correction_mps (__main__.GPUTests)
#153691 closed May 30, 2025
DISABLED test_linear (__main__.TestAOTInductorPackageCpp_xpu)
#154684 closed May 30, 2025
speculate_subgraph cannot detect input mutation for a function wrapped with torch.func.functionalize
#154669 closed May 29, 2025
Selective activation checkpointing causes memory leakage
#154642 closed May 29, 2025
Cleanup autotune_fallback_to_aten post-deprecation
#153298 closed May 29, 2025
flex_attention + NJT output inconsistent with non-NJT results
#154554 closed May 29, 2025
DISABLED test_dict_keys_match (__main__.TestGuardSerialization)
#153617 closed May 29, 2025
DISABLED test_seqential_batch_workers (__main__.TestDataLoader)
#81891 closed May 29, 2025
AOTI packaged model can't be run on newly created tensor of same shape as tensor created from slice
#153992 closed May 29, 2025
Silent incorrectness between static torch.compile vs eager
#152425 closed May 29, 2025
CUDA 12.9, Compile pytorch's static library error in windows 11 .
#154604 closed May 29, 2025
[Infra] Jobs got frequently cancelled, sometimes mid-checkout
#151669 closed May 29, 2025
DISABLED test_torch_manual_seed_seeds_cuda_devices (__main__.TestCuda)
#135218 closed May 28, 2025
DISABLED test_fused_sdp_choice_privateuseone (__main__.TestSDPAPrivateUse1Only)
#134600 closed May 28, 2025
DISABLED test_serialization_array_with_storage (__main__.TestCuda)
#134991 closed May 28, 2025
DISABLED test_allocator_fuzz (__main__.TestCudaMallocAsync)
#135249 closed May 28, 2025
DISABLED test_streaming_backwards_multiple_streams (__main__.TestCuda)
#135065 closed May 28, 2025
DISABLED test_max_autotune_remote_caching_dynamic_False (__main__.TestMaxAutotuneRemoteCache)
#145361 closed May 28, 2025
DISABLED test_record_stream (__main__.TestCuda)
#134746 closed May 28, 2025
DISABLED test_set_per_process_memory_fraction (__main__.TestCuda)
#135115 closed May 28, 2025
DISABLED test_random_no_reused_random_states_float32 (__main__.TestCuda)
#134944 closed May 28, 2025
DISABLED test_streaming_backwards_sync (__main__.TestCuda)
#103494 closed May 28, 2025
DISABLED test_garbage_collect_expandable (__main__.TestCudaMallocAsync)
#134811 closed May 28, 2025
DISABLED test_sum_fp16 (__main__.TestCuda)
#99953 closed May 28, 2025
DISABLED test_inplace_gradgrad_remainder_cuda_float64 (__main__.TestBwdGradientsCUDA)
#100675 closed May 28, 2025
DISABLED test_max_split_expandable (__main__.TestCudaMallocAsync)
#134837 closed May 28, 2025
DISABLED test_memory_plots_free_segment_stack (__main__.TestCudaMallocAsync)
#137223 closed May 28, 2025
DISABLED test_random_no_reused_random_states_float64 (__main__.TestCuda)
#134958 closed May 28, 2025
DISABLED test_threading (__main__.TestWithNCCL)
#141637 closed May 28, 2025
DISABLED test_randint_randomness_for_large_range (__main__.TestCuda)
#134932 closed May 28, 2025
DISABLED test_serialization_array_with_empty (__main__.TestCuda)
#134966 closed May 28, 2025
DISABLED test_prod_large (__main__.TestCuda)
#134723 closed May 28, 2025
DISABLED test_streaming_backwards_callback (__main__.TestCuda)
#135024 closed May 28, 2025
DISABLED test_streams (__main__.TestCuda)
#135114 closed May 28, 2025
torch.fft.hfft2 produces inconsistent results for infinite values between CPU and GPU
#154520 closed May 28, 2025
torch.fft.ifft2 produces inconsistent imaginary components for infinite inputs between CPU and GPU
#154521 closed May 28, 2025
DISABLED test_promotes_int_to_float_ldexp_cuda_int16 (__main__.TestCommonCUDA)
#154550 closed May 28, 2025
modded-nanogpt flaky NCCL hang starting 3/30 nightly
#152623 closed May 28, 2025
`SDPA`: `EFFICIENT_ATTENTION / FLASH_ATTENTION` backend, batch dim limited to 2**16-1 (CUDA error: invalid configuration argument)
#146704 closed May 28, 2025
Passing `device_id` to `torch.distributed.init_process_group()` results in NCCL randomly hanging during communications
#153960 closed May 28, 2025
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_int32 (__main__.TestForeachCUDA)
#153464 closed May 28, 2025
Set `size` when `is_coalesced` is set in `torch.sparse_coo_tensor()`
#145371 closed May 28, 2025
DISABLED test_parity__foreach_add_fastpath_outplace_cuda_bfloat16 (__main__.TestForeachCUDA)
#153537 closed May 28, 2025
DISABLED test_extra_cuda_context (__main__.ProcessGroupNCCLGroupTest)
#139011 closed May 28, 2025
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_uint8 (__main__.TestForeachCUDA)
#153525 closed May 28, 2025
Torch compile cache
#144859 closed May 28, 2025
Issue while Building the Documentation
#151901 closed May 28, 2025
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_int8 (__main__.TestForeachCUDA)
#153512 closed May 28, 2025
Test Release Highlight Feature
#154462 closed May 27, 2025
Torch compile issue, AttributeError: 'NoneType' object has no attribute 'store_cubin'
#150980 closed May 27, 2025
Use of @property on in-graph constructed NJT fails Dynamo tracing
#146932 closed May 27, 2025
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_int64 (__main__.TestForeachCUDA)
#153482 closed May 27, 2025
Core dump on `import torch` when built with `USE_TENSORPIPE=0 USE_GLOO=0` (cannot initialize type "RpcBackendOptions")
#154300 closed May 27, 2025
[Dynamo] Exception raised inside torch.autocast causes crash AttributeError: 'NoneType' object has no attribute 'is_python_constant
#152012 closed May 27, 2025
Issue with different output with different torch versions.
#154411 closed May 27, 2025
[MPS] Memory leak in `nn.Linear`
#132332 closed May 27, 2025
Model export to onnx, but got "RuntimeError: Expected all tensors to be on the same device"
#154093 closed May 27, 2025
TorchVision upgrade blocked by inductor regressions
#153985 closed May 27, 2025
[Upstream Triton] AttributeError: module 'triton.language.extra.cuda' has no attribute ```experimental_descriptor_load```
#154210 closed May 27, 2025
[Upstream Triton] AttributeError: module 'triton.language.extra.cuda' has no attribute ```experimental_tensormap_fenceproxy_acquire```
#154209 closed May 27, 2025
running my facebook/bart-base for summarization task : MPS does not support cumsum op with int64 input
#141786 closed May 27, 2025
`Tensor._make_wrapper_subclass` is not listed in `torch/_C/__init__.pyi`
#153790 closed May 27, 2025
Torch not found
#133076 closed May 27, 2025
DISABLED test_sdpa_rewriter_11_cuda (__main__.SDPAPatternRewriterCudaDynamicTests)
#148631 closed May 27, 2025
[dynamo] torch._dynamo crashes on `self.value.__module__` inside SkipFunctionVariable.call_function() (PyTorch 2.7, works 2.6)
#152316 closed May 27, 2025
Multiheadattention module doesn't implement the function about kdim and vdim
#95712 closed May 27, 2025
torch.distributed error with nccl backend
#154342 closed May 27, 2025
AOTInductor: Artifact compiled on A10 (SM_86) fails on H20 (SM_90) despite torch._inductor.config.cuda.arch="90"
#153697 closed May 27, 2025
UNSTABLE pull / linux-jammy-py3-clang12-executorch / test (executorch)
#144480 closed May 27, 2025
Whether the transposed tensor is contiguous affects the results of the subsequent Linear layer.
#148939 closed May 27, 2025
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_int16 (__main__.TestForeachCUDA)
#153440 closed May 27, 2025
MaxUnpool2d raise RuntimeError: view size is not compatible with input tensor's size and stride
#154341 closed May 27, 2025
module 'torch' has no attribute 'get_default_device'
#154362 closed May 27, 2025
[distributions] Creating a second instance of `Wishart` modifies the constraints on the first instance.
#154355 closed May 26, 2025
[Upstream Triton] AssertionError: Scalars are not equal!\n\nExpected 12 but got 9 ```inductor.test_cpu_cpp_wrapper```
#154247 closed May 26, 2025
Refactor MegaCache to make it generic
#152976 closed May 26, 2025
DISABLED test_item_to_inputs_kernel_nobreak_cuda (__main__.TestInductorDynamicCUDA)
#119538 closed May 26, 2025
Inconsistent behavior and misleading error message for `torch.nanmean()` with complex dtypes
#153132 closed May 26, 2025
DISABLED test_dtensor_seq_par_shard_dim_1 (__main__.MicroPipelineTPTest)
#153223 closed May 26, 2025
DISABLED test_remote_cache_load_function_device_cuda_bfloat16_dynamic_False_bundle_triton_False_use_static_cuda_launcher_False (__main__.TestFxGraphCache)
#152222 closed May 26, 2025
DISABLED test_cache_load_function_device_cuda_bfloat16_dynamic_False_bundle_triton_True_use_static_cuda_launcher_False_grad_False (__main__.TestFxGraphCache)
#151137 closed May 26, 2025
DISABLED test_sdpa_rewriter_14_cuda (__main__.SDPAPatternRewriterCudaDynamicTests)
#147600 closed May 26, 2025
Deterministic `index_put` on CUDA fails when broadcasting is required
#79987 closed May 25, 2025
Does Pytorch have the function that can obtain sub-matrix according to index?
#49278 closed May 25, 2025
index_put_ take min when there are repeated indices
#19197 closed May 25, 2025
[bug] inconsistent behavior of indexing
#14227 closed May 25, 2025
pinned_use_background_threads will cause a coredump
#152008 closed May 25, 2025
Index out of bound when running torch.gather
#107540 closed May 25, 2025
Tensor __getitem__ not documented, sparse grad?
#101068 closed May 25, 2025
scaled_dot_product_attention(): argument 'is_causal' must be bool, not SymBool
#154038 closed May 24, 2025
DISABLED test_mutation_rename (__main__.TestMaxAutotune)
#154218 closed May 24, 2025
How can I use inductor aot_compile to support a MoE network?
#148747 closed May 24, 2025
Query Regarding Memory Release API in AOTInductor for PyTorch
#153363 closed May 24, 2025
Wrong result for modulo if the dividend and divisor are int when using mps
#154171 closed May 24, 2025
Add the capability to export GradMultiply to ONNX
#73354 closed May 24, 2025
Torchdynamo with onnxrt backend generating fake tensor errors
#93502 closed May 24, 2025
autoformat failures blocking merge
#154084 closed May 23, 2025
UT failure in test_decompose_mem_bound_mm.py for Inductor
#153585 closed May 23, 2025
DISABLED test_find_or_create_pg (__main__.TestPgTag)
#107278 closed May 23, 2025
Sparse CSR layout GPU backend tracking issue
#60854 closed May 23, 2025
Problem installing from source on CentOS 6.5
#28444 closed May 23, 2025
Problem when installing pytorch 1.4 from source on Centos 6.3
#28497 closed May 23, 2025
torch.where behaves differently from in place replacement
#96110 closed May 23, 2025
test
#154278 closed May 23, 2025
Mysterious Tensor Indexing Problem
#22013 closed May 23, 2025
CUDA failure with deterministic fancy indexed assignment with broadcasting
#131933 closed May 23, 2025
Gather backward is faster than integer indexing on GPU
#15245 closed May 23, 2025
xpu: torch.nn.functional.scaled_dot_product_attention produces NaN on XPU
#154051 closed May 23, 2025
`RNNBase` modules break parameter sharing due to `flatten_parameters()`
#154238 closed May 23, 2025
Documentation issue about torch.finfo(x.dtype).eps
#154184 closed May 23, 2025
Bugs encountered when installing the torch corresponding to CUDA12.8
#154200 closed May 23, 2025
[Upstream Triton] AttributeError: module 'triton.language.extra.cuda' has no attribute ```experimental_tensormap_fenceproxy_acquire```
#154208 closed May 23, 2025
build pytorch2.3.0 cpu with mkldnn_acl 24.08 failed on aarch64
#148841 closed May 23, 2025
Assertion Failure: TestBinaryUfuncsCPU.test_lerp_cpu_complex64 on Graviton 3
#146155 closed May 23, 2025
Assertion Failure: TestMkldnnCPU.test_matmul_lower_precision_cpu_float16 on Graviton 2 & 3
#146484 closed May 23, 2025

373 Issues opened by 222 people

DISABLED test_get_parameter_dtype (__main__.ReproTests)
#156598 opened Jun 23, 2025
DISABLED test_add_sub_alpha_out (__main__.ReproTests)
#156597 opened Jun 23, 2025
Python 3.14 and dynamo fails to build
#156595 opened Jun 23, 2025
DeviceMesh's `_set_mesh_dim_group_options` ineffective for 1-dim meshes
#156593 opened Jun 23, 2025
Set dependencies lower bound
#156587 opened Jun 23, 2025
DISABLED test_dont_dce_rand (__main__.ReproTests)
#156580 opened Jun 23, 2025
DISABLED test_add_complex_conj (__main__.ReproTests)
#156579 opened Jun 23, 2025
Tracing with aot_eager/torch.compile produces wrong strides on HPU Meta dispatch
#156578 opened Jun 23, 2025
DISABLED test_basic_fn_backend_eager_device_cuda(__main__.TestPackage)
#156576 opened Jun 23, 2025
DISABLED test_dont_aggressively_write_assert (__main__.ReproTests)
#156570 opened Jun 23, 2025
DISABLED test_nccl_symmem_alloc (__main__.NCCLSymmetricMemoryTest)
#156569 opened Jun 23, 2025
Segmentation fault (core dumped) in `torch.profiler.profile`
#156563 opened Jun 22, 2025
Using collections. namedtuple in the forward method of the model resulted in compilation failure
#156558 opened Jun 22, 2025
[JIT] TorchScript fails to compile model using random.choice() with confusing error message
#156557 opened Jun 22, 2025
When using Apple MPS, the bias of BatchNorm1d becomes extremely large.
#156555 opened Jun 22, 2025
ConvertTritonGPUToLLVM pass fails on fused GroupNorm backward (SM 89) under torch.compile(…, backend='inductor')
#156549 opened Jun 21, 2025
Inconsistent Model Results and Failures on Windows with CUDA vs. CPU PyTorch Builds
#156547 opened Jun 21, 2025
`TorchScript` does not allow accessing methods of nested tensors
#156544 opened Jun 21, 2025
Issue when Using SparseMPS backend (especially the operation aten::_sparse_coo_tensor_with_dims_and_tensors) for Whisper model
#156540 opened Jun 21, 2025
FSDP2 - Tensor incompatibility
#156535 opened Jun 21, 2025
`<<` and `>>` operators seem silently broken for DTensor operand 1 and scalar operand 2
#156533 opened Jun 21, 2025
[ONNX] Update tests for attention
#156524 opened Jun 20, 2025
[Dtensor] handle dtensor ops that only need to operate on certain shard without all_gather first.
#156523 opened Jun 20, 2025
[user empathy] compile for `transformers` model
#156520 opened Jun 20, 2025
Convenient way to create device with torch.accelerator and a specific device index
#156519 opened Jun 20, 2025
DISABLED test_comprehensive_nn_functional_linear_cuda_float16 (__main__.TestInductorOpInfoCUDA)
#156514 opened Jun 20, 2025
Native BFloat16 Mixed BatchNorm Train gives incorrect gradients
#156513 opened Jun 20, 2025
functorch_maml_omniglot is a bad CPU performance smoketest model
#156511 opened Jun 20, 2025
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_int32 (__main__.TestForeachCUDA)
#156497 opened Jun 20, 2025
mypy.ini deprecation: numpy.typing.mypy_plugin
#156489 opened Jun 20, 2025
Add stub for mypy-torch._C._jit_tree_views
#156488 opened Jun 20, 2025
SDPA FLASH_ATTENTION backend gets NaN values for IPEX on Intel CPU
#156487 opened Jun 20, 2025
Parameter "onnx_shape_inference" can't be successfully passed to the "_export" interface in "torch/onnx/utils.py" via the "torch.onnx.export" interface
#156480 opened Jun 20, 2025
`torch.compile` fails with `UnicodeDecodeError` when model contains extreme value injection
#156451 opened Jun 19, 2025
Tensor.is_pinned() raises error after renaming privateuseone backend.
#156444 opened Jun 19, 2025
DISABLED test_inlined_optimized_graph (__main__.TestTEFuserDynamic)
#156438 opened Jun 19, 2025
Accuracy minifier fails to minify anything
#156437 opened Jun 19, 2025
DISABLED test_skip_grad_in_check (__main__.TestTEFuserDynamic)
#156436 opened Jun 19, 2025
Suggest to use the torch cmake target instead of ${TORCH_LIBRARIES} in the c++ docs
#156434 opened Jun 19, 2025
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_int16 (__main__.TestForeachCUDA)
#156430 opened Jun 19, 2025
[CI] "Update viable/strict" job occasionally hangs for days
#156425 opened Jun 19, 2025
research adding cuda-bindings to core
#156424 opened Jun 19, 2025
DISABLED test_fake_crossref_backward_no_amp_cholesky_solve_cuda_float32 (__main__.TestFakeTensorCUDA)
#156419 opened Jun 19, 2025
ShardedTensor breaks cycle detection
#156417 opened Jun 19, 2025
A possible bug in HistogramObserver._combine_histograms()
#156414 opened Jun 19, 2025
bmm max-autotune segfaults on x86 cpu
#156412 opened Jun 19, 2025
[compile] torch._dynamo.exc.TorchRuntimeError: Failed running call_function aten.lift_fresh_copy.default
#156411 opened Jun 19, 2025
DataParallel gather NaN across multiple gpus
#156392 opened Jun 19, 2025
[compile][transformers] Recompilation with mark_static_address with cudagraphs
#156377 opened Jun 18, 2025
DISABLED test_inplace_on_view_undefined_grad_output_cpu (__main__.TestAutogradDeviceTypeCPU)
#156363 opened Jun 18, 2025
DISABLED test_schedule_with_native_zero_bubble_ScheduleClass1 (__main__.ScheduleTest)
#156328 opened Jun 18, 2025
UNSTABLE periodic / linux-jammy-rocm-py3.10 / test (distributed)
#156327 opened Jun 18, 2025
[inductor] tune_scaled_grouped_mm fails with memory layout assertion, despite memory layout assertions prior to op call passing
#156325 opened Jun 18, 2025
Composition of nested `torch.compile` calls is not well defined
#156308 opened Jun 18, 2025
DISABLED test_inplace_on_view_then_no_grad_cpu (__main__.TestAutogradDeviceTypeCPU)
#156306 opened Jun 18, 2025
Inductor error with Torch XPU optimizations to StableDiffusion3 Pipeline
#156303 opened Jun 18, 2025
DISABLED test_inplace_on_view_of_view_cpu (__main__.TestAutogradDeviceTypeCPU)
#156289 opened Jun 18, 2025
DISABLED test_inplace_on_view_non_contig_cpu (__main__.TestAutogradDeviceTypeCPU)
#156265 opened Jun 18, 2025
DISABLED test_shape_env (__main__.TestGuardSerialization)
#156264 opened Jun 18, 2025
torch._foreach_copy_ causing CUDA illegal memory access.
#156261 opened Jun 18, 2025
[ONNX] Create a tutorial for exporting hf transformers model
#156258 opened Jun 18, 2025
[dynamic shapes] translation validation failure under `fake_tensor_propagate_real_tensors`
#156251 opened Jun 17, 2025
DISABLED test_name_match (__main__.TestGuardSerialization)
#156246 opened Jun 17, 2025
Upgrade torch._scaled_grouped_mm to SM100+
#156238 opened Jun 17, 2025
[Tracker] AutoParallel's feature request to DTensor
#156217 opened Jun 17, 2025
DISABLED test_inplace_on_view_makes_base_require_grad_cpu (__main__.TestAutogradDeviceTypeCPU)
#156209 opened Jun 17, 2025
When compiling submodules, AOTInductor is significantly slower with torch.export
#156206 opened Jun 17, 2025
Upgrade torch._grouped_mm to SM100+
#156202 opened Jun 17, 2025
Dynamo does not know how to trace method `__len__` of class `<unknown type>` with torch.logging calls
#156191 opened Jun 17, 2025
[CD] Windows Wheel builds CUDA 12.9.1 Stack Overflow during build
#156181 opened Jun 17, 2025
DISABLED test_inplace_on_view_backprop_view_of_view_cpu (__main__.TestAutogradDeviceTypeCPU)
#156180 opened Jun 17, 2025
nn.Module._load_from_state_dict is always called with strict=True
#156177 opened Jun 17, 2025
[Feature Request]: Native C++ API for ONNX Export in LibTorch
#156168 opened Jun 17, 2025
DISABLED test_inplace_on_view_backprop_view_cpu (__main__.TestAutogradDeviceTypeCPU)
#156163 opened Jun 17, 2025
torch2.7.1 issue for torch.compile numpy
#156162 opened Jun 17, 2025
[SDPA] RTX5080 is different from CPU calculation result in backward with long seq
#156160 opened Jun 17, 2025
Error shm.dll
#156159 opened Jun 17, 2025
Inconsistent `torch.rsqrt` results on complex128 between CPU and CUDA
#156152 opened Jun 17, 2025
DISABLED test_inplace_on_view_backprop_base_cpu (__main__.TestAutogradDeviceTypeCPU)
#156143 opened Jun 17, 2025
[dynamo, dynamic shapes] .item() on Tensor created in the compiled region fails
#156135 opened Jun 16, 2025
[DDP][FSDP2] add unit test to showcase DDP mixed precision with FSDP2 mixed precision
#156130 opened Jun 16, 2025
[dynamo] Show carets in graph break stack traces
#156127 opened Jun 16, 2025
[compile][torchtune] Full model compiled Qwen3 is 4x slower than eager
#156103 opened Jun 16, 2025
UNSTABLE rocm / linux-jammy-rocm-py3.10 / test (default)
#156098 opened Jun 16, 2025
DISABLED test_quantize (__main__.TestOpenReg)
#156089 opened Jun 16, 2025
DISABLED test_schedule_with_native_zero_bubble_ScheduleClass0 (__main__.ScheduleTest)
#156088 opened Jun 16, 2025
Idea: Add SBOM Generation (and optional vuln scan) for better supply chain insight
#156085 opened Jun 16, 2025
`torch.logsumexp`: support `dim=None`
#156075 opened Jun 16, 2025
[codespell] fix typos in the codebase
#156073 opened Jun 16, 2025
[typing][docs] `torch.amin` and `torch.amax` do not document `dim=None`
#156072 opened Jun 16, 2025
Docs incorrectly claim `torch.max` and `torch.logsumexp` accept `dim=None`
#156071 opened Jun 16, 2025
torch.nn.functional.conv_transpose3d return inconsistent results when weight containing inf between CPU and GPU
#156062 opened Jun 16, 2025
Dynamo trace an incorrect result on torch._C._storage_Use_Count
#156059 opened Jun 16, 2025
torch.equal causes fallback to eager mode in torch.compile
#156057 opened Jun 16, 2025
`torch.distributed.tensor.parallel.style.ColwiseParallel` introduce huge guard eval latency
#156054 opened Jun 16, 2025
Ability to set device guard in Python
#156052 opened Jun 16, 2025
[Upstream Triton] persistent mm + tma accuracy failures
#156028 opened Jun 15, 2025
Ошибка установки torch для CUDA 12.1 на GTX 1660 Ti
#156024 opened Jun 15, 2025
[Segfault Bug] Out-of-bounds write of at::native::cpubla::gemm (bfloat16) in at::native::cpu_flash_attention_backward
#156022 opened Jun 15, 2025
torch.fft.ifft for complex64 produces inconsistent results between CPU and CUDA
#156020 opened Jun 15, 2025
[ROCm] BF16 Context Parallelism MI300X Not Numerically Accurate
#156012 opened Jun 15, 2025
`torch.prod` or `torch.special.entr` triggers `CUDA driver error: invalid argument` on GPU unless kernel cache is cleared
#156010 opened Jun 15, 2025
Deprecation notice of `torch.norm` and `Tensor.norm` across the documentation
#156005 opened Jun 14, 2025
DCP save only saves one shard of tensor parallel model when using DP + TP
#156002 opened Jun 14, 2025
ONNX export via Dynamo sets `dft_length = 1` in `DFT`, breaking shape-inference for `torch.fft.rfft`
#155997 opened Jun 14, 2025
Keep gettting AssertionError: found no DeviceMesh from dtensor args for c10d.broadcast_.default!
#155993 opened Jun 14, 2025
[feature request] Native checkpointing to/from `s3://`
#155992 opened Jun 14, 2025
Vulkan interoperability
#155986 opened Jun 14, 2025
as_tensor of list of tensors should keep grad history
#155983 opened Jun 14, 2025
Floating point exception (core dumped) in torch.nn.functional.fold
#155981 opened Jun 14, 2025
Including XPU and CUDA in ProfilerActivity causes XPU profiling to be ignored
#155957 opened Jun 13, 2025
DISABLED test_call_count_tunableop_cuda_float32 (__main__.TestLinalgCUDA)
#155953 opened Jun 13, 2025
test vLLM with PyTorch 2.8rc before releasing PyTorch 2.8
#155933 opened Jun 13, 2025
Activation Checkpointing breaks "torch.distributed.checkpoint.state_dict._get_fqns"
#155924 opened Jun 13, 2025
UNSTABLE inductor-rocm / rocm-py3.10-inductor / test (inductor)
#155917 opened Jun 13, 2025
[`torch.distributed`] Watchdog monitor error handling should match watchdog error handling
#155916 opened Jun 13, 2025
DISABLED test_non_pow_2_headdim_head_dim_24_float16_cuda_float16 (__main__.TestFlexDecodingCUDA)
#155905 opened Jun 13, 2025
`torch.compile` fails when using `sdpa_kernel` with kwarg `set_priority` due to overly strict `TorchDynamo` assertion
#155898 opened Jun 13, 2025
test_flex_attention unit test failures
#155894 opened Jun 13, 2025
DISABLED test_non_pow_2_headdim_head_dim_17_float16_cuda_float16 (__main__.TestFlexDecodingCUDA)
#155893 opened Jun 13, 2025
[RFC] `DeviceGroup`: a mixin of `ProcessGroup` and `Backend`
#155892 opened Jun 13, 2025
[CUDA][CUTLASS] test_cutlass_backend.py unit test failures on SM90+
#155888 opened Jun 13, 2025
RFC: Add Adaptive Entropy-Gated Contrastive Fusion (AECF) for Robust Multimodal Attention Pooling
#155878 opened Jun 13, 2025
Migrating existing backend-MAIA integration toward PrivateUse1 / openReg
#155864 opened Jun 12, 2025
Export Huggingface models with StaticCache
#155862 opened Jun 12, 2025
[user triton][dynamo Fix TMA Descriptor reconstruction
#155856 opened Jun 12, 2025
intmm is being compiled inconsistently with errors.
#155838 opened Jun 12, 2025
Running dispatch modes on compile-disabled regions of a compiled model
#155825 opened Jun 12, 2025
DISABLED test_non_pow_2_headdim_head_dim_121_float16_cuda_float16 (__main__.TestFlexDecodingCUDA)
#155824 opened Jun 12, 2025
DISABLED test_compare_cpu_nn_functional_conv1d_cuda_float32 (__main__.TestCommonCUDA)
#155822 opened Jun 12, 2025
DISABLED test_non_pow_2_headdim_head_dim_94_float16_cuda_float16 (__main__.TestFlexDecodingCUDA)
#155808 opened Jun 12, 2025
torch.distributions.kl_divergence Fails with MultivariateNormal in Dynamo Due to _infer_size Type Error
#155800 opened Jun 12, 2025
`nn.RNN(...).to('cuda')` fails with `cuDNN error: CUDNN_STATUS_BAD_PARAM` on GPU, but works on CPU
#155798 opened Jun 12, 2025
[MPS] Performance regression and visual bug with ComfyUI Flux dev since nightly 20250510
#155797 opened Jun 12, 2025
Poor scaling on AArch64 at high thread counts
#155795 opened Jun 12, 2025
Compilation issues with ROCm 6.4.1 on Debian 12
#155794 opened Jun 12, 2025
DISABLED test_ddp_apply_optim_in_backward_ignored_params (__main__.TestDistBackendWithSpawn)
#155751 opened Jun 11, 2025
DISABLED test_ddp_apply_optim_in_backward_grad_as_bucket_view_false (__main__.TestDistBackendWithSpawn)
#155750 opened Jun 11, 2025
Tensor functionalization loses .grad information
#155725 opened Jun 11, 2025
[rocm] HIP Graph capture raises segmentation fault on AMD GPU but CUDA Graph capture succeeds on Nvidia GPU
#155720 opened Jun 11, 2025
DISABLED test_ddp_apply_optim_in_backward (__main__.TestDistBackendWithSpawn)
#155714 opened Jun 11, 2025
DISABLED test_allgather_stress_cuda (__main__.ProcessGroupGlooFRTest)
#155711 opened Jun 11, 2025
[pt2][precompile] Support ID_MATCH guards and torch.nn.Modules
#155705 opened Jun 11, 2025
[pt2] [precompile] Support cudagraphs in BundledAOTAutogradCacheArtifact
#155703 opened Jun 11, 2025
torch typing: register_buffer -> device
#155693 opened Jun 11, 2025
torch.compile produces incorrect output
#155690 opened Jun 11, 2025
[rocm] HIP Graph (on AMD GPU) capture does not raise `operation not permitted` for illegal operation whereas CUDA Graph (Nvidia GPU) does
#155684 opened Jun 11, 2025
[FSDP] optimizer_state_dict fails if FSDP model is not on global rank 0
#155680 opened Jun 11, 2025
Torch compile CUDA graphs leads to a large number of CUDA streams
#155679 opened Jun 11, 2025
torch.clamp throws overflow error on CPU but not on CUDA
#155671 opened Jun 11, 2025
[libtorch] Debug and Release should be shipped together on Windows
#155667 opened Jun 11, 2025
[Misc] skip_if decorators aborts the test process in distributed/test_composability.py
#155664 opened Jun 11, 2025
[DTensor] DTensor is not well supported on older versions of GPUs, such as A10
#155657 opened Jun 11, 2025
Hangs on torch.tril on Intel GPU
#155651 opened Jun 11, 2025
Functional all_gather_into_tensor does not support stacking, fails when compiled
#155632 opened Jun 10, 2025
DISABLED test_trust_repo_check_yes (__main__.TestHub)
#155617 opened Jun 10, 2025
Additional Documentation About Size Checking with Custom Operators
#155616 opened Jun 10, 2025
CUDA 12.6->12.8 slow and periodic failures
#155607 opened Jun 10, 2025
Question in aot_autograd trace in torch.distributed case
#155599 opened Jun 10, 2025
[libtorch] Crash on creating torch::optim::Adam optimizer for Windows
#155597 opened Jun 10, 2025
pipeline() fails when a sub-module uses "no_grad()"; impacts RoPE implementation on HF models
#155589 opened Jun 10, 2025
nn.init.trunc_normal_ Creates Massive Outliers with Small std Due to erfinv Instability
#155588 opened Jun 10, 2025
[Upstream Triton] Handle user-specified triton.set_allocator function
#155584 opened Jun 10, 2025
[Upstream Triton] Support new host-side TMA API in user-defined triton kernels
#155574 opened Jun 10, 2025
[Misc] test_foreach_add_different_mesh cannot work on machines with less than 4 GPUs
#155562 opened Jun 10, 2025
[MPS] Migrate torch.sort to Metal shader
#155560 opened Jun 10, 2025
Reproducibility of results without AVX512 by setting ATEN_CPU_CAPABILITY=avx2
#155552 opened Jun 10, 2025
get_ema_multi_avg_fn() equation is a little confused
#155551 opened Jun 10, 2025
[torch.export] Cannot export TorchVision raft_small, raft_large
#155550 opened Jun 10, 2025
[Misc] The keys of the sac_milp return dict were modified, but the test case was not updated
#155547 opened Jun 10, 2025
Clarify default value of eps in RMSNorm documentation
#155527 opened Jun 10, 2025
[Tracking] Triton 3.2 deprecation
#155519 opened Jun 10, 2025
DISABLED test_weight_norm_bwd_dynamic_shapes_cuda (__main__.DynamicShapesGPUTests)
#155517 opened Jun 10, 2025
DISABLED test_weight_norm_bwd_dynamic_shapes_cuda (__main__.DynamicShapesCodegenGPUTests)
#155516 opened Jun 10, 2025
Unable to find caffe2 when building libtorch from the source and then trying to use it in a cmake project
#155512 opened Jun 10, 2025
[cutlass backend] Arch in manifest is not handled correctly if we want to GenerateSM90 on Blackwell
#155511 opened Jun 10, 2025
PP activation offloading
#155490 opened Jun 9, 2025
sizevars.size_hint is an unbacked shapes footgun
#155484 opened Jun 9, 2025
[RFC][c10d] Make it easier to act on group
#155472 opened Jun 9, 2025
in partitioner avoid doing replacements on inputs when determining which symnodes to pass to backward.
#155468 opened Jun 9, 2025
torch.distributed TCP init method binds socket to all interfaces in all cases
#155467 opened Jun 9, 2025
AssertionError: found no DeviceMesh from dtensor args for c10d.broadcast_.default!
#155463 opened Jun 9, 2025
High-performance LLM quantization on X86 CPU with native PyTorch
#155435 opened Jun 9, 2025
Add SM100/B200 support for torch._grouped_mm
#155434 opened Jun 9, 2025
[pt2] [Precompile] Store parameters in BundledAOTAutogradCacheEntry
#155433 opened Jun 9, 2025
`torch._dynamo.exc.Unsupported: Attempted to call function marked as skipped`. Explanation: Dynamo developers have intentionally marked that the function `_immutable_list_unflatten`
#155426 opened Jun 8, 2025
Backward fails with compiled attention on nested tensors
#155421 opened Jun 8, 2025
record_stream + cudagraph + multiple streams leaks memory.
#155398 opened Jun 7, 2025
crash in torch.histc
#155393 opened Jun 7, 2025
OpOverloads should have annotations
#155386 opened Jun 7, 2025
[custom_op] Custom ops created by @custom_op should get type hints propagated to their OpOverload
#155385 opened Jun 7, 2025
RuntimeError: Could not find libnvrtc.so. Please make sure CUDA is installed.
#155378 opened Jun 6, 2025
Change error to warnning when tensor size mismatch while loading params
#155368 opened Jun 6, 2025
[Graph Partition] use pinned memory and foreach when moving cpu scalar tensor to gpu
#155360 opened Jun 6, 2025
[dynamo] Fixes to lru_cache Dynamo warning
#155352 opened Jun 6, 2025
CUDA_HOME doesn't seem to work with setup script
#155350 opened Jun 6, 2025
PyTorch CPP Extensions fail when same kernel is compiled more than once on ROCm servers
#155344 opened Jun 6, 2025
[A100][cusparselt] Non-determinstic correctness issue with some algorithms
#155333 opened Jun 6, 2025
compiler cache not work
#155332 opened Jun 6, 2025
Can't use torch.compile inside of a torch_dispatch mode
#155331 opened Jun 6, 2025
IntraKernel Dispatcher
#155330 opened Jun 6, 2025
Excessively restrictive dependencies
#155325 opened Jun 6, 2025
Segmentation fault after manual tensor assignment with autograd enabled on CUDA
#155322 opened Jun 6, 2025
mishandling torch.package.PackageExporter raises Aborted when given a tensor instead of an importer
#155321 opened Jun 6, 2025
`torch.export.export()` fails on GPU with LSTM model: "Cannot access data pointer of Tensor"
#155309 opened Jun 6, 2025
[MPS] 5D 4Mln rng attempts fail with internal error
#155293 opened Jun 6, 2025
torch.compile with InvokeAI: 'UserDefinedObjectVariable' object has no attribute 'proxy' / compute_ancestors KeyError op4
#155266 opened Jun 5, 2025
nn.NLLLoss Fails with 1D Inputs in Compiled Mode
#155247 opened Jun 5, 2025
Dynamo Fails on torch_scatter.scatter_max with Fake Tensor Allocation Error During Graph Tracing
#155240 opened Jun 5, 2025
Dynamo Fails on pad_packed_sequence Due to Fake Tensor Allocation Issue During FX Graph Tracing
#155238 opened Jun 5, 2025
DISABLED test_Transformer_multilayer_coder_cuda_tf32 (__main__.TestNN)
#155235 opened Jun 5, 2025
[rocm] `torch.fake_quantize_per_channel_affine` raises `CUDA error: operation not permitted when stream is capturing` on NVIDIA devices during cuda graph capture, but does not on AMD devices which may or may not result in blunt segmentation fault
#155231 opened Jun 5, 2025
[scan] scan is broken in nightly
#155230 opened Jun 5, 2025
canUse32BitIndexMath set to False with efficient net
#155225 opened Jun 5, 2025
[FSDP2] set_reduce_scatter_divide_factor errors with non-trivial MixedPrecisionPolicy
#155223 opened Jun 5, 2025
KeyError when using fx.split_module
#155220 opened Jun 5, 2025
3D Conv Slow with Large Tensors (> 2**31) Elements
#155218 opened Jun 5, 2025
DISABLED test_TransformerEncoderLayer_gelu_activation_cuda_tf32 (__main__.TestNN)
#155217 opened Jun 5, 2025
DISABLED test_Linear_cuda_tf32 (__main__.TestNN)
#155216 opened Jun 5, 2025
Profiler: Add hide metadata flag to skip events in key_averages() table
#155213 opened Jun 5, 2025
torch.compile bug when using resize
#155209 opened Jun 5, 2025
torch.compile failure in `all_to_all_single_grad` with dynamic splits
#155205 opened Jun 5, 2025
support for cuDNN 9.8+
#155203 opened Jun 5, 2025
Importing xgboost before torch + openmp causes seg fault
#155201 opened Jun 5, 2025
make html failing while building docs
#155199 opened Jun 5, 2025
Enable CUDA 12.9 binaries
#155196 opened Jun 5, 2025
Should be ReLU6(Module)
#155193 opened Jun 5, 2025
Graph break when modifying a list that contains symints.
#155174 opened Jun 5, 2025
[AC] torch.utils.checkpoint.CheckpointError from HF qwen2
#155171 opened Jun 4, 2025
[dynamic shapes] data-dependent error with conv1d
#155162 opened Jun 4, 2025
Multi-worker dataloader for `IterableDataset` holds open process on macOS Python 3.12
#155157 opened Jun 4, 2025
torch.profiler raises Aborted (core dumped) failurer related with GIL (gilstate_tss_set)
#155147 opened Jun 4, 2025
Segmentation fault when using torch.profiler
#155146 opened Jun 4, 2025
[RFC] Experimental Wheel Variant Support
#155141 opened Jun 4, 2025
some_tensor.to("cpu", non_blocking=True) becomes sync under PT2 while async in eager mode
#155121 opened Jun 4, 2025
After `torch.export.export`, my model inference results in FakeTensor.
#155114 opened Jun 4, 2025
[dynamo && compiled autograd] compiled autograd recompiles when using custom autograd functions and compiling the forward with backend "eager"
#155105 opened Jun 4, 2025
`torch.jit.script` models with `Dict[str, Tensor]` return cannot be exported via `torch.onnx.export` without `dynamo=True`, and error message is unclear
#155091 opened Jun 4, 2025
Flex Attention and Nested Tensor: very high VRAM usage
#155065 opened Jun 3, 2025
MPS Memory Leak
#155060 opened Jun 3, 2025
Segfault after clearing Dynamo Cache
#155057 opened Jun 3, 2025
[Upstream Triton] AOTI support w/ new TMA API
#155047 opened Jun 3, 2025
ROCm: torch.cholesky_inverse raises Memory access fault for large tensor shapes
#155046 opened Jun 3, 2025
Execution of an `ExportProgram` of a model with `torch.autograd.grad(torch.sqrt(x), x, torch.ones_like(torch.sqrt(x)))` returns a `FakeTensor` instead of a real tensor
#155044 opened Jun 3, 2025
Convert to markdown: storage.rst, tensor_attributes.rst, tensor_view.rst, tensorboard.rst, tensors.rst
#155034 opened Jun 3, 2025
Convert to markdown: nn.attention.rst, nn.functional.rst, nn.init.rst, nn.rst, onnx_dynamo_memory_usage.rst
#155029 opened Jun 3, 2025
Convert to markdown: jit_python_reference.rst, jit_unsupported.rst, jit_utils.rst, jit.rst, library.rst
#155024 opened Jun 3, 2025
Convert to markdown: cpp_extension.rst, cpp_index.rst, cpu.rst, cuda_environment_variables.rst, cuda._sanitizer.rst
#155015 opened Jun 3, 2025
Convert to markdown: bottleneck.rst, checkpoint.rst, complex_numbers.rst, cond.rst, config_mod.rst
#155014 opened Jun 3, 2025
Please support libtorch for XPU
#155011 opened Jun 3, 2025
Internal Assertion Failure with Invalid Arguments in max_unpool1d under TorchScript
#155009 opened Jun 3, 2025
Internal Assertion Failure with Invalid Arguments in max_pool1d under TorchScript
#155007 opened Jun 3, 2025
torch.compile triton kernel errors when there are """ docblocks
#155006 opened Jun 3, 2025
Internal Assertion Failure with Invalid Arguments in max_pool2d under TorchScript
#155004 opened Jun 3, 2025
Internal Assertion Failure with Invalid Arguments in max_unpool2d under TorchScript
#155003 opened Jun 3, 2025
The profiler does not seem to be able to record cuda runtime nodes
#155001 opened Jun 3, 2025
[Inductor] Float division inside tl.load in codegen results in TypeError('unexpected type fp32')
#154996 opened Jun 3, 2025
[FSDP2] Slower Convergence with fully_shard() Compared to DDP during Qwen2-VL Fine-Tuning
#154984 opened Jun 3, 2025
torch.linalg.vector_norm fails during torch.compile due to mismatch expected out dtype tensor
#154982 opened Jun 3, 2025
INTERNAL ASSERT FAILED in mse_loss when mixing CPU and CUDA tensors
#154978 opened Jun 3, 2025
[FSDP2] all_gather_copy_in for cpu offload
#154960 opened Jun 3, 2025
Invalid stride of output when use torch.cond
#154949 opened Jun 3, 2025
[Inductor] why `TORCHINDUCTOR_MAX_AUTOTUNE_GEMM_BACKENDS=TRITON` does NOT work?
#154947 opened Jun 3, 2025
[Tracker] Support flash attention fa3 ABI stable w/ libtorch
#154908 opened Jun 2, 2025
load_state_dict with strict should be able to remove_duplicate?
#154906 opened Jun 2, 2025
MPS does not support addmm for non-float input
#154901 opened Jun 2, 2025
MPS topk failure for 5D tensor or above
#154890 opened Jun 2, 2025
MPS batch_norm mixed dtype failure
#154887 opened Jun 2, 2025
MPS max_pool2d_with_indices failure: destination values and indices length mismatch along axis
#154882 opened Jun 2, 2025
NVLS algorithms are not disabled in PGNCCL in pytorch deterministic mode
#154880 opened Jun 2, 2025
documentation of Adafactor does not match the implementation
#154862 opened Jun 2, 2025
test_tensorboard.py::TestTensorBoardSummary::test_histogram_auto Fails with s390x-periodic / linux-manylinux-2_28-py3-cpu-s390x / test (default, 7, 10, linux.s390x)
#154855 opened Jun 2, 2025
Add node.meta["stack_trace"] to make_fx
#154853 opened Jun 2, 2025
Improve warning when specializations happen due to operator
#154851 opened Jun 2, 2025
Torchrun should handle SIGUSR1 and SIGUSR2
#154849 opened Jun 2, 2025
don't require recompiles when switching between torch.Tensor vs AsyncCollectiveTensor graph inputs
#154847 opened Jun 2, 2025
`torch.linalg.solve` does not raise an error for singular matrix on CPU in Linux environment, but works correctly on Windows
#154842 opened Jun 2, 2025
[FSDP2] fix unit test test_all_gather_extension_outer_size_stride
#154836 opened Jun 2, 2025
torch inductor cudagraph tree could incorrectly release some input nodes during replay
#154824 opened Jun 1, 2025
In-place operations are reordered across the forward-backward in autograd function
#154820 opened Jun 1, 2025
Multi-dimensional tensors in datasets might get incorrectly flattened when fetching data from dataloader which is specified 'batch_sampler' when created
#154810 opened Jun 1, 2025
FP8 scaled mm lowering ignores scale_result argument
#154807 opened May 31, 2025
[discussion] Specialized frontend method for computing Gram / covariance matrix
#154791 opened May 31, 2025
Redistribute DTensor across different DeviceMeshes
#154787 opened May 31, 2025
Turn gradient_accumulate into a separate aten op
#154767 opened May 30, 2025
Inductor codegens invalid fp8 elementwise mul op
#154750 opened May 30, 2025
torch.export seems to emit invalid code for Tensor.split when used with meta device
#154721 opened May 30, 2025
DISABLED test_inductor_multiple_specializations_dynamic_shapes_cuda (__main__.DynamicShapesCodegenGPUTests)
#154718 opened May 30, 2025
ONNX Exporter Support for aten::cartesian_prod
#154714 opened May 30, 2025
DISABLED test_inductor_multiple_specializations_dynamic_shapes_cuda (__main__.DynamicShapesGPUTests)
#154710 opened May 30, 2025
torch.quantized_batch_norm API return different output when input is nan/inf or undefined mathematical calculation method
#154708 opened May 30, 2025
DISABLED test_inductor_multiple_specializations_cuda (__main__.GPUTests)
#154705 opened May 30, 2025
[Inductor] Output discrepancy between Inductor and eager of mean with input of a large size tensor
#154703 opened May 30, 2025
[FSDP2] offer public API to share communication context aross fsdp roots
#154657 opened May 29, 2025
Recreation of unbacked symint leads to "possible memo disaster" when running decompositions
#154647 opened May 29, 2025
functional all_gather does not work with scalar tensor and fail with torch.compile
#154621 opened May 29, 2025
NotImplementedError: argument of type: <class 'torch._C._TensorMeta'>
#154614 opened May 29, 2025
`torch.jit.script` doesn't accept `axis` as alias for `dim`
#154613 opened May 29, 2025
Internal Assertion Fa 10000 ilure with torch.cuda.Stream in @script_method on CPU-only Systems
#154607 opened May 29, 2025
`torch.cumprod` for complex128 produces inconsistent results between CPU and CUDA
#154606 opened May 29, 2025
RuntimeError: devptr INTERNAL ASSERT FAILED at "/build/source/c10/cuda/CUDACachingAllocator.cpp":3696, please report a bug to PyTorch. entry in cache has missing shared_ptr
#154603 opened May 29, 2025
Cudnn attention is very slow when sequence length changed in every step
#154602 opened May 29, 2025
[inductor][cpu]functorch_dp_cifar10 and opacus_cifar10 performance regression in 2025-05-24 nightly release
#154598 opened May 29, 2025
inductor benchmark_compiled_module's random initialization will cause crash for data dependent indexing
#154592 opened May 29, 2025
DISABLED test_profiler_remote_cuda (__main__.TensorPipeCudaRpcTest)
#154587 opened May 29, 2025
DISABLED test_zero_bubble_with_model_kwargs_ScheduleClass1 (__main__.ScheduleTest)
#154579 opened May 29, 2025
Buggy test in test/export/test_export.py
#154574 opened May 28, 2025
DISABLED test_mempool_with_allocator (__main__.TestMemPool)
#154566 opened May 28, 2025
Distributed Breakpoint doesn't exit safely
#154563 opened May 28, 2025
Enable GB200 for dynamo/torchbench tests
#154560 opened May 28, 2025
[cond] support for unbacked symbols in compiled region
#154559 opened May 28, 2025
grouped_mm optional zero initialization of the output
#154557 opened May 28, 2025
DISABLED test_zero_bubble_with_model_kwargs_ScheduleClass0 (__main__.ScheduleTest)
#154547 opened May 28, 2025
[AOTI] view isn't guarded for output of a custom op
#154537 opened May 28, 2025
`typing.get_type_hints` fails on TorchScript model after loading with `torch.jit.load`
#154502 opened May 28, 2025
`torch.add` and `torch.sub` return incorrect results when computing complex numbers containing `inf` on both CPU and GPU
#154501 opened May 28, 2025
torch.fft.irfft(n) doesn’t handle non-Hermitian inputs in a consistent way
#154496 opened May 28, 2025
INTERNAL ASSERTION ERROR using device='mkldnn' despite deprecation
#154491 opened May 28, 2025
[wishlist item] Assume data-dependent info based on tensor construction
#154489 opened May 28, 2025
`__jit_ignored_attributes__` is not respected in `torch.jit.trace_module`
#154478 opened May 28, 2025
FFT regression caused by MKL upgrading: MKL FFT error: Intel oneMKL DFTI ERROR: Inconsistent configuration parameters
#154477 opened May 28, 2025
Multiple torch.fft APIs produce inconsistent results for infinite inputs between CPU and GPU
#154474 opened May 28, 2025
Mega Cache/ Torchcompile cache Not working
#154463 opened May 27, 2025
torch.compile tracing doesn't seem to be using the cache
#154456 opened May 27, 2025
[Dynamo] Confusing re-raise exception handling graph break message
#154454 opened May 27, 2025
16KB pagination support for PyTorch Torch Vision and PyTorch Lite
#154449 opened May 27, 2025
[RFC] Removing ideep git submodule dependency from PyTorch & add oneDNN git submodule
#154444 opened May 27, 2025
Inconsistent sin(x) output between CPU and CUDA for very large arguments
#154428 opened May 27, 2025
`Floating point exception` in `torch.onnx.export` with PixelShuffle
#154425 opened May 27, 2025
`Segmentation fault` in `torch.matmul` and `torch.sparse.addmm`
#154424 opened May 27, 2025
`Segmentation fault` in `torch.jit.ignore`
#154423 opened May 27, 2025
`Segmentation fault` in `torch.fx.experimental.partitioner_utils.map_arg`
#154422 opened May 27, 2025
`torch.fmod` and `torch.remainder` crash in `Inductor`
#154420 opened May 27, 2025
Crash in `torch.sparse.softmax`
#154419 opened May 27, 2025
[JIT] torch.jit.script raises an exception with view(dtype)
#154407 opened May 27, 2025
[C10D] Blackwell B200: NotImplementedError: Could not run '_c10d_functional_autograd::all_to_all_single' with arguments from the 'CUDA' backend.
#154370 opened May 26, 2025
Dynamo cannot trace into wrap_triton
#154365 opened May 26, 2025
`scaled_dot_product_attention` broadcasting (GQA) is a memory footgun
#154363 opened May 26, 2025
Q.size(-1) == m INTERNAL ASSERT FAILED at "/pytorch/aten/src/ATen/native/BatchLinearAlgebra.cpp
#154356 opened May 26, 2025
integer convolution
#154354 opened May 26, 2025
bfloat16 Conv2d slower than float16 on 4090
#154351 opened May 26, 2025
[RFC] : Remove Explicit Backend References from `torch.distributed` (`c10d`)
#154345 opened May 26, 2025
Support jvp for flex attention
#154332 opened May 25, 2025
MPS Memory Leak
#154329 opened May 25, 2025
`torch.distributed.checkpoint.state_dict.get_model_state_dict` does not update the state_dict._metadata key
#154327 opened May 25, 2025
torch.jit.script gives false results for autograd if complex data types are involved
#154324 opened May 25, 2025
when pass a scale generated from a registered_buff of a nn.Module to fake_quantize_per_channel_affine, onnx runntime will raise Exception： “For per axis quantization, scale must be 1D tensor with”
#154323 opened May 25, 2025
float16 → float32 conversion yields unexpected zero matrix for matrices > 43000 × 43000 in MPS
#154322 opened May 25, 2025
Replacement for `export_modules_as_functions`
#154319 opened May 25, 2025
[BUG] DataLoader low GPU utilization and extremely slow compared to manual batching
#154318 opened May 25, 2025
torch.logdet produces incorrect results for singular matrices on CUDA vs CPU
#154312 opened May 24, 2025
torch.inf.to(torch.int32) produces different values on CPU vs CUDA
#154311 opened May 24, 2025
graph recording observed an input tensor deallocate during graph recording that did not occur during replay
#154306 opened May 24, 2025
Inconsistent output of `torch.func.jvp` calculation
#154302 opened May 24, 2025
'max-autotune' much slower than 'default' mode (run fused add_mul_activation kernel)
#154301 opened May 24, 2025
ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory
#154299 opened May 24, 2025
Hangs and timeouts on dist.reduce_scatter on B200 GPU
#154297 opened May 24, 2025
Using `compile` on `hessian(hessian)`
#154284 opened May 23, 2025
LazyGraphModule causes graph breaks
#154282 opened May 23, 2025
[FSDP2] for mixed precision, input casting can get blocked when cuda streams are full
#154272 opened May 23, 2025
torch dynamo fails on GH200 with world size 5
#154266 opened May 23, 2025
max_pool2d padding assert incorrect with dilation
#154262 opened May 23, 2025
Pypi Support for Windows arm64
#154260 opened May 23, 2025
torch.compile regression blocks torchvision/torchbench pin upgrade
#154259 opened May 23, 2025
[RFC] Cuda support matrix for Release 2.8
#154257 opened May 23, 2025
[Testing][CUDA] test_cublas_addmm_reduced_precision_size_10000_backend_cublaslt_cuda_float16 failure with linux-focal-cuda12.6-py3-gcc11-slow-gradcheck
#154248 opened May 23, 2025
`RNNBase` modules break parameter sharing due to `flatten_parameters()`
#154241 opened May 23, 2025
torch._check on 3 unbacked symints aren't resolving ddes
#154240 opened May 23, 2025
MPS backend fails to detect nn.Embedding out-of-range error
#154235 opened May 23, 2025
NotImplementedError when computing JVP of Attention
#154226 opened May 23, 2025
AssertionError: AssertionError not raised ```test_mutable_custom_op_fixed_layout2```
#154219 opened May 23, 2025
AssertionError: Scalars are not equal! ```test_require_stride_expanded_dynamic_shapes_cuda```
#154214 opened May 23, 2025
Triton pin update for PyTorch 2.8 / Triton 3.4
#154206 opened May 23, 2025

864 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

Full static typing for `torch.distributions`
#144219 commented on Jun 12, 2025 • 71 new comments
[Set] Support sets in VariableBuilder
#153150 commented on Jun 21, 2025 • 46 new comments
Fused RMSNorm implementation
#153666 commented on Jun 21, 2025 • 29 new comments
[DLPack] Add support for missing keyword-arguments.
#150218 commented on Jun 21, 2025 • 18 new comments
Inductor logging + analysis of torch.profile
#149697 commented on Jun 10, 2025 • 17 new comments
[cp] dispatch flex_attention to CP impl in TorchDispatchMode
#151497 commented on Jun 5, 2025 • 12 new comments
Upgrade to DLPack 1.0.
#145000 commented on Jun 23, 2025 • 10 new comments
[list] Implement list.count
#153969 commented on Jun 21, 2025 • 9 new comments
Adding view and reduction tags
#153342 commented on Jun 4, 2025 • 9 new comments
[Inductor] auto-chunker
#136702 commented on Jun 17, 2025 • 9 new comments
[export] add serialized_artifact test
#152739 commented on Jun 16, 2025 • 8 new comments
autograd: Add VJP and JVP rules for aten::aminmax
#151186 commented on Jun 22, 2025 • 8 new comments
Fix full_like decomposition to preserve strides
#144765 commented on Jun 20, 2025 • 8 new comments
Enable the AMP precision with freezing for CPU nightly test
#152298 commented on Jun 20, 2025 • 8 new comments
[cp] dispatch flex_attention_backward to CP impl in TorchDispatchMode
#152311 commented on Jun 3, 2025 • 7 new comments
Build libgomp (gcc-11) from src on AArch64
#152361 commented on Jun 22, 2025 • 7 new comments
add fp8 scaled_mm for XPU
#140972 commented on Jun 5, 2025 • 6 new comments
Add unified memory APIs for torch.accelerator
#152932 commented on Jun 23, 2025 • 6 new comments
Fix clang-tidy bugprone* warnings
#148529 commented on Jun 23, 2025 • 6 new comments
[BE]: Update cudnn to 9.9 for cu128
#152782 commented on Jun 4, 2025 • 6 new comments
Upgrade MKL in CI
#154198 commented on Jun 13, 2025 • 6 new comments
[FrozenSet] Fixes for FrozenSet
#152991 commented on Jun 21, 2025 • 5 new comments
[dynamic shapes] unbacked safe conv1d
#154089 commented on May 29, 2025 • 4 new comments
docs: fix dead link in torch.compile docs
#152734 commented on Jun 17, 2025 • 4 new comments
Update OpenBLAS commit
#151547 commented on Jun 20, 2025 • 4 new comments
New Sampler: DistributedWeightedRandomSampler
#150182 commented on Jun 21, 2025 • 4 new comments
Deprecated pkg_resources and use distributions instead
#151915 commented on Jun 22, 2025 • 4 new comments
Add CPython string tests
#150793 commented on Jun 17, 2025 • 4 new comments
[HOP, map] Rework of map autograd to the new interface
#153343 commented on Jun 11, 2025 • 4 new comments
Fix DLPack stream logic.
#150217 commented on May 30, 2025 • 3 new comments
ci: Add sccache to manylinux images
#148419 commented on May 31, 2025 • 3 new comments
[MPS] Implement max_pool3d_with_indices
#154145 commented on May 25, 2025 • 3 new comments
[TESTING] Triton pin (Jun 13) 5389ed797016010543ef1c7b88efc50f7521cb4e
#153117 commented on Jun 14, 2025 • 3 new comments
NUMA Binding Integration with torchrun
#149334 commented on Jun 16, 2025 • 3 new comments
removed zero dim cpu logic from fake_tensor.py
#147501 commented on Jun 5, 2025 • 2 new comments
xpu: support custom ops with torch.library on xpu backend
#152879 commented on Jun 10, 2025 • 2 new comments
[inductor] Add typing to _inductor/ir.py
#149958 commented on Jun 19, 2025 • 2 new comments
Fix clang-tidy warnings of performance from uncovered files
#144542 commented on Jun 19, 2025 • 2 new comments
[ONNX] Don't link to third-party protobuf
#153920 commented on Jun 20, 2025 • 2 new comments
Make open device registration tests standalone
#153855 commented on Jun 10, 2025 • 2 new comments
Deprecate DataLoader pin_memory_device param
#146821 commented on Jun 23, 2025 • 2 new comments
[3/n][Optimus][Auto-AC] Support float8_e4m3fn quantization type and set scaling as the default
#153802 commented on Jun 4, 2025 • 2 new comments
Adjust CMake code for Eigen
#148628 commented on Jun 23, 2025 • 2 new comments
[PG/nccl] improvements to eager init
#154132 commented on Jun 22, 2025 • 1 new comment
[cuDNN][SDPA] cuDNN SDPA refactor/cleanup, nested tensor backward, test priority bump for `sm90`, `sm100`
#149282 commented on Jun 17, 2025 • 1 new comment
[jit] DeadCodeEliminator Mark(block) improvement
#152348 commented on Jun 11, 2025 • 1 new comment
Raise `BufferError` for DLPack buffer-related errors.
#150691 commented on Jun 20, 2025 • 1 new comment
[c10d] Use non-poisoning `current_accelerator()`
#154159 commented on May 29, 2025 • 1 new comment
[pytorch_146643] fixed max triton generation
#154056 commented on May 28, 2025 • 1 new comment
[CUDA] Allow cuDNN or flash attn in `test_activation_checkpointing` pattern match check
#153272 commented on Jun 16, 2025 • 1 new comment
[WIP][XPU] Update Triton commit
#153096 commented on Jun 7, 2025 • 1 new comment
[invoke_subgraph] Force the output stride to be same as eager
#152806 commented on Jun 6, 2025 • 1 new comment
Refactor CUDAAllocatorConfig to reuse AllocatorConfig
#150312 commented on Jun 18, 2025 • 1 new comment
[dict] Raise TypeError in dict methods
#154003 commented on Jun 12, 2025 • 1 new comment
Make `Adam`, `AdamW` work with nonzero-dim Tensor betas
#149939 commented on Jun 2, 2025 • 1 new comment
[reland][ROCm] remove caffe2 from hipify
#151845 commented on Jun 19, 2025 • 1 new comment
[nn.utils] scale_grad_ with for_each
#150033 commented on Jun 21, 2025 • 1 new comment
Add inductor backend to device interface; make minifier_tests more device agnostic
#151314 commented on Jun 20, 2025 • 1 new comment
[BE] Replace `std::runtime_error` with `TORCH_CHECK` [2/N]
#152080 commented on Jun 10, 2025 • 1 new comment
[CUTLASS][WIP] Gate rowwise matmul CUTLASS kernels by compute capability
#152642 commented on Jun 10, 2025 • 1 new comment
Random Batch Sampler Speedup
#147706 commented on May 30, 2025 • 1 new comment
[pytorch][triton] Enabling TMA for flex-attention for supported device types
#153662 commented on Jun 12, 2025 • 1 new comment
Updates contextlib with ParamSpec
#153623 commented on Jun 7, 2025 • 1 new comment
ROCm OCP Micro-scaling Format (mx-fp8/mx-fp4) Support
#151360 commented on Jun 23, 2025 • 1 new comment
Enable lazy cloning in `Tensor.to` between CPU and MPS
#150569 commented on Jun 4, 2025 • 1 new comment
[ROCm] update state check for test_trace_while_active*
#153545 commented on Jun 19, 2025 • 1 new comment
Add is_pinned to host allocator
#151439 commented on Jun 11, 2025 • 1 new comment
Work around MPSGraph issue in backward pass of nn.ReplicationPad1d/2d
#152094 commented on Jun 12, 2025 • 1 new comment
[DO NOT MERGE] Enable TMA persistent GEMM Template by default
#149427 commented on May 30, 2025 • 0 new comments
Add x86-simd-sort accelerated sorting
#149362 commented on May 29, 2025 • 0 new comments
Use mypy 1.15
#149426 commented on Jun 20, 2025 • 0 new comments
fix differentiable collectives under inference mode
#149411 commented on Jun 8, 2025 • 0 new comments
Fix B018 Useless Expressions in Multiple Files (#106571)
#149408 commented on Jun 17, 2025 • 0 new comments
Fix `SequentialLR` deprecate warning about invoke `step(epoch)`
#149392 commented on Jun 3, 2025 • 0 new comments
Batch Sampler Speedup
#149441 commented on Jun 12, 2025 • 0 new comments
[ROCm] Enable max_autotune run on inductor perf dashboard
#148672 commented on May 24, 2025 • 0 new comments
Remove shebang line from easy_install generated python scripts on Windows only
#148673 commented on May 27, 2025 • 0 new comments
Set /NODEFAULTLIB:vcomp for MSVC when linking caffe2::mkl with libiomp5md.lib
#148674 commented on Jun 1, 2025 • 0 new comments
Trunk workflow for Windows Arm64
#148753 commented on Jun 17, 2025 • 0 new comments
Re-introduce -Wmaybe-uninitialized
#148760 commented on Jun 10, 2025 • 0 new comments
remove guard_size_oblivious from unbind.
#148815 commented on Jun 21, 2025 • 0 new comments
Implement derivatives for nextafter operation
#148820 commented on May 24, 2025 • 0 new comments
Make `torch._check` support bool tensor as `cond` param
#148871 commented on May 31, 2025 • 0 new comments
[DRAFT] make reshape work for reshapeing 1dim unbacked non-contig to anything
#148899 commented on Jun 12, 2025 • 0 new comments
[testing only] Update torch.utils.checkpoint to stash and restore TLS state
#148919 commented on Jun 8, 2025 • 0 new comments
Support int step for nonfused optimizer
#148956 commented on Jun 9, 2025 • 0 new comments
[test] test for keep going
#149003 commented on Jun 13, 2025 • 0 new comments
Fix issue #149006: Added docstring for backward()
#149011 commented on May 31, 2025 • 0 new comments
[FlexAttention] Allow caching of backwards func
#149069 commented on Jun 5, 2025 • 0 new comments
Support return values in generators
#149108 commented on Jun 2, 2025 • 0 new comments
[Intel GPU] Allow XPU backend in Depthwise_conv2d&3d operators
#149114 commented on Jun 21, 2025 • 0 new comments
Update the heuristic for AArch64 bmm/baddbmm
#149122 commented on Jun 12, 2025 • 0 new comments
add keepdim to cosine similarity
#149134 commented on Jun 11, 2025 • 0 new comments
PaddedTensor Init
#149140 commented on Jun 18, 2025 • 0 new comments
[prototype] in memory checkpoint example
#149202 commented on Jun 13, 2025 • 0 new comments
Fix mps scaled dot attention
#149268 commented on Jun 6, 2025 • 0 new comments
Fix unexpected keyword argument 'mode' when calling `CompileCounterWithBackend`
#149271 commented on May 28, 2025 • 0 new comments
Refactoring Distributed test cases to be device agnostic [2/n]
#149317 commented on Jun 6, 2025 • 0 new comments
[WIP] rewrite pad_nd with guard_or_false
#149998 commented on May 25, 2025 • 0 new comments
Add min/max support in export
#150003 commented on Jun 11, 2025 • 0 new comments
[DRAFT] PR to regenerate docker images for nccl release
#150029 commented on May 25, 2025 • 0 new comments
[draft][FSDP2] Reorder FSDP2 pre_forward
#150044 commented on May 27, 2025 • 0 new comments
rework test_mem_get_info for single gpu case
#150065 commented on May 30, 2025 • 0 new comments
[export][schema_upgrader][refactor] create a folder that holds different major version schemas
#150069 commented on May 30, 2025 • 0 new comments
[StaticRuntime] Fuse SigridHash
#150072 commented on May 26, 2025 • 0 new comments
Add `_foreach_fill_` ops
#150092 commented on May 26, 2025 • 0 new comments
Fix `L1Loss`, `MSELoss`, `HuberLoss` missing `weight` param
#150097 commented on Jun 17, 2025 • 0 new comments
[cherry-pick] [Submodule] [cpuinfo] cpuinfo update (#149305)
#150105 commented on May 26, 2025 • 0 new comments
S390x: update more tests
#150116 commented on Jun 5, 2025 • 0 new comments
[issue][do not land] Add unit test to show composability issue between mp_policy and checkpoint()
#150139 commented on May 27, 2025 • 0 new comments
[fix] pin cmake to 3.31.6 in build requirements
#150201 commented on May 30, 2025 • 0 new comments
[DLPack] add NumPy exchange tests.
#150216 commented on May 30, 2025 • 0 new comments
[state dict] add strict check when there are more keys in global than local state
#150239 commented on Jun 14, 2025 • 0 new comments
Fix NVTX functions compatibility with torch.compile(fullgraph=True)
#150240 commented on Jun 1, 2025 • 0 new comments
[decomps] Add decomposition for linalg_vector_norm
#150241 commented on Jun 2, 2025 • 0 new comments
Add a test for checking that the CUDA stubs directory is not in libcaffe2_nvrts.so's RPATH or RUNPATH
#150268 commented on Jun 2, 2025 • 0 new comments
[clang-tidy] Get rid-off dangerouse clang-tidy option
#150292 commented on May 30, 2025 • 0 new comments
allow collectives to be DCEd during collective optimizations, fix bad partitioner save decision
#150302 commented on Jun 9, 2025 • 0 new comments
test dynamo
#150307 commented on May 30, 2025 • 0 new comments
[FlexAttention] Don't load invalid values from mask mod
#150331 commented on Jun 15, 2025 • 0 new comments
[WIP][draft_export] suppress pending unbacked for divisibility symbol
#151718 commented on Jun 19, 2025 • 0 new comments
[ROCm] support experimental CU carveout
#149466 commented on Jun 21, 2025 • 0 new comments
Refactor `test/test_torch.py` by moving testcase to `test_indexing.py`
#149469 commented on Jun 13, 2025 • 0 new comments
Update gen_data.py
#149573 commented on May 25, 2025 • 0 new comments
Generalize AllocatorConfig to be device-agnostic
#149601 commented on Jun 18, 2025 • 0 new comments
[Inductor] Restrict block analysis to only match integer dims and strides
#149615 commented on Jun 23, 2025 • 0 new comments
DO NOT MERGE: Testing sequential builds for cuda + cpu
#149675 commented on May 25, 2025 • 0 new comments
[TP] add support for fused QKV Sharding
#149701 commented on Jun 3, 2025 • 0 new comments
[fbcode]Removing `@NoIntBaseDeprecated` annotation in `caffe2.thrift` file (#149742)
#149744 commented on May 27, 2025 • 0 new comments
[Profiler] Give non-zero default values to start events
#149757 commented on May 28, 2025 • 0 new comments
[scan] Support None return in combine_fn
#149763 commented on Jun 8, 2025 • 0 new comments
fix inductor logging for torch._scaled_mm
#149769 commented on Jun 14, 2025 • 0 new comments
Fix `torch.cuda.MemPool()` internal assertion failure when changing devices
#149818 commented on May 30, 2025 • 0 new comments
cd: Add script for generating binary build matrix
#149830 commented on Jun 16, 2025 • 0 new comments
[draft] Add support in Flex for non-contiguous NJT
#149892 commented on Jun 18, 2025 • 0 new comments
add a util function _make_all_gather_out_tensor to reduce code duplication
#149912 commented on May 29, 2025 • 0 new comments
support scalar tensor for functional all_gather
#149913 commented on May 29, 2025 • 0 new comments
Replace c10::guts::is_fundamental with std::is_fundamental
#149925 commented on May 30, 2025 • 0 new comments
add weight 2D tensor for xpu
#149927 commented on May 31, 2025 • 0 new comments
[Minimizer] Better debugging message
#149931 commented on May 25, 2025 • 0 new comments
Test whether origin/main CI is broken
#149948 commented on May 25, 2025 • 0 new comments
[TESTING] triton WS version ee6a03d19db0de2148c2604994e0256eeaefc5bc
#149949 commented on May 25, 2025 • 0 new comments
AOTI freezing: fix test issues and enable by default
#149961 commented on Jun 13, 2025 • 0 new comments
[inductor] Fix mm logging for `torch._scaled_.mm`
#149967 commented on May 25, 2025 • 0 new comments
[BE][PYFMT] migrate PYFMT for `torch/[p-z]*/` to `ruff format`
#144552 commented on Jun 23, 2025 • 0 new comments
[BE][PYFMT] migrate PYFMT for `test/[a-h]*/` to `ruff format`
#144555 commented on Jun 23, 2025 • 0 new comments
[BE][PYFMT] migrate PYFMT for `test/[i-z]*/` to `ruff format`
#144556 commented on Jun 23, 2025 • 0 new comments
[BE][PYFMT] remove `black`: finish `black -> ruff format` migration
#144557 commented on Jun 23, 2025 • 0 new comments
[Reopen] [Intel GPU] Set higher tolerance for some models only on XPU Device
#144756 commented on Jun 16, 2025 • 0 new comments
Update fbgemm_gpu pin
#144905 commented on May 28, 2025 • 0 new comments
Replacing explicit backend search with api call
#144944 commented on Jun 10, 2025 • 0 new comments
Enable fp16 linear layers in PyTorch via ACL
#144992 commented on Jun 6, 2025 • 0 new comments
[inductor] [bug fix] Fix `conv` on processing uint
#145136 commented on May 30, 2025 • 0 new comments
[dtensor][cp] experiment: call flex_attention on DTensor
#145353 commented on Jun 2, 2025 • 0 new comments
removed check for ConvTranspose3D on MPS
#145366 commented on Jun 5, 2025 • 0 new comments
General Changes for multi accelerators
#145521 commented on Jun 13, 2025 • 0 new comments
NJT support for cat() on the ragged dim
#145778 commented on Jun 18, 2025 • 0 new comments
[AsyncMM] re-enable and adapt to cutlass 3.6.0 (#144011)
#145811 commented on Jun 13, 2025 • 0 new comments
Guard the CPU cpp wrapper tests on having a cpp wrapper
#145847 commented on Jun 2, 2025 • 0 new comments
[WIP] Allow generation of inductor backend specific tests using instantiate_device_type_tests
#145873 commented on Jun 21, 2025 • 0 new comments
Add future lazy clone setting and deprecate `torch.reshape` view
#145911 commented on Jun 6, 2025 • 0 new comments
Add MPS OpInfo db, rework test_mps to use OpInfo
#145955 commented on Jun 16, 2025 • 0 new comments
fix indirect broadcast
#145992 commented on Jun 15, 2025 • 0 new comments
[Win][CD] Install cmake and setuptools from PyPI
#146055 commented on May 27, 2025 • 0 new comments
Use device agnostic APIs for device_count and backend in common_fsdp
#146289 commented on May 27, 2025 • 0 new comments
[Testing] Reduce `test_exp` flakiness
#146436 commented on May 31, 2025 • 0 new comments
[Profiler] Enable CUPTI teardown to reduce profiler overhead
#146604 commented on Jun 6, 2025 • 0 new comments
Fix inductor non-stable argsort/sort test
#146622 commented on Jun 7, 2025 • 0 new comments
inference_mode Tensors do not always need to be guarded on
#148983 commented on Jun 3, 2025 • 0 new comments
[Docker] Create an independent dependecies layer
#138612 commented on Jun 18, 2025 • 0 new comments
Fix `USE_STATIC_MKL` lost functionality
#138996 commented on Jun 19, 2025 • 0 new comments
`has_triton`: Use the device interface for detecting Triton availability
#139171 commented on Jun 16, 2025 • 0 new comments
[Don't Review] Test CI
#139971 commented on Jun 19, 2025 • 0 new comments
Add torch._scaled_mm for CPU
#139975 commented on Jun 9, 2025 • 0 new comments
[Intel GPU] Enable mkldnn::_convolution.pointwise at XPU backend
#140372 commented on May 24, 2025 • 0 new comments
Enable C++ dynamic shape guards by default
#140756 commented on Jun 13, 2025 • 0 new comments
Implement cuda graphs implementation of torch.cond and torch.while_loop
#140979 commented on Jun 6, 2025 • 0 new comments
[ROCm][layer_norm] Use __builtin_amdgcn_rcpf(x) instead of 1.f/x
#141309 commented on May 26, 2025 • 0 new comments
Add AOT inductor support for _scaled_mm for CPU
#141961 commented on Jun 13, 2025 • 0 new comments
Fix undefined behavior
#142121 commented on May 28, 2025 • 0 new comments
[Testing only] Add python cycle detection
#143204 commented on Jun 8, 2025 • 0 new comments
[Draft][WIP] Enable XPU path for FlexAttention
#143553 commented on Jun 10, 2025 • 0 new comments
ci: Add scaffolding for buidling wheels sequentially
#143672 commented on May 25, 2025 • 0 new comments
Modify the tolerance level in TIMM benchmark for XPU PreCI
#143739 commented on Jun 4, 2025 • 0 new comments
Check F2C BLAS for OpenBLAS and other vendors
#143846 commented on May 24, 2025 • 0 new comments
Using acc_t for log_softmax
#143896 commented on Jun 6, 2025 • 0 new comments
Defaults to C++20 in CMake torch targets
#143959 commented on Jun 15, 2025 • 0 new comments
[ci] Add riscv opt-int build
#143979 commented on Jun 17, 2025 • 0 new comments
Enable several readability checks
#143987 commented on May 29, 2025 • 0 new comments
Fix dangling autogenerated sphinx source code links
#144052 commented on Jun 6, 2025 • 0 new comments
Support Swiglu for Module and functional
#144465 commented on Jun 21, 2025 • 0 new comments
[dynamo, nested graph breaks] add nested graph break tests
#144516 commented on May 29, 2025 • 0 new comments
[DTensor] add aten.as_strided.default op
#147514 commented on May 27, 2025 • 0 new comments
[dtensor][cp] experiment: register flex_attention to a custom fn on DTensor
#147515 commented on May 27, 2025 • 0 new comments
[dtensor][cp] experiment: register flex_attention to a custom fn within a custom dispatch mode
#147516 commented on May 27, 2025 • 0 new comments
[dtensor][cp] experiment: register flex_attention to a custom fn on DTensor within a custom dispatch mode
#147517 commented on May 27, 2025 • 0 new comments
Fix the shape check inside gnll loss
#147522 commented on Jun 2, 2025 • 0 new comments
[dtensor][cp] experiment: try e2e cp flex_attention
#147603 commented on May 27, 2025 • 0 new comments
torch.sort: Optimize memory usage with (dtype_indices: ScalarType, dynamic_indices_dtype: bool) options
#147629 commented on Jun 9, 2025 • 0 new comments
Update triton_heuristics.py
#147690 commented on May 30, 2025 • 0 new comments
[fix]: Offload OpenBLAS gemv calls to dedicated OpenBLAS kernel
#147858 commented on May 31, 2025 • 0 new comments
Increase reference count of state tensor in `THPGenerator_reduce` to avoid premature garbage collection in `multiprocessing` start method `"forkserver"` and `"spawn"`
#147907 commented on Jun 7, 2025 • 0 new comments
Custom ops support arbitrary input types by migrating to python dispatcher
#147927 commented on May 30, 2025 • 0 new comments
[ROCm] Skip gfx12 Row-Wise F8 Tests
#148037 commented on May 28, 2025 • 0 new comments
[pytree] add another simplified pytree module `torch.pytree`
#148180 commented on Jun 18, 2025 • 0 new comments
[BE][PYFMT] migrate PYFMT for `test/inductor/` to `ruff format`
#148186 commented on Jun 23, 2025 • 0 new comments
Move estimate runtime and pick loop order heuristics into choices.py
#148202 commented on May 26, 2025 • 0 new comments
[pytree] simplify public API exposition with `__module__`
#148328 commented on Jun 18, 2025 • 0 new comments
Enable `_lazy_clone` between CPU and MPS
#148408 commented on May 31, 2025 • 0 new comments
Disable flake8 advice C416
#148412 commented on Jun 8, 2025 • 0 new comments
Optimize `torch.distributions` Score function
#148429 commented on May 28, 2025 • 0 new comments
[BE][pytree] rename `NodeDef` member to match the type annotations: `*_fn -> *_func`
#148474 commented on Jun 18, 2025 • 0 new comments
[BE][pytree] rename argument name in register function to match the type annotations: `*_fn -> *_func`
#148484 commented on Jun 18, 2025 • 0 new comments
[triton hash update] update the pinned triton hash
#148492 commented on Jun 23, 2025 • 0 new comments
[BE][pytree] cleanup parameterized pytree tests
#148569 commented on Jun 18, 2025 • 0 new comments
[cuda] Add new faster gammabeta backward kernel
#148605 commented on Jun 2, 2025 • 0 new comments
gloo: fix building system gloo with CUDA/HIP
#146637 commented on Jun 15, 2025 • 0 new comments
Optimize isclose() for CPU and GPU by adding specific implementations
#146656 commented on Jun 17, 2025 • 0 new comments
Optimize LRScheduler docs
#146684 commented on May 27, 2025 • 0 new comments
Enable explicitly vectorized `_weight_int8pack_mm` op for FP16 dtype on x86_64 CPU
#146777 commented on Jun 6, 2025 • 0 new comments
implement Size.__radd__
#146834 commented on Jun 8, 2025 • 0 new comments
[Optimus][Inductor] Add full cat aten pattern
#146874 commented on May 30, 2025 • 0 new comments
Port distributed backend tests to Pytest
#146961 commented on Jun 4, 2025 • 0 new comments
Porting Pytorch to AIX Operating System.
#146983 commented on Jun 19, 2025 • 0 new comments
Use 2022 as default VC_YEAR for windows builds
#147053 commented on Jun 1, 2025 • 0 new comments
OpenReg: Fix releasing tensor issue when using pin_memory
#147066 commented on Jun 13, 2025 • 0 new comments
Fix the Problems About Defining Static Variable in Inline Function
#147095 commented on Jun 23, 2025 • 0 new comments
Add ppc64le wheel build support
#147194 commented on Jun 20, 2025 • 0 new comments
Periodic Activations Module
#147218 commented on May 30, 2025 • 0 new comments
fixed optimizer load_state_dict
#147289 commented on Jun 10, 2025 • 0 new comments
[MPS] Fix metallib embedding in static builds
#147324 commented on Jun 16, 2025 • 0 new comments
[MPS] Fix incorrect size for uint3 arg
#147325 commented on Jun 16, 2025 • 0 new comments
[DO NOT MERGE] Update submodule ideep for ideep matmul changes
#147359 commented on Jun 6, 2025 • 0 new comments
Replace `fw_metadata` info with trace log hint in hint message
#147365 commented on May 26, 2025 • 0 new comments
Add overflow check for large storage_offsets
#147398 commented on May 28, 2025 • 0 new comments
Small scheduler refactor
#147410 commented on Jun 21, 2025 • 0 new comments
To enable NCCL communication to support uint64 tensors
#147424 commented on Jun 10, 2025 • 0 new comments
[ONNX] Migrate onnx ops decomp functions
#147469 commented on Jun 7, 2025 • 0 new comments
[test] compile cmd
#147470 commented on Jun 18, 2025 • 0 new comments
Optimize `dynamo` typing
#147499 commented on May 28, 2025 • 0 new comments
Expand cache logging
#152026 commented on Jun 11, 2025 • 0 new comments
Test
#152055 commented on Jun 19, 2025 • 0 new comments
Cause `ceil_div` to accept values of differing types an upcast to the larger type
#152074 commented on Jun 23, 2025 • 0 new comments
Switch to standard pep517 sdist generation
#152098 commented on Jun 20, 2025 • 0 new comments
[inductor] propagate shapes in CSEVariable
#152198 commented on Jun 14, 2025 • 0 new comments
Improve error handling in CachingAutotuner for argument mismatches
#152215 commented on Jun 4, 2025 • 0 new comments
Add `padding="same"` for transposed convolution
#152228 commented on May 23, 2025 • 0 new comments
Updates to build on Noble (Ubuntu24.04) and py3.12
#152240 commented on Jun 5, 2025 • 0 new comments
complex.pow(2) on GPU by replacing with complex * complex to avoid numerical instability
#152373 commented on Jun 22, 2025 • 0 new comments
Relax tolerance for test_quick_baddbmm_cpu_complex64
#152424 commented on Jun 3, 2025 • 0 new comments
[2/N] Deprecate c10::string_view and c10::string
#152509 commented on Jun 4, 2025 • 0 new comments
ci: Switch benchmark dependency to use pip
#152545 commented on May 31, 2025 • 0 new comments
Implemented `Size.__radd__`
#152554 commented on Jun 23, 2025 • 0 new comments
[pytree] make `tree_*` functions accept both Python and C++ `PyTreeSpec`
#152624 commented on Jun 18, 2025 • 0 new comments
[ROCm] Initial AITER Integration for mha_bwd asm kernels
#152630 commented on Jun 17, 2025 • 0 new comments
[do-not-land][ca] default on for CI
#152646 commented on Jun 6, 2025 • 0 new comments
[Inductor] Pattern matcher support for mutable ops with non-view inputs
#152775 commented on May 30, 2025 • 0 new comments
[Don't merge] Debug
#152940 commented on May 30, 2025 • 0 new comments
[dtensor] add privateuse1 SDPA op support to DTensor
#152949 commented on Jun 11, 2025 • 0 new comments
[ROCm] Ck gemm architecture guard
#152951 commented on Jun 17, 2025 • 0 new comments
🌠 Add Muon optimizer
#153048 commented on Jun 2, 2025 • 0 new comments
Bump triton pin and update setup.py path
#153165 commented on Jun 4, 2025 • 0 new comments
Don't print hinted expression if statically known.
#153173 commented on Jun 14, 2025 • 0 new comments
Tensor .cuda() very slow with specific array sizes
#153176 commented on May 28, 2025 • 0 new comments
Update __init__.py
#151751 commented on Jun 22, 2025 • 0 new comments
Refactor duplicate code into a utility function in pytorch/torch/nn/functional.py
#151752 commented on Jun 20, 2025 • 0 new comments
torch.testing._internal.optests - MPS Support
#151758 commented on Jun 21, 2025 • 0 new comments
[MPS] Implement upsample_nearest3d_vec operator
#151760 commented on Jun 22, 2025 • 0 new comments
enable windows inductor UT in CI
#151777 commented on Jun 21, 2025 • 0 new comments
Normalize dynamic size symbols in template codegen cache key.
#151778 commented on Jun 4, 2025 • 0 new comments
Horizontal
#151780 commented on Jun 13, 2025 • 0 new comments
Deduplicate library deletion
#151795 commented on Jun 20, 2025 • 0 new comments
Add assert_on_assumption on to guard_or_true, and guard_or_false
#151854 commented on Jun 21, 2025 • 0 new comments
[draft export] normalize sympy expressions for data-dependent counting
#151856 commented on Jun 21, 2025 • 0 new comments
Add BufferDict works like ParameterDict
#151870 commented on Jun 21, 2025 • 0 new comments
Add `LinearLR` compute lr formula in doc
#151894 commented on May 26, 2025 • 0 new comments
[cp] dispatch flex_attention on DTensor to cp implementation
#151900 commented on Jun 23, 2025 • 0 new comments
[WIP] Deprecate AcceleratorHooksInterface isPinnedPtr, use at::getHostAllocator()->is_pinned instead
#151916 commented on Jun 11, 2025 • 0 new comments
[WIP]: track remaining runtime time asserts for backward coddgen instead of trying to regenerate all
#151919 commented on Jun 4, 2025 • 0 new comments
[Observability][Optimus] Fix the tlparse name
#151935 commented on Jun 22, 2025 • 0 new comments
[WIP][dynamic shapes] whitelist at dim-level
#151941 commented on Jun 22, 2025 • 0 new comments
[profiler] use inspect.getattr_static to avoid importing inductor
#151946 commented on Jun 22, 2025 • 0 new comments
add tlpare logs
#151948 commented on Jun 22, 2025 • 0 new comments
[inductor][profiler] lazily import things in standalone_compile
#151956 commented on Jun 22, 2025 • 0 new comments
Make `aten.embedding` do not wrap negative index
#151967 commented on May 28, 2025 • 0 new comments
[WIP][recompiles] verbose logging for tensor guard checks
#151971 commented on Jun 23, 2025 • 0 new comments
[Don't merge] Upgrade oneDNN to v3.8-rc for XPU build
#152001 commented on Jun 23, 2025 • 0 new comments
[WIP] fix reinplacing bug
#152011 commented on Jun 22, 2025 • 0 new comments
[RPC] fix deserialize doesn't respect user pickler
#153821 commented on Jun 5, 2025 • 0 new comments
[c10d] make register_backend device handling more robust
#153824 commented on Jun 4, 2025 • 0 new comments
[PP] Fix double backward error in stage_backward
#153893 commented on May 28, 2025 • 0 new comments
[CPU Generator] Remove the unused CPUGeneratorImplStateLegacy in set_state
#153934 commented on Jun 12, 2025 • 0 new comments
Fix #153942
#153943 commented on May 28, 2025 • 0 new comments
[Dynamo] Fixes for exceptions
#153966 commented on May 28, 2025 • 0 new comments
[draft][do not review] H-FSDP prototype
#154000 commented on Jun 18, 2025 • 0 new comments
Add MPS implementation of CTC Loss based on CUDA version
#154044 commented on May 27, 2025 • 0 new comments
Add pyrefly lint adaptor
#154059 commented on May 26, 2025 • 0 new comments
[Dynamo] [Set] Implement some binop operators for dict/set/frozenset/dict_keys
#154063 commented on Jun 21, 2025 • 0 new comments
[Dynamo] [Set] Raise TypeError if object is unhashable
#154064 commented on Jun 21, 2025 • 0 new comments
[Dynamo] [Set] Raise TypeError in set.union(...) and "__or__"
#154065 commented on Jun 21, 2025 • 0 new comments
[Dynamo] [Set] Add comparison for set subclass
#154066 commented on Jun 21, 2025 • 0 new comments
force computation in opmath_t for CUDA fused optimizers
#154069 commented on Jun 2, 2025 • 0 new comments
[dynamo] remove recursive cell/freevar in instruction tx
#154078 commented on May 29, 2025 • 0 new comments
dynamo_time: Allways log to pt2_compile_events
#154081 commented on May 24, 2025 • 0 new comments
Update the UT of test_decompose_mm_cpu
#154100 commented on May 28, 2025 • 0 new comments
[BE]: Update pybind11 submodule to 3.0.0rc
#154115 commented on Jun 19, 2025 • 0 new comments
[do not merge] Test out long filename for libtorch_agnostic c++ extension test in Windows
#154139 commented on May 28, 2025 • 0 new comments
Add basic xor_sum op
#154149 commented on May 28, 2025 • 0 new comments
[cuBLASLt][cuBLAS] Support 2D bias and `beta != 1.0` in cuBLASLt
#154170 commented on Jun 17, 2025 • 0 new comments
[PG/nccl] Simplify uniqueHash management
#154185 commented on May 26, 2025 • 0 new comments
[cond] support gen_schema for cond
#154193 commented on Jun 17, 2025 • 0 new comments
implement MKLGenerator
#154199 commented on Jun 18, 2025 • 0 new comments
Adding XPU support to DTensor examples
#153213 commented on May 25, 2025 • 0 new comments
Fix integer overflow bug in triu/tril for large diagonal values
#153240 commented on May 26, 2025 • 0 new comments
fix dtensor and tensor inconsistent compute mesh
#153268 commented on Jun 2, 2025 • 0 new comments
Fix lcm_ crash with int16 scalar and large int32 tensor
#153314 commented on May 23, 2025 • 0 new comments
CMake: update FindCUDAToolkit.cmake, use torch::nvtx3 if present, mod…
#153339 commented on Jun 15, 2025 • 0 new comments
[ATen][CUDA][CUB] Implement changes to CCCL (CUB/Thrust/LibCUDACXX) usage in ATen
#153373 commented on Jun 9, 2025 • 0 new comments
[AOTI Debugging] Add Environment Variable to control output path
#153391 commented on Jun 11, 2025 • 0 new comments
[BE]: Enable RUFF TRY400 rule - log.exception
#153473 commented on Jun 2, 2025 • 0 new comments
Clean PR: Replace _device_t with torch.types.Device and fix lint issues (#152952)
#153493 commented on Jun 4, 2025 • 0 new comments
[AUTOCAST] FEAT: Allow passing a `torch.device` object to autocast
#153539 commented on Jun 10, 2025 • 0 new comments
[Ez][BE]: Remove accidental classvar
#153540 commented on May 28, 2025 • 0 new comments
[BE]: Update CUTLASS submodule to 4.0.0rc
#153541 commented on Jun 13, 2025 • 0 new comments
[Dynamo] [SetSubclass] Add support for user defined sets
#153553 commented on Jun 21, 2025 • 0 new comments
[PP] wip, allow grad to be None
#153557 commented on May 28, 2025 • 0 new comments
[caffe2] Allow the elimination of implicit calls to strlen when using the RECORD_FUNCTION macros
#153567 commented on Jun 10, 2025 • 0 new comments
support scaled mm on inductor
#153602 commented on May 26, 2025 • 0 new comments
[BE] Use latest mkl-include and mkl-devel on Windows CI
#153684 commented on Jun 15, 2025 • 0 new comments
Use magma 2.9.0
#153703 commented on Jun 2, 2025 • 0 new comments
[partitioner] Fix _broadcast_on_rank0 to use deterministic hash function
#153734 commented on Jun 5, 2025 • 0 new comments
[not for land] small compile-on-one-rank example
#153743 commented on May 24, 2025 • 0 new comments
Add TORCH_CHECK for group < channels for native_channel_shuffle
#153781 commented on Jun 6, 2025 • 0 new comments
Fix `LLONG_MIN` errors in `torch.jit.script`
#153793 commented on Jun 6, 2025 • 0 new comments
Ignore url lint in install_xpu.sh
#153796 commented on Jun 16, 2025 • 0 new comments
Build clang20 image for ASAN tests
#153806 commented on Jun 22, 2025 • 0 new comments
Copy native runtime code to OSS.
#150338 commented on Jun 15, 2025 • 0 new comments
[torchrun] Fix: Use Correctly Reachable Host Address in c10d Rendezvous
#150533 commented on Jun 6, 2025 • 0 new comments
[export] Refactor strict to pass fake tensors to dynamo
#150546 commented on Jun 1, 2025 • 0 new comments
Fix link formatting in cpp_extension.py
#150552 commented on Jun 14, 2025 • 0 new comments
Initial Implementation of Padded Tensor
#150567 commented on Jun 18, 2025 • 0 new comments
[cuda] Added CUDA kernels for RMSNorm
#150576 commented on Jun 10, 2025 • 0 new comments
[test] DTensor moe compile fixes for dynamic shapes
#150582 commented on Jun 6, 2025 • 0 new comments
fix dynamic shapes for kwargs
#150583 commented on Jun 19, 2025 • 0 new comments
[WIP] try always splitting in reshape view
#150584 commented on Jun 2, 2025 • 0 new comments
[CUDA] include nvtx3 header in wheel so downstream torch extension can find it
#150591 commented on Jun 4, 2025 • 0 new comments
suppress neon missing message on armv8 build
#150595 commented on Jun 4, 2025 • 0 new comments
Make `nn.MultiLabelMarginLoss` error message user friendly
#150606 commented on Jun 3, 2025 • 0 new comments
Make error message descriptive
#150627 commented on Jun 2, 2025 • 0 new comments
tutorial example for cp
#150641 commented on Jun 3, 2025 • 0 new comments
Split up cub-RadixSortPairs-scalars.cu to parallelize compilation
#150678 commented on Jun 3, 2025 • 0 new comments
Revert "[ATen][CUDA] Implement 128 bit vectorization v2 (#145746)"
#150679 commented on Jun 3, 2025 • 0 new comments
cd: Introduce new binary build workflows (cpu)
#150713 commented on Jun 16, 2025 • 0 new comments
[wip] support tracing async collectives
#150720 commented on Jun 6, 2025 • 0 new comments
Avoid overwriting COW data in MPS code
#150721 commented on Jun 4, 2025 • 0 new comments
[Inductor] Set the default value of min_chunk_size to 512
#150762 commented on Jun 19, 2025 • 0 new comments
Add CPython exception tests
#150789 commented on Jun 3, 2025 • 0 new comments
Add CPython generator/contextlib tests
#150796 commented on Jun 5, 2025 • 0 new comments
Pin all root requirements to major versions
#150833 commented on Jun 18, 2025 • 0 new comments
Fix the Problems About Defining Static Variable in Inline Function
#150841 commented on Jun 7, 2025 • 0 new comments
not-for-landing add logs for debugging chunk metadata
#150886 commented on Jun 8, 2025 • 0 new comments
[DO NOT REVIEW] Update _fsdp_param_group.py
#150349 commented on May 31, 2025 • 0 new comments
test enummeta
#150351 commented on May 31, 2025 • 0 new comments
Memory leak base tests for compile
#150353 commented on May 31, 2025 • 0 new comments
support nested compile when inner compile is inside of __torch_dispatch__
#150355 commented on Jun 6, 2025 • 0 new comments
Build MacOS CI with MKLDNN
#150365 commented on May 31, 2025 • 0 new comments
bound sympy accuracy
#150383 commented on Jun 3, 2025 • 0 new comments
Add `mse_loss_backward_out` type promotion
#150384 commented on Jun 16, 2025 • 0 new comments
Test layout_opt_default set to 0
#150411 commented on Jun 3, 2025 • 0 new comments
[Inductor] Fix scaled_mm template migration missing endif block
#150415 commented on May 31, 2025 • 0 new comments
Test self hosted GPU runner
#150422 commented on Jun 3, 2025 • 0 new comments
[dynamo] Lazily import fsdp-related modules
#150429 commented on Jun 3, 2025 • 0 new comments
[WIP][dynamic shapes] guard_or_false rewrite for fake_impls.py:infer_size, compute_contiguous
#150431 commented on Jun 1, 2025 • 0 new comments
Faster way to test self hosted GPU runner
#150434 commented on Jun 4, 2025 • 0 new comments
caffe2: Fix lint errors in native/CPUFallback.cpp
#150443 commented on Jun 1, 2025 • 0 new comments
caffe2: Fix lint errors in FlashAttentionKernel
#150445 commented on Jun 2, 2025 • 0 new comments
[dynamic shapes] oblivious rewrite for meta_select
#150455 commented on Jun 2, 2025 • 0 new comments
[dynamic shapes] guard_or_false rewrite for scatter, gather, index metas
#150481 commented on Jun 1, 2025 • 0 new comments
caffe2: Fix lint errors in runtime/register_prim_ops.cpp
#150501 commented on Jun 1, 2025 • 0 new comments
caffe2: Fix lint errors in native/int4mm_kernel
#150503 commented on Jun 1, 2025 • 0 new comments
caffe2: Fix lint errors in native/quantized/TensorAdvancedIndexing
#150504 commented on Jun 1, 2025 • 0 new comments
caffe2: Fix lint errors in native/RNN.cpp
#150505 commented on Jun 1, 2025 • 0 new comments
caffe2: Fix lint errors in native/TensorAdvancedIndexing.cpp
#150506 commented on Jun 1, 2025 • 0 new comments
caffe2: Fix lint errors in native/TensorShape.cpp
#150507 commented on Jun 1, 2025 • 0 new comments
Fix CPU bitwise shifts for out-of-limit values in VSX-vec
#150524 commented on Jun 2, 2025 • 0 new comments
[WIP][dynamic shapes] lru cache bound_sympy
#151271 commented on Jun 16, 2025 • 0 new comments
[WIP] Generalize device caching allocator
#151298 commented on Jun 23, 2025 • 0 new comments
Update docker image names for s390x release
#151429 commented on Jun 16, 2025 • 0 new comments
Implement fast exp for AVX2 and AVX512 for the flash attention
#151441 commented on Jun 16, 2025 • 0 new comments
Allow to byteswap data when reading saved torch jit data
#151447 commented on Jun 3, 2025 • 0 new comments
update fx.Interpreter error logging to check if submodules are GraphModules
#151451 commented on Jun 17, 2025 • 0 new comments
Add default value for `serialization_format` in `_write_item` function for better compatibility
#151452 commented on Jun 16, 2025 • 0 new comments
[ROCm] Initial plumbing for CK Gemm Perf Improvement
#151465 commented on Jun 19, 2025 • 0 new comments
inductor.config.descriptive_names = False is not actually supported (#145523) (#146051)
#151481 commented on Jun 18, 2025 • 0 new comments
[dtensor][view_op] add as_strided op support to DTensor in FakeTensorMode
#151495 commented on Jun 23, 2025 • 0 new comments
Use device agnostic APIs and variable names for dtensor
#151527 commented on Jun 22, 2025 • 0 new comments
[WIP] Deprecate getPinnedMemoryAllocator use getHostAllocator instead
#151531 commented on Jun 11, 2025 • 0 new comments
Fix normalize mypy warning with tuple dim
#151553 commented on Jun 18, 2025 • 0 new comments
[autodeps2] Replace third-party/pyqt5 with third-party/pypi/pyqt5
#151557 commented on Jun 16, 2025 • 0 new comments
[bazel] Fix aten generator directory path
#151580 commented on Jun 19, 2025 • 0 new comments
[DRAFT] fix issues related to deferred assertion on unabcked floats
#151604 commented on Jun 17, 2025 • 0 new comments
added six and pyyaml to requirements.txt to fix missing module error …
#151605 commented on Jun 17, 2025 • 0 new comments
distributed: add distributed P2P TensorQueue and TensorStore
#151631 commented on Jun 17, 2025 • 0 new comments
[demo] Verify test runner integration
#151645 commented on Jun 21, 2025 • 0 new comments
Update link to NVIDIA cuDNN Support Matrix
#151647 commented on Jun 19, 2025 • 0 new comments
Add a custom profiler configuration option
#151656 commented on Jun 23, 2025 • 0 new comments
[aot] Set config partitioner recompute_views True by default
#151676 commented on Jun 17, 2025 • 0 new comments
[test] log
#151700 commented on Jun 6, 2025 • 0 new comments
Remove unnecessary recompile
#151711 commented on Jun 18, 2025 • 0 new comments
Fix StrictMinMaxConstraint issue
#150924 commented on Jun 9, 2025 • 0 new comments
Introduce test skip markers for Sandcastle
#150934 commented on Jun 2, 2025 • 0 new comments
update benchamark result due to <1% regression
#150937 commented on Jun 10, 2025 • 0 new comments
Turn optree warning into error
#150938 commented on Jun 8, 2025 • 0 new comments
all_reduce autograd
#150942 commented on Jun 22, 2025 • 0 new comments
Add complex logaddexp
#150946 commented on Jun 15, 2025 • 0 new comments
Add complex logaddexp2
#150947 commented on Jun 14, 2025 • 0 new comments
[ONNX] Migrate DORT to use the new exporter
#150950 commented on Jun 8, 2025 • 0 new comments
[dynamo][fsdp] Do not consider fsdp modules as specialized
#150954 commented on Jun 9, 2025 • 0 new comments
Add additional MacOS test runners for MPS
#150964 commented on Jun 15, 2025 • 0 new comments
move set_rotate_method to public namespace
#150968 commented on Jun 13, 2025 • 0 new comments
Add `pad_to_multiple_of` to `pad_sequence`
#150990 commented on Jun 10, 2025 • 0 new comments
Fix index broadcast
#151009 commented on Jun 9, 2025 • 0 new comments
Add `pad_to_multiple_of` to `pad_sequence` (C++ only)
#151021 commented on Jun 11, 2025 • 0 new comments
Tune linalg_eigh_cusolver: better heuristic for syevj_batched selection on cuda
#151118 commented on Jun 11, 2025 • 0 new comments
Reland prologue transposed changes
#151120 commented on Jun 11, 2025 • 0 new comments
[hop] Make base_hop share utils with control flow ops in backward
#151146 commented on Jun 6, 2025 • 0 new comments
Fix TypeIndex.h signature extraction
#151150 commented on Jun 12, 2025 • 0 new comments
Fix DWConv in QNNPACK for aarch32
#151191 commented on Jun 3, 2025 • 0 new comments
[ZCH vNext] Bucket offsets and sizes in torchrec shard metadata for bucket wise sharding
#151192 commented on Jun 14, 2025 • 0 new comments
Fix `MaskedTensor` to device ignored mask
#151205 commented on Jun 3, 2025 • 0 new comments
Implement MKLGenerator
#151218 commented on Jun 13, 2025 • 0 new comments
update visualizer with compare two schedules method
#151249 commented on Jun 13, 2025 • 0 new comments
[AMD][FA] Block mem efficient attention if backward head_dim > 128 in CK backend
#151258 commented on Jun 16, 2025 • 0 new comments
Per channel weight observer for ConvTranspose
#54816 commented on Jun 3, 2025 • 0 new comments
[Inductor][Schedule][Fusion] Ops are not fused due to reduction unroll
#153346 commented on Jun 3, 2025 • 0 new comments
test_gradient_all Device Type test regression with Numpy >= 2.0.0
#132450 commented on Jun 4, 2025 • 0 new comments
Representation string of a meta tensor is not a valid `tensor` call
#147643 commented on Jun 4, 2025 • 0 new comments
AttributeError: Can't pickle local object 'make_opaque_bitwise_fn.<locals>.BitwiseFn'
#147841 commented on Jun 4, 2025 • 0 new comments
Return type annotation of `Tensor.long()` etc is not narrowed down to dtype-specific names `LongTensor` etc
#148552 commented on Jun 4, 2025 • 0 new comments
torch.compile of simple loop takes 34 seconds
#111441 commented on Jun 4, 2025 • 0 new comments
torch.jit.script persistently changes default from utf-8 to ascii
#111480 commented on Jun 4, 2025 • 0 new comments
Triangular matrix storage + matmul
#122454 commented on Jun 4, 2025 • 0 new comments
Major perf regression with `BatchNorm2d` + `torch.compile` with `reduce-overhead` + DDP
#139207 commented on Jun 4, 2025 • 0 new comments
DISABLED test_byte_tensor_assignment (__main__.TestAdvancedIndexing)
#137028 commented on Jun 4, 2025 • 0 new comments
DISABLED test_reduce_stress_cuda (__main__.ProcessGroupGlooTest)
#152367 commented on Jun 4, 2025 • 0 new comments
DISABLED test_reduce_stress_cuda (__main__.ProcessGroupGlooLazyInitTest)
#152201 commented on Jun 4, 2025 • 0 new comments
Inconsistent behavior when indexing a Tensor with a list of lists
#119548 commented on Jun 4, 2025 • 0 new comments
non-strict export should detect fake tensor leakage
#153062 commented on Jun 4, 2025 • 0 new comments
Support loading and executing a ExportedProgram from torch.export in C++ environment
#144663 commented on Jun 4, 2025 • 0 new comments
Torch profiler corrupted names with Python 3.11
#121219 commented on Jun 5, 2025 • 0 new comments
Support for `uint16`, `uint32`, and `uint64`
#58734 commented on Jun 5, 2025 • 0 new comments
`torch==2.6` broke `nn.Module.dtype` typing
#152292 commented on Jun 5, 2025 • 0 new comments
Stack trace from pytest is very far away and far too find on some tests
#141204 commented on Jun 5, 2025 • 0 new comments
Obscure error: Expected a value of type 'List[int]' for argument 'sizes' but instead found type 'immutable_list'
#122129 commented on Jun 5, 2025 • 0 new comments
[Feature request] Exclusive prefix sum, `torch.cumsum(input, dim=0, exclusive=True)`
#76191 commented on Jun 5, 2025 • 0 new comments
Fx Graph cache hit generates guards that does not exists in the original cached program causing recompilations only at cache hit.
#152435 commented on Jun 5, 2025 • 0 new comments
Numerical inaccuracies in "ddp_apply_optim_in_backward" unit tests for gloo backend
#111834 commented on Jun 5, 2025 • 0 new comments
redundant recompilation caused by duplicated Sym()
#144068 commented on Jun 5, 2025 • 0 new comments
RuntimeError: Expression of type - cannot be used in a type expression: __torch__.transformers_modules.code-5p-110m-embedding.modeling_codet5p_embedding.___torch_mangle_1368.CodeT5pEmbeddingModel ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
#137252 commented on Jun 5, 2025 • 0 new comments
MPS backend appears to be limited to 32 bits
#84520 commented on Jun 2, 2025 • 0 new comments
Floating point exception when autocast is enabled
#154014 commented on Jun 2, 2025 • 0 new comments
[RFC] [Feature] Intra-Device Heterogeneous Memory Allocation Support
#153745 commented on Jun 2, 2025 • 0 new comments
[inline_inbuilt_nn_modules] Move export to inline_inbuilt_nn_modules
#147030 commented on Jun 2, 2025 • 0 new comments
torch.cuda.memory_reserved always returns 0 bytes
#103243 commented on Jun 2, 2025 • 0 new comments
[RFC] dropping CUDA 11.8 support in CI/CD
#147383 commented on Jun 2, 2025 • 0 new comments
[dynamo] Graph breaks from copy.deepcopy
#115122 commented on Jun 2, 2025 • 0 new comments
`copy_()` fails with HSDP in FSDP2
#147568 commented on Jun 2, 2025 • 0 new comments
[ONNX] Create unit tests for the new export path by adapting all existing tests
#129279 commented on Jun 2, 2025 • 0 new comments
SequentialLR does not work correctly with multiple ConstantLR
#82684 commented on Jun 3, 2025 • 0 new comments
Inconsistent export behavior for nonzero+grid_sample between CUDA and CPU/MPS backends
#152791 commented on Jun 3, 2025 • 0 new comments
pin_memory crashes for big tensors and leaks page locked memory
#152335 commented on Jun 3, 2025 • 0 new comments
Triton Compilation Error in Generated Code due to possible float division in index
#153375 commented on Jun 3, 2025 • 0 new comments
[feature request] Discover actually loaded shared libraries at runtime
#82098 commented on Jun 3, 2025 • 0 new comments
[RFC] Supporting Eager Mode via torch.compile
#115545 commented on Jun 3, 2025 • 0 new comments
Implement einsum backprop rather than decomposing
#149133 commented on Jun 3, 2025 • 0 new comments
NCCL out of memory error after updating to PyTorch 2.7
#152302 commented on Jun 3, 2025 • 0 new comments
matmul uses excessive memory in batch cases with more than 3 dimensions
#154128 commented on Jun 3, 2025 • 0 new comments
add `FlopCounterMode` documentation
#123800 commented on Jun 3, 2025 • 0 new comments
```grad_mode.py``` torch imported as NoneType, cannot import ```torch._jit_internal```
#154114 commented on Jun 3, 2025 • 0 new comments
[Feature Request] Memory optimization for backward propagation in GPU
#150698 commented on Jun 3, 2025 • 0 new comments
[triton pin update] Run Inductor CI on pin updates for Triton and the PyTorch nightly branch
#152608 commented on Jun 3, 2025 • 0 new comments
Add extensions `flash_attention` and `vllm` as test of new PyTorch releases for known issues of compat of their binaries and of possibility of compiling these from source
#155066 commented on Jun 3, 2025 • 0 new comments
Auto format lint not making suggestions
#153273 commented on Jun 3, 2025 • 0 new comments
UNSTABLE Build manywheel docker images for s390x / build-docker-cpu-s390x
#154074 commented on Jun 3, 2025 • 0 new comments
Unable to build with ATEN_THREADING=TBB option
#144767 commented on Jun 3, 2025 • 0 new comments
LibTorch build error on Windows for CUDA version (debug/release)
#139108 commented on Jun 3, 2025 • 0 new comments
[ONNX] Support while HOP
#146674 commented on Jun 7, 2025 • 0 new comments
[ONNX] Implement aten.stft
#147052 commented on Jun 7, 2025 • 0 new comments
[Dynamo][Custombackend]: rms_norm find inplace op when using aot_export_joint_simple
#154195 commented on Jun 7, 2025 • 0 new comments
Intel MKL DFTI ERROR
#120986 commented on Jun 8, 2025 • 0 new comments
MPS incompatibility: Calls into the C++ engine to run the backward pass
#143123 commented on Jun 8, 2025 • 0 new comments
torch.compile does not work with Flash attention 3
#144540 commented on Jun 8, 2025 • 0 new comments
Please consider adding MIG (MI-rror with G-radient modification) to torch.nn
#122680 commented on Jun 8, 2025 • 0 new comments
[feature request] `torch.scan` (also port `lax.fori_loop` / `lax.while_loop` / `lax.associative_scan` and hopefully parallelized associative scans)
#50688 commented on Jun 8, 2025 • 0 new comments
torch._higher_order_ops.scan incorrect/mismatched gradients for non-trailing layers with torch.compile
#153679 commented on Jun 8, 2025 • 0 new comments
aten::nonzero calls taking a huge amount of time when using MPS backend vs CPU
#124850 commented on Jun 8, 2025 • 0 new comments
profiler.export_stacks doesn't return stack trace unless experimental_config is provided
#100253 commented on Jun 8, 2025 • 0 new comments
Python 3.12 "from functorch.einops import rearrange" fails with "RuntimeError: First class dim doesn't work with python 3.12"
#142032 commented on Jun 9, 2025 • 0 new comments
Inductor aten.clone lowering ignores Conjugate and Negative dispatch keys
#145093 commented on Jun 9, 2025 • 0 new comments
`RuntimeError: UR error` with XPU
#149953 commented on Jun 9, 2025 • 0 new comments
`torch.set_default_tensor_type() is deprecated as of PyTorch 2.1` appearing in logs even when not using this function
#120584 commented on Jun 9, 2025 • 0 new comments
Mixed precision causes NaN loss
#40497 commented on Jun 9, 2025 • 0 new comments
DDP doesn't work with retain_graph = True
#47260 commented on Jun 9, 2025 • 0 new comments
[DDP] doesn't support multiple backwards when static_graph=True
#80832 commented on Jun 9, 2025 • 0 new comments
[ONNX] pad_sequence() is not exportable, with neither legacy onnx.export nor with dynamo_export
#127153 commented on Jun 9, 2025 • 0 new comments
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_int64 (__main__.TestForeachCUDA)
#149735 commented on Jun 9, 2025 • 0 new comments
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_int32 (__main__.TestForeachCUDA)
#149628 commented on Jun 9, 2025 • 0 new comments
Docs Update `wrap_triton`
#152870 commented on Jun 9, 2025 • 0 new comments
Make implicit packages (PEP420) explicit PyTorch
#153546 commented on Jun 9, 2025 • 0 new comments
Torchrun does not handle worker failure gracefully
#146371 commented on Jun 9, 2025 • 0 new comments
Triton has removed the experimental descriptor API
#154162 commented on Jun 9, 2025 • 0 new comments
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_float16 (__main__.TestForeachCUDA)
#153250 commented on Jun 10, 2025 • 0 new comments
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_complex64 (__main__.TestForeachCUDA)
#151313 commented on Jun 10, 2025 • 0 new comments
Error : torch/utils/_sympy/interp.py:176] [0/2] failed while executing pow_by_natural([VR1, int_oo], VR[-1, -1]])
#148003 commented on Jun 5, 2025 • 0 new comments
GroupNorm compilation errors on UNet-based architecture on torch >= 2.6.0
#152185 commented on Jun 5, 2025 • 0 new comments
fbgemm packages are compiled in torchinductor torchbench tests
#152024 commented on Jun 5, 2025 • 0 new comments
"RuntimeError: CUDA error: operation not supported" fixed by downgrading toolkit version
#135126 commented on Jun 5, 2025 • 0 new comments
Unexpected float32 overflow for amp training with torch.compile
#153044 commented on Jun 5, 2025 • 0 new comments
Dump bytecode of resumption frames in tlparse
#136038 commented on Jun 5, 2025 • 0 new comments 10000
PyTorch Docathon H1 2025
#153952 commented on Jun 5, 2025 • 0 new comments
torch.compile() within TorchDispatchMode always causes an unknown guard failure.
#144787 commented on Jun 5, 2025 • 0 new comments
Enable 12.8.1
#152922 commented on Jun 6, 2025 • 0 new comments
unbacked inputs not being preserved in backwards graph
#153778 commented on Jun 6, 2025 • 0 new comments
Compile + torch.autograd.grad returns no gradients
#132929 commented on Jun 6, 2025 • 0 new comments
PGO errors out in dynamo main path for sparse tensors
#154161 commented on Jun 6, 2025 • 0 new comments
Allow creation of pseudo devices for testing purposes
#61654 commented on Jun 6, 2025 • 0 new comments
RuntimeError: Shared memory manager connection has timed out
#129656 commented on Jun 6, 2025 • 0 new comments
'torch.sparse.to_sparse_semi_structured' significantly worsens performance on H100 GPUs
#153825 commented on Jun 6, 2025 • 0 new comments
DISABLED test_nn_module (__main__.TestGuardSerialization)
#153120 commented on Jun 6, 2025 • 0 new comments
FP8 Support for FlexAttention
#151695 commented on Jun 6, 2025 • 0 new comments
Can pytorch add sparse linear solvers like scipy.sparse.linalg.gmres, scipy.sparse.linalg.bicg etc.
#133676 commented on Jun 6, 2025 • 0 new comments
☂️ MPS support for large tensors
#149325 commented on Jun 6, 2025 • 0 new comments
`bytes(...)` support of torch tensor does not match numpy + it would be nice to support tensor.tobytes() as alias
#108565 commented on Jun 6, 2025 • 0 new comments
compilation fails `error: invalid argument '-std=c++17' not allowed with 'C'`
#103222 commented on Jun 6, 2025 • 0 new comments
Divergence of handling python del in dynamo vs eager
#153701 commented on Jun 6, 2025 • 0 new comments
FSDP2 "got mixed torch.Tensor and DTensor"
#153354 commented on Jun 6, 2025 • 0 new comments
[export] fail to export joint graph of a model with tied weights using experimental `_export_forward_backward` API
#147380 commented on Jun 6, 2025 • 0 new comments
LowRankMultivariateNormal doesn't work with 0 diagonal
#75173 commented on Jun 7, 2025 • 0 new comments
Cannot Convert Pytorch model with fft_rfftn layers to ONNX using latest torch.onnx.dynamo_export
#133785 commented on Jun 7, 2025 • 0 new comments
[ONNX] Migrate torchlib from onnxscript
#139301 commented on Jun 7, 2025 • 0 new comments
☂️ Update submodule dependencies to supported version of Cmake
#150328 commented on May 26, 2025 • 0 new comments
FPE in `torch.remainder`
#153919 commented on May 27, 2025 • 0 new comments
Inductor inappropriately tries to fuse scalar views of a CPU tensor into GPU kernels.
#140457 commented on May 27, 2025 • 0 new comments
Error after successful build: No module named 'torch._C._distributed_c10d'
#152285 commented on May 27, 2025 • 0 new comments
DISABLED test_tensor_subclasses (__main__.TestScript)
#119949 commented on May 27, 2025 • 0 new comments
associative scan is incorrect for certain shapes/kwargs
#137943 commented on May 27, 2025 • 0 new comments
2.2.0+ regresses SDPA performance on Windows
#125070 commented on May 27, 2025 • 0 new comments
Slow performance when running torch.jit traced model with Flash Attention using libtorch on Windows
#109770 commented on May 27, 2025 • 0 new comments
Deprecate and remove usage of from __future__ import annotations in codebase
#117449 commented on May 27, 2025 • 0 new comments
`torch.compile` and complex numbers
#125718 commented on May 27, 2025 • 0 new comments
Building extensions with CMake
#115937 commented on May 27, 2025 • 0 new comments
AssertionError: Guard check failed: 0/1: x.size()[0] == y.size()[0] # (unknown source x.size()[0], please file a bug)
#153923 commented on May 27, 2025 • 0 new comments
Using Inductor always throws a warning
#154160 commented on May 27, 2025 • 0 new comments
The "eager" and "aot_eager" backends have different behavior for the expected gradient tensor of the torch.expend_as operator
#151884 commented on May 27, 2025 • 0 new comments
Poor-quality random numbers generated by torch.poisson on gpus
#136750 commented on May 27, 2025 • 0 new comments
Add deterministic support for upsample_trilinear3d_backward_out_cuda
#154183 commented on May 27, 2025 • 0 new comments
torch.distributed.nn.all_reduce incorrectly scales the gradient
#58005 commented on May 27, 2025 • 0 new comments
[Dynamo][Inductor] `detectron2_fcos_r_50_fpn` in export config failure on dashboard
#154137 commented on May 27, 2025 • 0 new comments
[Release improvements] Have cherry-pick bot always add the current release to the PR
#152212 commented on May 27, 2025 • 0 new comments
[MPS] Possible persistent infinite loop in `nn.ReplicationPad1d`
#135442 commented on May 27, 2025 • 0 new comments
Flex Attention doesn't scale with custom bias
#152593 commented on May 28, 2025 • 0 new comments
Undefined Symobl: pybind11::detail::type_caster<at::Tensor, void>::load(pybind11::handle, bool)
#108041 commented on May 28, 2025 • 0 new comments
`vmap` not working on `torch.arange`, `torch.scalar_tensor`, and `torch.ones`
#152295 commented on May 28, 2025 • 0 new comments
RuntimeError: NVML_SUCCESS == r INTERNAL ASSERT FAILED at "/opt/conda/conda-bld/pytorch_1695392020201/work/c10/cuda/CUDACachingAllocator.cpp":1154, please report a bug to PyTorch.
#112377 commented on May 28, 2025 • 0 new comments
[XPU User Empathy Day] [Windows] First 'import torch' takes long time on Arc
#154180 commented on May 28, 2025 • 0 new comments
[Intel GPU][XPU] Slow DDP training using oneCCL backend
#153438 commented on May 28, 2025 • 0 new comments
SystemError: PY_SSIZE_T_CLEAN macro must be defined for '#' formats - PyTorch compile fails with Python 3.12
#153737 commented on May 28, 2025 • 0 new comments
RuntimeError prompting a bug report.
#154197 commented on May 23, 2025 • 0 new comments
Add clamped FP8 (E4M3) cast for overflow-safe inference
#154028 commented on May 23, 2025 • 0 new comments
[CUDA] test_c10d_nccl test_extra_cuda_context failure due to _helper_test_extra_cuda_context_by_memory
#153122 commented on May 23, 2025 • 0 new comments
Trying to use forward AD with _scaled_dot_product_flash_attention that does not support it because it has not been implemented yet.
#128971 commented on May 23, 2025 • 0 new comments
[inductor][triton] Block ptrs are being removed from Triton
#154025 commented on May 23, 2025 • 0 new comments
Dynamic compilation fails with torch 2.7
#153937 commented on May 23, 2025 • 0 new comments
Sum difference for equal channels of tensor
#153564 commented on May 23, 2025 • 0 new comments
[Feature request] `torch.export` .save/.load could support `safetensors` and/or `weights_only=True`
#153410 commented on May 23, 2025 • 0 new comments
`detectron2_maskrcnn` OOMs on eager with A100 40G.
#120115 commented on May 23, 2025 • 0 new comments
[feature request] "Batched" index_select (i.e. simplified torch.gather with not specifying full index)
#64208 commented on May 23, 2025 • 0 new comments
Error on padding 0-sized tensors
#152750 commented on May 23, 2025 • 0 new comments
[feature request] `torch.to(obj, device, dtype)` supporting recursive lists/dicts/tuples of tensors probably by uplifting/promoting `torch.distributed.utils._recursive_to`
#69431 commented on May 23, 2025 • 0 new comments
[feature request] [discussion] Baseline ONNX interpreter / executor in python / PyTorch
#130114 commented on May 24, 2025 • 0 new comments
pybind11 loading for c10::Scalar NYI
#154187 commented on May 24, 2025 • 0 new comments
`torch.ldexp` goes out of range when `2**other` is out of range
#153069 commented on May 24, 2025 • 0 new comments
torch.compile raise JSONDecodeError("Extra data", s, end) while using Ray with Ulysses + 4 GPUs
#153791 commented on May 24, 2025 • 0 new comments
[ued][gemma3] HF + torch.compile - torch.compile on Gemma3
#149574 commented on May 24, 2025 • 0 new comments
Inductor C++ Compile Error
#154127 commented on May 24, 2025 • 0 new comments
NotImplementedError: Output channels > 65536 not supported at the MPS device.
#144445 commented on May 24, 2025 • 0 new comments
Immutable (read-only) tensors
#44027 commented on May 24, 2025 • 0 new comments
A bunch of fft ops fails the size/strides assert
#145977 commented on May 25, 2025 • 0 new comments
Improve typing of args and kwargs with ParamSpec
#142306 commented on May 25, 2025 • 0 new comments
Segfault, possibly due to recursion limit
#127622 commented on May 25, 2025 • 0 new comments
Add option for custom ops to automatically get a FakeTensor kernel (during static shapes)
#127337 commented on May 25, 2025 • 0 new comments
`collect_env.py` fails with `'NoneType' object has no attribute 'splitlines'` if pytorch is installed without pip
#144615 commented on May 25, 2025 • 0 new comments
inductor unbacked codegen results in undefined inputs
#154146 commented on May 25, 2025 • 0 new comments
Invalid handling of nans in compiled torch.quantile / torch.nanquantile on cuda
#152423 commented on May 26, 2025 • 0 new comments
[BUG] `functionalize` silent data corruption with pre-strided tensors
#153861 commented on May 30, 2025 • 0 new comments
Conda Pytorch set processor affinity to the first physical core after fork
#99625 commented on May 30, 2025 • 0 new comments
importing torch._dynamo under meta device fails
#153330 commented on May 30, 2025 • 0 new comments
torch.compile on torch.vmap function gives different shape to torch.vmap alone when using jacrev
#154036 commented on May 30, 2025 • 0 new comments
[FSDP+TP] RuntimeError: 'weight' must be 2-D
#124019 commented on May 30, 2025 • 0 new comments
Unexpected incorrect size error in GaussianNLLLoss
#147521 commented on May 30, 2025 • 0 new comments
AOTAutograd export path does not support training graphs with parameters that do not receive gradients.
#101192 commented on May 30, 2025 • 0 new comments
`__setitem__` with bool mask and dtype mismatch fails
#150017 commented on May 30, 2025 • 0 new comments
`index_copy` has different index behavior with `index_fill`
#73501 commented on May 30, 2025 • 0 new comments
Bracket indexing not working
#145143 commented on May 30, 2025 • 0 new comments
torch.compile joint trace materializes constant tensors
#154083 commented on May 30, 2025 • 0 new comments
[ONNX] export() with dynamic shapes fails where dynamo_export(dynamic_shapes=True) succeeds
#126607 commented on May 31, 2025 • 0 new comments
Stable C bindings for libtorch
#145656 commented on May 31, 2025 • 0 new comments
Non-blocking GPU to CPU copy of complex numbers with the different conj status produces wrong results
#146286 commented on May 31, 2025 • 0 new comments
torch.nn.functional.scaled_dot_product_attention is_causal fails for kv-cache case (sequential and further parallel attention)
#144858 commented on May 31, 2025 • 0 new comments
【Pytorch mobile android】torch jit forward fail,Calling torch.geqrf on a CPU tensor requires compiling PyTorch with LAPACK. Please use PyTorch built with LAPACK support.
#130309 commented on May 31, 2025 • 0 new comments
Torch 2.1 compile + FSDP (mixed precision) + LlamaForCausalLM: `RuntimeError: attempting to assign a gradient with dtype 'c10::BFloat16' to a tensor with dtype 'float'.`
#111317 commented on Jun 1, 2025 • 0 new comments
dist.init_process_group not work
#154102 commented on Jun 1, 2025 • 0 new comments
torch.compile doesnot support index with tensor
#151997 commented on Jun 1, 2025 • 0 new comments
Async NCCL communication blocks CUDA kernel in the first run
#136248 commented on Jun 2, 2025 • 0 new comments
MPS Backend Error: ComplexDouble (complex128) Conversion Fails When Diffusers Transformer Creates 64‐bit Complex Tensors
#148670 commented on Jun 2, 2025 • 0 new comments
RAM leak during data loading with multiprocessing and Conv3d on CPU in Dataset __getitem__
#150612 commented on Jun 2, 2025 • 0 new comments
Add aten::empty.memory_format for SparseMPS
#87886 commented on Jun 2, 2025 • 0 new comments
ByteTensor fails under FakeTensorMode()
#146635 commented on Jun 2, 2025 • 0 new comments
addition of muon optimizer to torch.optim
#148819 commented on Jun 2, 2025 • 0 new comments
Crash when testing Libtorch example
#129819 commented on Jun 2, 2025 • 0 new comments
10000 Kohya SS FLUX LoRA training is way faster on Linux than Windows any ideas to debug? Same settings, libraries and GPU
#134324 commented on Jun 2, 2025 • 0 new comments
MX basic dtypes in pytorch/pytorch
#146414 commented on May 28, 2025 • 0 new comments
[inductor][cpu] aoti shared memory run on multiple cores, performance will drop
#154094 commented on May 28, 2025 • 0 new comments
Update `torch/nn/modules/conv.py` to use Literal for support padding modes
#152280 commented on May 28, 2025 • 0 new comments
MPS: Placeholder tensor is empty!
#81180 commented on May 28, 2025 • 0 new comments
NFS errors during DataLoader shutdown when num_workers > 1 when temporary directory is on NFS
#143471 commented on May 28, 2025 • 0 new comments
CuDNN SDPA Issue Tracker
#141133 commented on May 28, 2025 • 0 new comments
`import torch` takes forever
#137260 commented on May 28, 2025 • 0 new comments
cuda_utils.so: failed to map segment from shared object
#123054 commented on May 28, 2025 • 0 new comments
inductor `full_like` decompositions give incorrect strides
#144699 commented on May 28, 2025 • 0 new comments
Performance Regression nightly 03/11→03/12, on nanogpt speedrun
#152823 commented on May 28, 2025 • 0 new comments
[nightly][jit] bad constant exponent (e+38.f) in default_program fused_mul_div_add
#107503 commented on May 29, 2025 • 0 new comments
[onnx] [njt] [feature request] Export NJT-enabled SDPA / MHA ops to ORT's PackingMode Attention
#140130 commented on May 29, 2025 • 0 new comments
with torch compile, bf16 gelu,silu, and mish are not deterministic in some sense.
#154150 commented on May 29, 2025 • 0 new comments
undefined symbol: __nvJitLinkCreate_12_8, version libnvJitLink.so.12
#152783 commented on May 29, 2025 • 0 new comments
GRU and LSTM fail for seq_len = 0
#50192 commented on May 29, 2025 • 0 new comments
[torch.export.load] failed while executing `pow_by_natural`
#136628 commented on May 29, 2025 • 0 new comments
MPS backend gradient correctness issues with large shapes
#153957 commented on May 29, 2025 • 0 new comments
A more flexible API for torch.compile fullgraph=True
#144908 commented on May 29, 2025 • 0 new comments
[Benchmark] High compilation time variance on benchmark dashboards
#152566 commented on May 29, 2025 • 0 new comments
Selective Activation Checkpointing on custom autograd.Function
#153334 commented on May 29, 2025 • 0 new comments
Support torch.func.grad for Flex Attention
#144810 commented on May 29, 2025 • 0 new comments
SourcelessBuilder.create does not know how to wrap <class '__main__.InFlexData'>
#154009 commented on May 30, 2025 • 0 new comments
torch.nn.functional.scaled_dot_product_attention returns NaN values after backward pass.
#126654 commented on May 30, 2025 • 0 new comments
Feature Request: CUDA torch.histogram (and histogramdd)
#69519 commented on May 30, 2025 • 0 new comments
Massive initial memory overhead GPU
#12873 commented on May 30, 2025 • 0 new comments
Device Error on vmap
#151591 commented on May 30, 2025 • 0 new comments
torch.compile fails for complex nested_tensor code
#130825 commented on May 30, 2025 • 0 new comments
[dtensor] ops coverage tracker
#119930 commented on Jun 18, 2025 • 0 new comments
RFC: The State of Custom CUDA extensions in PyTorch
#152032 commented on Jun 19, 2025 • 0 new comments
DISABLED test_memory_snapshot (__main__.TestCudaMallocAsync)
#126953 commented on Jun 19, 2025 • 0 new comments
Dynamo handling for all methods of torch.Generator
#88576 commented on Jun 19, 2025 • 0 new comments
Windows inductor genarated zero size array code, and is not supported by MSVC(C2466).
#153180 commented on Jun 19, 2025 • 0 new comments
DISABLED test_mempool_ctx_multithread (__main__.TestMemPool)
#153460 commented on Jun 19, 2025 • 0 new comments
NotImplementedError: Could not run 'aten::log' with arguments from the 'SparseCUDA' backend.
#153497 commented on Jun 19, 2025 • 0 new comments
cpp wrapper calls back to python for custom op even when a C++ registration is made
#153478 commented on Jun 19, 2025 • 0 new comments
Segmentation fault when converting sparse COO tensor with complex values to dense
#153329 commented on Jun 19, 2025 • 0 new comments
`torch.sparse.log_softmax` output mismatch between CPU and CUDA
#152293 commented on Jun 19, 2025 • 0 new comments
NotImplementedError: Could not run 'aten::index.Tensor' with arguments from the 'SparseCUDA' backend.
#152226 commented on Jun 19, 2025 • 0 new comments
Sparse tensor indexing not implemented, but partially supported by using index_select
#150277 commented on Jun 19, 2025 • 0 new comments
Dynamo export: Fake tensor broadcast error
#129534 commented on Jun 19, 2025 • 0 new comments
DISABLED test_wait_tensor (__main__.CompileTest)
#148014 commented on Jun 19, 2025 • 0 new comments
Make streams used for NCCL operations configurable
#67158 commented on Jun 19, 2025 • 0 new comments
[ONNX] ONNX export of simple quantized model fails
#113817 commented on Jun 20, 2025 • 0 new comments
[torch.compile][Megatron] Error with Megatron with Pytorch v2.5.0 using `AOTAutograd` and `torch.compile`
#141783 commented on Jun 20, 2025 • 0 new comments
Remove redundant type aliases of _device for torch.Device
#152952 commented on Jun 20, 2025 • 0 new comments
The docstring linter should not force overridden methods to be documented
#151692 commented on Jun 20, 2025 • 0 new comments
DISABLED test_sdpa_mask_fp16_L6_S17_NH23_HS121 (__main__.TestSDPA)
#138905 commented on Jun 20, 2025 • 0 new comments
[ONNX] Simple torch.nn.Identity onnx export with dynamo=True does not load
#151017 commented on Jun 20, 2025 • 0 new comments
[ONNX] Use dlpack to transfer tensors when onnxruntime implements proper support
#151064 commented on Jun 20, 2025 • 0 new comments
[export] Decomp failure when running `aten.item.default`
#150823 commented on Jun 20, 2025 • 0 new comments
[ONNX Convert] Error when input to nn.AdaptiveAvgPool2d size 10000 is variable
#147720 commented on Jun 20, 2025 • 0 new comments
`torch.onnx.export` (dynamo=False) fails with uninformative error when exporting `apply_rotary_pos_emb`/`repeat_interleave`
#145100 commented on Jun 20, 2025 • 0 new comments
[ONNX] broadcast_in_dim: model (ReDimNet)
#138313 commented on Jun 20, 2025 • 0 new comments
[CI] [anaconda] CI Build and Test scripts Linux
#148336 commented on Jun 20, 2025 • 0 new comments
[DCP] Allow for rank-specific tensors with duplicate keys
#146566 commented on Jun 17, 2025 • 0 new comments
[feature request] Exact euclidean distance transform
#61509 commented on Jun 17, 2025 • 0 new comments
DISABLED test_run_decompositions_map_handle_to_new_nodes (__main__.TestNumericDebugger)
#144933 commented on Jun 17, 2025 • 0 new comments
Feature Request: Add a rounding mode to round
#55289 commented on Jun 17, 2025 • 0 new comments
[feature request] Rank-Revealing QR - Adding dgeqp3 support to torch.qr
#10454 commented on Jun 17, 2025 • 0 new comments
TorchInductor CPU Performance Dashboard
#93531 commented on Jun 17, 2025 • 0 new comments
torch.compile on MPS progress tracker
#150121 commented on Jun 17, 2025 • 0 new comments
[Async TP] Fuse all-gather-matmuls for float8 rowwise training
#149990 commented on Jun 17, 2025 • 0 new comments
Inductor Perf MX to_blocked
#153194 commented on Jun 17, 2025 • 0 new comments
[ONNX] dynamic_axes does not rename dynamic dimension in torch.onnx.export
#150544 commented on Jun 17, 2025 • 0 new comments
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_float32 (__main__.TestForeachCUDA)
#153470 commented on Jun 17, 2025 • 0 new comments
The difference between input grad computed by channels last backward and the input grad computed by channels first backward of Hardswish on MPS is too large
#107214 commented on Jun 17, 2025 • 0 new comments
[NJT] can only chunk if the 2nd dimension is ragged
#153238 commented on Jun 17, 2025 • 0 new comments
RuntimeError: NVML_SUCCESS == r INTERNAL ASSERT FAILED at "/opt/pytorch/pytorch/c10/cuda/CUDACachingAllocator.cpp":830, please report a bug to PyTorch.
#123834 commented on Jun 17, 2025 • 0 new comments
Device check missing in torch.linalg.solve_triangular leading to hard crash
#142048 commented on Jun 17, 2025 • 0 new comments
UR Error when calling grid_sample
#153996 commented on Jun 18, 2025 • 0 new comments
DTensor RNG state for non CUDA backends
#138329 commented on Jun 18, 2025 • 0 new comments
xpu: implement aten::_linalg_eigvals for XPU backend (affecting HF Transformers v4.46.0-v4.48.0)
#140965 commented on Jun 18, 2025 • 0 new comments
DISABLED test_per_sample_api_compute_batch_size_not_pytreeable_cpu (__main__.TestExpandedWeightModuleCPU)
#146972 commented on Jun 18, 2025 • 0 new comments
DISABLED test_inductor_all_gather_into_tensor_single (__main__.CompileTest)
#147707 commented on Jun 18, 2025 • 0 new comments
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_float64 (__main__.TestForeachCUDA)
#153544 commented on Jun 18, 2025 • 0 new comments
Timer benchmark stores only one time value, and therefore has broken mean/median/etc metrics
#106801 commented on Jun 18, 2025 • 0 new comments
[RFC][API-Unstable] Support 3rd party SYCL kernels with CPP Extension API
#153265 commented on Jun 18, 2025 • 0 new comments
Compile produces different result than eager for mutable custom op use case
#153389 commented on Jun 18, 2025 • 0 new comments
Escape hatch: way to dynamically add or remove tags from custom operators
#150972 commented on Jun 18, 2025 • 0 new comments
"RuntimeError: makeDeviceForHostname(): unsupported gloo device" with nightly torch 2.8
#150381 commented on Jun 18, 2025 • 0 new comments
Context Parallel -- unsharded output doesn't match output without CP.
#152261 commented on Jun 18, 2025 • 0 new comments
Compiled `nn.Module` with tensor subclass can't be moved to another device
#141548 commented on Jun 23, 2025 • 0 new comments
Flex Attention is incompatible with selective AC
#147879 commented on Jun 23, 2025 • 0 new comments
DISABLED test_cat_max_autotune_triton (__main__.TestMaxAutotune)
#145830 commented on Jun 23, 2025 • 0 new comments
DISABLED test_ranks_and_tag (__main__.CompileTest)
#147974 commented on Jun 23, 2025 • 0 new comments
Support sparse COO/CSR/CSC/BSR/BSC return values in gradcheck input function
#97825 commented on Jun 19, 2025 • 0 new comments
Support building pytorch using MKL ILP64 model.
#102613 commented on Jun 19, 2025 • 0 new comments
Automated submodule update: kineto
#106149 commented on Jun 19, 2025 • 0 new comments
[RFC] Tensordict integration
#112441 commented on May 28, 2025 • 0 new comments
[pytree] support PyStructSequence types for Python pytree
#113258 commented on Jun 19, 2025 • 0 new comments
Automated submodule update: FBGEMM
#115316 commented on Jun 23, 2025 • 0 new comments
[DO NOT MERGE] Test new ROCm CI Navi31 nodes
#124424 commented on May 25, 2025 • 0 new comments
refine fp32 precision api
#125888 commented on Jun 23, 2025 • 0 new comments
allow to use bf16 as fp32 internal precision for mkldnn conv
#126050 commented on Jun 23, 2025 • 0 new comments
allow to use bf16 as fp32 internal precision for mkldnn conv backward
#126054 commented on Jun 23, 2025 • 0 new comments
[AOTAutograd] tweak min-cut partitioner to avoid saving softmax output
#126348 commented on Jun 18, 2025 • 0 new comments
[inductor] enable bf32 test for mkldnn conv
#127293 commented on Jun 23, 2025 • 0 new comments
[inductor] enable bf32 for mkldnn linear pointwise/binary in inductor
#127294 commented on Jun 23, 2025 • 0 new comments
Fix numerical instability for norm
#129352 commented on Jun 22, 2025 • 0 new comments
[DTensor] decomposed sharding propagation
#130887 commented on Jun 19, 2025 • 0 new comments
Remove deprecated jit code
#131296 commented on Jun 20, 2025 • 0 new comments
[torch.special] Adding betainc, betaincc, betaincinv, betainccinv, betaln and beta with backward operation
#132135 commented on Jun 9, 2025 • 0 new comments
Avoid sqrt calculations with values less than zero
#136824 commented on Jun 21, 2025 • 0 new comments
Load cuda deps more aggressively
#137059 commented on Jun 12, 2025 • 0 new comments
Help fix numpy detection in cross compiled layouts
#137084 commented on Jun 7, 2025 • 0 new comments
[pytree] Add public pytree module `torch.utils.pytree`
#137400 commented on Jun 18, 2025 • 0 new comments
Add TORCH_CHECK_INDEX in convert_indices_from_coo_to_csr_cpu
#138068 commented on Jun 18, 2025 • 0 new comments
[pytree] add `treespec_{leaf,tuple,dict}` functions for args_spec modification
#138214 commented on Jun 18, 2025 • 0 new comments
[CI] [anaconda] CI Build and Test scripts Windows
#148338 commented on Jun 20, 2025 • 0 new comments
[Docs] [anaconda] Review and update
#148339 commented on Jun 20, 2025 • 0 new comments
[CI] [anaconda] CI Build and Test scripts MacOS
#148340 commented on Jun 20, 2025 • 0 new comments
[CI] [anaconda] Docker files have conda environment installed
#148335 commented on Jun 20, 2025 • 0 new comments
[release] Make pytorch source distribution package respect pep-0517
#150461 commented on Jun 20, 2025 • 0 new comments
CUDA 12.6 Inductor accuracy test failures
#148699 commented on Jun 20, 2025 • 0 new comments
[torch.export] Cannot export TorchVision fasterrcnn_mobilenet_v3_large_fpn
#146152 commented on Jun 20, 2025 • 0 new comments
DISABLED test_non_contiguous_input_mm_plus_mm (__main__.TestMaxAutotune)
#126867 commented on Jun 20, 2025 • 0 new comments
DISABLED test_slice_scatter_reinplace_cuda (__main__.GPUTests)
#145189 commented on Jun 20, 2025 • 0 new comments
DISABLED test_inductor_reduce_scatter_tensor_coalesced (__main__.CompileTest)
#147887 commented on Jun 20, 2025 • 0 new comments
UNSTABLE pull / cuda12.8-py3.10-gcc9-sm75 / test (pr_time_benchmarks)
#153987 commented on Jun 20, 2025 • 0 new comments
torch.export does not support torchaudio.transforms.Spectrogram
#112844 commented on Jun 20, 2025 • 0 new comments
MSE documentation is weak
#88327 commented on Jun 20, 2025 • 0 new comments
Division by zero in ONNX export with `dynamo=True` leading to NaN outputs
#150623 commented on Jun 21, 2025 • 0 new comments
Looking for valid compiling option for extension based on torch-2.1.0+cpu.cxx11.abi
#143780 commented on Jun 21, 2025 • 0 new comments
Tensor.lerp inconsistent when using -Infinity between MPS and CPU
#111374 commented on Jun 21, 2025 • 0 new comments
Segmentation fault when calling `torch.choose_qparams_optimized()` with empty tensors and extreme num_bins value
#153326 commented on Jun 21, 2025 • 0 new comments
CompiledFxGraph.current_callable is not thread-safe
#138961 commented on Jun 21, 2025 • 0 new comments
General MPS op coverage tracking issue
#77764 commented on Jun 21, 2025 • 0 new comments
[ONNX] exported nodes of Multi-head attention can be simplified
#151209 commented on Jun 21, 2025 • 0 new comments
MPS operator coverage tracking issue (2.6+ version)
#141287 commented on Jun 22, 2025 • 0 new comments
Online softmax is disabled on the fly
#153241 commented on Jun 22, 2025 • 0 new comments
foreach CUDA tests flaky on CUDA 12.6+ due to flaky profiler results
#148681 commented on Jun 22, 2025 • 0 new comments
Improve error message for wrong number of arguments in CachingAutotuner
#146018 commented on Jun 23, 2025 • 0 new comments
Docs are little bit outdated for torch logs
#137285 commented on Jun 23, 2025 • 0 new comments
[RFC] Use CUDA graphs by default on torch.compile
#121968 commented on Jun 23, 2025 • 0 new comments
`setup.py develop` command is disappearing soon from `setuptools`
#152276 commented on Jun 23, 2025 • 0 new comments
DISABLED test_inductor_inplace_op_on_view (__main__.CompileTest)
#147852 commented on Jun 11, 2025 • 0 new comments
Build pytorch for rocm failed
#148167 commented on Jun 11, 2025 • 0 new comments
DISABLED test_repeated_calling_cuda (__main__.AOTInductorTestABICompatibleGpu)
#146185 commented on Jun 11, 2025 • 0 new comments
DISABLED test_inductor_reduce_scatter_tensor_single (__main__.CompileTest)
#147911 commented on Jun 11, 2025 • 0 new comments
DISABLED test_foreach_l2_large_value_input__foreach_norm_cuda_float16 (__main__.TestForeachCUDA)
#150509 commented on Jun 11, 2025 • 0 new comments
When scoped_libary is destroyed the fake impls are not cleared
#152720 commented on Jun 11, 2025 • 0 new comments
Different Cholesky results between Windows & Linux
#131774 commented on Jun 11, 2025 • 0 new comments
Some files in sccache are owned by `hostmaster+pytorch`
#139143 commented on Jun 11, 2025 • 0 new comments
torch.compile fails in FSDP due to .data assignment with different floating type
#152162 commented on Jun 11, 2025 • 0 new comments
custom_op's backward changes can't invalidate `torch.compile` cache for backward
#144344 commented on Jun 11, 2025 • 0 new comments
Pytorch PP requires all parameters to have grad in backward
#153484 commented on Jun 11, 2025 • 0 new comments
MaxPool2D memory leakage on device MPS
#125217 commented on Jun 11, 2025 • 0 new comments
DISABLED test_inductor_reuse_buffer_after_inplace_collective (__main__.CompileTest)
#147950 commented on Jun 11, 2025 • 0 new comments
[PTD BE DAY]Burn Down Distributed Disabled Tests!!
#132845 commented on Jun 12, 2025 • 0 new comments
NCCL ISend is not asynchronous
#108378 commented on Jun 12, 2025 • 0 new comments
nn.InstanceNorm and nn.GroupNorm are affected by padding, so they need to masking
#81985 commented on Jun 12, 2025 • 0 new comments
[RFC] Proposed Changes to Feature Tracking & Classification for PyTorch Releases starting Release 2.8
#152134 commented on Jun 12, 2025 • 0 new comments
DISABLED test_remove_noop_slice1_cpu (__main__.CpuTests)
#151379 commented on Jun 12, 2025 • 0 new comments
DISABLED test_remove_noop_slice1_cuda (__main__.GPUTests)
#151381 commented on Jun 12, 2025 • 0 new comments
DISABLED test_remove_noop_slice_cuda (__main__.GPUTests)
#151383 commented on Jun 12, 2025 • 0 new comments
DISABLED test_remove_noop_slice_scatter_cpu (__main__.CpuTests)
#151382 commented on Jun 12, 2025 • 0 new comments
DISABLED test_inductor_all_gather_into_tensor_coalesced (__main__.CompileTest)
#146806 commented on Jun 12, 2025 • 0 new comments
DISABLED test_remove_noop_slice_scatter_cuda (__main__.GPUTests)
#151378 commented on Jun 12, 2025 • 0 new comments
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_bfloat16 (__main__.TestForeachCUDA)
#150932 commented on Jun 12, 2025 • 0 new comments
[ONNX] Flip `dynamo` default to True in torch.onnx.export
#151693 commented on Jun 12, 2025 • 0 new comments
[ONNX] Support torchvision ops
#146459 commented on Jun 12, 2025 • 0 new comments
torch.nn.functional.one_hot has inconsistent behavior between eager and torch.compile when num_classes=0
#146274 commented on Jun 12, 2025 • 0 new comments
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_float32 (__main__.TestForeachCUDA)
#153284 commented on Jun 10, 2025 • 0 new comments
DISABLED test_while_loop_schema_gen (__main__.TestHopSchema)
#141202 commented on Jun 10, 2025 • 0 new comments
DISABLED test_rng (__main__.TestCompilerBisector)
#139590 commented on Jun 10, 2025 • 0 new comments
Memory Corruption in `torch.batch_norm_update_stats`
#153967 commented on Jun 10, 2025 • 0 new comments
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_float64 (__main__.TestForeachCUDA)
#153395 commented on Jun 10, 2025 • 0 new comments
need to document `FlopCounterMode`
#145555 commented on Jun 10, 2025 • 0 new comments
torch.nn.functional.conv_transpose2d produces inconsistent output on CPU and CUDA
#153276 commented on Jun 10, 2025 • 0 new comments
Unexpected overflow behavior when using `torch.addcmul`
#152294 commented on Jun 10, 2025 • 0 new comments
`torch.nn.functional.conv_transpose2d` has inconsistent handling of `float16` overflow on CPU
#153700 commented on Jun 10, 2025 • 0 new comments
[doc] functionalities not documented
#9886 commented on Jun 10, 2025 • 0 new comments
Missing examples in some API docs
#103844 commented on Jun 10, 2025 • 0 new comments
torch.multiprocessing.Queue Zeroes Out Tensors on Retrieval
#149155 commented on Jun 10, 2025 • 0 new comments
DISABLED test_inductor_all_to_all_single (__main__.CompileTest)
#147795 commented on Jun 10, 2025 • 0 new comments
DISABLED test_inductor_all_reduce_non_contig_input (__main__.CompileTest)
#147733 commented on Jun 10, 2025 • 0 new comments
JVP: Option to Disable Gradient Caching for Tangents
#151782 commented on Jun 10, 2025 • 0 new comments
Unexpected Behavior when using torch.isclose()
#102400 commented on Jun 10, 2025 • 0 new comments
`Segmentation fault` in `torch.nn.utils.rnn.pad_packed_sequence` and `torch.nn.utils.rnn.unpack_sequence`
#149622 commented on Jun 10, 2025 • 0 new comments
[inductor] [cpu] `torch.nn.RReLU()` outputs different resutls with eager on cpp backend
#147255 commented on Jun 10, 2025 • 0 new comments
Setting the tensor of the values out of `[0,1]` to `target` argument of `nn.CrossEntropyLoss()` with class probabilities works against the doc
#134771 commented on Jun 10, 2025 • 0 new comments
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_int16 (__main__.TestForeachCUDA)
#150309 commented on Jun 10, 2025 • 0 new comments
[dynamo] Try tracing into einops
#152480 commented on Jun 10, 2025 • 0 new comments
Tensor parallel for convolutions and groupnorm
#133221 commented on Jun 11, 2025 • 0 new comments
deepcopy of LazyLinear fails
#83168 commented on Jun 11, 2025 • 0 new comments
DISABLED test_comprehensive_nn_functional_conv_transpose3d_cuda_float32 (__main__.TestInductorOpInfoCUDA)
#148853 commented on Jun 11, 2025 • 0 new comments
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_float64 (__main__.TestForeachCUDA)
#150562 commented on Jun 11, 2025 • 0 new comments
DISABLED test_foreach_l2_large_value_input__foreach_norm_cuda_bfloat16 (__main__.TestForeachCUDA)
#150467 commented on Jun 11, 2025 • 0 new comments
xpu: installed pytorch is missing aten xpu ops headers (ATen/ops/cat_xpu_dispatch.h and others)
#145902 commented on Jun 11, 2025 • 0 new comments
Error computing the norm of MaskedTensor
#117287 commented on Jun 14, 2025 • 0 new comments
CTCLoss gradient is incorrect
#52241 commented on Jun 14, 2025 • 0 new comments
Incompatible Torch and Torchvision while building from source for 2.6.0 and CUDA 12.6, RuntimeError: operator torchvision::nms does not exist
#146221 commented on Jun 15, 2025 • 0 new comments
Enable `torch.topk` to support `stable` flag
#88227 commented on Jun 15, 2025 • 0 new comments
bmm, topk, cholesky, linalg.norm, max with out variants set causing recompilations in torch.compile
#135859 commented on Jun 15, 2025 • 0 new comments
Enable TorchInductor to Generate Matmuls Natively via `tl.dot`
#151705 commented on Jun 15, 2025 • 0 new comments
Continuous calls to nn.Linear in fp32 on the 5090D cause severe performance degradation
#150725 commented on Jun 15, 2025 • 0 new comments
DISABLED test_remove_noop_view_dtype_cuda (__main__.GPUTests)
#151541 commented on Jun 16, 2025 • 0 new comments
Some Doc Issue about `torch.lobpcg()`
#152107 commented on Jun 16, 2025 • 0 new comments
DTensor + torch.compile on CPU: compiled matmul fails with multiple shape inputs
#154111 commented on Jun 16, 2025 • 0 new comments
DISABLED test_jacobian_vectorize_raises_no_warnings_logging_tensor (__main__.TestAutogradFunctional)
#153707 commented on Jun 16, 2025 • 0 new comments
TypeError when using torch.cuda.list_gpu_processes() on Windows with the WDDM driver
#64491 commented on Jun 16, 2025 • 0 new comments
[RFC][API-Unstable] Intel GPU distributed Backend integration in `torch-xpu-ops`and registeration in PyTorch
#141741 commented on Jun 16, 2025 • 0 new comments
[Tracker] Nested tensor op coverage requests
#118107 commented on Jun 16, 2025 • 0 new comments
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_float16 (__main__.TestForeachCUDA)
#153379 commented on Jun 16, 2025 • 0 new comments
EDA0
Most requested ops for the MPS backend
#154052 commented on Jun 16, 2025 • 0 new comments
FSDP learning hangs when the program tries to save the model
#143536 commented on Jun 16, 2025 • 0 new comments
Cannot compile with latest LLVM-19
#139065 commented on Jun 16, 2025 • 0 new comments
Can't call torch.compile inside of a custom op
#151328 commented on Jun 16, 2025 • 0 new comments
Suggestion: integration of einops test suite
#146782 commented on Jun 16, 2025 • 0 new comments
Segmentation error for torch==2.2.1 on MacOs
#121101 commented on Jun 16, 2025 • 0 new comments
Graph Partition Issue Tracker
#151832 commented on Jun 16, 2025 • 0 new comments
Add support for MaxPool3D on the MPS backend
#100674 commented on Jun 16, 2025 • 0 new comments
DISABLED test_ddp_uneven_inputs (__main__.TestDistBackendWithSpawn)
#75648 commented on Jun 16, 2025 • 0 new comments
Process never ends when sending tensors through multiprocessing queues in Python 3.12+ on macOS
#153050 commented on Jun 16, 2025 • 0 new comments
DISABLED test_lowering_to_x86 (__main__.TestQuantizePT2EX86Inductor)
#153140 commented on Jun 17, 2025 • 0 new comments
DISABLED test_re_export_preserve_handle (__main__.TestNumericDebugger)
#144898 commented on Jun 17, 2025 • 0 new comments
RuntimeError: [1] is setting up NCCL communicator and retreiving ncclUniqueId from [0] via c10d key-value store by key '0', but store->get('0') got error: Timeout waiting for key: default_pg/0/0 after 1800000 ms
#82091 commented on Jun 12, 2025 • 0 new comments
DISABLED test_matrix_rank_basic_cuda_float32 (__main__.TestLinalgCUDA)
#150406 commented on Jun 12, 2025 • 0 new comments
DISABLED test_remove_noop_slice_cpu (__main__.CpuTests)
#151384 commented on Jun 12, 2025 • 0 new comments
Label tracking meta-issue (edit me to get automatically CC'ed on issues! cc bot)
#24422 commented on Jun 12, 2025 • 0 new comments
result_type doesn't take dtypes and doesn't match numpy
#51284 commented on Jun 13, 2025 • 0 new comments
mark_unbacked for strides.
#153204 commented on Jun 13, 2025 • 0 new comments
DISABLED test_foreach_check_stride_ignore_dims_of_one_cuda_float32 (__main__.TestForeachCUDA)
#150026 commented on Jun 13, 2025 • 0 new comments
[feature request] Provide FlexAttention as a new available/selectable backend for SDPA
#137574 commented on Jun 13, 2025 • 0 new comments
DISABLED test_is_isnot (__main__.TestScript)
#120694 commented on Jun 13, 2025 • 0 new comments
DISABLED test_int64_upsample3d_cuda_bfloat16 (__main__.TestTorchDeviceTypeCUDA)
#146007 commented on Jun 13, 2025 • 0 new comments
DISABLED test_hessian_vectorize_raises_no_warnings_logging_tensor (__main__.TestAutogradFunctional)
#153644 commented on Jun 13, 2025 • 0 new comments
[RFC][API-Unstable]Enable A16W4 on XPU Device
#153019 commented on Jun 13, 2025 • 0 new comments
DISABLED test_remove_noop_view_default_cpu (__main__.CpuTests)
#151512 commented on Jun 13, 2025 • 0 new comments
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_bool (__main__.TestForeachCUDA)
#151229 commented on Jun 13, 2025 • 0 new comments
Multihead Attention does not work with jagged tensors due to __torch_function__
#153472 commented on Jun 13, 2025 • 0 new comments
MPS Performance regressions on Sonoma 14.0
#111517 commented on Jun 13, 2025 • 0 new comments
DISABLED test_remove_noop_view_default_cuda (__main__.GPUTests)
#151511 commented on Jun 13, 2025 • 0 new comments
DISABLED test_remove_noop_view_dtype_cpu (__main__.CpuTests)
#151540 commented on Jun 13, 2025 • 0 new comments
[ONNX] Failed to export PyTorch-2-Export-Quantized model to onnx
#143474 commented on Jun 13, 2025 • 0 new comments
torch._dynamo.exc.Unsupported: builtin: bool [<class 'torch._dynamo.variables.tensor.SymNodeVariable'>] False
#136075 commented on Jun 13, 2025 • 0 new comments
reshape_view_helper is only used for fake tensor tracing but not proxy tracing.
#153303 commented on Jun 14, 2025 • 0 new comments
eval should handle (unhinted: (s77 > 3) | (u0 > 200)) when s77 has hint =5
#153227 commented on Jun 14, 2025 • 0 new comments
MPS Sparse Support
#129842 commented on Jun 14, 2025 • 0 new comments
Make tlparse able to show a summary of distinct graph breaks
#153669 commented on Jun 14, 2025 • 0 new comments
[torch.export] Torch Export produces incorrect program when python generators are used.
#130975 commented on Jun 14, 2025 • 0 new comments
DISABLED test_distributed_checkpoint_state_dict_type0_cuda (__main__.TestDistributedCheckpointCUDA)
#145807 commented on Jun 14, 2025 • 0 new comments
`INTERNAL ASSERT FAILED` in `interpolate` and `torch.import_ir_module`
#149737 commented on Jun 14, 2025 • 0 new comments

0