-
Notifications
You must be signed in to change notification settings - Fork 24.5k
Insights: pytorch/pytorch
Overview
Could not load contribution data
Please try again later
1 Release published by 1 person
-
v2.7.1 PyTorch 2.7.1 Release, bug fix release
published
Jun 4, 2025
10 Pull requests merged by 4 people
-
Bump requests from 2.32.2 to 2.32.4 in /.github
#155491 merged
Jun 16, 2025 -
Bump pillow from 10.0.1 to 10.3.0 in /.github/requirements
#154416 merged
Jun 4, 2025 -
Revert "Temporarily disable sparse tensor validation when loading from external storage."
#154755 merged
May 30, 2025 -
Revert "Add optional check_pinning argument to _validate_sparse_compressed_tensor/coo_args"
#154751 merged
May 30, 2025 -
Add optional check_pinning argument to _validate_sparse_compressed_tensor/coo_args
#154617 merged
May 30, 2025 -
Temporarily disable sparse tensor validation when loading from external storage.
#154600 merged
May 29, 2025 -
set thread_work_size to 4 for unrolled kernel
#154541 merged
May 29, 2025 -
[c10d] Fix extra CUDA context created by barrier
#152834 merged
May 27, 2025 -
[c10d] Add more tests to prevent extra context
#154179 merged
May 27, 2025 -
[CI] Remove the xpu env source for linux binary validate
#154409 merged
May 27, 2025
529 Pull requests opened by 236 people
-
Revert D74898941 (#154188)
#154203 opened
May 23, 2025 -
[c10d] Add more tests to prevent extra context
#154204 opened
May 23, 2025 -
Fix for ISSUE #153069
#154211 opened
May 23, 2025 -
Support deterministic upsample trilinear backward
#154239 opened
May 23, 2025 -
[Dynamo] [FrozensetSubclass] Add support for user defined frozensets
#154263 opened
May 23, 2025 -
[MTIA Aten Backend][3/n] Migrate mm.out from out-of-tree to in-tree
#154277 opened
May 23, 2025 -
Add Support for transposed convolution with Padding Mode 'same' in C++
#154279 opened
May 23, 2025 -
[dynamo] control one_graph behavior additionally through config
#154283 opened
May 23, 2025 -
[dynamo] add set_fullgraph decorator/context manager
#154289 opened
May 23, 2025 -
[cuBLAS][cuBLASLt] Reduce scale of inputs for reduced precision reduction matmul test
#154293 opened
May 24, 2025 -
Aten vector default constructors set to 0, add fnmadd and fnmsub
#154298 opened
May 24, 2025 -
[DO NOT MERGE] Oboarding lab pt 3 4
#154315 opened
May 25, 2025 -
feat: add torch.export .save/.load to support `safetensors` and/or weights_only=True
#154330 opened
May 25, 2025 -
[MTIA Aten Backend][1.2/n] Migrate as_strided to in-tree, and add unit tests
#154334 opened
May 26, 2025 -
Add dispatch log for torch benchmark
#154338 opened
May 26, 2025 -
Enhance testing infrastructure to add half-precision support for `histc` on XPU
#154339 opened
May 26, 2025 -
[Reland] [Intel GPU] Make SDPA output has the same stride as Query.
#154340 opened
May 26, 2025 -
Allow decomposeK to fuse
#154349 opened
May 26, 2025 -
Draft subgraph fusion
#154350 opened
May 26, 2025 -
Add `torch.segment_reduce` docs
#154352 opened
May 26, 2025 -
Ensure Dynamo can trace through explicit dunder method call
#154366 opened
May 26, 2025 -
Fix: Ensure writeback handles NO_SHARD correctly by flattening tensors before copying
#154369 opened
May 26, 2025 -
Draft
#154388 opened
May 26, 2025 -
[WIP] Do not unroll reduction when reduced new size is empty.
#154389 opened
May 27, 2025 -
Don't need to handle PyTrace_EXCEPTION in pyProfileFn
#154392 opened
May 27, 2025 -
Updated padding validation in max_pool functions to account for dilation
#154395 opened
May 27, 2025 -
Convert torch.rst to .md
#154438 opened
May 27, 2025 -
[Not for review] Fix rebuild buckets when find_unused_parameters=True for use cases like GAN based models
#154447 opened
May 27, 2025 -
[Inductor] Fix remove_noop_ops pass where the types for the same_meta would differ
#154460 opened
May 27, 2025 -
[WIP] oblivious where
#154468 opened
May 27, 2025 -
[CI] [CUDA] Add CUDA 12.8 eager CI tests
#154469 opened
May 28, 2025 -
[easy] better copy_misaligned_inputs assertion failure message
#154472 opened
May 28, 2025 -
[dynamic shapes] guard_or_false should_swap
#154475 opened
May 28, 2025 -
Replace deprecated `is_compiling` method
#154476 opened
May 28, 2025 -
[cpp wrapper] add AOTI shim for collective ops
#154492 opened
May 28, 2025 -
Fix CI failures from inductor periodic
#154497 opened
May 28, 2025 -
Reland "Collect packages with importlib in collect_env #144616"
#154505 opened
May 28, 2025 -
[precompile] Package dynamo artifacts. [jjwu test copy]
#154510 opened
May 28, 2025 -
hack
#154511 opened
May 28, 2025 -
Fix Float16 CooperativeReduction Test Failure
#154516 opened
May 28, 2025 -
simplify modularindexing
#154523 opened
May 28, 2025 -
antoher test
#154535 opened
May 28, 2025 -
[Generator] Implement generator.__contains__
#154539 opened
May 28, 2025 -
add envvar to bisect number of graphs compiled
#154543 opened
May 28, 2025 -
[cpp_wrapper] Build main and kernel code in separate threads
#154551 opened
May 28, 2025 -
Fixes Issue #154491
#154561 opened
May 28, 2025 -
[dynamo] raise hard error if error is encountered while tracing resume function prologue
#154564 opened
May 28, 2025 -
Always set CPU affinity for benchmark jobs
#154569 opened
May 28, 2025 -
[WIP][export][cond] support exporting cond with unbacked symint shaped tensor
#154570 opened
May 28, 2025 -
Enable Leak Sanitizer
#154584 opened
May 29, 2025 -
Fix MKL error: Inconsistent configuration parameters
#154585 opened
May 29, 2025 -
Bump mimalloc to v3.0.3
#154594 opened
May 29, 2025 -
Add `scale` complex type check in `quantize_per_tensor`
#154601 opened
May 29, 2025 -
deprecate MTIA_WORKLOADD from pytorch
#154609 opened
May 29, 2025 -
[et/triton kernel] Assign correct stream_id from cudaStream
#154640 opened
May 29, 2025 -
[1/n]adding torch.distributed.run option to provide destination for event logging
#154644 opened
May 29, 2025 -
Add support for tracing vmap in pre-dispatch export
#154650 opened
May 29, 2025 -
Add pg transport and tests
#154653 opened
May 29, 2025 -
Extract DeviceType to a standalone header file
#154654 opened
May 29, 2025 -
[forward fix] add support for MemoryFormat after type tightening
#154658 opened
May 29, 2025 -
pin torchao in torchbench jobs
#154665 opened
May 29, 2025 -
[Graph Partition] Turn-on in OSS by default
#154667 opened
May 29, 2025 -
Experimental: torch.vulkan.compile_shader
#154676 opened
May 29, 2025 -
NOCOMMIT: hack to allow allocating SSBO-backed Vulkan Tensors from Python
#154677 opened
May 29, 2025 -
Experimental: also support SSBO calling convention in torch.vulkan.compile_shader
#154678 opened
May 29, 2025 -
[cp] test: add memory snapshot for flex_attention tests
#154691 opened
May 29, 2025 -
[vision hash update] update the pinned vision hash
#154694 opened
May 30, 2025 -
[WIP][export] assume standard slice for non-strict
#154699 opened
May 30, 2025 -
[Dynamo] Support torch dynamo for PrivateUse1 backend
#154702 opened
May 30, 2025 -
Improve `lr_scheduler` epoch type and description
#154706 opened
May 30, 2025 -
Type hints for distributions/constraints
#154711 opened
May 30, 2025 -
[DO NOT merge] windows ami test
#154729 opened
May 30, 2025 -
[Wheel Variant] Experimental Support
#154733 opened
May 30, 2025 -
Script for consolidation of sharded safetensor files
#154743 opened
May 30, 2025 -
Add NestedTensorHPU to to_padded_tensor and _nested_tensor_storage_offsets in native_functions.yaml
#154744 opened
May 30, 2025 -
[dynamo] fix debugging code_parts for relational guards
#154753 opened
May 30, 2025 -
Fix in pytorch do_bench_using_profiling
#154766 opened
May 30, 2025 -
[dynamo] fix selecting shape guards
#154772 opened
May 30, 2025 -
ci: Refactor reuse_old_whl to use python stdlib
#154777 opened
May 31, 2025 -
[dynamo] fix set_fullgraph for nested calls
#154782 opened
May 31, 2025 -
Fix setuptools
#154788 opened
May 31, 2025 -
Add note about using latest clangd for IDE support
#154790 opened
May 31, 2025 -
[dict] Add dict.popitem
#154793 opened
May 31, 2025 -
[dict] Allow Dynamo to trace through explicit dict dunder method call
#154794 opened
May 31, 2025 -
[BE][Ez]: Fully type nn.utils.clip_grad
#154801 opened
May 31, 2025 -
add fixes for missing headers, setup.py for local build
#154812 opened
Jun 1, 2025 -
[BE]: Try to enable LTO
#154819 opened
Jun 1, 2025 -
distributions/constraints type annotations + public classes + some refactoring
#154827 opened
Jun 1, 2025 -
Fix circular imports
#154832 opened
Jun 2, 2025 -
[dynamo] Refactor TorchCtxManagerClassVariable to use dispatch map fo…
#154833 opened
Jun 2, 2025 -
Add serialization support for register_constant
#154834 opened
Jun 2, 2025 -
[Inductor] Fix CUDAGraphTree input/output aliasing bug in torch.compile with inductor backend
#154839 opened
Jun 2, 2025 -
Fix DataLoader to Pass List to getitems When Using BatchSampler. Fixes Issue_#154810
#154844 opened
Jun 2, 2025 -
[TorchGen] Add explicit op level CPU fallback feature via env variables
#154854 opened
Jun 2, 2025 -
[ROCm] SDPA fix mem fault when dropout is enabled
#154864 opened
Jun 2, 2025 -
[CI] Removing --user flag from all pip install commands
#154900 opened
Jun 2, 2025 -
[dict] Implement dict.__eq__ and dict.__ne__
#154903 opened
Jun 2, 2025 -
[DO NOT MERGE] Test mi300 tests on IBM cluster
#154907 opened
Jun 2, 2025 -
[WIP] cast to bf16 before mul op in flex bwd
#154922 opened
Jun 2, 2025 -
[Inductor] Re-run torchgen/fuse/gen_patterns for DistilBert
#154923 opened
Jun 2, 2025 -
[inductor] Add CSE.get_prefix to allow customized behavior
#154924 opened
Jun 2, 2025 -
inductor: add wrap_expr_for_assignment_to_var
#154925 opened
Jun 2, 2025 -
RFC: Prototype Vulkan inductor backend
#154926 opened
Jun 2, 2025 -
Remove CUDA 11.8 CI code
#154937 opened
Jun 3, 2025 -
[dict] Implement dict.__eq__ and dict.__ne__
#154942 opened
Jun 3, 2025 -
[OrderedDict] Implement explicit OrderedDict dunder method call
#154943 opened
Jun 3, 2025 -
[DO NOT LAND] all_gather_copy_in for cpu offload
#154959 opened
Jun 3, 2025 -
Remove unsafe PyTorchError constructor
#154961 opened
Jun 3, 2025 -
update the baseline for nightly max_autotune tests
#154973 opened
Jun 3, 2025 -
[Intel GPU] Refactor Matmul integration: Modularize bias handling and memory creation
#154977 opened
Jun 3, 2025 -
Avoid differing results in `linalg.(tensor_)solve`
#154983 opened
Jun 3, 2025 -
change cpu_blas and cmakefile to support openblas bgemm
#155000 opened
Jun 3, 2025 -
[c10] buffered atomic stack
#155012 opened
Jun 3, 2025 -
Add functionalize util
#155042 opened
Jun 3, 2025 -
[cp] test: understand why flex_attention doesn't get dispatched in the assumed way
#155059 opened
Jun 3, 2025 -
Add type validation for alpha and inplace in nn.CELU
#155061 opened
Jun 3, 2025 -
[test] lintrunner thing
#155062 opened
Jun 3, 2025 -
Add a crash handler to async compile subprocesses
#155068 opened
Jun 3, 2025 -
[DO NOT MERGE] test mi300 workflows on vultr cluster
#155069 opened
Jun 3, 2025 -
Supporting compilation of distributed_c10d.send and distributed_c10d.recv
#155070 opened
Jun 3, 2025 -
[dict] Implement dict.__ior__ and fix return type in dict.__or__
#155072 opened
Jun 3, 2025 -
[WIP][dynamic shapes] guard_or_false for are_strides_like_channel_last
#155076 opened
Jun 3, 2025 -
[ca] cpp tensor pre hooks
#155082 opened
Jun 3, 2025 -
Deprecate c10::string
#155084 opened
Jun 4, 2025 -
Add device check in `mse_loss`
#155089 opened
Jun 4, 2025 -
docs: link to Nvidia Container Toolkit in README
#155102 opened
Jun 4, 2025 -
Adapting pipeline parallelism test cases to be device agnostic
#155108 opened
Jun 4, 2025 -
[Quant][CPU] fix fake_quantize_per_tensor_affine of inf values
#155109 opened
Jun 4, 2025 -
Fixes #154982: add missing to_result_dtype in vector_norm
#155111 opened
Jun 4, 2025 -
Issue warning with reference to user code rather than torch
#155112 opened
Jun 4, 2025 -
Fix conversion of values in libtorch agnostic tests
#155115 opened
Jun 4, 2025 -
[TESTING] [DO NOT MERGE] Updated triton commit pin
#155117 opened
Jun 4, 2025 -
[docs] Decorator to create a deprecation warning
#155127 opened
Jun 4, 2025 -
[MTIA Aten Backend] Migrate set_.source_Storage and set_.source_Tensor
#155144 opened
Jun 4, 2025 -
[OrderedDict] Implement `OrderedDict.move_to_end(key, last=False)`
#155152 opened
Jun 4, 2025 -
[OrderedDict] Implement `OrderedDict.popitem(last=...)`
#155153 opened
Jun 4, 2025 -
[dict] Implement `__eq__` for dict_items
#155154 opened
Jun 4, 2025 -
[MPS] Add device guard for MPS dispatch key
#155165 opened
Jun 4, 2025 -
[dynamo] handle fullgraph toggle using nested torch.compile
#155166 opened
Jun 4, 2025 -
[WIP][fake tensor] avoid nonzero memo
#155176 opened
Jun 5, 2025 -
adding arg values and arg types to Strobelight USDT
#155185 opened
Jun 5, 2025 -
[WIP][fake tensor] invalidate memos for PropagateUnbackedSymInts
#155187 opened
Jun 5, 2025 -
[PT] support custom all_gather and reduce_scatter comms
#155189 opened
Jun 5, 2025 -
Add UT for torch.accelerator memory-related API
#155200 opened
Jun 5, 2025 -
Add AMD AWS runners to inductor performance tests
#155206 opened
Jun 5, 2025 -
Add is_hidden_event method to KinetoEvent Python interface
#155214 opened
Jun 5, 2025 -
[PrivateUse1] Optimize 3rd party backend experiences
#155215 opened
Jun 5, 2025 -
Add more logging
#155219 opened
Jun 5, 2025 -
Add serialized_type_name to torch.return_types.* so we can dump them
#155245 opened
Jun 5, 2025 -
updated adafactor doc #154862
#155248 opened
Jun 5, 2025 -
Add STD_TORCH_CHECK to torch::standalone
#155253 opened
Jun 5, 2025 -
[PowerPC] Fixed build issue for vsx vec256 complexfloat and scaled_mm_out_cpu
#155255 opened
Jun 5, 2025 -
Add STD_TORCH_CHECK to torch::standalone [imported]
#155258 opened
Jun 5, 2025 -
[Dynamo] Add CPython default dict tests
#155263 opened
Jun 5, 2025 -
higher_order_ops.py unimplemented_v2 migration, part1
#155264 opened
Jun 5, 2025 -
[inductor] support linear & layer_norm unbacked
#155267 opened
Jun 5, 2025 -
[WIP][dynamic shapes] if-then-else for meta_select storage offset
#155269 opened
Jun 5, 2025 -
turn off reorder_for_peak_memory in case of collectives
#155271 opened
Jun 5, 2025 -
[#155034] Converted RST files to Markdown
#155287 opened
Jun 5, 2025 -
[BE] Deprecate `search_autotune_cache`
#155302 opened
Jun 6, 2025 -
Use expecttest in test_compiled_optimizers.py
#155308 opened
Jun 6, 2025 -
[einops] Ensure Dynamo can trace through einops
#155310 opened
Jun 6, 2025 -
Try adding bfloat16 to test_nn_lstm
#155338 opened
Jun 6, 2025 -
[oss] Add version to metadata
#155343 opened
Jun 6, 2025 -
[aotd] Support mutations of the same input in fw and bw
#155354 opened
Jun 6, 2025 -
[ATen][CPU][Sparse] Use Third-Party Eigen for sparse add and addmm
#155357 opened
Jun 6, 2025 -
Fix serialization of nans in torch.export
#155359 opened
Jun 6, 2025 -
[AOTInductor] Inherit extern kernels for runtime constant folding
#155361 opened
Jun 6, 2025 -
Try running test_foreach sequentially
#155366 opened
Jun 6, 2025 -
[aot] bw_module for ca: do not clone real buffers/params
#155370 opened
Jun 6, 2025 -
[export] inline into torch.jit.traced nn module
#155381 opened
Jun 6, 2025 -
[Precompile] Integrate PrecompileContext with CompilePackage
#155384 opened
Jun 7, 2025 -
Convert onnx torchscript rst to md
#155390 opened
Jun 7, 2025 -
Convert to markdown: jit_python_reference.rst, jit_unsupported.rst, jit_utils.rst, library.rst
#155404 opened
Jun 7, 2025 -
Support --inplace flag for tools/nightly.py
#155419 opened
Jun 8, 2025 -
[DTensor] Fix aten.all strategy with min instead of sum as the reduce_op
#155420 opened
Jun 8, 2025 -
[scan] Fix issues with scan on CPU and for autograd when implementing an RNN with multiple layers
#155422 opened
Jun 8, 2025 -
[PT2][partitioners] Add aten.split to view_ops list
#155424 opened
Jun 8, 2025 -
[inductor] Improve GEMM loggings
#155427 opened
Jun 8, 2025 -
[Inductor] Fix output discrepancy between Inductor and eager of mean with input of a large size tensor
#155428 opened
Jun 8, 2025 -
Clean up memory management in impl_func_norm
#155432 opened
Jun 9, 2025 -
Convert sparse rst to md
#155438 opened
Jun 9, 2025 -
Use unpack instructions for vec256 (de)interleave2
#155440 opened
Jun 9, 2025 -
[1/n] refactor the ring attention implementation
#155441 opened
Jun 9, 2025 -
[2/n] rewrite load balancing and sharding in context parallel
#155442 opened
Jun 9, 2025 -
[test][do not merge] test on MI300 for #125888
#155445 opened
Jun 9, 2025 -
[Profiler] Fix lost C call events problem in Python 3.12.0-3.12.4
#155446 opened
Jun 9, 2025 -
Update slow tests
#155448 opened
Jun 9, 2025 -
[test][do not merge] test on main on MI300, compared with for #125888
#155449 opened
Jun 9, 2025 -
Quiet Inductor #135521
#155450 opened
Jun 9, 2025 -
[Dynamo] Enable torch function dispatch on HOPs
#155452 opened
Jun 9, 2025 -
[ROCm] skip convolution tests on Navi, enable batch_norm_with_update
#155454 opened
Jun 9, 2025 -
[proxy_tensor] Do not clobber tensor proxies for inplace ops
#155456 opened
Jun 9, 2025 -
[PyTorch][NCCLX] expose nccl_nonblocking_timeout for NCCLX PG building to reuse NCCLUtils macro
#155487 opened
Jun 9, 2025 -
Add torch._C._log_api_usage_once to datapipes (mapper)
#155489 opened
Jun 9, 2025 -
XCCL changes for DDP
#155497 opened
Jun 9, 2025 -
[OrderedDict] Implement `hasattr(..., IteratorVariable)`
#155501 opened
Jun 9, 2025 -
[OrderedDict] Set the correct dict class in UserDefinedDictVariable
#155502 opened
Jun 9, 2025 -
[OrderedDict] Add `bool(OrderedDict)`
#155503 opened
Jun 9, 2025 -
[inductor] Fix propagating torch.utils._sympy.functions.Identity in IndexPropagation
#155504 opened
Jun 9, 2025 -
Making implicit packages explicit (torch)
#155505 opened
Jun 10, 2025 -
[WIP] Port dynamo test cases for xpu backend.
#155524 opened
Jun 10, 2025 -
Update MAIAHooksInterface to pin host memory in MAIA device
#155541 opened
Jun 10, 2025 -
[Inductor] Add decomposition for aten.mul
#155542 opened
Jun 10, 2025 -
[Misc] fix distributed/_tools/test_sac_ilp.py::TestSACILP::test_sac_i…
#155548 opened
Jun 10, 2025 -
FractionalMaxPool3d add kernel_size check
#155549 opened
Jun 10, 2025 -
fix: 155029 convert rst to md
#155554 opened
Jun 10, 2025 -
WIP add support for dynamic shapes
#155557 opened
Jun 10, 2025 -
Implement guard collectives
#155558 opened
Jun 10, 2025 -
[Misc] skip the case test_foreach_add_different_mesh if world size is…
#155563 opened
Jun 10, 2025 -
[ROCm][SymmetricMemory] Avoid bf16 to float conversion during reduce
#155587 opened
Jun 10, 2025 -
Compute contiguity symbolically to avoid dde, and introduce c++ is_contiguous_or_false.
#155590 opened
Jun 10, 2025 -
[do NOT land] torch_function_mode + flex_attention dispatch test
#155594 opened
Jun 10, 2025 -
[aoti][mps] Enable test_aot_inductor.py tests
#155598 opened
Jun 10, 2025 -
[do NOT land] DTensor + torch_function_mode + flex_attention dispatch test
#155600 opened
Jun 10, 2025 -
Remove unnecessary MPSStream initialization
#155602 opened
Jun 10, 2025 -
[DRAFT] Evaluate feasability of using FunctionalTensor for Example Value
#155606 opened
Jun 10, 2025 -
[dict] Implement dict subclass `fromkeys` classmethod
#155608 opened
Jun 10, 2025 -
[dynamo] added github_cli to detect unimplemented_v2 calls
#155610 opened
Jun 10, 2025 -
[Scripts] Add refresh script to clean, pull and build repo
#155639 opened
Jun 10, 2025 -
[WIP][PGO] exclude optimizer state from PGO whitelist
#155643 opened
Jun 10, 2025 -
[cond] preserve merged phs meta for subgraph
#155644 opened
Jun 10, 2025 -
[cond] auto_functionalize cond
#155645 opened
Jun 10, 2025 -
DOC: update CrossEntropyLoss with note and example of incorrect target specification
#155649 opened
Jun 11, 2025 -
Fix UB in BFloat16 round_to_nearest_even
#155650 opened
Jun 11, 2025 -
Make upsample accept list scale_factor
#155654 opened
Jun 11, 2025 -
Fix cudagraph record_stream memory leak
#155658 opened
Jun 11, 2025 -
[Misc] handle sys exit caused by skip_if_lt_x_gpu in test_composabili…
#155665 opened
Jun 11, 2025 -
[Optimus] add einsum_to_pointwise_pass pattern
#155666 opened
Jun 11, 2025 -
[inductor] Make size_hint fallback parameter required and add size_hi…
#155669 opened
Jun 11, 2025 -
Fixed NLLLoss 1D input crash with torch.compile
#155672 opened
Jun 11, 2025 -
Document `Flop Counter Mode` in torch.utils
#155673 opened
Jun 11, 2025 -
[Quant][CPU] Enable fp8 qlinear
#155678 opened
Jun 11, 2025 -
[fsdp] fix: fix optim_state_dict with FSDP model not on global rank 0
#155685 opened
Jun 11, 2025 -
Remove unused nonlocal declarations from checkpoint and library helper functions
#155686 opened
Jun 11, 2025 -
[CUDA][MAGMA][Linalg][WIP] Remove MAGMA
#155694 opened
Jun 11, 2025 -
[CI] Use `setup-python` from for Mac tests
#155698 opened
Jun 11, 2025 -
Clean up HF components
#155707 opened
Jun 11, 2025 -
docs: clean up docstring for clarity and correctness
#155712 opened
Jun 11, 2025 -
Remove actionable label from docathon label sync script
#155713 opened
Jun 11, 2025 -
[2/2] proxy_tensor do not clobber for mutating ops
#155716 opened
Jun 11, 2025 -
[FP8] Fix Benchmarking for certain Priors
#155722 opened
Jun 11, 2025 -
HOP py_impl register to tensor subclass cannot dispatch
#155726 opened
Jun 11, 2025 -
Adds number of channels check in PixelShuffle
#155728 opened
Jun 11, 2025 -
[MPS] Add regression test for memory leak in nn.MaxPool2d
#155730 opened
Jun 11, 2025 -
[CI] Remove conda from from windows
#155731 opened
Jun 11, 2025 -
Fix torch.export.export() GPU failure with RNN modules.
#155734 opened
Jun 11, 2025 -
[MPS] Activation kernels: do compute at float precision
#155735 opened
Jun 11, 2025 -
Overload `mul_overflows` for `size_t`
#155736 opened
Jun 11, 2025 -
torch.distributed TCP bind address
#155741 opened
Jun 11, 2025 -
[Torch Package] Make get names of OrderedImporters support fallback to importers
#155743 opened
Jun 11, 2025 -
Add Windows CUDA 12.9.1 build
#155748 opened
Jun 11, 2025 -
[refactor] simplify the implementation of check_input_alias_and_mutation_return_outputs
#155749 opened
Jun 11, 2025 -
[dynamo][guards] Skip dispatch key guards for requires_grad=False
#155756 opened
Jun 11, 2025 -
efficient zero_mask implementation for vec128_*_neon
#155766 opened
Jun 12, 2025 -
add `__annotations__` attribute to `OpOverload`
#155784 opened
Jun 12, 2025 -
add sfdp pattern
#155792 opened
Jun 12, 2025 -
[DONT MERGE][TESTING][1/2] xpu test runner
#155793 opened
Jun 12, 2025 -
[Do not merge] DCP ZOC Test Changes
#155802 opened
Jun 12, 2025 -
[CD] Move build_magma.bat to build_magma.py
#155804 opened
Jun 12, 2025 -
[VibeCoding] Replace clone.bat with clone.ps1
#155805 opened
Jun 12, 2025 -
[VibeCoding] Convert architecture specific batch scripts to PowerShell
#155807 opened
Jun 12, 2025 -
wip
#155810 opened
Jun 12, 2025 -
Refactor DynamoStore into disk and in memory implementations
#155818 opened
Jun 12, 2025 -
[doc] Updates to distributed.md for XCCL backend
#155834 opened
Jun 12, 2025 -
Skip FSDP tests if device count is less then requested world_size value
#155836 opened
Jun 12, 2025 -
[einops] Ensure Dynamo can trace through explicit set dunder method call
#155842 opened
Jun 12, 2025 -
[VibeCoding] build_pytorch.bat to build_pytorch.ps1
#155843 opened
Jun 12, 2025 -
[ATen MTIA backend] Use aten native CPU fallback function on MTIA
#155845 opened
Jun 12, 2025 -
assert statement to check if output_size is not None
#155846 opened
Jun 12, 2025 -
patch PR #151719
#155851 opened
Jun 12, 2025 -
[test][inductor] fix test_conv_cat failure
#155852 opened
Jun 12, 2025 -
[NCCL][P2P] Optionally avoid `recordStream`in P2P comms
#155854 opened
Jun 12, 2025 -
[BE][c10d/Store]add check in pyi
#155855 opened
Jun 12, 2025 -
Optionally avoid `record_streams` in autograd with `TORCH_AUTOGRAD_AVOID_RECORD_STREAMS=1`
#155857 opened
Jun 12, 2025 -
[dynamo] Support builtin bool on non-constant VTs
#155863 opened
Jun 12, 2025 -
[DONT MERGE] Diffusion models benchmarking for compile time
#155866 opened
Jun 13, 2025 -
[cutlass backend] compile and link for .so files
#155876 opened
Jun 13, 2025 -
[NOT FOR MERGE] Exploratory work on AOTInductor training
#155877 opened
Jun 13, 2025 -
Docs: Fix sphinx heading markup in `nn.rst`
#155883 opened
Jun 13, 2025 -
[hop] support torch.func.functional_call in hop subgraph
#155886 opened
Jun 13, 2025 -
[WIP][user triton] AOT inductor support for device-side TMA
#155896 opened
Jun 13, 2025 -
Default USE_PRIORITIZED_TEXT_FOR_LD=1 on Linux aarch64 via setup.py
#155901 opened
Jun 13, 2025 -
test without rocblas conv when using cudagraphs
#155902 opened
Jun 13, 2025 -
Mitigate upcoming removal of direct invocation of setup.py support
#155910 opened
Jun 13, 2025 -
[WIP] Automatic load/save
#155913 opened
Jun 13, 2025 -
Fix argument validation for torch.nn.attention.sdpa_kernel
#155922 opened
Jun 13, 2025 -
[dynamo] Add `-> bool` to functions named `is_*` or `_is_*`
#155923 opened
Jun 13, 2025 -
[inductor] Add `-> bool` to functions named `is_*` or `_is_*`
#155928 opened
Jun 13, 2025 -
updated matplotlib version in docs requirements
#155931 opened
Jun 13, 2025 -
[export] add _union_dataclass to support comparing dataclasses that inherits from union.
#155932 opened
Jun 13, 2025 -
Fix stride comparison max(512 - s, 1) vs. (512 - s)
#155938 opened
Jun 13, 2025 -
consolidate in finish step
#155940 opened
Jun 13, 2025 -
don't do a full deserialize on every file
#155942 opened
Jun 13, 2025 -
[ca] mark some sparse tests fixed by AccumulateGrad functionalization
#155948 opened
Jun 13, 2025 -
TopK workaround when tensor rank - sort axis > 4
#155950 opened
Jun 13, 2025 -
[cuDNN] Adding SDPA tests for cuDNN backend
#155951 opened
Jun 13, 2025 -
NotForLand: LOAF on by default
#155956 opened
Jun 13, 2025 -
[DRAFT][cuDNN][SDPA] Introduce `TORCH_CUDNN_SDPA_AVOID_RECOMPILE=1`
#155958 opened
Jun 13, 2025 -
[ca] default on in CI also for PYTORCH_TEST_WITH_INDUCTOR
#155960 opened
Jun 13, 2025 -
draft: [cp] context_parallel + flex_attention using torch_function and autograd function
#155962 opened
Jun 13, 2025 -
Fix issue with set_reduce_scatter_divide_factor errors and MixedPrecisionPolicy
#155964 opened
Jun 13, 2025 -
[CI] Remove redundant accuracy benchmarks for cpp_wrapper
#155966 opened
Jun 13, 2025 -
[CI][cpp_wrapper] Fix selection of CPU OpInfo tests
#155967 opened
Jun 13, 2025 -
draft: [cp] context_parallel + flex_attention_backward using torch_function and autograd function
#155970 opened
Jun 13, 2025 -
unify dynamic shapes API namings 3 (guard_int, guard_int_seq)
#155973 opened
Jun 14, 2025 -
Fixes for CPython int/float tests
#155978 opened
Jun 14, 2025 -
Handling overflow for long int overflow for the product of kernel_hei…
#155989 opened
Jun 14, 2025 -
[build] modernize build-backend: `setuptools.build_meta:__legacy__` -> `setuptools.build_meta`
#155998 opened
Jun 14, 2025 -
[BE][Ez]: Use ruff type inference to autotype parts of dynamo
#156001 opened
Jun 14, 2025 -
[BE][Ez]: Fix untyped decorator in dcp utils
#156003 opened
Jun 14, 2025 -
add enum for core Backend class
#156004 opened
Jun 14, 2025 -
Allow as_tensor to retain grad info
#156006 opened
Jun 14, 2025 -
feat(cmake): add NCCL version selection based on CUDA version
#156014 opened
Jun 15, 2025 -
[profiler] add more CUDA API for kernel launcher
#156016 opened
Jun 15, 2025 -
[BE] add a minimal linter to check `pyproject.toml` consistency
#156017 opened
Jun 15, 2025 -
[build] modernize build-frontend: `python setup.py develop/install` -> `[uv ]pip install[ -e] .`
#156027 opened
Jun 15, 2025 -
bmm, topk, cholesky, linalg.norm, max with out variants set causing r…
#156030 opened
Jun 15, 2025 -
[BE][Easy] set end-of-line for `.bat` file to CRLF in `.editorconfig`
#156032 opened
Jun 15, 2025 -
Fix atleast_{1,2,3}d() with no arguments description
#156042 opened
Jun 16, 2025 -
[BE][Easy][setup] wrap over long error messages and redirect them to `stderr` in `setup.py`
#156043 opened
Jun 16, 2025 -
[BE][Easy][setup] use `super().method(...)` in command subclasses in `setup.py`
#156044 opened
Jun 16, 2025 -
[build] remove upper version pin for `setuptools<80.0`
#156049 opened
Jun 16, 2025 -
Update URL for RPATH documentation
#156060 opened
Jun 16, 2025 -
Support transpose and pack for bit8
#156065 opened
Jun 16, 2025 -
Register hpu device to fake backend
#156076 opened
Jun 16, 2025 -
revamp dtype documentation for 2025
#156087 opened
Jun 16, 2025 -
Convert to markdown: jit.rst
#156094 opened
Jun 16, 2025 -
Add error to intercept crash in issue #154882 on maxpool2d with indices
#156101 opened
Jun 16, 2025 -
[opinfo] Exclude aten_name if its not actually a name
#156104 opened
Jun 16, 2025 -
[opinfo] add overloads to opinfo
#156109 opened
Jun 16, 2025 -
local load/save
#156110 opened
Jun 16, 2025 -
Add debug messages for deps issues during fx splits
#156111 opened
Jun 16, 2025 -
Bump transfomers version
#156118 opened
Jun 16, 2025 -
[ROCm][Inductor][CK] update API for gemm-multiD change
#156122 opened
Jun 16, 2025 -
Display a warning when overwriting `CMAKE_CUDA_ARCHITECTURES`
#156123 opened
Jun 16, 2025 -
[test] re-run CI with complex + Python dispatch key changes
#156131 opened
Jun 16, 2025 -
NOT-FOR-LAND: enable autochunker by default
#156132 opened
Jun 16, 2025 -
[WIP][ci][cutlass backend] Add ci for cutlass backend tests
#156136 opened
Jun 16, 2025 -
Templatize model_container
#156137 opened
Jun 16, 2025 -
Improve documentation for torch.lobpcg
#156139 opened
Jun 16, 2025 -
[cuDNN][64-bit indexing] update conv depthwise 64bit indexing dispatch condition to match native kernel
#156140 opened
Jun 17, 2025 -
[executorch hash update] update the pinned executorch hash
#156141 opened
Jun 17, 2025 -
[dcp_poc] Introduce a new simple rank local checkpointer
#156142 opened
Jun 17, 2025 -
[Docs] Fix indentations in cond.md
#156147 opened
Jun 17, 2025 -
[list] Raise exception in invalid list method call
#156148 opened
Jun 17, 2025 -
Convert bottleneck.rst to markdown
#156149 opened
Jun 17, 2025 -
Optimize dim description in torch.max
#156153 opened
Jun 17, 2025 -
Bump protobuf from 5.29.4 to 5.29.5 in /.ci/docker
#156157 opened
Jun 17, 2025 -
Optimize scatter/gather kernel for ARM.
#156161 opened
Jun 17, 2025 -
Deprecate CUDAAllocatorConfig, use AllocatorConfig instead
#156165 opened
Jun 17, 2025 -
draft: [cp] context_parallel + flex_attention using monkey_patch
#156170 opened
Jun 17, 2025 -
Implementation of a ScannedModule
#156172 opened
Jun 17, 2025 -
[WIP] Add a new API of allocator setting for accelerator
#156175 opened
Jun 17, 2025 -
[NVIDIA] Refactor Family Blackwell Support codegen
#156176 opened
Jun 17, 2025 -
[WIP] Remove legacy aarch64_linux builder in favor of Manylinux
#156178 opened
Jun 17, 2025 -
[TEST] Add Windows cuda 12.9.1 build
#156179 opened
Jun 17, 2025 -
[Native][CPU][TopK] Improve perf by reducing swap operations
#156183 opened
Jun 17, 2025 -
[Inductor] Subgraph as a choice symbolic expression as input
#156185 opened
Jun 17, 2025 -
[TEST] Triton 3.4.0 pin update
#156186 opened
Jun 17, 2025 -
[inductor] Quiesce Triton compile worker pool after each dynamo compile
#156187 opened
Jun 17, 2025 -
Engine reuse calling thread when only single device detected
#156188 opened
Jun 17, 2025 -
Add auto support
#156189 opened
Jun 17, 2025 -
[ROCm] [CK] Composable Kernel integration for ROCm
#156192 opened
Jun 17, 2025 -
[dynamo] allow symints in list.__setitem__
#156197 opened
Jun 17, 2025 -
[Codemod][Folly target clean up] 57
#156198 opened
Jun 17, 2025 -
Fix torch.clamp CPU overflow with float16 tensors
#156199 opened
Jun 17, 2025 -
[CUTLASS] [CUDA] SM100 GroupMM
#156203 opened
Jun 17, 2025 -
[DCP] OSS Zero Overhead Checkpointing Implementation
#156207 opened
Jun 17, 2025 -
[CI] Add prebuild command option, set prebuild command option for CI to build flash attention
#156236 opened
Jun 17, 2025 -
Fix constant folding pass for mutable buffer
#156239 opened
Jun 17, 2025 -
Fix `aten::index_put` args Dtensor type mismatch
#156240 opened
Jun 17, 2025 -
[list] Implement `list.remove`
#156242 opened
Jun 17, 2025 -
Extract CPU log_softmax kernels to header
#156243 opened
Jun 17, 2025 -
[EZ/Profiler] Change 'b' to 'B' in FunctionEvent Frontend
#156250 opened
Jun 17, 2025 -
[dynamo] fix some cross-graph-break refleaks in eval_frame
#156252 opened
Jun 17, 2025 -
[Test] Kineto Submodule Update
#156253 opened
Jun 17, 2025 -
Add size_hints_or_throw
#156255 opened
Jun 17, 2025 -
Consolidate stack trace in Tracer
#156257 opened
Jun 18, 2025 -
[invoke_subgraph] make same subgraph share get_attr target
#156260 opened
Jun 18, 2025 -
Convert quantization.rst to markdown
#156266 opened
Jun 18, 2025 -
Add fallback-aware device checking for MPS operations
#156267 opened
Jun 18, 2025 -
[Inductor][CPP backend] Optimize parallel depth algorithm [Don't merge]
#156268 opened
Jun 18, 2025 -
Implement list.__add__ and list.__iadd__
#156270 opened
Jun 18, 2025 -
[list] Add list.__mul__ and list.__imul__
#156271 opened
Jun 18, 2025 -
[Intel GPU] Enable training for SDPA XPU [WIP]
#156272 opened
Jun 18, 2025 -
[inductor] split out triton templates
#156276 opened
Jun 18, 2025 -
[inductor][tma template] subclass workspace arg for choice
#156277 opened
Jun 18, 2025 -
[inductor] add KernelTemplateParams
#156278 opened
Jun 18, 2025 -
[inductor] introduce kernel_inputs
#156279 opened
Jun 18, 2025 -
[inductor][1/2] break out TritonTemplate, TritonTemplateKernel, TritonTemplateCaller out of select_algorithm.py
#156280 opened
Jun 18, 2025 -
[inductor][2/2] break out TritonTemplate, TritonTemplateKernel, TritonTemplateCaller out of select_algorithm.py
#156281 opened
Jun 18, 2025 -
[inductor] heuristics based on kernel templates
#156282 opened
Jun 18, 2025 -
Introduce sync_cross_rank_decision
#156287 opened
Jun 18, 2025 -
[inductor] KernelTemplates report their own KernelParams
#156292 opened
Jun 18, 2025 -
Add cascade sum support for Inductor CPP backend
#156296 opened
Jun 18, 2025 -
[BE][1/16] fix typos in torch/
#156311 opened
Jun 18, 2025 -
[BE][2/16] fix typos in torch/ (torch/_*/)
#156312 opened
Jun 18, 2025 -
[BE][8/16] fix typos in torch/ (torch/csrc/jit/)
#156318 opened
Jun 18, 2025 -
[BE][10/16] fix typos in torch/ (torch/csrc/jit/)
#156320 opened
Jun 18, 2025 -
Address richard's comments on libtorch_stable_abi note
#156324 opened
Jun 18, 2025 -
[TEST] DO not commit
#156326 opened
Jun 18, 2025 -
Migrate c10/macros/cmake_macros.h.in
#156329 opened
Jun 18, 2025 -
Fix native static dispatch kernels
#156331 opened
Jun 18, 2025 -
Validate custom op support for compile_kernel
#156332 opened
Jun 18, 2025 -
[testing] test/run_test.py: Only shutdown pool if it was created
#156333 opened
Jun 18, 2025 -
Storage: add_delete_hook for deregistration
#156338 opened
Jun 18, 2025 -
[list] Add list.__delitem__
#156339 opened
Jun 18, 2025 -
Add private API to modify the tags for a custom operator
#156343 opened
Jun 18, 2025 -
[inductor] set config.min_num_split by default
#156345 opened
Jun 18, 2025 -
[invoke_subgraph] make collect_meta_analysis fake prop cachable
#156347 opened
Jun 18, 2025 -
Add User defined subclass handling to funcitonalize impl
#156349 opened
Jun 18, 2025 -
Build FBGEMM GenAI as part of PyTorch
#156355 opened
Jun 18, 2025 -
[wip]
#156356 opened
Jun 18, 2025 -
[BE] comments + try to get rid of secondary `make_autotune_fn`
#156358 opened
Jun 18, 2025 -
[Codemod][Folly target clean up] 28
#156365 opened
Jun 18, 2025 -
[Codemod][Folly target clean up] 22
#156366 opened
Jun 18, 2025 -
[iter] Update some of the tests to not call pickle
#156369 opened
Jun 18, 2025 -
[iter] exhaust `ListIterator` when `unpack_var_sequence` is called
#156370 opened
Jun 18, 2025 -
[iter] Add support for sequence protocol in `iter(..)`
#156371 opened
Jun 18, 2025 -
Add macos26 beta test runner
#156372 opened
Jun 18, 2025 -
[TSAN][live speech translation] Fix A data race in caffe2
#156378 opened
Jun 18, 2025 -
cub and compile_kernel composition
#156380 opened
Jun 19, 2025 -
Prevent cudaStreamSync when indexing GPU tensors with boolean CPU mask
#156384 opened
Jun 19, 2025 -
[InductorBench] Fix accuracy validation logic for MPS
#156385 opened
Jun 19, 2025 -
Bump urllib3 from 2.2.2 to 2.5.0 in /tools/build/bazel
#156390 opened
Jun 19, 2025 -
Use CMake wholearchive group
#156393 opened
Jun 19, 2025 -
Use CUDA::cupti target
#156396 opened
Jun 19, 2025 -
Added index 0 for ROCR_VISIBLE_DEVICES
#156398 opened
Jun 19, 2025 -
[ez] fix typo in comment
#156402 opened
Jun 19, 2025 -
[Codemo 57AE d][Folly target clean up] 28 [A]
#156403 opened
Jun 19, 2025 -
[DONT MERGE][TESTING][2/2] test new xpu runner
#156410 opened
Jun 19, 2025 -
Fix storage_offset preservation in clone_preserve_strides
#156415 opened
Jun 19, 2025 -
[iter] support `iter(callable, sentinel)`
#156416 opened
Jun 19, 2025 -
Change t.is_cuda to t.device.type == 'cuda' in torch/utils/viz
#156418 opened
Jun 19, 2025 -
[cc][multi-kernel] attempt 1
#156421 opened
Jun 19, 2025 -
[dm][multi-kernel] attempt 1
#156422 opened
Jun 19, 2025 -
[dm][mk] attempt 2
#156423 opened
Jun 19, 2025 -
[cc][multi-kernel] attempt 2
#156427 opened
Jun 19, 2025 -
[br][mk] attempt 1
#156428 opened
Jun 19, 2025 -
[precompile] Detect source code changes for save/load.
#156432 opened
Jun 19, 2025 -
[dynamo] show frame information when recompilation is triggered on fail_on_recompile
#156433 opened
Jun 19, 2025 -
use cmake target torch instead of ${TORCH_LIBRARIES} in cpp installation docs
#156435 opened
Jun 19, 2025 -
[cc][multi-kernel] attempt 3
#156439 opened
Jun 19, 2025 -
[invoke_subgraph] Add config flag to control support of input mutation
#156450 opened
Jun 19, 2025 -
[cc][multi-kernel] attempt 4
#156452 opened
Jun 19, 2025 -
[WIP]Fallback to CPU for XPU FP64
#156456 opened
Jun 19, 2025 -
Fixes issue #156414: Fixes bug in implementation of _combine_histograms.
#156457 opened
Jun 20, 2025 -
wip Updates to scaled_mm code
#156458 opened
Jun 20, 2025 -
[iter] Wrap iter(..) call in a ObjectIteratorVariable
#156460 opened
Jun 20, 2025 -
[inductor] select_algorithm: add preprocessing fns
#156464 opened
Jun 20, 2025 -
[torchbench] update environment setup s B41A cript
#156465 opened
Jun 20, 2025 -
WIP: Add `max_pool3d` for MPS
#156467 opened
Jun 20, 2025 -
Debug PR, no need to review
#156468 opened
Jun 20, 2025 -
Docs/update contributing rebase tip
#156469 opened
Jun 20, 2025 -
kernel arg munging attempt
#156470 opened
Jun 20, 2025 -
[wip][inductor] add kernel choice
#156477 opened
Jun 20, 2025 -
[Codemod][Folly target clean up] 22 [B]
#156478 opened
Jun 20, 2025 -
[ROCm][Windows] Fixing undefined symbol linker error after exposing MIOpen symbols
#156479 opened
Jun 20, 2025 -
[WIP] Add device_id to XPU device properties
#156481 opened
Jun 20, 2025 -
Fix torch.onnx.export parameter for onnx_shape_inference (#156480)
#156483 opened
Jun 20, 2025 -
[Profiler] Fix profile_all_threads in debug build
#156484 opened
Jun 20, 2025 -
Add regression test for UnicodeDecodeError in torch.compile with extreme values
#156485 opened
Jun 20, 2025 -
[ROCm][Windows] Skip using rocm-core on Windows case
#156486 opened
Jun 20, 2025 -
[DO NOT MERGE] Update trunk.yml to change the runner that the job runs-on
#156491 opened
Jun 20, 2025 -
[INIT DRAFT] setting up the build for torch/standalone
#156492 opened
Jun 20, 2025 -
Fix type annotations for dim parameter in torch.amin and torch.amax
#156493 opened
Jun 20, 2025 -
[MPS] Optimize cumsum/cumprod metal kernels
#156494 opened
Jun 20, 2025 -
cublaslt/hipblaslt persistent workspace
#156495 opened
Jun 20, 2025 -
add test_batchnorn_2D and 3D tests
#156498 opened
Jun 20, 2025 -
[ROCm] Bump AOTriton to 0.10b
#156499 opened
Jun 20, 2025 -
[Inductor] Fix epilogue fusion decision with 1 Triton caller as choice
#156500 opened
Jun 20, 2025 -
[MTIA Aten Backend] Migrate maximum.out / minimum.out / cos.out / erf.out / exp.out
#156502 opened
Jun 20, 2025 -
Organize BUCK for torch/standalone
#156503 opened
Jun 20, 2025 -
added stubs for jit tree views
#156504 opened
Jun 20, 2025 -
[nativert] Move PrimKernelRegistry to PyTorch core
#156506 opened
Jun 20, 2025 -
[nativert] Move HigherOrderKernel
#156507 opened
Jun 20, 2025 -
[nativert] move layout planner algorithms to libtorch
#156508 opened
Jun 20, 2025 -
[docs][typing] Document and type support for dim=None in torch.amin and torch.amax
#156510 opened
Jun 20, 2025 -
python definitely_contiguous-> is_contiguous_or_false
#156515 opened
Jun 20, 2025 -
Unify dynamic shapes APIs naming 2 (expect_true and check) attempt2
#156518 opened
Jun 20, 2025 -
[aoti] Check longlong upperbound for codegening input size check
#156522 opened
Jun 20, 2025 -
remove gso from set_storage_meta__symint
#156525 opened
Jun 20, 2025 -
[Inductor][CPP] Fix perf regression of functorch_maml_omniglot
#156526 opened
Jun 21, 2025 -
[dynamo] fix segfault due to dangling CacheEntry backend pointer
#156527 opened
Jun 21, 2025 -
[dynamo] Guard eagerly on list objects to avoid guard on getitem index
#156531 opened
Jun 21, 2025 -
Add RoPE (Rotary Positional Embedding) to PyTorch core
#156532 opened
Jun 21, 2025 -
[inductor] Quiesce Triton compile worker pool by default in OSS
#156534 opened
Jun 21, 2025 -
remove allow-untyped-defs from c10d_rendezvous_backend.py
#156536 opened
Jun 21, 2025 -
remove allow-untyped-defs from torch/ao/nn/sparse/quantized/linear.py
#156537 opened
Jun 21, 2025 -
[MTIA Aten Backend] Migrate _log_softmax.out / _log_softmax_backward_data.out
#156539 opened
Jun 21, 2025 -
avoid to declare an unknown bound array without any element
#156543 opened
Jun 21, 2025 -
Enable target-determination (TD) for ROCm CI
#156545 opened
Jun 21, 2025 -
[ddp] improve c++ reducer bucketing readability
#156550 opened
Jun 21, 2025 -
F438 [CUDAGraph] add config `cudagraph_capture_sizes`
#156551 opened
Jun 21, 2025 -
Add fx_graph_runnable tests boilerplate
#156552 opened
Jun 21, 2025 -
[MTIA Aten Backend] Migrate isnan
#156554 opened
Jun 22, 2025 -
Clarify online softmax split reduction limitation and invite contributions (refs #153241)
#156556 opened
Jun 22, 2025 -
Don't use deprecated CUDA.cmake module
#156559 opened
Jun 22, 2025 -
typo
#156560 opened
Jun 22, 2025 -
Implement guard collectives (optimized version)
#156562 opened
Jun 22, 2025 -
[nativert] reland D76832891 remove designated initializer cpp20
#156565 opened
Jun 22, 2025 -
[MPSInductor][BE] Fix multistage reduction check
#156567 opened
Jun 22, 2025 -
[MTIA Aten Backend] Migrate max.dim_max / min.dim_min
#156568 opened
Jun 23, 2025 -
[nativert] Move call_torchbind_kernel
#156571 opened
Jun 23, 2025 -
[WIP][AOTI][Intel GPU] Add XPU quantization ops to AOT Inductor.
#156572 opened
Jun 23, 2025 -
[MTIA Aten Backend] Migrate ge.Tensor_out / ge.Scalar_out
#156573 opened
Jun 23, 2025 -
add torch.concat to normalization pass
#156574 opened
Jun 23, 2025 -
[WIP] Port three dynamo test to Intel GPU
#156575 opened
Jun 23, 2025 -
Fix UT failure on non-cuda backend
#156577 opened
Jun 23, 2025 -
Added philox based RNG context for HPU device in Dtensor scenarios
#156581 opened
Jun 23, 2025 -
[SymmMem] Rename all_to_all_vdev ops
#156582 opened
Jun 23, 2025 -
Update github first merge rule
#156583 opened
Jun 23, 2025 -
[xla hash update] update the pinned xla hash
#156584 opened
Jun 23, 2025 -
[CPU] Fix memory access for sbgemm bf16
#156585 opened
Jun 23, 2025 -
[Profiler] the doc of _ExperimentalConfig is incorrectly truncated by commas
#156586 opened
Jun 23, 2025 -
[OpenReg][1/N] Migrate cpp_extensions_open_device_registration to OpenReg
#156588 opened
Jun 23, 2025 -
[OpenReg][2/N] Migrate cpp_extensions_open_device_registration to OpenReg
#156589 opened
Jun 23, 2025 -
[Doc] remove WSL2 in support matrix for Intel GPU
#156590 opened
Jun 23, 2025 -
[ROCm][Windows] Fix rocsolver undefined symbol error
#156591 opened
Jun 23, 2025 -
[Inductor Dashboard] Enable deterministic algorithms for some models
#156592 opened
Jun 23, 2025 -
[Break XPU] Fix UT failures introduced by community.
#156594 opened
Jun 23, 2025 -
docstring_linter: Fix #151692 and other issues
#156596 opened
Jun 23, 2025 -
[ZENDNN] Integrate ZenDNN library, implement Linear op, add unit-tests
#156599 opened
Jun 23, 2025
704 Issues closed by 116 people
-
`set_reduce_scatter_divide_factor` is inconsistent between FSDP and HSDP
#155903 closed
Jun 23, 2025 -
FSDP2's `set_reduce_scatter_divide_factor` is inconsistent wrt reduce dtype
#155904 closed
Jun 23, 2025 -
DISABLED test_ind_worker_queue (__main__.TestIndividualWorkerQueue)
#68643 closed
Jun 23, 2025 -
DISABLED test_module_and_optimizer_ids (__main__.TestTorchTidyProfiler)
#87581 closed
Jun 23, 2025 -
Provide a way to allow dynamo to trace into an operator defined with `torch.library.custom_op`
#156322 closed
Jun 23, 2025 -
Dynamo benchmark test got failed torch.dtype object has no attribute '__name__'
#156482 closed
Jun 23, 2025 -
[RFC] Migrate to modern Python build system and replace `setup.py` commands with their modern alternatives
#156029 closed
Jun 23, 2025 -
[MPSInductor] Silently incorrect result with varmean+epilogue
#156426 closed
Jun 23, 2025 -
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_complex128 (__main__.TestForeachCUDA)
#151300 closed
Jun 23, 2025 -
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_bool (__main__.TestForeachCUDA)
#151268 closed
Jun 23, 2025 -
DISABLED test_graph_partition (__main__.TritonCodeGenTests)
#148957 closed
Jun 23, 2025 -
DISABLED test_mm_plus_mm (__main__.TestPatternMatcher)
#145335 closed
Jun 23, 2025 -
Segmentation fault (core dumped) in `torch.profiler.profile`
#156564 closed
Jun 22, 2025 -
UNSTABLE inductor / linux-jammy-cpu-py3.9-gcc11-inductor / test (inductor_torchbench_cpu_smoketest_perf)
#156521 closed
Jun 22, 2025 -
Is it possible to serialize a torch.cuda.CUDAGraph into disk or CPU memory
#125820 closed
Jun 22, 2025 -
`scaled_dot_product_attention` backwards: illegal memory access with large inputs
#150054 closed
Jun 21, 2025 -
Using `opset_version = 22` in `torch.onnx.export` with `dynamo=True` includes dropout nodes in the model
#156542 closed
Jun 21, 2025 -
`torch.distributed.pipelining.pipeline` error when initializing on meta device
#156541 closed
Jun 21, 2025 -
Add runtime profiler info for AOTDispatcher prologue
#155721 closed
Jun 21, 2025 -
UNSTABLE inductor-rocm-mi300 / rocm-py3.10-inductor-mi300 / test (inductor)
#154884 closed
Jun 21, 2025 -
UNSTABLE rocm-mi300 / linux-jammy-rocm-py3.10-mi300 / test (default)
#156360 closed
Jun 21, 2025 -
Support input mutations + aliasing with scan during training
#156337 closed
Jun 20, 2025 -
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_bfloat16 (__main__.TestForeachCUDA)
#151228 closed
Jun 20, 2025 -
Add @markDynamoStrictTest to all TestCase
#115671 closed
Jun 20, 2025 -
Loss with LBFGS not going down
#156501 closed
Jun 20, 2025 -
Can we have Dim.AUTO/Dim.DYNAMIC with an optional min & max?
#147483 closed
Jun 20, 2025 -
DTensor does not compose with Parameters Groups
#156453 closed
Jun 20, 2025 -
[ONNX] Support for grouped query attention
#151762 closed
Jun 20, 2025 -
[XPU] Support toggling profiler on/off for XPU.
#154898 closed
Jun 20, 2025 -
Sourceforge outage causing multiple CI failures
#108773 closed
Jun 20, 2025 -
pytorchbot erroneously thinks PR has already been merged as a different commit
#154427 closed
Jun 20, 2025 -
[ONNX] Inputs generated b 10000 y onnx.export() with dynamo=False are not consistent with dynamo=True
#136179 closed
Jun 20, 2025 -
[Torch TO ONNX BUG] The right shift operation in torch is mapped as a division operation when converted to ONNX.
#139455 closed
Jun 20, 2025 -
[ONNX] 2.0 regression: dynamic shapes lost for an operator
#139463 closed
Jun 20, 2025 -
[ONNX] Document the registration API
#139499 closed
Jun 20, 2025 -
[ONNX] Run report_exportability when report=True
#139904 closed
Jun 20, 2025 -
Replace reduce(operator.mul) with math.prod for computing product of dimensions
#140888 closed
Jun 20, 2025 -
Exporting the operator 'aten::_transformer_encoder_layer_fwd' to ONNX opset version 17 is not supported
#144242 closed
Jun 20, 2025 -
Custom symbolic functions for ONNX export with None args causes SEGFAULT
#145261 closed
Jun 20, 2025 -
ONNX export failing when using `symbolic` functions and scripting
#146035 closed
Jun 20, 2025 -
Export HuggingFace mamba to ONNX
#146835 closed
Jun 20, 2025 -
[ONNX] BitwiseOr was generated for bool inputs (invalid)
#147854 closed
Jun 20, 2025 -
[ONNX] dynamic dims are not exported with the specified names
#148629 closed
Jun 20, 2025 -
[ONNX] How to export Llama4
#150891 closed
Jun 20, 2025 -
Exporting the operator 'aten::lift_fresh' to ONNX - not supported
#151932 closed
Jun 20, 2025 -
Exporting the operator 'aten::fft_fft2' to ONNX opset version 19 is not supported.
#153823 closed
Jun 20, 2025 -
[ONNX] Verify the translation of SDPA to Attention-23
#156105 closed
Jun 20, 2025 -
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_float64 (__main__.TestForeachCUDA)
#151214 closed
Jun 20, 2025 -
DISABLED test_triton_template_generated_code_caching (__main__.TestMaxAutotune)
#154108 closed
Jun 20, 2025 -
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_float32 (__main__.TestForeachCUDA)
#151136 closed
Jun 20, 2025 -
Inductor cpp_wrapper has performance regressions
#156037 closed
Jun 20, 2025 -
DISABLED test_export_opnames_interface (__main__.TestMisc)
#154986 closed
Jun 20, 2025 -
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_float16 (__main__.TestForeachCUDA)
#151114 closed
Jun 19, 2025 -
_flash_attention_forward accuracy drop from CUDA to ROCM implementation.
#154582 closed
Jun 19, 2025 -
xpu: AOT compilation does not happen with sycl extension (JIT fallback happens)
#156249 closed
Jun 19, 2025 -
Cannot install pytorch through official pip guidance
#156413 closed
Jun 19, 2025 -
Tensors with no explicit references are possible not freed timely with torch.compile
#155778 closed
Jun 19, 2025 -
Support C shim for customized OP
#150988 closed
Jun 19, 2025 -
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_complex64 (__main__.TestForeachCUDA)
#151099 closed
Jun 19, 2025 -
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_complex128 (__main__.TestForeachCUDA)
#151093 closed
Jun 19, 2025 -
FSDP + save optimizer dtype AssertionError
#156166 closed
Jun 19, 2025 -
DISABLED test_parity__foreach_acos_fastpath_outplace_cuda_bfloat16 (__main__.TestForeachCUDA)
#151054 closed
Jun 19, 2025 -
`max_entries` parameter of `torch.cuda.memory._record_memory_history()`
#129674 closed
Jun 19, 2025 -
Indexing beyond end of array on ROCm build
#155045 closed
Jun 18, 2025 -
[ued][kokoro] torch.compile fails in kokoro (both fullgraph=True and False)
#149570 closed
Jun 18, 2025 -
Actual torch `ExportGraphSignature` does not match the example in the docs
#156184 closed
Jun 18, 2025 -
Certain operations cause implicity sync-points
#12461 closed
Jun 18, 2025 -
DISABLED test_parity__foreach_acos_fastpath_inplace_cuda_float64 (__main__.TestForeachCUDA)
#151019 closed
Jun 18, 2025 -
DISABLED test_comprehensive_pca_lowrank_cuda_float32 (__main__.TestInductorOpInfoCUDA)
#139828 closed
Jun 18, 2025 -
DISABLED test_roi_align_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#103156 closed
Jun 18, 2025 -
NCCL init hits CUDA failure 'invalid argument' on 12.2 driver
#150852 closed
Jun 18, 2025 -
Schema version check fails in `torch.export.load`
#156354 closed
Jun 18, 2025 -
Windows Runners are not available on PyTorch CI/CD
#156352 closed
Jun 18, 2025 -
format_flamegraph failed to setup the script
#156309 closed
Jun 18, 2025 -
Cannot install >=2.7.0 on ubuntu 18.04, conflict with prerequisite
#156215 closed
Jun 18, 2025 -
[tracker] DTensor Operator Coverage
#156204 closed
Jun 18, 2025 -
Flip is much slower than advanced indexing
#16424 closed
Jun 18, 2025 -
Please implement the batching rule for torch.matrix_exp.
#115992 closed
Jun 18, 2025 -
Function 'MmBackward0' returned nan values in its 0th output.
#156015 closed
Jun 18, 2025 -
DISABLED test_parity__foreach_acos_fastpath_inplace_cuda_float32 (__main__.TestForeachCUDA)
#151003 closed
Jun 18, 2025 -
Status of support for ROCm 6.4.1
#155292 closed
Jun 18, 2025 -
DISABLED test_matmul_layer_norm_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#151835 closed
Jun 18, 2025 -
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_float32 (__main__.TestForeachCUDA)
#150530 closed
Jun 18, 2025 -
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_float16 (__main__.TestForeachCUDA)
#150510 closed
Jun 18, 2025 -
[FR] Expose CUDAGraph handle to allow customized modification on the graph
#155106 closed
Jun 18, 2025 -
Have compiled autograd config API support nested compilation
#152219 closed
Jun 18, 2025 -
Convert to markdown: rpc.rst, signal.rst, size.rst, sparse.rst, special.rst
#155033 closed
Jun 18, 2025 -
DISABLED test_randint_distribution_dynamic_shapes_xpu (__main__.DynamicShapesCodegenGPUTests)
#155689 closed
Jun 17, 2025 -
DISABLED test_randint_distribution_dynamic_shapes_xpu (__main__.DynamicShapesGPUTests)
#155692 closed
Jun 17, 2025 -
DISABLED test_serialize_by_key (__main__.PrecompileContextTests)
#156146 closed
Jun 17, 2025 -
DISABLED test_basic (__main__.PrecompileContextTests)
#156063 closed
Jun 17, 2025 -
`torch.ops.aten.index_put` returns different results on CUDA and CPU
#156173 closed
Jun 17, 2025 -
DISABLED test_grad_with_manual_interleaved_ScheduleClass0_use_new_runtime_True (__main__.ScheduleTest)
#154373 closed
Jun 17, 2025 -
DISABLED test_grad_with_manual_interleaved_ScheduleClass1_use_new_runtime_False (__main__.ScheduleTest)
#154391 closed
Jun 17, 2025 -
DISABLED test_grad_with_manual_interleaved_ScheduleClass1_use_new_runtime_True (__main__.ScheduleTest)
#154408 closed
Jun 17, 2025 -
DISABLED test_grad_with_manual_interleaved_ScheduleClass2_use_new_runtime_False (__main__.ScheduleTest)
#154443 closed
Jun 17, 2025 -
torch.where() can produce nan values for unselected branch during backward
#156212 closed
Jun 17, 2025 -
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_bool (__main__.TestForeachCUDA)
#150468 closed
Jun 17, 2025 -
Failure of iOS Build Test: Build (default, 1, 1, macos-14-xlarge, SIMULATOR, arm64)
#136284 closed
Jun 17, 2025 -
[Testing] multigpu tests are still running against CUDA-11
#154119 closed
Jun 17, 2025 -
ONNX Dynamo Export - Unsupported FX nodes: {'call_function': ['aten._upsample_bilinear2d_aa.default']}.
#128818 closed
Jun 17, 2025 -
torch.compile fails to trace methods decorated with @lru_cache
#155841 closed
Jun 17, 2025 -
[FDSP2] express zero-1 with fully_shard
#155952 closed
Jun 17, 2025 -
MPS cumsum failure for 5D tensor or above
#154881 closed
Jun 17, 2025 -
get different result between conv1x1 and linear
#156154 closed
Jun 17, 2025 -
[dynamo] Add support for torch.cuda.FloatTensor()
#130722 closed
Jun 17, 2025 -
Convert to markdown: linalg.rst, logging.rst, masked.rst, meta.rst, miscellaneous_environment_variables.rst
#155025 closed
Jun 17, 2025 -
A mistake in PyTorch Docs for nn.RNN
#129446 closed
Jun 17, 2025 -
When calling torch.histc the CPU and CUDA implementations produce different outputs.
#156019 closed
Jun 17, 2025 -
When calling torch.cumprod on a float16 tensor, the CPU and CUDA implementations produce different outputs.
#156018 closed
Jun 17, 2025 -
Extra onnx::Neg_2 input after torch.onnx.export
#148655 closed
Jun 17, 2025 -
RuntimeError: CUDA driver error: operation not supported with test_stream_write_value32 and cuStreamWriteValue32
#154073 closed
Jun 17, 2025 -
DISABLED test_reentrant_parent_error_on_cpu_cuda (__main__.TestAutogradDeviceTypeCUDA)
#86735 closed
Jun 17, 2025 -
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_int8 (__main__.TestForeachCUDA)
#150407 closed
Jun 17, 2025 -
[XPU] Upgrade the XPU support packages version to 2025.1 in CI/CD
#151097 closed
Jun 17, 2025 -
libtorch doesn't work with cuda 12.6 and 12.4
#132575 closed
Jun 17, 2025 -
DISABLED test_weight_norm_bwd_dynamic_shapes_cpu (__main__.DynamicShapesCodegenCpuTests)
#153803 closed
Jun 17, 2025 -
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_int64 (__main__.TestForeachCUDA)
#150392 closed
Jun 17, 2025 -
DISABLED test_pattern_matcher_multi_user_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#134433 closed
Jun 17, 2025 -
Update ONNX Opset Version to Support Attention Operator
#153611 closed
Jun 17, 2025 -
DISABLED test_weight_norm_bwd_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#141484 closed
Jun 17, 2025 -
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_bool (__main__.TestForeachCUDA)
#150120 closed
Jun 17, 2025 -
[ONNX] Implement scan
#151327 closed
Jun 17, 2025 -
`TestCppExtensionOpenRgistration.test_base_device_registration` hangs during shutdown on MacOS
#155759 closed
Jun 16, 2025 -
ROCm: no HIP device available if device is already initialized
#152941 closed
Jun 16, 2025 -
Inductor CI failure due to Huggingface outage
#156113 closed
Jun 16, 2025 -
[Compiled_autograd] running nn.LayerNorm failed for torch.compile with compiled_autograd when deepspeed Zero3
#140091 closed
Jun 16, 2025 -
add x/0 gradient behaviour to documentation
#128796 closed
Jun 16, 2025 -
Stop special-casing einops in Dynamo
#142486 closed
Jun 16, 2025 -
None deterministic output of linear projection based on batch size and projection dimensions
#156084 closed
Jun 16, 2025 -
DISABLED test_tmp_not_defined_issue2_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#135219 closed
Jun 16, 2025 -
DISABLED test_grad_with_manual_interleaved_ScheduleClass2_use_new_runtime_True (__main__.ScheduleTest)
#154481 closed
Jun 16, 2025 -
DISABLED test_on_device_tma_store_old_api (__main__.MutationTests)
#155691 closed
Jun 16, 2025 -
torch.cuda.set_device(0) behaves differently from torch.cuda.set_device(1) in terms of cuda context
#155668 closed
Jun 16, 2025 -
DISABLED test_cache_hot_load_device_cuda_bfloat16_dynamic_False (__main__.AOTAutogradCacheTests)
#145334 closed
Jun 16, 2025 -
IInconsistent Error Handling in `torch.fused_moving_avg_obs_fake_quant` Between CPU and GPU Implementations
#153310 closed
Jun 16, 2025 -
DISABLED test_fake_registration (__main__.TestOpProfiles)
#151301 closed
Jun 16, 2025 -
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_int32 (__main__.TestForeachCUDA)
#150350 closed
Jun 16, 2025 -
旧版pytorch标注python版本
#156038 closed
Jun 16, 2025 -
In the docs for torch.amax/amin the note about min/max gradient behavior is outdated
#155048 closed
Jun 15, 2025 -
[feature request]: Update max onnx opset to 21 for compatability
#127167 closed
Jun 15, 2025 -
Update nccl 2.27.3 in pytorch nightly
#155052 closed
Jun 14, 2025 -
[DCP] failure case of save method
#152310 closed
Jun 14, 2025 -
The PyTorch version is too low and does not support 50 series GPUs
#155985 closed
Jun 14, 2025 -
[cutlass backend] Add cutlass 3x support ops config to control which ops to do cutlass lowerings on
#155718 closed
Jun 14, 2025 -
`lintrunner init` fails
#152999 closed
Jun 14, 2025 -
[inductor] [fake tensor] `torch.conj` crashes when `add` original complex tensor
#148950 closed
Jun 14, 2025 -
DistributedDataParallel with compile(..., mode="max-autotune") hangs in 2.5+
#140395 closed
Jun 14, 2025 -
[CI][CUDA][Distributed] test_assert_nan_float16 unit test hangs with certain Host OS + CUDA KMD 570.133.07
#153479 closed
Jun 13, 2025 -
Hide getitems in Dynamo bytecode profiling
#153372 closed
Jun 13, 2025 -
DISABLED test_sdpa_rewriter_12_cuda (__main__.SDPAPatternRewriterCudaTests)
#145187 closed
Jun 13, 2025 -
DISABLED test_sdpa_rewriter_12_cuda (__main__.SDPAPatternRewriterCudaDynamicTests)
#145188 closed
Jun 13, 2025 -
DISABLED test_sdpa_rewriter_11_cuda (__main__.SDPAPatternRewriterCudaTests)
#148525 closed
Jun 13, 2025 -
DISABLED test_parity__foreach_acos_fastpath_inplace_cuda_float16 (__main__.TestForeachCUDA)
#150985 closed
Jun 13, 2025 -
Convert to markdown: cuda.rst, cuda.tunable.rst, cudnn_persistent_rnn.rst, cudnn_rnn_determinism.rst, data.rst
#155016 closed
Jun 13, 2025 -
Tensor.backward type hints clarification
#81963 closed
Jun 13, 2025 -
DISABLED test_parity__foreach_ceil_fastpath_inplace_cuda_complex128 (__main__.TestForeachCUDA)
#155887 closed
Jun 13, 2025 -
DISABLED test_parity__foreach_ceil_fastpath_inplace_cuda_complex64 (__main__.TestForeachCUDA)
#155908 closed
Jun 13, 2025 -
`make_fx` error in nightly but not PyTorch 2.7.1
#155605 closed
Jun 13, 2025 -
Cannot build docs via `make html`
#155092 closed
Jun 13, 2025 -
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_float64 (__main__.TestForeachCUDA)
#150298 closed
Jun 13, 2025 -
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_float32 (__main__.TestForeachCUDA)
#150208 closed
Jun 13, 2025 -
[Upstream Triton] [ROCm] Accuracy issues in ```inductor/test_torchinductor_opinfo```
#155803 closed
Jun 13, 2025 -
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_bfloat16 (__main__.TestForeachCUDA)
#148966 closed
Jun 13, 2025 -
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_float16 (__main__.TestForeachCUDA)
#150173 closed
Jun 13, 2025 -
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_complex64 (__main__.TestForeachCUDA)
#150161 closed
Jun 13, 2025 -
Typing: Incorrect overload for boolean operators
#155701 closed
Jun 12, 2025 -
`python torchgen/gen.py --update-aoti-c-shim` should update all the c_shim_*_files
#155349 closed
Jun 12, 2025 -
module.cuda() doesn't work under FakeTensorMode
#148977 closed
Jun 12, 2025 -
Expand Examples for torch.autograd.functional.jacobian
#132140 closed
Jun 12, 2025 -
Github outage resulting in jobs failing at checkout
#155829 closed
Jun 12, 2025 -
MPS: Conv1d fails with NotImplementedError for output_channels > 65536
#152278 closed
Jun 12, 2025 -
NO support for torch 2.7.1 + cuda 12.4?
#155790 closed
Jun 12, 2025 -
GH200/GB200 NCCL Build Pytorch
#152182 closed
Jun 12, 2025 -
[Multiprocesing] missing `_release_ipc_counter` in rebuilding cuda ipc tensor with UntypedStorage
#155311 closed
Jun 12, 2025 -
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_complex128 (__main__.TestForeachCUDA)
#150141 closed
Jun 12, 2025 -
DISABLED test_variant_consistency_jit_linalg_lu_factor_cuda_float32 (__main__.TestJitCUDA)
#86839 closed
Jun 12, 2025 -
DISABLED test_comprehensive_nn_functional_nll_loss_cuda_float64 (__main__.TestDecompCUDA)
#118355 closed
Jun 12, 2025 -
DISABLED test_vmapjvpvjp_linalg_lu_cuda_float32 (__main__.TestOperatorsCUDA)
#86733 closed
Jun 12, 2025 -
DISABLED test_vmapvjpvjp_linalg_lu_factor_cuda_float32 (__main__.TestOperatorsCUDA)
#113850 closed
Jun 12, 2025 -
DISABLED test_vmapvjp_linalg_lu_cuda_float32 (__main__.TestOperatorsCUDA)
#86893 closed
Jun 12, 2025 -
DISABLED test_vmapvjpvjp_linalg_lu_cuda_float32 (__main__.TestOperatorsCUDA)
#86929 closed
Jun 12, 2025 -
DISABLED test_vmapjvpall_linalg_lu_cuda_float32 (__main__.TestOperatorsCUDA)
#86770 closed
Jun 12, 2025 -
DISABLED test_variant_consistency_jit_linalg_lu_cuda_complex64 (__main__.TestJitCUDA)
#87070 closed
Jun 12, 2025 -
DISABLED test_variant_consistency_jit_linalg_lu_factor_ex_cuda_complex64 (__main__.TestJitCUDA)
#86887 closed
Jun 12, 2025 -
TRACK: integral + floating inputs to an op with floating requiring grad result in INTERNAL_ASSERT
#78332 closed
Jun 12, 2025 -
`matmul, mm` triggers INTERNAL ASSERT FAILED when input requires grad
#78141 closed
Jun 12, 2025 -
`index_fill` will trigger INTERNAL ASSERT when float tensor requiring grad + int tensor
#78443 closed
Jun 12, 2025 -
`layer_norm` triggers INTERNAL ASSERT with input requiring grad + zero-size int tensor
#78444 closed
Jun 12, 2025 -
`addmv, mv` will trigger INTERNAL ASSERT FAILED when input requiring grad
#77814 closed
Jun 12, 2025 -
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_uint8 (__main__.TestForeachCUDA)
#150417 closed
Jun 12, 2025 -
DISABLED test_dict_contains (__main__.TestGuardSerialization)
#153530 closed
Jun 12, 2025 -
Convert to markdown: accelerator.rst, amp.rst, autograd.rst, backends.rst, benchmark_utils.rst
#155013 closed
Jun 12, 2025 -
DISABLED test_ddp_apply_optim_in_backward_ignored_params (__main__.TestDistBackendWithSpawn)
#106361 closed
Jun 12, 2025 -
ERROR: Unknown target name: "deepspeed"
#155158 closed
Jun 11, 2025 -
Convert to markdown: fsdp.rst, func.api.rst, func.batch_norm.rst, func.migrating.rst, func.rst
#155021 closed
Jun 11, 2025 -
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_bfloat16 (__main__.TestForeachCUDA)
#150119 closed
Jun 11, 2025 -
Convert to markdown: draft_export.rst, export.ir_spec.rst, export.programming_model.rst, export.rst, fft.rst
#155020 closed
Jun 11, 2025 -
DISABLED test_parity__foreach_acos_fastpath_inplace_cuda_complex64 (__main__.TestForeachCUDA)
#150960 closed
Jun 11, 2025 -
DISABLED test_parity__foreach_acos_fastpath_inplace_cuda_complex128 (__main__.TestForeachCUDA)
#150933 closed
Jun 11, 2025 -
UNSTABLE pull / linux-jammy-py3-clang12-executorch / build
#150261 closed
Jun 11, 2025 -
[inductor] Improve GEMM loggings for torch.bmm
#155307 closed
Jun 11, 2025 -
Inductor's require_stride_order and require_exact_strides should pull from the graph sent to inductor
#137979 closed
Jun 11, 2025 -
get custom operators to use exact strides
#146210 closed
Jun 11, 2025 -
torch.library.custom_op string support
#152685 closed
Jun 11, 2025 -
Autograd doc does not mention torch.autograd.set_grad_enabled
#86718 closed
Jun 11, 2025 -
DISABLED test_multi_output_unbacked_custom_op_cuda (__main__.TestInductorDynamicCUDA)
#135755 closed
Jun 11, 2025 -
Enhanced Feedback for `load_state_dict` with `strict=False`
#141256 closed
Jun 11, 2025 -
DISABLED test_parity__foreach_acos_fastpath_inplace_cuda_bfloat16 (__main__.TestForeachCUDA)
#150902 closed
Jun 11, 2025 -
DISABLED test_cublas_addmm_size_10000_backend_cublaslt_cuda_float32 (__main__.TestMatmulCudaCUDA)
#154498 closed
Jun 11, 2025 -
DISABLED test_vdd_clamp_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#134445 closed
Jun 11, 2025 -
Partitioner loses Inplace ops where source is constant
#155242 closed
Jun 11, 2025 -
[XPU] Kineto profiler fails on XPU with `PTI_ERROR_NOT_IMPLEMENTED`
#153632 closed
Jun 11, 2025 -
DISABLED test_reorder_peak_memory_dfs (__main__.TestOperatorReorderForPeakMemory)
#145183 closed
Jun 11, 2025 -
DISABLED test_ddp_comm_hook_sparse_gradients (__main__.DistributedDataParallelTest)
#146400 closed
Jun 11, 2025 -
DISABLED test_sparse_gradients (__main__.DistributedDataParallelTest)
#153142 closed
Jun 11, 2025 -
DISABLED test_reorder_peak_memory_lpmf (__main__.TestOperatorReorderForPeakMemory)
#145210 closed
Jun 11, 2025 -
DISABLED test_reorder_peak_memory_bfs (__main__.TestOperatorReorderForPeakMemory)
#147949 closed
Jun 11, 2025 -
DISABLED test_variant_consistency_jit_linalg_lu_cuda_float32 (__main__.TestJitCUDA)
#86711 closed
Jun 11, 2025 -
DISABLED test_variant_consistency_jit_linalg_lu_factor_cuda_complex64 (__main__.TestJitCUDA)
#86732 closed
Jun 11, 2025 -
DISABLED test_mm_concat_cuda (__main__.FreezingGpuTests)
#145186 closed
Jun 11, 2025 -
[XPU User Empathy Day] `torch.linalg.solve` support on XPU
#154182 closed
Jun 11, 2025 -
DISABLED test_smoke (__main__.TestCollectEnv)
#77345 closed
Jun 11, 2025 -
xpu: can't install latest nighly torch along with latest torchvision (torchvision depends on N-1 torch)
#154687 closed
Jun 11, 2025 -
DISABLED test_duplicate_registration_impl (__main__.TestOpProfiles)
#151281 closed
Jun 11, 2025 -
FP8 quantization causes bad precision with torch.compile
#154020 closed
Jun 11, 2025 -
Inconsistent result of torch.eye() in CPU vs GPU
#155661 closed
Jun 11, 2025 -
[doc] Long Function Name (C++) overlapping on the side menu
#66937 closed
Jun 11, 2025 -
deepcopy of Lazy modules returns an exception
#65051 closed
Jun 11, 2025 -
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_uint8 (__main__.TestForeachCUDA)
#150878 closed
Jun 11, 2025 -
torch.library.custom_op doesn't handle 1-element tuples returns
#150472 closed
Jun 11, 2025 -
Core dumpect when inspecting instantiation of a torch.autograd.Function (Future deprecation)
#154981 closed
Jun 11, 2025 -
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_int8 (__main__.TestForeachCUDA)
#150837 closed
Jun 10, 2025 -
Convert to markdown: onnx_verification.rst, onnx.rst, optim.rst, package.rst, profiler.rst
#155031 closed
Jun 10, 2025 -
Jacobian mismatch for `nn.functional.ctc_loss`
#67462 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic_inline_and_install_strict (__main__.InlineAndInstallStrictExportTestExport)
#154951 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic (__main__.TestExport)
#154950 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic_retraceability_strict (__main__.RetraceExportTestExport)
#154940 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic_retraceability_nonstrict (__main__.RetraceExportNonStrictTestExport)
#154939 closed
Jun 10, 2025 -
DISABLED test_cond_contains_unbacked_no_escape_cpp_serdes (__main__.CppSerdesTestExport)
#154919 closed
Jun 10, 2025 -
DISABLED test_cond_contains_unbacked_no_escape_retraceability_strict (__main__.RetraceExportTestExport)
#154915 closed
Jun 10, 2025 -
DISABLED test_cond_contains_unbacked_no_escape_strict (__main__.StrictExportTestExport)
#154913 closed
Jun 10, 2025 -
DISABLED test_cond_contains_unbacked_no_escape_serdes_strict (__main__.SerDesExportTestExport)
#154912 closed
Jun 10, 2025 -
DISABLED test_cond_contains_unbacked_no_escape_serdes_nonstrict (__main__.SerDesExportNonStrictTestExport)
#154911 closed
Jun 10, 2025 -
DISABLED test_reshape_view_helper_inline_and_install_strict (__main__.InlineAndInstallStrictExportTestExport)
#154877 closed
Jun 10, 2025 -
DISABLED test_reshape_view_helper_strict (__main__.StrictExportTestExport)
#154878 closed
Jun 10, 2025 -
DISABLED test_cond_contains_unbacked_no_escape (__main__.TestExport)
#154909 closed
Jun 10, 2025 -
DISABLED test_reshape_view_helper (__main__.TestExport)
#154876 closed
Jun 10, 2025 -
DISABLED test_reshape_view_helper_retraceability_strict (__main__.RetraceExportTestExport)
#154875 closed
Jun 10, 2025 -
DISABLED test_reshape_view_helper_retraceability_nonstrict (__main__.RetraceExportNonStrictTestExport)
#154874 closed
Jun 10, 2025 -
DISABLED test_reshape_view_helper_serdes_strict (__main__.SerDesExportTestExport)
#154873 closed
Jun 10, 2025 -
DISABLED test_reshape_view_helper_serdes_nonstrict (__main__.SerDesExportNonStrictTestExport)
#154872 closed
Jun 10, 2025 -
DISABLED test_reshape_view_helper_cpp_serdes (__main__.CppSerdesTestExport)
#154871 closed
Jun 10, 2025 -
DISABLED test_reshape_view_helper_training_ir_to_decomp_strict (__main__.TrainingIRToRunDecompExportTestExport)
#154870 closed
Jun 10, 2025 -
DISABLED test_dim_hint_range_violations_serdes_strict (__main__.SerDesExportTestExport)
#154993 closed
Jun 10, 2025 -
DISABLED test_dim_hint_range_violations_strict (__main__.StrictExportTestExport)
#154994 closed
Jun 10, 2025 -
DISABLED test_dim_hint_range_violations_serdes_nonstrict (__main__.SerDesExportNonStrictTestExport)
#154992 closed
Jun 10, 2025 -
DISABLED test_dim_hint_range_violations_retraceability_strict (__main__.RetraceExportTestExport)
#154991 closed
Jun 10, 2025 -
DISABLED test_dim_hint_range_violations_retraceability_nonstrict (__main__.RetraceExportNonStrictTestExport)
#154990 closed
Jun 10, 2025 -
DISABLED test_dim_hint_range_violations_cpp_serdes (__main__.CppSerdesTestExport)
#154989 closed
Jun 10, 2025 -
DISABLED test_dim_hint_range_violations (__main__.TestExport)
#154988 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic_strict (__main__.StrictExportTestExport)
#154972 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic_specialization_retraceability_nonstrict (__main__.RetraceExportNonStrictTestExport)
#154970 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic_specialization_retraceability_strict (__main__.RetraceExportTestExport)
#154971 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic_specialization_cpp_serdes (__main__.CppSerdesTestExport)
#154969 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic_training_ir_to_decomp_strict (__main__.TrainingIRToRunDecompExportTestExport)
#154967 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic_specialization (__main__.TestExport)
#154966 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic_specialization_serdes_strict (__main__.SerDesExportTestExport)
#154965 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic_specialization_serdes_nonstrict (__main__.SerDesExportNonStrictTestExport)
#154964 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic_cpp_serdes (__main__.CppSerdesTestExport)
#154957 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic_serdes_nonstrict (__main__.SerDesExportNonStrictTestExport)
#154952 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic_serdes_strict (__main__.SerDesExportTestExport)
#154953 closed
Jun 10, 2025 -
DISABLED test_dim_dynamic_specialization_strict (__main__.StrictExportTestExport)
#154954 closed
Jun 10, 2025 -
Integrate with ONNX 1.18.0 release branch
#151681 closed
Jun 10, 2025 -
[Profile] unexpected ops have been profiled from torch.profiler
#155188 closed
Jun 10, 2025 -
Tensor.long() produces inconsistent results for torch.inf between CPU and GPU
#154724 closed
Jun 10, 2025 -
Support Delay Loading of c10.dll in when using libtorch as a thirdparty library.
#105058 closed
Jun 10, 2025 -
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_float32 (__main__.TestForeachCUDA)
#150747 closed
Jun 10, 2025 -
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_uint8 (__main__.TestForeachCUDA)
#150662 closed
Jun 10, 2025 -
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_int8 (__main__.TestForeachCUDA)
#150630 closed
Jun 10, 2025 -
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_int64 (__main__.TestForeachCUDA)
#150617 closed
Jun 10, 2025 -
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_bfloat16 (__main__.TestForeachCUDA)
#150668 closed
Jun 10, 2025 -
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_float64 (__main__.TestForeachCUDA)
#150752 closed
Jun 10, 2025 -
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_bool (__main__.TestForeachCUDA)
#150680 closed
Jun 10, 2025 -
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_int32 (__main__.TestForeachCUDA)
#150602 closed
Jun 10, 2025 -
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_int64 (__main__.TestForeachCUDA)
#150822 closed
Jun 10, 2025 -
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_int32 (__main__.TestForeachCUDA)
#150800 closed
Jun 10, 2025 -
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_int16 (__main__.TestForeachCUDA)
#150590 closed
Jun 10, 2025 -
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_int16 (__main__.TestForeachCUDA)
#150772 closed
Jun 10, 2025 -
DISABLED test_parity__foreach_abs_fastpath_outplace_cuda_float16 (__main__.TestForeachCUDA)
#150712 closed
Jun 10, 2025 -
[compile time][inductor] Quadratic compile time observed in Inductor fusion
#154652 closed
Jun 10, 2025 -
torch._dynamo.exc.Unsupported: Failed to trace builtin operator
#155436 closed
Jun 10, 2025 -
Potential Bug with HYBRID_SHARD and (n, 1) Device Mesh Falling Back to NO_SHARD
#154888 closed
Jun 9, 2025 -
Broken link in doc.
#132178 closed
Jun 9, 2025 -
Rprop does not work in MPS for noncontiguous tensors
#118117 closed
Jun 9, 2025 -
[cuBLAS] relax the restrictions on the use of cublasLt
#153590 closed
Jun 9, 2025 -
[Upstream Triton] experimental_tensormap_fenceproxy_acquire
#154692 closed
Jun 9, 2025 -
The compilation result is incorrect when torch.compile compiles code containing torch.nonzero.
#155324 closed
Jun 9, 2025 -
`weight` argument of `nn.CrossEntropyLoss()` works with `int`, `complex` and `bool` type
#134896 closed
Jun 9, 2025 -
[Inductor] Need a hash key for the underlying kernel for ChoiceCaller that doesn't depend on runtime params
#154467 closed
Jun 9, 2025 -
request for faster inductor kernels for blockwise reduction across dim1 -> write
#149982 closed
Jun 9, 2025 -
Sliced float16/bfloat16 tensors can have numerical errors > 1e-3 after passing through nn.Linear.
#154997 closed
Jun 9, 2025 -
Internal Assertion Failure in torch.tensor() in Restricted eval() Environment
#155224 closed
Jun 9, 2025 -
internal assert failed while trying to run dorado (ONT basecaller)
#155326 closed
Jun 9, 2025 -
DISABLED test_pending_fusions_multiple (__main__.TestPrologueFusion)
#152221 closed
Jun 9, 2025 -
[bug] the LTS torch==1.8.2 pip package is incomplete
#69689 closed
Jun 9, 2025 -
Adam is 30% slower than SGD on Apple Metal.
#78063 closed
Jun 9, 2025 -
Need support and testing for Adam optimizer for MPS
#105382 closed
Jun 9, 2025 -
DISABLED test_sparse_add_cuda_float64 (__main__.TestSparseCSRCUDA)
#145019 closed
Jun 9, 2025 -
Document garbage_collection_threshold default
#150917 closed
Jun 8, 2025 -
Convert to markdown: mps.rst, mtia.memory.rst, mtia.rst, multiprocessing.rst, name_inference.rst
#155027 closed
Jun 8, 2025 -
Segmentation fault (core dumped) in torch.concat
#155306 closed
Jun 8, 2025 -
I get different results on simple network operation on a computer with and without AVX512
#155423 closed
Jun 8, 2025 -
Export _in_spec different before and after load
#154674 closed
Jun 8, 2025 -
There is a performance drop because we have not yet implemented the batching rule for aten::matrix_exp.
#155400 closed
Jun 8, 2025 -
profile for torch.add(x, x) where x is a zero-sized tensor looks bogus
#151829 closed
Jun 7, 2025 -
Potential indexing issues in compile for large tensors
#154168 closed
Jun 7, 2025 -
Pytorch with Rocm nightly is missing a dependency
#155207 closed
Jun 7, 2025 -
DISABLED test_penalized_small_dim (__main__.TestTiling)
#155186 closed
Jun 7, 2025 -
torch.device context manager change doesn't show in torch.get_default_device
#131328 closed
Jun 7, 2025 -
[ONNX] Use onnx Attention operator for scaled_dot_product_attention
#149662 closed
Jun 7, 2025 -
[ONNX] rfftn/irfftn produces incorrect shapes
#125903 closed
Jun 7, 2025 -
Inductor doesn't move 0-D tensors to enable cudagraphs
#119241 closed
Jun 7, 2025 -
how to save the fx graph with output tensor shapes ?
#155391 closed
Jun 7, 2025 -
MPS missing erfc
#155337 closed
Jun 7, 2025 -
[TensorDict - compile] cuda.is_initialized compatibility
#129659 closed
Jun 7, 2025 -
DISABLED test_stft_xpu (__main__.AOTInductorTestABICompatibleGpu)
#154701 closed
Jun 6, 2025 -
[ROCm][TunableOp] Contents of untuned csv files are ignored during offline tuning
#153462 closed
Jun 6, 2025 -
Weight Shape Corruption in Sequential Modules During Stateful Execution
#155246 closed
Jun 6, 2025 -
Convert to markdown: torch.overrides.rst, type_info.rst, utils.rst, xpu.rst
#155041 closed
Jun 6, 2025 -
Cpp-wrapper mode issue tracker
#117363 closed
Jun 6, 2025 -
torch.compiled flex_attention + NJT raises `RuntimeError: Attempting to use FunctionalTensor on its own.`
#154556 closed
Jun 6, 2025 -
Release 2.7.1 validations checklist and cherry-picks
#154512 closed
Jun 6, 2025 -
Cannot Export Dynamic Shape Decoder to Onnx
#153955 closed
Jun 6, 2025 -
Better mergebot messages when reverting a PR
#139680 closed
Jun 6, 2025 -
"Automatically add `__all__`" tool and linter
#146242 closed
Jun 6, 2025 -
ImportError: cannot import name 'make_fx' from 'torch.fx
#155323 closed
Jun 6, 2025 -
Long CI Queue Times: VolumeLimitExceeded
#155265 closed
Jun 6, 2025 -
Unexpected behaviour when resuming from checkpoint using CosineAnnealingLR
#65342 closed
Jun 6, 2025 -
[feature request] Global GPU Flag
#7535 closed
Jun 6, 2025 -
TorchScript based ONNX exporter: Shape Inference failure
#153214 closed
Jun 6, 2025 -
Jobs failing with: "The job was not acquired by Runner of type", Internal server error
#155250 closed
Jun 5, 2025 -
torch._higher_order_ops.scan graph breaks and clamp error with compile/autograd
#153437 closed
Jun 5, 2025 -
autograd.grad in compiled function can't run with code cache hit
#154536 closed
Jun 5, 2025 -
[MPS] "Can't be indexed using 32-bit iterator" error as of 20250430 nightly cpu MacOS build
#154828 closed
Jun 5, 2025 -
Move all CI/CD workflows from focal to jammy
#154157 closed
Jun 5, 2025 -
Inconsistent behavior between CPU and GPU implementations of `torch.arange`
#153133 closed
Jun 5, 2025 -
AI On-device (Feature)
#155161 closed
Jun 5, 2025 -
[Upstream Triton] AttributeError: args ```test_deep_reentrant```
#154249 closed
Jun 5, 2025 -
[Upstream Triton] torch._dynamo.exc.BackendCompilerFailed ```test_conv_weight_layout_convert_cuda```
#154231 closed
Jun 5, 2025 -
[Upstream Triton] RuntimeError: Expected to not find ".run(" but found it ```test_low_precision```
#154228 closed
Jun 5, 2025 -
`torch.cross`'s behavior is different on cpu and gpu on torch 2.5.0.dev20240708+cu121
#132031 closed
Jun 5, 2025 -
DISABLED test_sparse_gradients_grad_is_view (__main__.DistributedDataParallelTest)
#140368 closed
Jun 5, 2025 -
[Upstream Triton] nvcc / ptx mismatch w/ AOTI
#154938 closed
Jun 5, 2025 -
[Windows UT] CpuTests.test_fractional_max_pool2d2_cpu failed with PyTorch 2025-05-25 nightly wheel
#154697 closed
Jun 5, 2025 -
Unexpected behavior when using dist.all_reduce(x, op=dist.ReduceOp.SUM)
#152300 closed
Jun 5, 2025 -
DISABLED test_deterministic_algorithms (__main__.TestGuardSerialization)
#154090 closed
Jun 5, 2025 -
hugging face transformer regression on `facebook/opt-125m`
#155168 closed
Jun 5, 2025 -
Non-negligible overhead of `OpOverloadPacket` dispatch w/o overload
#153626 closed
Jun 5, 2025 -
AI On-Device
#155159 closed
Jun 4, 2025 -
`2` and `-2` for `ord` argument of `linalg.norm()` should be explained more clearly
#136453 closed
Jun 4, 2025 -
[v2.7.1] Release Tracker
#152627 closed
Jun 4, 2025 -
quantize_fx.prepare_qat_fx, `get_default_qat_qconfig_mapping` is unused in code.
#144522 closed
Jun 4, 2025 -
DISABLED TCPStoreTest.testMultiTenantStoresUV (__main__.TCPStoreTest)
#139150 closed
Jun 4, 2025 -
DISABLED TCPStoreTest.testMultiTenantStores (__main__.TCPStoreTest)
#142030 closed
Jun 4, 2025 -
DISABLED AotInductorTest.FreeInactiveConstantBufferCuda (build.bin.test_aoti_inference)
#149495 closed
Jun 4, 2025 -
`lintrunner-noclang` job name is confusing
#126324 closed
Jun 4, 2025 -
Enable opting out of CI experiments
#139334 closed
Jun 4, 2025 -
torch.compile failed to handle a custom __delattr__ method correctly
#150765 closed
Jun 4, 2025 -
[Upstream Triton] AssertionError: Scalars are not equal! ```inductor.test_mkldnn_pattern_matcher```
#154225 closed
Jun 4, 2025 -
[Upstream Triton] AssertionError: Incorrect result from choice TritonTemplateCaller ```test_mm_dropout```
#154222 closed
Jun 4, 2025 -
[CPU][CUDA] `cpu` and `cuda` implementations of `log_softmax` backward seem to disagree on `dim > 2`
#155043 closed
Jun 4, 2025 -
[Upstream Triton] [HIP] torch._inductor.exc.InductorError ```inductor.test_custom_lowering```
#154242 closed
Jun 4, 2025 -
[Upstream Triton] [HIP] RuntimeError: built without cuda ```inductor.test_distributed_patterns```
#154233 closed
Jun 4, 2025 -
[Upstream Triton] AssertionError: Tensor-likes are not close! ```inductor.test_torchinductor_opinfo```
#154212 closed
Jun 4, 2025 -
Experiencing "429: Too Many Requests" on downloading actions
#155075 closed
Jun 4, 2025 -
[ROCm] AssertionError: Scalars are not equal! ```test_linear_with_input_of_flexible_layout```
#154246 closed
Jun 4, 2025 -
loading state dict with mismatching shapes error
#154597 closed
Jun 4, 2025 -
DISABLED test_ddp_apply_optim_in_backward (__main__.TestDistBackendWithSpawn)
#153266 closed
Jun 4, 2025 -
[Upstream Triton] AssertionError: False is not true ```test_cpu_repro``` and ```test_cuda_repro```
#154245 closed
Jun 4, 2025 -
[Upstream Triton] KeyError: Unknown key: ```'cubin'```
#154207 closed
Jun 4, 2025 -
[Dynamo] recompiles with empty (no reason) guard
#154707 closed
Jun 4, 2025 -
tensor_to_numpy symbol not exported in libtorch_python.so in pytorch 2.7
#154105 closed
Jun 4, 2025 -
`randint(max)` causes a graph break, but not `rand().mul(max).floor().to(torch.long)` (on CPU)
#135664 closed
Jun 4, 2025 -
MPS does not support sigmoid op with int64 input
#154895 closed
Jun 4, 2025 -
Some PyTorch tensor functions silently change the default locale encoding
#151442 closed
Jun 4, 2025 -
Mismatch of mixed precision `cast_fn` in FSDP and FSDP2
#153077 closed
Jun 4, 2025 -
DISABLED test_linear (__main__.TestAOTInductorPackageCpp_xpu)
#154689 closed
Jun 3, 2025 -
DISABLED test_multiple_methods (__main__.TestAOTInductorPackageCpp_xpu)
#154690 closed
Jun 3, 2025 -
DISABLED test_metadata (__main__.TestAOTInductorPackageCpp_xpu)
#154685 closed
Jun 3, 2025 -
DISABLED test_bool_input (__main__.TestAOTInductorPackageCpp_xpu)
#154683 closed
Jun 3, 2025 -
DISABLED test_specified_output_dir (__main__.TestAOTInductorPackageCpp_xpu)
#154681 closed
Jun 3, 2025 -
DISABLED test_add (__main__.TestAOTInductorPackageCpp_xpu)
#154682 closed
Jun 3, 2025 -
Improve sharding algorithm for ASAN (any maybe other jobs as well)
#74620 10000 closed
Jun 3, 2025 -
on-pr docker build stuck with `user is not authorized to BatchGetImage`
#148771 closed
Jun 3, 2025 -
CI/CD: Figure out what to do with split build
#138750 closed
Jun 3, 2025 -
CUDA not found in NVIDIA runners
#153760 closed
Jun 3, 2025 -
[Mergebot] Adding ciflow/pull in PR without pull and lint workflows
#152718 closed
Jun 3, 2025 -
CI workflows being skipped on PR
#152697 closed
Jun 3, 2025 -
Wrong formula for CosineAnnealingLR
#152081 closed
Jun 3, 2025 -
[Upstream Triton] AssertionError: Scalars are not equal! ```inductor/test_codecache.py```
#154250 closed
Jun 3, 2025 -
[Upstream Triton] AssertionError: False is not true ```test_pad_3d_tensor```
#154224 closed
Jun 3, 2025 -
inductor codegen for masks is non-deterministic
#154741 closed
Jun 3, 2025 -
DISABLED test_sort_transpose_mps (__main__.GPUTests)
#153939 closed
Jun 3, 2025 -
DISABLED test_slice_mutation1_mps (__main__.GPUTests)
#153913 closed
Jun 3, 2025 -
Some `Improve Error Message` Bugs
#149625 closed
Jun 3, 2025 -
torch.compile regression: it cause recompile when int value changed
#154490 closed
Jun 3, 2025 -
torch.compile supported with GIL disabled
#147946 closed
Jun 3, 2025 -
[FSDP2] reshard_after_forward=True for root models
#154655 closed
Jun 3, 2025 -
[Upstream Triton] RuntimeError: Tried to register an operator ```test_item_to_inputs_kernel_nobreak_cpu```
#154216 closed
Jun 3, 2025 -
[Upstream Triton] AssertionError: False is not true ```test_inductor_profiling_triton_hooks```
#154223 closed
Jun 3, 2025 -
pytorchbot gets confused when ghstacked commit order are interactively rebased
#154461 closed
Jun 2, 2025 -
MPS does not support log1p op with int64 input
#154883 closed
Jun 2, 2025 -
[ONNX] Exporter improvement tasks
#129274 closed
Jun 2, 2025 -
[ONNX][low pri] Move old (non-public) implementation into legacy/ and schedule for deprecation
#129308 closed
Jun 2, 2025 -
ShardedGradScaler is not documented on the website.
#141543 closed
Jun 2, 2025 -
[FSDP2] mixed precision: auto turn off `cast_forward_inputs`
#146130 closed
Jun 2, 2025 -
Parameter not updating when FSDP2 model is used before optimizre creation
#149205 closed
Jun 2, 2025 -
torch.distributed.checkpoint CUDA OOM with broadcast_from_rank0
#149640 closed
Jun 2, 2025 -
[FSDP] Moving module's view tensor to device
#147321 closed
Jun 2, 2025 -
FSDP with AveragedModel
#149138 closed
Jun 2, 2025 -
[RFC] Deprecate silent fallback to aten logic in Inductor
#147479 closed
Jun 2, 2025 -
DISABLED test_dtypeview_int16_bfloat16_mps (__main__.GPUTests)
#153864 closed
Jun 2, 2025 -
DISABLED test_upsample_nearest1d_mps (__main__.GPUTests)
#153866 closed
Jun 2, 2025 -
[Upstream Triton] AssertionError: expected to fail, but actually passed ```test_unbacked_reduction_cpu```
#154217 closed
Jun 2, 2025 -
DISABLED AotInductorTest.BasicTestCuda (build.bin.test_aoti_inference)
#152888 closed
Jun 2, 2025 -
DISABLED AotInductorTest.BasicTestCpu (build.bin.test_aoti_inference)
#152889 closed
Jun 2, 2025 -
[dynamo] Improve final traceback frame format
#152867 closed
Jun 2, 2025 -
Slow-autograd tests regression
#154459 closed
Jun 2, 2025 -
Small numerical discrepancy in sum/mean after torch.block_diag
#154616 closed
Jun 2, 2025 -
Dynamo converts `Size + tuple` to `tuple` instead of `Size`
#154432 closed
Jun 2, 2025 -
`Inconsistent Results` from `torch.svd_lowrank` on CPU and CUDA
#154479 closed
Jun 2, 2025 -
Add MPS support for ConvTranspose3d
#154615 closed
Jun 2, 2025 -
Inconsistent result of torch.amin() in CPU vs GPU
#154792 closed
Jun 2, 2025 -
Update `pocketfft` in third party
#154843 closed
Jun 2, 2025 -
DISABLED test_resize_mps (__main__.GPUTests)
#153811 closed
Jun 2, 2025 -
DISABLED test_cat_upcasting_mps (__main__.GPUTests)
#153814 closed
Jun 2, 2025 -
DISABLED test_scalar_output_mps (__main__.GPUTests)
#153810 closed
Jun 2, 2025 -
DISABLED test_invalid_operand_issue1_mps (__main__.GPUTests)
#153813 closed
Jun 2, 2025 -
DISABLED test_to_dtype_mps (__main__.GPUTests)
#153812 closed
Jun 2, 2025 -
[Upstream Triton] AssertionError: Tensor-likes are not close! ```test_conv_with_as_strided_cpu```
#154232 closed
Jun 2, 2025 -
DISABLED test_split_cumsum_mps (__main__.GPUTests)
#153804 closed
Jun 2, 2025 -
DISABLED test_profiler_mark_wrapper_call_dynamic_shapes_cpu (__main__.DynamicShapesCpuTests)
#135294 closed
Jun 2, 2025 -
[Inductor] opacus_cifar10 test returns eager_fail_to_run after opacus 1.5.4 update
#154446 closed
Jun 2, 2025 -
Runtime assertions ignored in many cases
#153756 closed
Jun 2, 2025 -
[FSDP] Cannot writeback when the parameter shape changes
#151223 closed
Jun 1, 2025 -
Support for CUDA 12.8 (e.g., RTX 5090) in Previous PyTorch Versions (e.g., 2.3)
#154813 closed
Jun 1, 2025 -
there is a little color problem of the example code in sparse docs.
#154779 closed
Jun 1, 2025 -
empty_cache does not work for CUDAPluggableAllocator + MemPool
#145168 closed
May 31, 2025 -
torch.nextafter(0, 1) returns 0 on MPS device
#150027 closed
May 31, 2025 -
torch.compile on MPS: error running compiled RMSNorm
#150454 closed
May 31, 2025 -
aot_compile default configuration doesn't seem to appropriately setup support for unbacked SymInts
#118304 closed
May 31, 2025 -
[Indexing] Incoherent Tensor indexing for nested lists
#100080 closed
May 31, 2025 -
Indexing a tensor with a NumPy array sometimes works and sometimes doesn't
#65218 closed
May 31, 2025 -
Inconsistent list indexing behavior
#68595 closed
May 31, 2025 -
Cannot index into a tensor using indices from another device - regression from 1.12
#85450 closed
May 30, 2025 -
Mixed logical indexing / numerical indexing fails.
#60261 closed
May 30, 2025 -
Advanced indexing: allow combining Boolean & integer index
#46468 closed
May 30, 2025 -
Cannot use mask and slice assignment together
#140802 closed
May 30, 2025 -
Tensor slice copy across multiple devices fails silently
#84573 closed
May 30, 2025 -
DISABLED test_neg_index_cpu (__main__.CpuTests)
#154760 closed
May 30, 2025 -
Advanced indexing with uint8 tensor versus int64 tensor is inconsistent
#20149 closed
May 30, 2025 -
Named Tensors: Slicing based on name
#29023 closed
May 30, 2025 -
Advanced indexing slower than numpy
#14687 closed
May 30, 2025 -
DISABLED test_AllenaiLongformerBase_repro_cpu (__main__.CpuTests)
#102865 closed
May 30, 2025 -
Accessing elements of tensor with multi-dimensional index results `IndexError`
#43128 closed
May 30, 2025 -
Indexing assignment can have no effect on CUDA with deterministic algorithms
#76176 closed
May 30, 2025 -
Inconsistency between index_select and __get_item__
#83702 closed
May 30, 2025 -
Advanced Indexing does not trace correctly for tensor shape that has leading 1s
#49852 closed
May 30, 2025 -
Unexpected behaviour of 1.13.0
#90194 closed
May 30, 2025 -
CUDA error: device-side assert triggered
#120901 closed
May 30, 2025 -
Torch doesn't copy the assigned self-referential memory on `cpu` (inconsistent with `numpy` and `cuda`)
#126097 closed
May 30, 2025 -
index_put : INTERNAL ASSERT FAILED
#72053 closed
May 30, 2025 -
`torch.index_put` raise error when `accumulate=True`
#144539 closed
May 30, 2025 -
Add int32 support to torch.gather
#148119 closed
May 30, 2025 -
DISABLED test_var_mean_tile_reduction_False_mps (__main__.GPUTests)
#153762 closed
May 30, 2025 -
DISABLED test_tensor_index_put_slice_mps (__main__.GPUTests)
#153761 closed
May 30, 2025 -
[inductor] `proxy_tensor.py` throws `SyntaxError` when using `.random_`
#151432 closed
May 30, 2025 -
DISABLED test_mean_mps (__main__.GPUTests)
#153747 closed
May 30, 2025 -
Tensor.type_as() produces inconsistent results for torch.inf between CPU and GPU
#154730 closed
May 30, 2025 -
Tensor.char() produces inconsistent results for torch.inf between CPU and GPU
#154727 closed
May 30, 2025 -
Tensor.int() produces inconsistent results for torch.inf between CPU and GPU
#154726 closed
May 30, 2025 -
Doc for `assign` parameter of `load_state_dict` is not rendered correctly
#141364 closed
May 30, 2025 -
[export] torch.tensor constructor specializes on float value
#153411 closed
May 30, 2025 -
Cannot override __add__ in NamedTuple with __new__ + torch.compile
#133762 closed
May 30, 2025 -
DISABLED test_lerp_mps (__main__.GPUTests)
#153713 closed
May 30, 2025 -
DISABLED test_view_on_aliased_mps (__main__.GPUTests)
#153717 closed
May 30, 2025 -
DISABLED test_shape_padding_mps (__main__.GPUTests)
#153719 closed
May 30, 2025 -
DISABLED test_topk_mps (__main__.GPUTests)
#153718 closed
May 30, 2025 -
DISABLED test_resize_as_mps (__main__.GPUTests)
#153716 closed
May 30, 2025 -
DISABLED test_xblock_divides_xnumel_mps (__main__.GPUTests)
#153715 closed
May 30, 2025 -
DISABLED test_max_pool2d_with_indices_backward5_mps (__main__.GPUTests)
#153721 closed
May 30, 2025 -
DISABLED test_dtypeview_bfloat16_bfloat16_mps (__main__.GPUTests)
#153710 closed
May 30, 2025 -
DISABLED test_searchsorted_mps (__main__.GPUTests)
#153711 closed
May 30, 2025 -
DISABLED test_view_uint8_through_differing_bitwidths_mps (__main__.GPUTests)
#153720 closed
May 30, 2025 -
DISABLED test_where_with_logical_op_mps (__main__.GPUTests)
#153712 closed
May 30, 2025 -
DISABLED test_inductor_multiple_specializations_dynamic_shapes_cuda (__main__.DynamicShapesCodegenGPUTests)
#154717 closed
May 30, 2025 -
DISABLED test_zero_element_mutation_mps (__main__.GPUTests)
#153708 closed
May 30, 2025 -
DISABLED test_full_dtype (__main__.TestFull)
#138574 closed
May 30, 2025 -
DISABLED test_tensor1_mps (__main__.GPUTests)
#153709 closed
May 30, 2025 -
inductor: inductor conv2d get a different size and stride with eager mod when input channel is zero
#101356 closed
May 30, 2025 -
`torch.compile` fails with sparse embedding (`F.embedding(sparse=True)`)
#150656 closed
May 30, 2025 -
jit compilation returns an int rather than a bool when using math.isnan()
#107166 closed
May 30, 2025 -
llama model failed for dynamic shape path
#106110 closed
May 30, 2025 -
[FSDP2] relax uniform dtype assertion for requires_grad=False
#154082 closed
May 30, 2025 -
AttributeError: type object 'torch._C._distributed_c10d.BackendType' has no attribute 'XCCL'.
#147059 closed
May 30, 2025 -
DISABLED test_div_zero_dim_mps (__main__.GPUTests)
#153694 closed
May 30, 2025 -
DISABLED test_tmp_not_defined_issue2_mps (__main__.GPUTests)
#153693 closed
May 30, 2025 -
DISABLED test_dropout_trivial_1_mps (__main__.GPUTests)
#153695 closed
May 30, 2025 -
DISABLED test_views1_mps (__main__.GPUTests)
#153692 closed
May 30, 2025 -
DISABLED test_var_correction_mps (__main__.GPUTests)
#153691 closed
May 30, 2025 -
DISABLED test_linear (__main__.TestAOTInductorPackageCpp_xpu)
#154684 closed
May 30, 2025 -
speculate_subgraph cannot detect input mutation for a function wrapped with torch.func.functionalize
#154669 closed
May 29, 2025 -
Selective activation checkpointing causes memory leakage
#154642 closed
May 29, 2025 -
Cleanup autotune_fallback_to_aten post-deprecation
#153298 closed
May 29, 2025 -
flex_attention + NJT output inconsistent with non-NJT results
#154554 closed
May 29, 2025 -
DISABLED test_dict_keys_match (__main__.TestGuardSerialization)
#153617 closed
May 29, 2025 -
DISABLED test_seqential_batch_workers (__main__.TestDataLoader)
#81891 closed
May 29, 2025 -
AOTI packaged model can't be run on newly created tensor of same shape as tensor created from slice
#153992 closed
May 29, 2025 -
Silent incorrectness between static torch.compile vs eager
#152425 closed
May 29, 2025 -
CUDA 12.9, Compile pytorch's static library error in windows 11 .
#154604 closed
May 29, 2025 -
[Infra] Jobs got frequently cancelled, sometimes mid-checkout
#151669 closed
May 29, 2025 -
DISABLED test_torch_manual_seed_seeds_cuda_devices (__main__.TestCuda)
#135218 closed
May 28, 2025 -
DISABLED test_fused_sdp_choice_privateuseone (__main__.TestSDPAPrivateUse1Only)
#134600 closed
May 28, 2025 -
DISABLED test_serialization_array_with_storage (__main__.TestCuda)
#134991 closed
May 28, 2025 -
DISABLED test_allocator_fuzz (__main__.TestCudaMallocAsync)
#135249 closed
May 28, 2025 -
DISABLED test_streaming_backwards_multiple_streams (__main__.TestCuda)
#135065 closed
May 28, 2025 -
DISABLED test_max_autotune_remote_caching_dynamic_False (__main__.TestMaxAutotuneRemoteCache)
#145361 closed
May 28, 2025 -
DISABLED test_record_stream (__main__.TestCuda)
#134746 closed
May 28, 2025 -
DISABLED test_set_per_process_memory_fraction (__main__.TestCuda)
#135115 closed
May 28, 2025 -
DISABLED test_random_no_reused_random_states_float32 (__main__.TestCuda)
#134944 closed
May 28, 2025 -
DISABLED test_streaming_backwards_sync (__main__.TestCuda)
#103494 closed
May 28, 2025 -
DISABLED test_garbage_collect_expandable (__main__.TestCudaMallocAsync)
#134811 closed
May 28, 2025 -
DISABLED test_sum_fp16 (__main__.TestCuda)
#99953 closed
May 28, 2025 -
DISABLED test_inplace_gradgrad_remainder_cuda_float64 (__main__.TestBwdGradientsCUDA)
#100675 closed
May 28, 2025 -
DISABLED test_max_split_expandable (__main__.TestCudaMallocAsync)
#134837 closed
May 28, 2025 -
DISABLED test_memory_plots_free_segment_stack (__main__.TestCudaMallocAsync)
#137223 closed
May 28, 2025 -
DISABLED test_random_no_reused_random_states_float64 (__main__.TestCuda)
#134958 closed
May 28, 2025 -
DISABLED test_threading (__main__.TestWithNCCL)
#141637 closed
May 28, 2025 -
DISABLED test_randint_randomness_for_large_range (__main__.TestCuda)
#134932 closed
May 28, 2025 -
DISABLED test_serialization_array_with_empty (__main__.TestCuda)
#134966 closed
May 28, 2025 -
DISABLED test_prod_large (__main__.TestCuda)
#134723 closed
May 28, 2025 -
DISABLED test_streaming_backwards_callback (__main__.TestCuda)
#135024 closed
May 28, 2025 -
DISABLED test_streams (__main__.TestCuda)
#135114 closed
May 28, 2025 -
torch.fft.hfft2 produces inconsistent results for infinite values between CPU and GPU
#154520 closed
May 28, 2025 -
torch.fft.ifft2 produces inconsistent imaginary components for infinite inputs between CPU and GPU
#154521 closed
May 28, 2025 -
DISABLED test_promotes_int_to_float_ldexp_cuda_int16 (__main__.TestCommonCUDA)
#154550 closed
May 28, 2025 -
modded-nanogpt flaky NCCL hang starting 3/30 nightly
#152623 closed
May 28, 2025 -
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_int32 (__main__.TestForeachCUDA)
#153464 closed
May 28, 2025 -
Set `size` when `is_coalesced` is set in `torch.sparse_coo_tensor()`
#145371 closed
May 28, 2025 -
DISABLED test_parity__foreach_add_fastpath_outplace_cuda_bfloat16 (__main__.TestForeachCUDA)
#153537 closed
May 28, 2025 -
DISABLED test_extra_cuda_context (__main__.ProcessGroupNCCLGroupTest)
#139011 closed
May 28, 2025 -
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_uint8 (__main__.TestForeachCUDA)
#153525 closed
May 28, 2025 -
Torch compile cache
#144859 closed
May 28, 2025 -
Issue while Building the Documentation
#151901 closed
May 28, 2025 -
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_int8 (__main__.TestForeachCUDA)
#153512 closed
May 28, 2025 -
Test Release Highlight Feature
#154462 closed
May 27, 2025 -
Torch compile issue, AttributeError: 'NoneType' object has no attribute 'store_cubin'
#150980 closed
May 27, 2025 -
Use of @property on in-graph constructed NJT fails Dynamo tracing
#146932 closed
May 27, 2025 -
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_int64 (__main__.TestForeachCUDA)
#153482 closed
May 27, 2025 -
Issue with different output with different torch versions.
#154411 closed
May 27, 2025 -
[MPS] Memory leak in `nn.Linear`
#132332 closed
May 27, 2025 -
Model export to onnx, but got "RuntimeError: Expected all tensors to be on the same device"
#154093 closed
May 27, 2025 -
TorchVision upgrade blocked by inductor regressions
#153985 closed
May 27, 2025 -
running my facebook/bart-base for summarization task : MPS does not support cumsum op with int64 input
#141786 closed
May 27, 2025 -
`Tensor._make_wrapper_subclass` is not listed in `torch/_C/__init__.pyi`
#153790 closed
May 27, 2025 -
Torch not found
#133076 closed
May 27, 2025 -
DISABLED test_sdpa_rewriter_11_cuda (__main__.SDPAPatternRewriterCudaDynamicTests)
#148631 closed
May 27, 2025 -
Multiheadattention module doesn't implement the function about kdim and vdim
#95712 closed
May 27, 2025 -
torch.distributed error with nccl backend
#154342 closed
May 27, 2025 -
AOTInductor: Artifact compiled on A10 (SM_86) fails on H20 (SM_90) despite torch._inductor.config.cuda.arch="90"
#153697 closed
May 27, 2025 -
UNSTABLE pull / linux-jammy-py3-clang12-executorch / test (executorch)
#144480 closed
May 27, 2025 -
Whether the transposed tensor is contiguous affects the results of the subsequent Linear layer.
#148939 closed
May 27, 2025 -
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_int16 (__main__.TestForeachCUDA)
#153440 closed
May 27, 2025 -
MaxUnpool2d raise RuntimeError: view size is not compatible with input tensor's size and stride
#154341 closed
May 27, 2025 -
module 'torch' has no attribute 'get_default_device'
#154362 closed
May 27, 2025 -
[distributions] Creating a second instance of `Wishart` modifies the constraints on the first instance.
#154355 closed
May 26, 2025 -
Refactor MegaCache to make it generic
#152976 closed
May 26, 2025 -
DISABLED test_item_to_inputs_kernel_nobreak_cuda (__main__.TestInductorDynamicCUDA)
#119538 closed
May 26, 2025 -
Inconsistent behavior and misleading error message for `torch.nanmean()` with complex dtypes
#153132 closed
May 26, 2025 -
DISABLED test_dtensor_seq_par_shard_dim_1 (__main__.MicroPipelineTPTest)
#153223 closed
May 26, 2025 -
DISABLED test_sdpa_rewriter_14_cuda (__main__.SDPAPatternRewriterCudaDynamicTests)
#147600 closed
May 26, 2025 -
Deterministic `index_put` on CUDA fails when broadcasting is required
#79987 closed
May 25, 2025 -
Does Pytorch have the function that can obtain sub-matrix according to index?
#49278 closed
May 25, 2025 -
index_put_ take min when there are repeated indices
#19197 closed
May 25, 2025 -
[bug] inconsistent behavior of indexing
#14227 closed
May 25, 2025 -
pinned_use_background_threads will cause a coredump
#152008 closed
May 25, 2025 -
Index out of bound when running torch.gather
#107540 closed
May 25, 2025 -
Tensor __getitem__ not documented, sparse grad?
#101068 closed
May 25, 2025 -
scaled_dot_product_attention(): argument 'is_causal' must be bool, not SymBool
#154038 closed
May 24, 2025 -
DISABLED test_mutation_rename (__main__.TestMaxAutotune)
#154218 closed
May 24, 2025 -
How can I use inductor aot_compile to support a MoE network?
#148747 closed
May 24, 2025 -
Query Regarding Memory Release API in AOTInductor for PyTorch
#153363 closed
May 24, 2025 -
Wrong result for modulo if the dividend and divisor are int when using mps
#154171 closed
May 24, 2025 -
Add the capability to export GradMultiply to ONNX
#73354 closed
May 24, 2025 -
Torchdynamo with onnxrt backend generating fake tensor errors
#93502 closed
May 24, 2025 -
autoformat failures blocking merge
#154084 closed
May 23, 2025 -
UT failure in test_decompose_mem_bound_mm.py for Inductor
#153585 closed
May 23, 2025 -
DISABLED test_find_or_create_pg (__main__.TestPgTag)
#107278 closed
May 23, 2025 -
Sparse CSR layout GPU backend tracking issue
#60854 closed
May 23, 2025 -
Problem installing from source on CentOS 6.5
#28444 closed
May 23, 2025 -
Problem when installing pytorch 1.4 from source on Centos 6.3
#28497 closed
May 23, 2025 -
torch.where behaves differently from in place replacement
#96110 closed
May 23, 2025 -
test
#154278 closed
May 23, 2025 -
Mysterious Tensor Indexing Problem
#22013 closed
May 23, 2025 -
CUDA failure with deterministic fancy indexed assignment with broadcasting
#131933 closed
May 23, 2025 -
Gather backward is faster than integer indexing on GPU
#15245 closed
May 23, 2025 -
xpu: torch.nn.functional.scaled_dot_product_attention produces NaN on XPU
#154051 closed
May 23, 2025 -
`RNNBase` modules break parameter sharing due to `flatten_parameters()`
#154238 closed
May 23, 2025 -
Documentation issue about torch.finfo(x.dtype).eps
#154184 closed
May 23, 2025 -
Bugs encountered when installing the torch corresponding to CUDA12.8
#154200 closed
May 23, 2025 -
build pytorch2.3.0 cpu with mkldnn_acl 24.08 failed on aarch64
#148841 closed
May 23, 2025 -
Assertion Failure: TestBinaryUfuncsCPU.test_lerp_cpu_complex64 on Graviton 3
#146155 closed
May 23, 2025 -
Assertion Failure: TestMkldnnCPU.test_matmul_lower_precision_cpu_float16 on Graviton 2 & 3
#146484 closed
May 23, 2025
373 Issues opened by 222 people
-
DISABLED test_get_parameter_dtype (__main__.ReproTests)
#156598 opened
Jun 23, 2025 -
DISABLED test_add_sub_alpha_out (__main__.ReproTests)
#156597 opened
Jun 23, 2025 -
Python 3.14 and dynamo fails to build
#156595 opened
Jun 23, 2025 -
DeviceMesh's `_set_mesh_dim_group_options` ineffective for 1-dim meshes
#156593 opened
Jun 23, 2025 -
Set dependencies lower bound
#156587 opened
Jun 23, 2025 -
DISABLED test_dont_dce_rand (__main__.ReproTests)
#156580 opened
Jun 23, 2025 -
DISABLED test_add_complex_conj (__main__.ReproTests)
#156579 opened
Jun 23, 2025 -
Tracing with aot_eager/torch.compile produces wrong strides on HPU Meta dispatch
#156578 opened
Jun 23, 2025 -
DISABLED test_basic_fn_backend_eager_device_cuda(__main__.TestPackage)
#156576 opened
Jun 23, 2025 -
DISABLED test_dont_aggressively_write_assert (__main__.ReproTests)
#156570 opened
Jun 23, 2025 -
DISABLED test_nccl_symmem_alloc (__main__.NCCLSymmetricMemoryTest)
#156569 opened
Jun 23, 2025 -
Segmentation fault (core dumped) in `torch.profiler.profile`
#156563 opened
Jun 22, 2025 -
Using collections. namedtuple in the forward method of the model resulted in compilation failure
#156558 opened
Jun 22, 2025 -
[JIT] TorchScript fails to compile model using random.choice() with confusing error message
#156557 opened
Jun 22, 2025 -
When using Apple MPS, the bias of BatchNorm1d becomes extremely large.
#156555 opened
Jun 22, 2025 -
ConvertTritonGPUToLLVM pass fails on fused GroupNorm backward (SM 89) under torch.compile(…, backend='inductor')
#156549 opened
Jun 21, 2025 -
Inconsistent Model Results and Failures on Windows with CUDA vs. CPU PyTorch Builds
#156547 opened
Jun 21, 2025 -
`TorchScript` does not allow accessing methods of nested tensors
#156544 opened
Jun 21, 2025 -
FSDP2 - Tensor incompatibility
#156535 opened
Jun 21, 2025 -
`<<` and `>>` operators seem silently broken for DTensor operand 1 and scalar operand 2
#156533 opened
Jun 21, 2025 -
[ONNX] Update tests for attention
#156524 opened
Jun 20, 2025 -
[Dtensor] handle dtensor ops that only need to operate on certain shard without all_gather first.
#156523 opened
Jun 20, 2025 -
[user empathy] compile for `transformers` model
#156520 opened
Jun 20, 2025 -
Convenient way to create device with torch.accelerator and a specific device index
#156519 opened
Jun 20, 2025 -
DISABLED test_comprehensive_nn_functional_linear_cuda_float16 (__main__.TestInductorOpInfoCUDA)
#156514 opened
Jun 20, 2025 -
Native BFloat16 Mixed BatchNorm Train gives incorrect gradients
#156513 opened
Jun 20, 2025 -
functorch_maml_omniglot is a bad CPU performance smoketest model
#156511 opened
Jun 20, 2025 -
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_int32 (__main__.TestForeachCUDA)
#156497 opened
Jun 20, 2025 -
mypy.ini deprecation: numpy.typing.mypy_plugin
#156489 opened
Jun 20, 2025 -
Add stub for mypy-torch._C._jit_tree_views
#156488 opened
Jun 20, 2025 -
SDPA FLASH_ATTENTION backend gets NaN values for IPEX on Intel CPU
#156487 opened
Jun 20, 2025 -
`torch.compile` fails with `UnicodeDecodeError` when model contains extreme value injection
#156451 opened
Jun 19, 2025 -
Tensor.is_pinned() raises error after renaming privateuseone backend.
#156444 opened
Jun 19, 2025 -
DISABLED test_inlined_optimized_graph (__main__.TestTEFuserDynamic)
#156438 opened
Jun 19, 2025 -
Accuracy minifier fails to minify anything
#156437 opened
Jun 19, 2025 -
DISABLED test_skip_grad_in_check (__main__.TestTEFuserDynamic)
#156436 opened
Jun 19, 2025 -
Suggest to use the torch cmake target instead of ${TORCH_LIBRARIES} in the c++ docs
#156434 opened
Jun 19, 2025 -
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_int16 (__main__.TestForeachCUDA)
#156430 opened
Jun 19, 2025 -
[CI] "Update viable/strict" job occasionally hangs for days
#156425 opened
Jun 19, 2025 -
research adding cuda-bindings to core
#156424 opened
Jun 19, 2025 -
DISABLED test_fake_crossref_backward_no_amp_cholesky_solve_cuda_float32 (__main__.TestFakeTensorCUDA)
#156419 opened
Jun 19, 2025 -
ShardedTensor breaks cycle detection
#156417 opened
Jun 19, 2025 -
A possible bug in HistogramObserver._combine_histograms()
#156414 opened
Jun 19, 2025 -
bmm max-autotune segfaults on x86 cpu
#156412 opened
Jun 19, 2025 -
[compile] torch._dynamo.exc.TorchRuntimeError: Failed running call_function aten.lift_fresh_copy.default
#156411 opened
Jun 19, 2025 -
DataParallel gather NaN across multiple gpus
#156392 opened
Jun 19, 2025 -
[compile][transformers] Recompilation with mark_static_address with cudagraphs
#156377 opened
Jun 18, 2025 -
DISABLED test_inplace_on_view_undefined_grad_output_cpu (__main__.TestAutogradDeviceTypeCPU)
#156363 opened
Jun 18, 2025 -
DISABLED test_schedule_with_native_zero_bubble_ScheduleClass1 (__main__.ScheduleTest)
#156328 opened
Jun 18, 2025 -
UNSTABLE periodic / linux-jammy-rocm-py3.10 / test (distributed)
#156327 opened
Jun 18, 2025 -
Composition of nested `torch.compile` calls is not well defined
#156308 opened
Jun 18, 2025 -
DISABLED test_inplace_on_view_then_no_grad_cpu (__main__.TestAutogradDeviceTypeCPU)
#156306 opened
Jun 18, 2025 -
Inductor error with Torch XPU optimizations to StableDiffusion3 Pipeline
#156303 opened
Jun 18, 2025 -
DISABLED test_inplace_on_view_of_view_cpu (__main__.TestAutogradDeviceTypeCPU)
#156289 opened
Jun 18, 2025 -
DISABLED test_inplace_on_view_non_contig_cpu (__main__.TestAutogradDeviceTypeCPU)
#156265 opened
Jun 18, 2025 -
DISABLED test_shape_env (__main__.TestGuardSerialization)
#156264 opened
Jun 18, 2025 -
torch._foreach_copy_ causing CUDA illegal memory access.
#156261 opened
Jun 18, 2025 -
[ONNX] Create a tutorial for exporting hf transformers model
#156258 opened
Jun 18, 2025 -
[dynamic shapes] translation validation failure under `fake_tensor_propagate_real_tensors`
#156251 opened
Jun 17, 2025 -
DISABLED test_name_match (__main__.TestGuardSerialization)
#156246 opened
Jun 17, 2025 -
Upgrade torch._scaled_grouped_mm to SM100+
#156238 opened
Jun 17, 2025 -
[Tracker] AutoParallel's feature request to DTensor
#156217 opened
Jun 17, 2025 -
DISABLED test_inplace_on_view_makes_base_require_grad_cpu (__main__.TestAutogradDeviceTypeCPU)
#156209 opened
Jun 17, 2025 -
When compiling submodules, AOTInductor is significantly slower with torch.export
#156206 opened
Jun 17, 2025 -
Upgrade torch._grouped_mm to SM100+
#156202 opened
Jun 17, 2025 -
Dynamo does not know how to trace method `__len__` of class `<unknown type>` with torch.logging calls
#156191 opened
Jun 17, 2025 -
[CD] Windows Wheel builds CUDA 12.9.1 Stack Overflow during build
#156181 opened
Jun 17, 2025 -
DISABLED test_inplace_on_view_backprop_view_of_view_cpu (__main__.TestAutogradDeviceTypeCPU)
#156180 opened
Jun 17, 2025 -
nn.Module._load_from_state_dict is always called with strict=True
#156177 opened
Jun 17, 2025 -
[Feature Request]: Native C++ API for ONNX Export in LibTorch
#156168 opened
Jun 17, 2025 -
DISABLED test_inplace_on_view_backprop_view_cpu (__main__.TestAutogradDeviceTypeCPU)
#156163 opened
Jun 17, 2025 -
torch2.7.1 issue for torch.compile numpy
#156162 opened
Jun 17, 2025 -
[SDPA] RTX5080 is different from CPU calculation result in backward with long seq
#156160 opened
Jun 17, 2025 -
Error shm.dll
#156159 opened
Jun 17, 2025 -
Inconsistent `torch.rsqrt` results on complex128 between CPU and CUDA
#156152 opened
Jun 17, 2025 -
DISABLED test_inplace_on_view_backprop_base_cpu (__main__.TestAutogradDeviceTypeCPU)
#156143 opened
Jun 17, 2025 -
[dynamo, dynamic shapes] .item() on Tensor created in the compiled region fails
#156135 opened
Jun 16, 2025 -
[DDP][FSDP2] add unit test to showcase DDP mixed precision with FSDP2 mixed precision
#156130 opened
Jun 16, 2025 -
[dynamo] Show carets in graph break stack traces
#156127 opened
Jun 16, 2025 -
[compile][torchtune] Full model compiled Qwen3 is 4x slower than eager
#156103 opened
Jun 16, 2025 -
UNSTABLE rocm / linux-jammy-rocm-py3.10 / test (default)
#156098 opened
Jun 16, 2025 -
DISABLED test_quantize (__main__.TestOpenReg)
#156089 opened
Jun 16, 2025 -
DISABLED test_schedule_with_native_zero_bubble_ScheduleClass0 (__main__.ScheduleTest)
#156088 opened
Jun 16, 2025 -
Idea: Add SBOM Generation (and optional vuln scan) for better supply chain insight
#156085 opened
Jun 16, 2025 -
`torch.logsumexp`: support `dim=None`
#156075 opened
Jun 16, 2025 -
[codespell] fix typos in the codebase
#156073 opened
Jun 16, 2025 -
[typing][docs] `torch.amin` and `torch.amax` do not document `dim=None`
#156072 opened
Jun 16, 2025 -
Docs incorrectly claim `torch.max` and `torch.logsumexp` accept `dim=None`
#156071 opened
Jun 16, 2025 -
torch.nn.functional.conv_transpose3d return inconsistent results when weight containing inf between CPU and GPU
#156062 opened
Jun 16, 2025 -
Dynamo trace an incorrect result on torch._C._storage_Use_Count
#156059 opened
Jun 16, 2025 -
torch.equal causes fallback to eager mode in torch.compile
#156057 opened
Jun 16, 2025 -
`torch.distributed.tensor.parallel.style.ColwiseParallel` introduce huge guard eval latency
#156054 opened
Jun 16, 2025 -
Ability to set device guard in Python
#156052 opened
Jun 16, 2025 -
[Upstream Triton] persistent mm + tma accuracy failures
#156028 opened
Jun 15, 2025 -
Ошибка установки torch для CUDA 12.1 на GTX 1660 Ti
#156024 opened
Jun 15, 2025 -
torch.fft.ifft for complex64 produces inconsistent results between CPU and CUDA
#156020 opened
Jun 15, 2025 -
[ROCm] BF16 Context Parallelism MI300X Not Numerically Accurate
#156012 opened
Jun 15, 2025 -
Deprecation notice of `torch.norm` and `Tensor.norm` across the documentation
#156005 opened
Jun 14, 2025 -
DCP save only saves one shard of tensor parallel model when using DP + TP
#156002 opened
Jun 14, 2025 -
ONNX export via Dynamo sets `dft_length = 1` in `DFT`, breaking shape-inference for `torch.fft.rfft`
#155997 opened
Jun 14, 2025 -
Keep gettting AssertionError: found no DeviceMesh from dtensor args for c10d.broadcast_.default!
#155993 opened
Jun 14, 2025 -
[feature request] Native checkpointing to/from `s3://`
#155992 opened
Jun 14, 2025 -
Vulkan interoperability
#155986 opened
Jun 14, 2025 -
as_tensor of list of tensors should keep grad history
#155983 opened
Jun 14, 2025 -
Floating point exception (core dumped) in torch.nn.functional.fold
#155981 opened
Jun 14, 2025 -
Including XPU and CUDA in ProfilerActivity causes XPU profiling to be ignored
#155957 opened
Jun 13, 2025 -
DISABLED test_call_count_tunableop_cuda_float32 (__main__.TestLinalgCUDA)
#155953 opened
Jun 13, 2025 -
test vLLM with PyTorch 2.8rc before releasing PyTorch 2.8
#155933 opened
Jun 13, 2025 -
Activation Checkpointing breaks "torch.distributed.checkpoint.state_dict._get_fqns"
#155924 opened
Jun 13, 2025 -
UNSTABLE inductor-rocm / rocm-py3.10-inductor / test (inductor)
#155917 opened
Jun 13, 2025 -
[`torch.distributed`] Watchdog monitor error handling should match watchdog error handling
#155916 opened
Jun 13, 2025 -
DISABLED test_non_pow_2_headdim_head_dim_24_float16_cuda_float16 (__main__.TestFlexDecodingCUDA)
#155905 opened
Jun 13, 2025 -
test_flex_attention unit test failures
#155894 opened
Jun 13, 2025 -
DISABLED test_non_pow_2_headdim_head_dim_17_float16_cuda_float16 (__main__.TestFlexDecodingCUDA)
#155893 opened
Jun 13, 2025 -
[RFC] `DeviceGroup`: a mixin of `ProcessGroup` and `Backend`
#155892 opened
Jun 13, 2025 -
[CUDA][CUTLASS] test_cutlass_backend.py unit test failures on SM90+
#155888 opened
Jun 13, 2025 -
RFC: Add Adaptive Entropy-Gated Contrastive Fusion (AECF) for Robust Multimodal Attention Pooling
#155878 opened
Jun 13, 2025 -
Migrating existing backend-MAIA integration toward PrivateUse1 / openReg
#155864 opened
Jun 12, 2025 -
Export Huggingface models with StaticCache
#155862 opened
Jun 12, 2025 -
[user triton][dynamo Fix TMA Descriptor reconstruction
#155856 opened
Jun 12, 2025 -
intmm is being compiled inconsistently with errors.
#155838 opened
Jun 12, 2025 -
Running dispatch modes on compile-disabled regions of a compiled model
#155825 opened
Jun 12, 2025 -
DISABLED test_non_pow_2_headdim_head_dim_121_float16_cuda_float16 (__main__.TestFlexDecodingCUDA)
#155824 opened
Jun 12, 2025 -
DISABLED test_compare_cpu_nn_functional_conv1d_cuda_float32 (__main__.TestCommonCUDA)
#155822 opened
Jun 12, 2025 -
DISABLED test_non_pow_2_headdim_head_dim_94_float16_cuda_float16 (__main__.TestFlexDecodingCUDA)
#155808 opened
Jun 12, 2025 -
torch.distributions.kl_divergence Fails with MultivariateNormal in Dynamo Due to _infer_size Type Error
#155800 opened
Jun 12, 2025 -
`nn.RNN(...).to('cuda')` fails with `cuDNN error: CUDNN_STATUS_BAD_PARAM` on GPU, but works on CPU
#155798 opened
Jun 12, 2025 -
[MPS] Performance regression and visual bug with ComfyUI Flux dev since nightly 20250510
#155797 opened
Jun 12, 2025 -
Poor scaling on AArch64 at high thread counts
#155795 opened
Jun 12, 2025 -
Compilation issues with ROCm 6.4.1 on Debian 12
#155794 opened
Jun 12, 2025 -
DISABLED test_ddp_apply_optim_in_backward_ignored_params (__main__.TestDistBackendWithSpawn)
#155751 opened
Jun 11, 2025 -
DISABLED test_ddp_apply_optim_in_backward_grad_as_bucket_view_false (__main__.TestDistBackendWithSpawn)
#155750 opened
Jun 11, 2025 -
Tensor functionalization loses .grad information
#155725 opened
Jun 11, 2025 -
[rocm] HIP Graph capture raises segmentation fault on AMD GPU but CUDA Graph capture succeeds on Nvidia GPU
#155720 opened
Jun 11, 2025 -
DISABLED test_ddp_apply_optim_in_backward (__main__.TestDistBackendWithSpawn)
#155714 opened
Jun 11, 2025 -
DISABLED test_allgather_stress_cuda (__main__.ProcessGroupGlooFRTest)
#155711 opened
Jun 11, 2025 -
[pt2][precompile] Support ID_MATCH guards and torch.nn.Modules
#155705 opened
Jun 11, 2025 -
[pt2] [precompile] Support cudagraphs in BundledAOTAutogradCacheArtifact
#155703 opened
Jun 11, 2025 -
torch typing: register_buffer -> device
#155693 opened
Jun 11, 2025 -
torch.compile produces incorrect output
#155690 opened
Jun 11, 2025 -
[FSDP] optimizer_state_dict fails if FSDP model is not on global rank 0
#155680 opened
Jun 11, 2025 -
Torch compile CUDA graphs leads to a large number of CUDA streams
#155679 opened
Jun 11, 2025 -
torch.clamp throws overflow error on CPU but not on CUDA
#155671 opened
Jun 11, 2025 -
[libtorch] Debug and Release should be shipped together on Windows
#155667 opened
Jun 11, 2025 -
[Misc] skip_if decorators aborts the test process in distributed/test_composability.py
#155664 opened
Jun 11, 2025 -
[DTensor] DTensor is not well supported on older versions of GPUs, such as A10
#155657 opened
Jun 11, 2025 -
Hangs on torch.tril on Intel GPU
#155651 opened
Jun 11, 2025 -
Functional all_gather_into_tensor does not support stacking, fails when compiled
#155632 opened
Jun 10, 2025 -
DISABLED test_trust_repo_check_yes (__main__.TestHub)
#155617 opened
Jun 10, 2025 -
Additional Documentation About Size Checking with Custom Operators
#155616 opened
Jun 10, 2025 -
CUDA 12.6->12.8 slow and periodic failures
#155607 opened
Jun 10, 2025 -
Question in aot_autograd trace in torch.distributed case
#155599 opened
Jun 10, 2025 -
[libtorch] Crash on creating torch::optim::Adam optimizer for Windows
#155597 opened
Jun 10, 2025 -
pipeline() fails when a sub-module uses "no_grad()"; impacts RoPE implementation on HF models
#155589 opened
Jun 10, 2025 -
nn.init.trunc_normal_ Creates Massive Outliers with Small std Due to erfinv Instability
#155588 opened
Jun 10, 2025 -
[Upstream Triton] Handle user-specified triton.set_allocator function
#155584 opened
Jun 10, 2025 -
[Upstream Triton] Support new host-side TMA API in user-defined triton kernels
#155574 opened
Jun 10, 2025 -
[Misc] test_foreach_add_different_mesh cannot work on machines with less than 4 GPUs
#155562 opened
Jun 10, 2025 -
[MPS] Migrate torch.sort to Metal shader
#155560 opened
Jun 10, 2025 -
Reproducibility of results without AVX512 by setting ATEN_CPU_CAPABILITY=avx2
#155552 opened
Jun 10, 2025 -
get_ema_multi_avg_fn() equation is a little confused
#155551 opened
Jun 10, 2025 -
[torch.export] Cannot export TorchVision raft_small, raft_large
#155550 opened
Jun 10, 2025 -
[Misc] The keys of the sac_milp return dict were modified, but the test case was not updated
#155547 opened
Jun 10, 2025 -
Clarify default value of eps in RMSNorm documentation
#155527 opened
Jun 10, 2025 -
[Tracking] Triton 3.2 deprecation
#155519 opened
Jun 10, 2025 -
DISABLED test_weight_norm_bwd_dynamic_shapes_cuda (__main__.DynamicShapesGPUTests)
#155517 opened
Jun 10, 2025 -
DISABLED test_weight_norm_bwd_dynamic_shapes_cuda (__main__.DynamicShapesCodegenGPUTests)
#155516 opened
Jun 10, 2025 -
Unable to find caffe2 when building libtorch from the source and then trying to use it in a cmake project
#155512 opened
Jun 10, 2025 -
[cutlass backend] Arch in manifest is not handled correctly if we want to GenerateSM90 on Blackwell
#155511 opened
Jun 10, 2025 -
PP activation offloading
#155490 opened
Jun 9, 2025 -
sizevars.size_hint is an unbacked shapes footgun
#155484 opened
Jun 9, 2025 -
[RFC][c10d] Make it easier to act on group
#155472 opened
Jun 9, 2025 -
in partitioner avoid doing replacements on inputs when determining which symnodes to pass to backward.
#155468 opened
Jun 9, 2025 -
torch.distributed TCP init method binds socket to all interfaces in all cases
#155467 opened
Jun 9, 2025 -
AssertionError: found no DeviceMesh from dtensor args for c10d.broadcast_.default!
#155463 opened
Jun 9, 2025 -
High-performance LLM quantization on X86 CPU with native PyTorch
#155435 opened
Jun 9, 2025 -
Add SM100/B200 support for torch._grouped_mm
#155434 opened
Jun 9, 2025 -
[pt2] [Precompile] Store parameters in BundledAOTAutogradCacheEntry
#155433 opened
Jun 9, 2025 -
Backward fails with compiled attention on nested tensors
#155421 opened
Jun 8, 2025 -
record_stream + cudagraph + multiple streams leaks memory.
#155398 opened
Jun 7, 2025 -
crash in torch.histc
#155393 opened
Jun 7, 2025 -
OpOverloads should have annotations
#155386 opened
Jun 7, 2025 -
[custom_op] Custom ops created by @custom_op should get type hints propagated to their OpOverload
#155385 opened
Jun 7, 2025 -
RuntimeError: Could not find libnvrtc.so. Please make sure CUDA is installed.
#155378 opened
Jun 6, 2025 -
Change error to warnning when tensor size mismatch while loading params
#155368 opened
Jun 6, 2025 -
[Graph Partition] use pinned memory and foreach when moving cpu scalar tensor to gpu
#155360 opened
Jun 6, 2025 -
[dynamo] Fixes to lru_cache Dynamo warning
#155352 opened
Jun 6, 2025 -
CUDA_HOME doesn't seem to work with setup script
#155350 opened
Jun 6, 2025 -
PyTorch CPP Extensions fail when same kernel is compiled more than once on ROCm servers
#155344 opened
Jun 6, 2025 -
[A100][cusparselt] Non-determinstic correctness issue with some algorithms
#155333 opened
Jun 6, 2025 -
compiler cache not work
#155332 opened
Jun 6, 2025 -
Can't use torch.compile inside of a torch_dispatch mode
#155331 opened
Jun 6, 2025 -
IntraKernel Dispatcher
#155330 opened
Jun 6, 2025 -
Excessively restrictive dependencies
#155325 opened
Jun 6, 2025 -
Segmentation fault after manual tensor assignment with autograd enabled on CUDA
#155322 opened
Jun 6, 2025 -
mishandling torch.package.PackageExporter raises Aborted when given a tensor instead of an importer
#155321 opened
Jun 6, 2025 -
`torch.export.export()` fails on GPU with LSTM model: "Cannot access data pointer of Tensor"
#155309 opened
Jun 6, 2025 -
[MPS] 5D 4Mln rng attempts fail with internal error
#155293 opened
Jun 6, 2025 -
nn.NLLLoss Fails with 1D Inputs in Compiled Mode
#155247 opened
Jun 5, 2025 -
Dynamo Fails on torch_scatter.scatter_max with Fake Tensor Allocation Error During Graph Tracing
#155240 opened
Jun 5, 2025 -
Dynamo Fails on pad_packed_sequence Due to Fake Tensor Allocation Issue During FX Graph Tracing
#155238 opened
Jun 5, 2025 -
DISABLED test_Transformer_multilayer_coder_cuda_tf32 (__main__.TestNN)
#155235 opened
Jun 5, 2025 -
[scan] scan is broken in nightly
#155230 opened
Jun 5, 2025 -
canUse32BitIndexMath set to False with efficient net
#155225 opened
Jun 5, 2025 -
[FSDP2] set_reduce_scatter_divide_factor errors with non-trivial MixedPrecisionPolicy
#155223 opened
Jun 5, 2025 -
KeyError when using fx.split_module
#155220 opened
Jun 5, 2025 -
3D Conv Slow with Large Tensors (> 2**31) Elements
#155218 opened
Jun 5, 2025 -
DISABLED test_TransformerEncoderLayer_gelu_activation_cuda_tf32 (__main__.TestNN)
#155217 opened
Jun 5, 2025 -
DISABLED test_Linear_cuda_tf32 (__main__.TestNN)
#155216 opened
Jun 5, 2025 -
Profiler: Add hide metadata flag to skip events in key_averages() table
#155213 opened
Jun 5, 2025 -
torch.compile bug when using resize
#155209 opened
Jun 5, 2025 -
torch.compile failure in `all_to_all_single_grad` with dynamic splits
#155205 opened
Jun 5, 2025 -
support for cuDNN 9.8+
#155203 opened
Jun 5, 2025 -
Importing xgboost before torch + openmp causes seg fault
#155201 opened
Jun 5, 2025 -
make html failing while building docs
#155199 opened
Jun 5, 2025 -
Enable CUDA 12.9 binaries
#155196 opened
Jun 5, 2025 -
Should be ReLU6(Module)
#155193 opened
Jun 5, 2025 -
Graph break when modifying a list that contains symints.
#155174 opened
Jun 5, 2025 -
[AC] torch.utils.checkpoint.CheckpointError from HF qwen2
#155171 opened
Jun 4, 2025 -
[dynamic shapes] data-dependent error with conv1d
#155162 opened
Jun 4, 2025 -
Multi-worker dataloader for `IterableDataset` holds open process on macOS Python 3.12
#155157 opened
Jun 4, 2025 -
torch.profiler raises Aborted (core dumped) failurer related with GIL (gilstate_tss_set)
#155147 opened
Jun 4, 2025 -
Segmentation fault when using torch.profiler
#155146 opened
Jun 4, 2025 -
[RFC] Experimental Wheel Variant Support
#155141 opened
Jun 4, 2025 -
some_tensor.to("cpu", non_blocking=True) becomes sync under PT2 while async in eager mode
#155121 opened
Jun 4, 2025 -
After `torch.export.export`, my model inference results in FakeTensor.
#155114 opened
Jun 4, 2025 -
Flex Attention and Nested Tensor: very high VRAM usage
#155065 opened
Jun 3, 2025 -
MPS Memory Leak
#155060 opened
Jun 3, 2025 -
Segfault after clearing Dynamo Cache
#155057 opened
Jun 3, 2025 -
[Upstream Triton] AOTI support w/ new TMA API
#155047 opened
Jun 3, 2025 -
ROCm: torch.cholesky_inverse raises Memory access fault for large tensor shapes
#155046 opened
Jun 3, 2025 -
Convert to markdown: storage.rst, tensor_attributes.rst, tensor_view.rst, tensorboard.rst, tensors.rst
#155034 opened
Jun 3, 2025 -
Convert to markdown: nn.attention.rst, nn.functional.rst, nn.init.rst, nn.rst, onnx_dynamo_memory_usage.rst
#155029 opened
Jun 3, 2025 -
Convert to markdown: jit_python_reference.rst, jit_unsupported.rst, jit_utils.rst, jit.rst, library.rst
#155024 opened
Jun 3, 2025 -
Convert to markdown: bottleneck.rst, checkpoint.rst, complex_numbers.rst, cond.rst, config_mod.rst
#155014 opened
Jun 3, 2025 -
Please support libtorch for XPU
#155011 opened
Jun 3, 2025 -
Internal Assertion Failure with Invalid Arguments in max_unpool1d under TorchScript
#155009 opened
Jun 3, 2025 -
Internal Assertion Failure with Invalid Arguments in max_pool1d under TorchScript
#155007 opened
Jun 3, 2025 -
torch.compile triton kernel errors when there are """ docblocks
#155006 opened
Jun 3, 2025 -
Internal Assertion Failure with Invalid Arguments in max_pool2d under TorchScript
#155004 opened
Jun 3, 2025 -
Internal Assertion Failure with Invalid Arguments in max_unpool2d under TorchScript
#155003 opened
Jun 3, 2025 -
The profiler does not seem to be able to record cuda runtime nodes
#155001 opened
Jun 3, 2025 -
[Inductor] Float division inside tl.load in codegen results in TypeError('unexpected type fp32')
#154996 opened
Jun 3, 2025 -
[FSDP2] Slower Convergence with fully_shard() Compared to DDP during Qwen2-VL Fine-Tuning
#154984 opened
Jun 3, 2025 -
torch.linalg.vector_norm fails during torch.compile due to mismatch expected out dtype tensor
#154982 opened
Jun 3, 2025 -
INTERNAL ASSERT FAILED in mse_loss when mixing CPU and CUDA tensors
#154978 opened
Jun 3, 2025 -
[FSDP2] all_gather_copy_in for cpu offload
#154960 opened
Jun 3, 2025 -
Invalid stride of output when use torch.cond
#154949 opened
Jun 3, 2025 -
[Inductor] why `TORCHINDUCTOR_MAX_AUTOTUNE_GEMM_BACKENDS=TRITON` does NOT work?
#154947 opened
Jun 3, 2025 -
[Tracker] Support flash attention fa3 ABI stable w/ libtorch
#154908 opened
Jun 2, 2025 -
load_state_dict with strict should be able to remove_duplicate?
#154906 opened
Jun 2, 2025 -
MPS does not support addmm for non-float input
#154901 opened
Jun 2, 2025 -
MPS topk failure for 5D tensor or above
#154890 opened
Jun 2, 2025 -
MPS batch_norm mixed dtype failure
#154887 opened
Jun 2, 2025 -
MPS max_pool2d_with_indices failure: destination values and indices length mismatch along axis
#154882 opened
Jun 2, 2025 -
NVLS algorithms are not disabled in PGNCCL in pytorch deterministic mode
#154880 opened
Jun 2, 2025 -
documentation of Adafactor does not match the implementation
#154862 opened
Jun 2, 2025 -
Add node.meta["stack_trace"] to make_fx
#154853 opened
Jun 2, 2025 -
Improve warning when specializations happen due to operator
#154851 opened
Jun 2, 2025 -
Torchrun should handle SIGUSR1 and SIGUSR2
#154849 opened
Jun 2, 2025 -
don't require recompiles when switching between torch.Tensor vs AsyncCollectiveTensor graph inputs
#154847 opened
Jun 2, 2025 -
[FSDP2] fix unit test test_all_gather_extension_outer_size_stride
#154836 opened
Jun 2, 2025 -
torch inductor cudagraph tree could incorrectly release some input nodes during replay
#154824 opened
Jun 1, 2025 -
In-place operations are reordered across the forward-backward in autograd function
#154820 opened
Jun 1, 2025 -
FP8 scaled mm lowering ignores scale_result argument
#154807 opened
May 31, 2025 -
[discussion] Specialized frontend method for computing Gram / covariance matrix
#154791 opened
May 31, 2025 -
Redistribute DTensor across different DeviceMeshes
#154787 opened
May 31, 2025 -
Turn gradient_accumulate into a separate aten op
#154767 opened
May 30, 2025 -
Inductor codegens invalid fp8 elementwise mul op
#154750 opened
May 30, 2025 -
torch.export seems to emit invalid code for Tensor.split when used with meta device
#154721 opened
May 30, 2025 -
DISABLED test_inductor_multiple_specializations_dynamic_shapes_cuda (__main__.DynamicShapesCodegenGPUTests)
#154718 opened
May 30, 2025 -
ONNX Exporter Support for aten::cartesian_prod
#154714 opened
May 30, 2025 -
DISABLED test_inductor_multiple_specializations_dynamic_shapes_cuda (__main__.DynamicShapesGPUTests)
#154710 opened
May 30, 2025 -
DISABLED test_inductor_multiple_specializations_cuda (__main__.GPUTests)
#154705 opened
May 30, 2025 -
[Inductor] Output discrepancy between Inductor and eager of mean with input of a large size tensor
#154703 opened
May 30, 2025 -
[FSDP2] offer public API to share communication context aross fsdp roots
#154657 opened
May 29, 2025 -
Recreation of unbacked symint leads to "possible memo disaster" when running decompositions
#154647 opened
May 29, 2025 -
functional all_gather does not work with scalar tensor and fail with torch.compile
#154621 opened
May 29, 2025 -
NotImplementedError: argument of type: <class 'torch._C._TensorMeta'>
#154614 opened
May 29, 2025 -
`torch.jit.script` doesn't accept `axis` as alias for `dim`
#154613 opened
May 29, 2025 -
Internal Assertion Fa 10000 ilure with torch.cuda.Stream in @script_method on CPU-only Systems
#154607 opened
May 29, 2025 -
`torch.cumprod` for complex128 produces inconsistent results between CPU and CUDA
#154606 opened
May 29, 2025 -
Cudnn attention is very slow when sequence length changed in every step
#154602 opened
May 29, 2025 -
[inductor][cpu]functorch_dp_cifar10 and opacus_cifar10 performance regression in 2025-05-24 nightly release
#154598 opened
May 29, 2025 -
inductor benchmark_compiled_module's random initialization will cause crash for data dependent indexing
#154592 opened
May 29, 2025 -
DISABLED test_profiler_remote_cuda (__main__.TensorPipeCudaRpcTest)
#154587 opened
May 29, 2025 -
DISABLED test_zero_bubble_with_model_kwargs_ScheduleClass1 (__main__.ScheduleTest)
#154579 opened
May 29, 2025 -
Buggy test in test/export/test_export.py
#154574 opened
May 28, 2025 -
DISABLED test_mempool_with_allocator (__main__.TestMemPool)
#154566 opened
May 28, 2025 -
Distributed Breakpoint doesn't exit safely
#154563 opened
May 28, 2025 -
Enable GB200 for dynamo/torchbench tests
#154560 opened
May 28, 2025 -
[cond] support for unbacked symbols in compiled region
#154559 opened
May 28, 2025 -
grouped_mm optional zero initialization of the output
#154557 opened
May 28, 2025 -
DISABLED test_zero_bubble_with_model_kwargs_ScheduleClass0 (__main__.ScheduleTest)
#154547 opened
May 28, 2025 -
[AOTI] view isn't guarded for output of a custom op
#154537 opened
May 28, 2025 -
`typing.get_type_hints` fails on TorchScript model after loading with `torch.jit.load`
#154502 opened
May 28, 2025 -
torch.fft.irfft(n) doesn’t handle non-Hermitian inputs in a consistent way
#154496 opened
May 28, 2025 -
INTERNAL ASSERTION ERROR using device='mkldnn' despite deprecation
#154491 opened
May 28, 2025 -
[wishlist item] Assume data-dependent info based on tensor construction
#154489 opened
May 28, 2025 -
`__jit_ignored_attributes__` is not respected in `torch.jit.trace_module`
#154478 opened
May 28, 2025 -
Multiple torch.fft APIs produce inconsistent results for infinite inputs between CPU and GPU
#154474 opened
May 28, 2025 -
Mega Cache/ Torchcompile cache Not working
#154463 opened
May 27, 2025 -
torch.compile tracing doesn't seem to be using the cache
#154456 opened
May 27, 2025 -
[Dynamo] Confusing re-raise exception handling graph break message
#154454 opened
May 27, 2025 -
16KB pagination support for PyTorch Torch Vision and PyTorch Lite
#154449 opened
May 27, 2025 -
[RFC] Removing ideep git submodule dependency from PyTorch & add oneDNN git submodule
#154444 opened
May 27, 2025 -
Inconsistent sin(x) output between CPU and CUDA for very large arguments
#154428 opened
May 27, 2025 -
`Floating point exception` in `torch.onnx.export` with PixelShuffle
#154425 opened
May 27, 2025 -
`Segmentation fault` in `torch.matmul` and `torch.sparse.addmm`
#154424 opened
May 27, 2025 -
`Segmentation fault` in `torch.jit.ignore`
#154423 opened
May 27, 2025 -
`Segmentation fault` in `torch.fx.experimental.partitioner_utils.map_arg`
#154422 opened
May 27, 2025 -
`torch.fmod` and `torch.remainder` crash in `Inductor`
#154420 opened
May 27, 2025 -
Crash in `torch.sparse.softmax`
#154419 opened
May 27, 2025 -
[JIT] torch.jit.script raises an exception with view(dtype)
#154407 opened
May 27, 2025 -
Dynamo cannot trace into wrap_triton
#154365 opened
May 26, 2025 -
`scaled_dot_product_attention` broadcasting (GQA) is a memory footgun
#154363 opened
May 26, 2025 -
Q.size(-1) == m INTERNAL ASSERT FAILED at "/pytorch/aten/src/ATen/native/BatchLinearAlgebra.cpp
#154356 opened
May 26, 2025 -
integer convolution
#154354 opened
May 26, 2025 -
bfloat16 Conv2d slower than float16 on 4090
#154351 opened
May 26, 2025 -
[RFC] : Remove Explicit Backend References from `torch.distributed` (`c10d`)
#154345 opened
May 26, 2025 -
Support jvp for flex attention
#154332 opened
May 25, 2025 -
MPS Memory Leak
#154329 opened
May 25, 2025 -
`torch.distributed.checkpoint.state_dict.get_model_state_dict` does not update the state_dict._metadata key
#154327 opened
May 25, 2025 -
torch.jit.script gives false results for autograd if complex data types are involved
#154324 opened
May 25, 2025 -
float16 → float32 conversion yields unexpected zero matrix for matrices > 43000 × 43000 in MPS
#154322 opened
May 25, 2025 -
Replacement for `export_modules_as_functions`
#154319 opened
May 25, 2025 -
[BUG] DataLoader low GPU utilization and extremely slow compared to manual batching
#154318 opened
May 25, 2025 -
torch.logdet produces incorrect results for singular matrices on CUDA vs CPU
#154312 opened
May 24, 2025 -
torch.inf.to(torch.int32) produces different values on CPU vs CUDA
#154311 opened
May 24, 2025 -
graph recording observed an input tensor deallocate during graph recording that did not occur during replay
#154306 opened
May 24, 2025 -
Inconsistent output of `torch.func.jvp` calculation
#154302 opened
May 24, 2025 -
'max-autotune' much slower than 'default' mode (run fused add_mul_activation kernel)
#154301 opened
May 24, 2025 -
ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory
#154299 opened
May 24, 2025 -
Hangs and timeouts on dist.reduce_scatter on B200 GPU
#154297 opened
May 24, 2025 -
Using `compile` on `hessian(hessian)`
#154284 opened
May 23, 2025 -
LazyGraphModule causes graph breaks
#154282 opened
May 23, 2025 -
[FSDP2] for mixed precision, input casting can get blocked when cuda streams are full
#154272 opened
May 23, 2025 -
torch dynamo fails on GH200 with world size 5
#154266 opened
May 23, 2025 -
max_pool2d padding assert incorrect with dilation
#154262 opened
May 23, 2025 -
Pypi Support for Windows arm64
#154260 opened
May 23, 2025 -
torch.compile regression blocks torchvision/torchbench pin upgrade
#154259 opened
May 23, 2025 -
[RFC] Cuda support matrix for Release 2.8
#154257 opened
May 23, 2025 -
`RNNBase` modules break parameter sharing due to `flatten_parameters()`
#154241 opened
May 23, 2025 -
torch._check on 3 unbacked symints aren't resolving ddes
#154240 opened
May 23, 2025 -
MPS backend fails to detect nn.Embedding out-of-range error
#154235 opened
May 23, 2025 -
NotImplementedError when computing JVP of Attention
#154226 opened
May 23, 2025 -
AssertionError: AssertionError not raised ```test_mutable_custom_op_fixed_layout2```
#154219 opened
May 23, 2025 -
AssertionError: Scalars are not equal! ```test_require_stride_expanded_dynamic_shapes_cuda```
#154214 opened
May 23, 2025 -
Triton pin update for PyTorch 2.8 / Triton 3.4
#154206 opened
May 23, 2025
864 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Full static typing for `torch.distributions`
#144219 commented on
Jun 12, 2025 • 71 new comments -
[Set] Support sets in VariableBuilder
#153150 commented on
Jun 21, 2025 • 46 new comments -
Fused RMSNorm implementation
#153666 commented on
Jun 21, 2025 • 29 new comments -
[DLPack] Add support for missing keyword-arguments.
#150218 commented on
Jun 21, 2025 • 18 new comments -
Inductor logging + analysis of torch.profile
#149697 commented on
Jun 10, 2025 • 17 new comments -
[cp] dispatch flex_attention to CP impl in TorchDispatchMode
#151497 commented on
Jun 5, 2025 • 12 new comments -
Upgrade to DLPack 1.0.
#145000 commented on
Jun 23, 2025 • 10 new comments -
[list] Implement list.count
#153969 commented on
Jun 21, 2025 • 9 new comments -
Adding view and reduction tags
#153342 commented on
Jun 4, 2025 • 9 new comments -
[Inductor] auto-chunker
#136702 commented on
Jun 17, 2025 • 9 new comments -
[export] add serialized_artifact test
#152739 commented on
Jun 16, 2025 • 8 new comments -
autograd: Add VJP and JVP rules for aten::aminmax
#151186 commented on
Jun 22, 2025 • 8 new comments -
Fix full_like decomposition to preserve strides
#144765 commented on
Jun 20, 2025 • 8 new comments -
Enable the AMP precision with freezing for CPU nightly test
#152298 commented on
Jun 20, 2025 • 8 new comments -
[cp] dispatch flex_attention_backward to CP impl in TorchDispatchMode
#152311 commented on
Jun 3, 2025 • 7 new comments -
Build libgomp (gcc-11) from src on AArch64
#152361 commented on
Jun 22, 2025 • 7 new comments -
add fp8 scaled_mm for XPU
#140972 commented on
Jun 5, 2025 • 6 new comments -
Add unified memory APIs for torch.accelerator
#152932 commented on
Jun 23, 2025 • 6 new comments -
Fix clang-tidy bugprone* warnings
#148529 commented on
Jun 23, 2025 • 6 new comments -
[BE]: Update cudnn to 9.9 for cu128
#152782 commented on
Jun 4, 2025 • 6 new comments -
Upgrade MKL in CI
#154198 commented on
Jun 13, 2025 • 6 new comments -
[FrozenSet] Fixes for FrozenSet
#152991 commented on
Jun 21, 2025 • 5 new comments -
[dynamic shapes] unbacked safe conv1d
#154089 commented on
May 29, 2025 • 4 new comments -
docs: fix dead link in torch.compile docs
#152734 commented on
Jun 17, 2025 • 4 new comments -
Update OpenBLAS commit
#151547 commented on
Jun 20, 2025 • 4 new comments -
New Sampler: DistributedWeightedRandomSampler
#150182 commented on
Jun 21, 2025 • 4 new comments -
Deprecated pkg_resources and use distributions instead
#151915 commented on
Jun 22, 2025 • 4 new comments -
Add CPython string tests
#150793 commented on
Jun 17, 2025 • 4 new comments -
[HOP, map] Rework of map autograd to the new interface
#153343 commented on
Jun 11, 2025 • 4 new comments -
Fix DLPack stream logic.
#150217 commented on
May 30, 2025 • 3 new comments -
ci: Add sccache to manylinux images
#148419 commented on
May 31, 2025 • 3 new comments -
[MPS] Implement max_pool3d_with_indices
#154145 commented on
May 25, 2025 • 3 new comments -
[TESTING] Triton pin (Jun 13) 5389ed797016010543ef1c7b88efc50f7521cb4e
#153117 commented on
Jun 14, 2025 • 3 new comments -
NUMA Binding Integration with torchrun
#149334 commented on
Jun 16, 2025 • 3 new comments -
removed zero dim cpu logic from fake_tensor.py
#147501 commented on
Jun 5, 2025 • 2 new comments -
xpu: support custom ops with torch.library on xpu backend
#152879 commented on
Jun 10, 2025 • 2 new comments -
[inductor] Add typing to _inductor/ir.py
#149958 commented on
Jun 19, 2025 • 2 new comments -
Fix clang-tidy warnings of performance from uncovered files
#144542 commented on
Jun 19, 2025 • 2 new comments -
[ONNX] Don't link to third-party protobuf
#153920 commented on
Jun 20, 2025 • 2 new comments -
Make open device registration tests standalone
#153855 commented on
Jun 10, 2025 • 2 new comments -
Deprecate DataLoader pin_memory_device param
#146821 commented on
Jun 23, 2025 • 2 new comments -
[3/n][Optimus][Auto-AC] Support float8_e4m3fn quantization type and set scaling as the default
#153802 commented on
Jun 4, 2025 • 2 new comments -
Adjust CMake code for Eigen
#148628 commented on
Jun 23, 2025 • 2 new comments -
[PG/nccl] improvements to eager init
#154132 commented on
Jun 22, 2025 • 1 new comment -
[cuDNN][SDPA] cuDNN SDPA refactor/cleanup, nested tensor backward, test priority bump for `sm90`, `sm100`
#149282 commented on
Jun 17, 2025 • 1 new comment -
[jit] DeadCodeEliminator Mark(block) improvement
#152348 commented on
Jun 11, 2025 • 1 new comment -
Raise `BufferError` for DLPack buffer-related errors.
#150691 commented on
Jun 20, 2025 • 1 new comment -
[c10d] Use non-poisoning `current_accelerator()`
#154159 commented on
May 29, 2025 • 1 new comment -
[pytorch_146643] fixed max triton generation
#154056 commented on
May 28, 2025 • 1 new comment -
[CUDA] Allow cuDNN or flash attn in `test_activation_checkpointing` pattern match check
#153272 commented on
Jun 16, 2025 • 1 new comment -
[WIP][XPU] Update Triton commit
#153096 commented on
Jun 7, 2025 • 1 new comment -
[invoke_subgraph] Force the output stride to be same as eager
#152806 commented on
Jun 6, 2025 • 1 new comment -
Refactor CUDAAllocatorConfig to reuse AllocatorConfig
#150312 commented on
Jun 18, 2025 • 1 new comment -
[dict] Raise TypeError in dict methods
#154003 commented on
Jun 12, 2025 • 1 new comment -
Make `Adam`, `AdamW` work with nonzero-dim Tensor betas
#149939 commented on
Jun 2, 2025 • 1 new comment -
[reland][ROCm] remove caffe2 from hipify
#151845 commented on
Jun 19, 2025 • 1 new comment -
[nn.utils] scale_grad_ with for_each
#150033 commented on
Jun 21, 2025 • 1 new comment -
Add inductor backend to device interface; make minifier_tests more device agnostic
#151314 commented on
Jun 20, 2025 • 1 new comment -
[BE] Replace `std::runtime_error` with `TORCH_CHECK` [2/N]
#152080 commented on
Jun 10, 2025 • 1 new comment -
[CUTLASS][WIP] Gate rowwise matmul CUTLASS kernels by compute capability
#152642 commented on
Jun 10, 2025 • 1 new comment -
Random Batch Sampler Speedup
#147706 commented on
May 30, 2025 • 1 new comment -
[pytorch][triton] Enabling TMA for flex-attention for supported device types
#153662 commented on
Jun 12, 2025 • 1 new comment -
Updates contextlib with ParamSpec
#153623 commented on
Jun 7, 2025 • 1 new comment -
ROCm OCP Micro-scaling Format (mx-fp8/mx-fp4) Support
#151360 commented on
Jun 23, 2025 • 1 new comment -
Enable lazy cloning in `Tensor.to` between CPU and MPS
#150569 commented on
Jun 4, 2025 • 1 new comment -
[ROCm] update state check for test_trace_while_active*
#153545 commented on
Jun 19, 2025 • 1 new comment -
Add is_pinned to host allocator
#151439 commented on
Jun 11, 2025 • 1 new comment -
Work around MPSGraph issue in backward pass of nn.ReplicationPad1d/2d
#152094 commented on
Jun 12, 2025 • 1 new comment -
[DO NOT MERGE] Enable TMA persistent GEMM Template by default
#149427 commented on
May 30, 2025 • 0 new comments -
Add x86-simd-sort accelerated sorting
#149362 commented on
May 29, 2025 • 0 new comments -
Use mypy 1.15
#149426 commented on
Jun 20, 2025 • 0 new comments -
fix differentiable collectives under inference mode
#149411 commented on
Jun 8, 2025 • 0 new comments -
Fix B018 Useless Expressions in Multiple Files (#106571)
#149408 commented on
Jun 17, 2025 • 0 new comments -
Fix `SequentialLR` deprecate warning about invoke `step(epoch)`
#149392 commented on
Jun 3, 2025 • 0 new comments -
Batch Sampler Speedup
#149441 commented on
Jun 12, 2025 • 0 new comments -
[ROCm] Enable max_autotune run on inductor perf dashboard
#148672 commented on
May 24, 2025 • 0 new comments -
Remove shebang line from easy_install generated python scripts on Windows only
#148673 commented on
May 27, 2025 • 0 new comments -
Set /NODEFAULTLIB:vcomp for MSVC when linking caffe2::mkl with libiomp5md.lib
#148674 commented on
Jun 1, 2025 • 0 new comments -
Trunk workflow for Windows Arm64
#148753 commented on
Jun 17, 2025 • 0 new comments -
Re-introduce -Wmaybe-uninitialized
#148760 commented on
Jun 10, 2025 • 0 new comments -
remove guard_size_oblivious from unbind.
#148815 commented on
Jun 21, 2025 • 0 new comments -
Implement derivatives for nextafter operation
#148820 commented on
May 24, 2025 • 0 new comments -
Make `torch._check` support bool tensor as `cond` param
#148871 commented on
May 31, 2025 • 0 new comments -
[DRAFT] make reshape work for reshapeing 1dim unbacked non-contig to anything
#148899 commented on
Jun 12, 2025 • 0 new comments -
[testing only] Update torch.utils.checkpoint to stash and restore TLS state
#148919 commented on
Jun 8, 2025 • 0 new comments -
Support int step for nonfused optimizer
#148956 commented on
Jun 9, 2025 • 0 new comments -
[test] test for keep going
#149003 commented on
Jun 13, 2025 • 0 new comments -
Fix issue #149006: Added docstring for backward()
#149011 commented on
May 31, 2025 • 0 new comments -
[FlexAttention] Allow caching of backwards func
#149069 commented on
Jun 5, 2025 • 0 new comments -
Support return values in generators
#149108 commented on
Jun 2, 2025 • 0 new comments -
[Intel GPU] Allow XPU backend in Depthwise_conv2d&3d operators
#149114 commented on
Jun 21, 2025 • 0 new comments -
Update the heuristic for AArch64 bmm/baddbmm
#149122 commented on
Jun 12, 2025 • 0 new comments -
add keepdim to cosine similarity
#149134 commented on
Jun 11, 2025 • 0 new comments -
PaddedTensor Init
#149140 commented on
Jun 18, 2025 • 0 new comments -
[prototype] in memory checkpoint example
#149202 commented on
Jun 13, 2025 • 0 new comments -
Fix mps scaled dot attention
#149268 commented on
Jun 6, 2025 • 0 new comments -
Fix unexpected keyword argument 'mode' when calling `CompileCounterWithBackend`
#149271 commented on
May 28, 2025 • 0 new comments -
Refactoring Distributed test cases to be device agnostic [2/n]
#149317 commented on
Jun 6, 2025 • 0 new comments -
[WIP] rewrite pad_nd with guard_or_false
#149998 commented on
May 25, 2025 • 0 new comments -
Add min/max support in export
#150003 commented on
Jun 11, 2025 • 0 new comments -
[DRAFT] PR to regenerate docker images for nccl release
#150029 commented on
May 25, 2025 • 0 new comments -
[draft][FSDP2] Reorder FSDP2 pre_forward
#150044 commented on
May 27, 2025 • 0 new comments -
rework test_mem_get_info for single gpu case
#150065 commented on
May 30, 2025 • 0 new comments -
[export][schema_upgrader][refactor] create a folder that holds different major version schemas
#150069 commented on
May 30, 2025 • 0 new comments -
[StaticRuntime] Fuse SigridHash
#150072 commented on
May 26, 2025 • 0 new comments -
Add `_foreach_fill_` ops
#150092 commented on
May 26, 2025 • 0 new comments -
Fix `L1Loss`, `MSELoss`, `HuberLoss` missing `weight` param
#150097 commented on
Jun 17, 2025 • 0 new comments -
[cherry-pick] [Submodule] [cpuinfo] cpuinfo update (#149305)
#150105 commented on
May 26, 2025 • 0 new comments -
S390x: update more tests
#150116 commented on
Jun 5, 2025 • 0 new comments -
[issue][do not land] Add unit test to show composability issue between mp_policy and checkpoint()
#150139 commented on
May 27, 2025 • 0 new comments -
[fix] pin cmake to 3.31.6 in build requirements
#150201 commented on
May 30, 2025 • 0 new comments -
[DLPack] add NumPy exchange tests.
#150216 commented on
May 30, 2025 • 0 new comments -
[state dict] add strict check when there are more keys in global than local state
#150239 commented on
Jun 14, 2025 • 0 new comments -
Fix NVTX functions compatibility with torch.compile(fullgraph=True)
#150240 commented on
Jun 1, 2025 • 0 new comments -
[decomps] Add decomposition for linalg_vector_norm
#150241 commented on
Jun 2, 2025 • 0 new comments -
Add a test for checking that the CUDA stubs directory is not in libcaffe2_nvrts.so's RPATH or RUNPATH
#150268 commented on
Jun 2, 2025 • 0 new comments -
[clang-tidy] Get rid-off dangerouse clang-tidy option
#150292 commented on
May 30, 2025 • 0 new comments -
allow collectives to be DCEd during collective optimizations, fix bad partitioner save decision
#150302 commented on
Jun 9, 2025 • 0 new comments -
test dynamo
#150307 commented on
May 30, 2025 • 0 new comments -
[FlexAttention] Don't load invalid values from mask mod
#150331 commented on
Jun 15, 2025 • 0 new comments -
[WIP][draft_export] suppress pending unbacked for divisibility symbol
#151718 commented on
Jun 19, 2025 • 0 new comments -
[ROCm] support experimental CU carveout
#149466 commented on
Jun 21, 2025 • 0 new comments -
Refactor `test/test_torch.py` by moving testcase to `test_indexing.py`
#149469 commented on
Jun 13, 2025 • 0 new comments -
Update gen_data.py
#149573 commented on
May 25, 2025 • 0 new comments -
Generalize AllocatorConfig to be device-agnostic
#149601 commented on
Jun 18, 2025 • 0 new comments -
[Inductor] Restrict block analysis to only match integer dims and strides
#149615 commented on
Jun 23, 2025 • 0 new comments -
DO NOT MERGE: Testing sequential builds for cuda + cpu
#149675 commented on
May 25, 2025 • 0 new comments -
[TP] add support for fused QKV Sharding
#149701 commented on
Jun 3, 2025 • 0 new comments -
[fbcode]Removing `@NoIntBaseDeprecated` annotation in `caffe2.thrift` file (#149742)
#149744 commented on
May 27, 2025 • 0 new comments -
[Profiler] Give non-zero default values to start events
#149757 commented on
May 28, 2025 • 0 new comments -
[scan] Support None return in combine_fn
#149763 commented on
Jun 8, 2025 • 0 new comments -
fix inductor logging for torch._scaled_mm
#149769 commented on
Jun 14, 2025 • 0 new comments -
Fix `torch.cuda.MemPool()` internal assertion failure when changing devices
#149818 commented on
May 30, 2025 • 0 new comments -
cd: Add script for generating binary build matrix
#149830 commented on
Jun 16, 2025 • 0 new comments -
[draft] Add support in Flex for non-contiguous NJT
#149892 commented on
Jun 18, 2025 • 0 new comments -
add a util function _make_all_gather_out_tensor to reduce code duplication
#149912 commented on
May 29, 2025 • 0 new comments -
support scalar tensor for functional all_gather
#149913 commented on
May 29, 2025 • 0 new comments -
Replace c10::guts::is_fundamental with std::is_fundamental
#149925 commented on
May 30, 2025 • 0 new comments -
add weight 2D tensor for xpu
#149927 commented on
May 31, 2025 • 0 new comments -
[Minimizer] Better debugging message
#149931 commented on
May 25, 2025 • 0 new comments -
Test whether origin/main CI is broken
#149948 commented on
May 25, 2025 • 0 new comments -
[TESTING] triton WS version ee6a03d19db0de2148c2604994e0256eeaefc5bc
#149949 commented on
May 25, 2025 • 0 new comments -
AOTI freezing: fix test issues and enable by default
#149961 commented on
Jun 13, 2025 • 0 new comments -
[inductor] Fix mm logging for `torch._scaled_.mm`
#149967 commented on
May 25, 2025 • 0 new comments -
[BE][PYFMT] migrate PYFMT for `torch/[p-z]*/` to `ruff format`
#144552 commented on
Jun 23, 2025 • 0 new comments -
[BE][PYFMT] migrate PYFMT for `test/[a-h]*/` to `ruff format`
#144555 commented on
Jun 23, 2025 • 0 new comments -
[BE][PYFMT] migrate PYFMT for `test/[i-z]*/` to `ruff format`
#144556 commented on
Jun 23, 2025 • 0 new comments -
[BE][PYFMT] remove `black`: finish `black -> ruff format` migration
#144557 commented on
Jun 23, 2025 • 0 new comments -
[Reopen] [Intel GPU] Set higher tolerance for some models only on XPU Device
#144756 commented on
Jun 16, 2025 • 0 new comments -
Update fbgemm_gpu pin
#144905 commented on
May 28, 2025 • 0 new comments -
Replacing explicit backend search with api call
#144944 commented on
Jun 10, 2025 • 0 new comments -
Enable fp16 linear layers in PyTorch via ACL
#144992 commented on
Jun 6, 2025 • 0 new comments -
[inductor] [bug fix] Fix `conv` on processing uint
#145136 commented on
May 30, 2025 • 0 new comments -
[dtensor][cp] experiment: call flex_attention on DTensor
#145353 commented on
Jun 2, 2025 • 0 new comments -
removed check for ConvTranspose3D on MPS
#145366 commented on
Jun 5, 2025 • 0 new comments -
General Changes for multi accelerators
#145521 commented on
Jun 13, 2025 • 0 new comments -
NJT support for cat() on the ragged dim
#145778 commented on
Jun 18, 2025 • 0 new comments -
[AsyncMM] re-enable and adapt to cutlass 3.6.0 (#144011)
#145811 commented on
Jun 13, 2025 • 0 new comments -
Guard the CPU cpp wrapper tests on having a cpp wrapper
#145847 commented on
Jun 2, 2025 • 0 new comments -
[WIP] Allow generation of inductor backend specific tests using instantiate_device_type_tests
#145873 commented on
Jun 21, 2025 • 0 new comments -
Add future lazy clone setting and deprecate `torch.reshape` view
#145911 commented on
Jun 6, 2025 • 0 new comments -
Add MPS OpInfo db, rework test_mps to use OpInfo
#145955 commented on
Jun 16, 2025 • 0 new comments -
fix indirect broadcast
#145992 commented on
Jun 15, 2025 • 0 new comments -
[Win][CD] Install cmake and setuptools from PyPI
#146055 commented on
May 27, 2025 • 0 new comments -
Use device agnostic APIs for device_count and backend in common_fsdp
#146289 commented on
May 27, 2025 • 0 new comments -
[Testing] Reduce `test_exp` flakiness
#146436 commented on
May 31, 2025 • 0 new comments -
[Profiler] Enable CUPTI teardown to reduce profiler overhead
#146604 commented on
Jun 6, 2025 • 0 new comments -
Fix inductor non-stable argsort/sort test
#146622 commented on
Jun 7, 2025 • 0 new comments -
inference_mode Tensors do not always need to be guarded on
#148983 commented on
Jun 3, 2025 • 0 new comments -
[Docker] Create an independent dependecies layer
#138612 commented on
Jun 18, 2025 • 0 new comments -
Fix `USE_STATIC_MKL` lost functionality
#138996 commented on
Jun 19, 2025 • 0 new comments -
`has_triton`: Use the device interface for detecting Triton availability
#139171 commented on
Jun 16, 2025 • 0 new comments -
[Don't Review] Test CI
#139971 commented on
Jun 19, 2025 • 0 new comments -
Add torch._scaled_mm for CPU
#139975 commented on
Jun 9, 2025 • 0 new comments -
[Intel GPU] Enable mkldnn::_convolution.pointwise at XPU backend
#140372 commented on
May 24, 2025 • 0 new comments -
Enable C++ dynamic shape guards by default
#140756 commented on
Jun 13, 2025 • 0 new comments -
Implement cuda graphs implementation of torch.cond and torch.while_loop
#140979 commented on
Jun 6, 2025 • 0 new comments -
[ROCm][layer_norm] Use __builtin_amdgcn_rcpf(x) instead of 1.f/x
#141309 commented on
May 26, 2025 • 0 new comments -
Add AOT inductor support for _scaled_mm for CPU
#141961 commented on
Jun 13, 2025 • 0 new comments -
Fix undefined behavior
#142121 commented on
May 28, 2025 • 0 new comments -
[Testing only] Add python cycle detection
#143204 commented on
Jun 8, 2025 • 0 new comments -
[Draft][WIP] Enable XPU path for FlexAttention
#143553 commented on
Jun 10, 2025 • 0 new comments -
ci: Add scaffolding for buidling wheels sequentially
#143672 commented on
May 25, 2025 • 0 new comments -
Modify the tolerance level in TIMM benchmark for XPU PreCI
#143739 commented on
Jun 4, 2025 • 0 new comments -
Check F2C BLAS for OpenBLAS and other vendors
#143846 commented on
May 24, 2025 • 0 new comments -
Using acc_t for log_softmax
#143896 commented on
Jun 6, 2025 • 0 new comments -
Defaults to C++20 in CMake torch targets
#143959 commented on
Jun 15, 2025 • 0 new comments -
[ci] Add riscv opt-int build
#143979 commented on
Jun 17, 2025 • 0 new comments -
Enable several readability checks
#143987 commented on
May 29, 2025 • 0 new comments -
Fix dangling autogenerated sphinx source code links
#144052 commented on
Jun 6, 2025 • 0 new comments -
Support Swiglu for Module and functional
#144465 commented on
Jun 21, 2025 • 0 new comments -
[dynamo, nested graph breaks] add nested graph break tests
#144516 commented on
May 29, 2025 • 0 new comments -
[DTensor] add aten.as_strided.default op
#147514 commented on
May 27, 2025 • 0 new comments -
[dtensor][cp] experiment: register flex_attention to a custom fn on DTensor
#147515 commented on
May 27, 2025 • 0 new comments -
[dtensor][cp] experiment: register flex_attention to a custom fn within a custom dispatch mode
#147516 commented on
May 27, 2025 • 0 new comments -
[dtensor][cp] experiment: register flex_attention to a custom fn on DTensor within a custom dispatch mode
#147517 commented on
May 27, 2025 • 0 new comments -
Fix the shape check inside gnll loss
#147522 commented on
Jun 2, 2025 • 0 new comments -
[dtensor][cp] experiment: try e2e cp flex_attention
#147603 commented on
May 27, 2025 • 0 new comments -
torch.sort: Optimize memory usage with (dtype_indices: ScalarType, dynamic_indices_dtype: bool) options
#147629 commented on
Jun 9, 2025 • 0 new comments -
Update triton_heuristics.py
#147690 commented on
May 30, 2025 • 0 new comments -
[fix]: Offload OpenBLAS gemv calls to dedicated OpenBLAS kernel
#147858 commented on
May 31, 2025 • 0 new comments -
Increase reference count of state tensor in `THPGenerator_reduce` to avoid premature garbage collection in `multiprocessing` start method `"forkserver"` and `"spawn"`
#147907 commented on
Jun 7, 2025 • 0 new comments -
Custom ops support arbitrary input types by migrating to python dispatcher
#147927 commented on
May 30, 2025 • 0 new comments -
[ROCm] Skip gfx12 Row-Wise F8 Tests
#148037 commented on
May 28, 2025 • 0 new comments -
[pytree] add another simplified pytree module `torch.pytree`
#148180 commented on
Jun 18, 2025 • 0 new comments -
[BE][PYFMT] migrate PYFMT for `test/inductor/` to `ruff format`
#148186 commented on
Jun 23, 2025 • 0 new comments -
Move estimate runtime and pick loop order heuristics into choices.py
#148202 commented on
May 26, 2025 • 0 new comments -
[pytree] simplify public API exposition with `__module__`
#148328 commented on
Jun 18, 2025 • 0 new comments -
Enable `_lazy_clone` between CPU and MPS
#148408 commented on
May 31, 2025 • 0 new comments -
Disable flake8 advice C416
#148412 commented on
Jun 8, 2025 • 0 new comments -
Optimize `torch.distributions` Score function
#148429 commented on
May 28, 2025 • 0 new comments -
[BE][pytree] rename `NodeDef` member to match the type annotations: `*_fn -> *_func`
#148474 commented on
Jun 18, 2025 • 0 new comments -
[BE][pytree] rename argument name in register function to match the type annotations: `*_fn -> *_func`
#148484 commented on
Jun 18, 2025 • 0 new comments -
[triton hash update] update the pinned triton hash
#148492 commented on
Jun 23, 2025 • 0 new comments -
[BE][pytree] cleanup parameterized pytree tests
#148569 commented on
Jun 18, 2025 • 0 new comments -
[cuda] Add new faster gammabeta backward kernel
#148605 commented on
Jun 2, 2025 • 0 new comments -
gloo: fix building system gloo with CUDA/HIP
#146637 commented on
Jun 15, 2025 • 0 new comments -
Optimize isclose() for CPU and GPU by adding specific implementations
#146656 commented on
Jun 17, 2025 • 0 new comments -
Optimize LRScheduler docs
#146684 commented on
May 27, 2025 • 0 new comments -
Enable explicitly vectorized `_weight_int8pack_mm` op for FP16 dtype on x86_64 CPU
#146777 commented on
Jun 6, 2025 • 0 new comments -
implement Size.__radd__
#146834 commented on
Jun 8, 2025 • 0 new comments -
[Optimus][Inductor] Add full cat aten pattern
#146874 commented on
May 30, 2025 • 0 new comments -
Port distributed backend tests to Pytest
#146961 commented on
Jun 4, 2025 • 0 new comments -
Porting Pytorch to AIX Operating System.
#146983 commented on
Jun 19, 2025 • 0 new comments -
Use 2022 as default VC_YEAR for windows builds
#147053 commented on
Jun 1, 2025 • 0 new comments -
OpenReg: Fix releasing tensor issue when using pin_memory
#147066 commented on
Jun 13, 2025 • 0 new comments -
Fix the Problems About Defining Static Variable in Inline Function
#147095 commented on
Jun 23, 2025 • 0 new comments -
Add ppc64le wheel build support
#147194 commented on
Jun 20, 2025 • 0 new comments -
Periodic Activations Module
#147218 commented on
May 30, 2025 • 0 new comments -
fixed optimizer load_state_dict
#147289 commented on
Jun 10, 2025 • 0 new comments -
[MPS] Fix metallib embedding in static builds
#147324 commented on
Jun 16, 2025 • 0 new comments -
[MPS] Fix incorrect size for uint3 arg
#147325 commented on
Jun 16, 2025 • 0 new comments -
[DO NOT MERGE] Update submodule ideep for ideep matmul changes
#147359 commented on
Jun 6, 2025 • 0 new comments -
Replace `fw_metadata` info with trace log hint in hint message
#147365 commented on
May 26, 2025 • 0 new comments -
Add overflow check for large storage_offsets
#147398 commented on
May 28, 2025 • 0 new comments -
Small scheduler refactor
#147410 commented on
Jun 21, 2025 • 0 new comments -
To enable NCCL communication to support uint64 tensors
#147424 commented on
Jun 10, 2025 • 0 new comments -
[ONNX] Migrate onnx ops decomp functions
#147469 commented on
Jun 7, 2025 • 0 new comments -
[test] compile cmd
#147470 commented on
Jun 18, 2025 • 0 new comments -
Optimize `dynamo` typing
#147499 commented on
May 28, 2025 • 0 new comments -
Expand cache logging
#152026 commented on
Jun 11, 2025 • 0 new comments -
Test
#152055 commented on
Jun 19, 2025 • 0 new comments -
Cause `ceil_div` to accept values of differing types an upcast to the larger type
#152074 commented on
Jun 23, 2025 • 0 new comments -
Switch to standard pep517 sdist generation
#152098 commented on
Jun 20, 2025 • 0 new comments -
[inductor] propagate shapes in CSEVariable
#152198 commented on
Jun 14, 2025 • 0 new comments -
Improve error handling in CachingAutotuner for argument mismatches
#152215 commented on
Jun 4, 2025 • 0 new comments -
Add `padding="same"` for transposed convolution
#152228 commented on
May 23, 2025 • 0 new comments -
Updates to build on Noble (Ubuntu24.04) and py3.12
#152240 commented on
Jun 5, 2025 • 0 new comments -
complex.pow(2) on GPU by replacing with complex * complex to avoid numerical instability
#152373 commented on
Jun 22, 2025 • 0 new comments -
Relax tolerance for test_quick_baddbmm_cpu_complex64
#152424 commented on
Jun 3, 2025 • 0 new comments -
[2/N] Deprecate c10::string_view and c10::string
#152509 commented on
Jun 4, 2025 • 0 new comments -
ci: Switch benchmark dependency to use pip
#152545 commented on
May 31, 2025 • 0 new comments -
Implemented `Size.__radd__`
#152554 commented on
Jun 23, 2025 • 0 new comments -
[pytree] make `tree_*` functions accept both Python and C++ `PyTreeSpec`
#152624 commented on
Jun 18, 2025 • 0 new comments -
[ROCm] Initial AITER Integration for mha_bwd asm kernels
#152630 commented on
Jun 17, 2025 • 0 new comments -
[do-not-land][ca] default on for CI
#152646 commented on
Jun 6, 2025 • 0 new comments -
[Inductor] Pattern matcher support for mutable ops with non-view inputs
#152775 commented on
May 30, 2025 • 0 new comments -
[Don't merge] Debug
#152940 commented on
May 30, 2025 • 0 new comments -
[dtensor] add privateuse1 SDPA op support to DTensor
#152949 commented on
Jun 11, 2025 • 0 new comments -
[ROCm] Ck gemm architecture guard
#152951 commented on
Jun 17, 2025 • 0 new comments -
🌠 Add Muon optimizer
#153048 commented on
Jun 2, 2025 • 0 new comments -
Bump triton pin and update setup.py path
#153165 commented on
Jun 4, 2025 • 0 new comments -
Don't print hinted expression if statically known.
#153173 commented on
Jun 14, 2025 • 0 new comments -
Tensor .cuda() very slow with specific array sizes
#153176 commented on
May 28, 2025 • 0 new comments -
Update __init__.py
#151751 commented on
Jun 22, 2025 • 0 new comments -
Refactor duplicate code into a utility function in pytorch/torch/nn/functional.py
#151752 commented on
Jun 20, 2025 • 0 new comments -
torch.testing._internal.optests - MPS Support
#151758 commented on
Jun 21, 2025 • 0 new comments -
[MPS] Implement upsample_nearest3d_vec operator
#151760 commented on
Jun 22, 2025 • 0 new comments -
enable windows inductor UT in CI
#151777 commented on
Jun 21, 2025 • 0 new comments -
Normalize dynamic size symbols in template codegen cache key.
#151778 commented on
Jun 4, 2025 • 0 new comments -
Horizontal
#151780 commented on
Jun 13, 2025 • 0 new comments -
Deduplicate library deletion
#151795 commented on
Jun 20, 2025 • 0 new comments -
Add assert_on_assumption on to guard_or_true, and guard_or_false
#151854 commented on
Jun 21, 2025 • 0 new comments -
[draft export] normalize sympy expressions for data-dependent counting
#151856 commented on
Jun 21, 2025 • 0 new comments -
Add BufferDict works like ParameterDict
#151870 commented on
Jun 21, 2025 • 0 new comments -
Add `LinearLR` compute lr formula in doc
#151894 commented on
May 26, 2025 • 0 new comments -
[cp] dispatch flex_attention on DTensor to cp implementation
#151900 commented on
Jun 23, 2025 • 0 new comments -
[WIP] Deprecate AcceleratorHooksInterface isPinnedPtr, use at::getHostAllocator()->is_pinned instead
#151916 commented on
Jun 11, 2025 • 0 new comments -
[WIP]: track remaining runtime time asserts for backward coddgen instead of trying to regenerate all
#151919 commented on
Jun 4, 2025 • 0 new comments -
[Observability][Optimus] Fix the tlparse name
#151935 commented on
Jun 22, 2025 • 0 new comments -
[WIP][dynamic shapes] whitelist at dim-level
#151941 commented on
Jun 22, 2025 • 0 new comments -
[profiler] use inspect.getattr_static to avoid importing inductor
#151946 commented on
Jun 22, 2025 • 0 new comments -
add tlpare logs
#151948 commented on
Jun 22, 2025 • 0 new comments -
[inductor][profiler] lazily import things in standalone_compile
#151956 commented on
Jun 22, 2025 • 0 new comments -
Make `aten.embedding` do not wrap negative index
#151967 commented on
May 28, 2025 • 0 new comments -
[WIP][recompiles] verbose logging for tensor guard checks
#151971 commented on
Jun 23, 2025 • 0 new comments -
[Don't merge] Upgrade oneDNN to v3.8-rc for XPU build
#152001 commented on
Jun 23, 2025 • 0 new comments -
[WIP] fix reinplacing bug
#152011 commented on
Jun 22, 2025 • 0 new comments -
[RPC] fix deserialize doesn't respect user pickler
#153821 commented on
Jun 5, 2025 • 0 new comments -
[c10d] make register_backend device handling more robust
#153824 commented on
Jun 4, 2025 • 0 new comments -
[PP] Fix double backward error in stage_backward
#153893 commented on
May 28, 2025 • 0 new comments -
[CPU Generator] Remove the unused CPUGeneratorImplStateLegacy in set_state
#153934 commented on
Jun 12, 2025 • 0 new comments -
Fix #153942
#153943 commented on
May 28, 2025 • 0 new comments -
[Dynamo] Fixes for exceptions
#153966 commented on
May 28, 2025 • 0 new comments -
[draft][do not review] H-FSDP prototype
#154000 commented on
Jun 18, 2025 • 0 new comments -
Add MPS implementation of CTC Loss based on CUDA version
#154044 commented on
May 27, 2025 • 0 new comments -
Add pyrefly lint adaptor
#154059 commented on
May 26, 2025 • 0 new comments -
[Dynamo] [Set] Implement some binop operators for dict/set/frozenset/dict_keys
#154063 commented on
Jun 21, 2025 • 0 new comments -
[Dynamo] [Set] Raise TypeError if object is unhashable
#154064 commented on
Jun 21, 2025 • 0 new comments -
[Dynamo] [Set] Raise TypeError in set.union(...) and "__or__"
#154065 commented on
Jun 21, 2025 • 0 new comments -
[Dynamo] [Set] Add comparison for set subclass
#154066 commented on
Jun 21, 2025 • 0 new comments -
force computation in opmath_t for CUDA fused optimizers
#154069 commented on
Jun 2, 2025 • 0 new comments -
[dynamo] remove recursive cell/freevar in instruction tx
#154078 commented on
May 29, 2025 • 0 new comments -
dynamo_time: Allways log to pt2_compile_events
#154081 commented on
May 24, 2025 • 0 new comments -
Update the UT of test_decompose_mm_cpu
#154100 commented on
May 28, 2025 • 0 new comments -
[BE]: Update pybind11 submodule to 3.0.0rc
#154115 commented on
Jun 19, 2025 • 0 new comments -
[do not merge] Test out long filename for libtorch_agnostic c++ extension test in Windows
#154139 commented on
May 28, 2025 • 0 new comments -
Add basic xor_sum op
#154149 commented on
May 28, 2025 • 0 new comments -
[cuBLASLt][cuBLAS] Support 2D bias and `beta != 1.0` in cuBLASLt
#154170 commented on
Jun 17, 2025 • 0 new comments -
[PG/nccl] Simplify uniqueHash management
#154185 commented on
May 26, 2025 • 0 new comments -
[cond] support gen_schema for cond
#154193 commented on
Jun 17, 2025 • 0 new comments -
implement MKLGenerator
#154199 commented on
Jun 18, 2025 • 0 new comments -
Adding XPU support to DTensor examples
#153213 commented on
May 25, 2025 • 0 new comments -
Fix integer overflow bug in triu/tril for large diagonal values
#153240 commented on
May 26, 2025 • 0 new comments -
fix dtensor and tensor inconsistent compute mesh
#153268 commented on
Jun 2, 2025 • 0 new comments -
Fix lcm_ crash with int16 scalar and large int32 tensor
#153314 commented on
May 23, 2025 • 0 new comments -
CMake: update FindCUDAToolkit.cmake, use torch::nvtx3 if present, mod…
#153339 commented on
Jun 15, 2025 • 0 new comments -
[ATen][CUDA][CUB] Implement changes to CCCL (CUB/Thrust/LibCUDACXX) usage in ATen
#153373 commented on
Jun 9, 2025 • 0 new comments -
[AOTI Debugging] Add Environment Variable to control output path
#153391 commented on
Jun 11, 2025 • 0 new comments -
[BE]: Enable RUFF TRY400 rule - log.exception
#153473 commented on
Jun 2, 2025 • 0 new comments -
Clean PR: Replace _device_t with torch.types.Device and fix lint issues (#152952)
#153493 commented on
Jun 4, 2025 • 0 new comments -
[AUTOCAST] FEAT: Allow passing a `torch.device` object to autocast
#153539 commented on
Jun 10, 2025 • 0 new comments -
[Ez][BE]: Remove accidental classvar
#153540 commented on
May 28, 2025 • 0 new comments -
[BE]: Update CUTLASS submodule to 4.0.0rc
#153541 commented on
Jun 13, 2025 • 0 new comments -
[Dynamo] [SetSubclass] Add support for user defined sets
#153553 commented on
Jun 21, 2025 • 0 new comments -
[PP] wip, allow grad to be None
#153557 commented on
May 28, 2025 • 0 new comments -
[caffe2] Allow the elimination of implicit calls to strlen when using the RECORD_FUNCTION macros
#153567 commented on
Jun 10, 2025 • 0 new comments -
support scaled mm on inductor
#153602 commented on
May 26, 2025 • 0 new comments -
[BE] Use latest mkl-include and mkl-devel on Windows CI
#153684 commented on
Jun 15, 2025 • 0 new comments -
Use magma 2.9.0
#153703 commented on
Jun 2, 2025 • 0 new comments -
[partitioner] Fix _broadcast_on_rank0 to use deterministic hash function
#153734 commented on
Jun 5, 2025 • 0 new comments -
[not for land] small compile-on-one-rank example
#153743 commented on
May 24, 2025 • 0 new comments -
Add TORCH_CHECK for group < channels for native_channel_shuffle
#153781 commented on
Jun 6, 2025 • 0 new comments -
Fix `LLONG_MIN` errors in `torch.jit.script`
#153793 commented on
Jun 6, 2025 • 0 new comments -
Ignore url lint in install_xpu.sh
#153796 commented on
Jun 16, 2025 • 0 new comments -
Build clang20 image for ASAN tests
#153806 commented on
Jun 22, 2025 • 0 new comments -
Copy native runtime code to OSS.
#150338 commented on
Jun 15, 2025 • 0 new comments -
[torchrun] Fix: Use Correctly Reachable Host Address in c10d Rendezvous
#150533 commented on
Jun 6, 2025 • 0 new comments -
[export] Refactor strict to pass fake tensors to dynamo
#150546 commented on
Jun 1, 2025 • 0 new comments -
Fix link formatting in cpp_extension.py
#150552 commented on
Jun 14, 2025 • 0 new comments -
Initial Implementation of Padded Tensor
#150567 commented on
Jun 18, 2025 • 0 new comments -
[cuda] Added CUDA kernels for RMSNorm
#150576 commented on
Jun 10, 2025 • 0 new comments -
[test] DTensor moe compile fixes for dynamic shapes
#150582 commented on
Jun 6, 2025 • 0 new comments -
fix dynamic shapes for kwargs
#150583 commented on
Jun 19, 2025 • 0 new comments -
[WIP] try always splitting in reshape view
#150584 commented on
Jun 2, 2025 • 0 new comments -
[CUDA] include nvtx3 header in wheel so downstream torch extension can find it
#150591 commented on
Jun 4, 2025 • 0 new comments -
suppress neon missing message on armv8 build
#150595 commented on
Jun 4, 2025 • 0 new comments -
Make `nn.MultiLabelMarginLoss` error message user friendly
#150606 commented on
Jun 3, 2025 • 0 new comments -
Make error message descriptive
#150627 commented on
Jun 2, 2025 • 0 new comments -
tutorial example for cp
#150641 commented on
Jun 3, 2025 • 0 new comments -
Split up cub-RadixSortPairs-scalars.cu to parallelize compilation
#150678 commented on
Jun 3, 2025 • 0 new comments -
Revert "[ATen][CUDA] Implement 128 bit vectorization v2 (#145746)"
#150679 commented on
Jun 3, 2025 • 0 new comments -
cd: Introduce new binary build workflows (cpu)
#150713 commented on
Jun 16, 2025 • 0 new comments -
[wip] support tracing async collectives
#150720 commented on
Jun 6, 2025 • 0 new comments -
Avoid overwriting COW data in MPS code
#150721 commented on
Jun 4, 2025 • 0 new comments -
[Inductor] Set the default value of min_chunk_size to 512
#150762 commented on
Jun 19, 2025 • 0 new comments -
Add CPython exception tests
#150789 commented on
Jun 3, 2025 • 0 new comments -
Add CPython generator/contextlib tests
#150796 commented on
Jun 5, 2025 • 0 new comments -
Pin all root requirements to major versions
#150833 commented on
Jun 18, 2025 • 0 new comments -
Fix the Problems About Defining Static Variable in Inline Function
#150841 commented on
Jun 7, 2025 • 0 new comments -
not-for-landing add logs for debugging chunk metadata
#150886 commented on
Jun 8, 2025 • 0 new comments -
[DO NOT REVIEW] Update _fsdp_param_group.py
#150349 commented on
May 31, 2025 • 0 new comments -
test enummeta
#150351 commented on
May 31, 2025 • 0 new comments -
Memory leak base tests for compile
#150353 commented on
May 31, 2025 • 0 new comments -
support nested compile when inner compile is inside of __torch_dispatch__
#150355 commented on
Jun 6, 2025 • 0 new comments -
Build MacOS CI with MKLDNN
#150365 commented on
May 31, 2025 • 0 new comments -
bound sympy accuracy
#150383 commented on
Jun 3, 2025 • 0 new comments -
Add `mse_loss_backward_out` type promotion
#150384 commented on
Jun 16, 2025 • 0 new comments -
Test layout_opt_default set to 0
#150411 commented on
Jun 3, 2025 • 0 new comments -
[Inductor] Fix scaled_mm template migration missing endif block
#150415 commented on
May 31, 2025 • 0 new comments -
Test self hosted GPU runner
#150422 commented on
Jun 3, 2025 • 0 new comments -
[dynamo] Lazily import fsdp-related modules
#150429 commented on
Jun 3, 2025 • 0 new comments -
[WIP][dynamic shapes] guard_or_false rewrite for fake_impls.py:infer_size, compute_contiguous
#150431 commented on
Jun 1, 2025 • 0 new comments -
Faster way to test self hosted GPU runner
#150434 commented on
Jun 4, 2025 • 0 new comments -
caffe2: Fix lint errors in native/CPUFallback.cpp
#150443 commented on
Jun 1, 2025 • 0 new comments -
caffe2: Fix lint errors in FlashAttentionKernel
#150445 commented on
Jun 2, 2025 • 0 new comments -
[dynamic shapes] oblivious rewrite for meta_select
#150455 commented on
Jun 2, 2025 • 0 new comments -
[dynamic shapes] guard_or_false rewrite for scatter, gather, index metas
#150481 commented on
Jun 1, 2025 • 0 new comments -
caffe2: Fix lint errors in runtime/register_prim_ops.cpp
#150501 commented on
Jun 1, 2025 • 0 new comments -
caffe2: Fix lint errors in native/int4mm_kernel
#150503 commented on
Jun 1, 2025 • 0 new comments -
caffe2: Fix lint errors in native/quantized/TensorAdvancedIndexing
#150504 commented on
Jun 1, 2025 • 0 new comments -
caffe2: Fix lint errors in native/RNN.cpp
#150505 commented on
Jun 1, 2025 • 0 new comments -
caffe2: Fix lint errors in native/TensorAdvancedIndexing.cpp
#150506 commented on
Jun 1, 2025 • 0 new comments -
caffe2: Fix lint errors in native/TensorShape.cpp
#150507 commented on
Jun 1, 2025 • 0 new comments -
Fix CPU bitwise shifts for out-of-limit values in VSX-vec
#150524 commented on
Jun 2, 2025 • 0 new comments -
[WIP][dynamic shapes] lru cache bound_sympy
#151271 commented on
Jun 16, 2025 • 0 new comments -
[WIP] Generalize device caching allocator
#151298 commented on
Jun 23, 2025 • 0 new comments -
Update docker image names for s390x release
#151429 commented on
Jun 16, 2025 • 0 new comments -
Implement fast exp for AVX2 and AVX512 for the flash attention
#151441 commented on
Jun 16, 2025 • 0 new comments -
Allow to byteswap data when reading saved torch jit data
#151447 commented on
Jun 3, 2025 • 0 new comments -
update fx.Interpreter error logging to check if submodules are GraphModules
#151451 commented on
Jun 17, 2025 • 0 new comments -
Add default value for `serialization_format` in `_write_item` function for better compatibility
#151452 commented on
Jun 16, 2025 • 0 new comments -
[ROCm] Initial plumbing for CK Gemm Perf Improvement
#151465 commented on
Jun 19, 2025 • 0 new comments -
inductor.config.descriptive_names = False is not actually supported (#145523) (#146051)
#151481 commented on
Jun 18, 2025 • 0 new comments -
[dtensor][view_op] add as_strided op support to DTensor in FakeTensorMode
#151495 commented on
Jun 23, 2025 • 0 new comments -
Use device agnostic APIs and variable names for dtensor
#151527 commented on
Jun 22, 2025 • 0 new comments -
[WIP] Deprecate getPinnedMemoryAllocator use getHostAllocator instead
#151531 commented on
Jun 11, 2025 • 0 new comments -
Fix normalize mypy warning with tuple dim
#151553 commented on
Jun 18, 2025 • 0 new comments -
[autodeps2] Replace third-party/pyqt5 with third-party/pypi/pyqt5
#151557 commented on
Jun 16, 2025 • 0 new comments -
[bazel] Fix aten generator directory path
#151580 commented on
Jun 19, 2025 • 0 new comments -
[DRAFT] fix issues related to deferred assertion on unabcked floats
#151604 commented on
Jun 17, 2025 • 0 new comments -
added six and pyyaml to requirements.txt to fix missing module error …
#151605 commented on
Jun 17, 2025 • 0 new comments -
distributed: add distributed P2P TensorQueue and TensorStore
#151631 commented on
Jun 17, 2025 • 0 new comments -
[demo] Verify test runner integration
#151645 commented on
Jun 21, 2025 • 0 new comments -
Update link to NVIDIA cuDNN Support Matrix
#151647 commented on
Jun 19, 2025 • 0 new comments -
Add a custom profiler configuration option
#151656 commented on
Jun 23, 2025 • 0 new comments -
[aot] Set config partitioner recompute_views True by default
#151676 commented on
Jun 17, 2025 • 0 new comments -
[test] log
#151700 commented on
Jun 6, 2025 • 0 new comments -
Remove unnecessary recompile
#151711 commented on
Jun 18, 2025 • 0 new comments -
Fix StrictMinMaxConstraint issue
#150924 commented on
Jun 9, 2025 • 0 new comments -
Introduce test skip markers for Sandcastle
#150934 commented on
Jun 2, 2025 • 0 new comments -
update benchamark result due to <1% regression
#150937 commented on
Jun 10, 2025 • 0 new comments -
Turn optree warning into error
#150938 commented on
Jun 8, 2025 • 0 new comments -
all_reduce autograd
#150942 commented on
Jun 22, 2025 • 0 new comments -
Add complex logaddexp
#150946 commented on
Jun 15, 2025 • 0 new comments -
Add complex logaddexp2
#150947 commented on
Jun 14, 2025 • 0 new comments -
[ONNX] Migrate DORT to use the new exporter
#150950 commented on
Jun 8, 2025 • 0 new comments -
[dynamo][fsdp] Do not consider fsdp modules as specialized
#150954 commented on
Jun 9, 2025 • 0 new comments -
Add additional MacOS test runners for MPS
#150964 commented on
Jun 15, 2025 • 0 new comments -
move set_rotate_method to public namespace
#150968 commented on
Jun 13, 2025 • 0 new comments -
Add `pad_to_multiple_of` to `pad_sequence`
#150990 commented on
Jun 10, 2025 • 0 new comments -
Fix index broadcast
#151009 commented on
Jun 9, 2025 • 0 new comments -
Add `pad_to_multiple_of` to `pad_sequence` (C++ only)
#151021 commented on
Jun 11, 2025 • 0 new comments -
Tune linalg_eigh_cusolver: better heuristic for syevj_batched selection on cuda
#151118 commented on
Jun 11, 2025 • 0 new comments -
Reland prologue transposed changes
#151120 commented on
Jun 11, 2025 • 0 new comments -
[hop] Make base_hop share utils with control flow ops in backward
#151146 commented on
Jun 6, 2025 • 0 new comments -
Fix TypeIndex.h signature extraction
#151150 commented on
Jun 12, 2025 • 0 new comments -
Fix DWConv in QNNPACK for aarch32
#151191 commented on
Jun 3, 2025 • 0 new comments -
[ZCH vNext] Bucket offsets and sizes in torchrec shard metadata for bucket wise sharding
#151192 commented on
Jun 14, 2025 • 0 new comments -
Fix `MaskedTensor` to device ignored mask
#151205 commented on
Jun 3, 2025 • 0 new comments -
Implement MKLGenerator
#151218 commented on
Jun 13, 2025 • 0 new comments -
update visualizer with compare two schedules method
#151249 commented on
Jun 13, 2025 • 0 new comments -
[AMD][FA] Block mem efficient attention if backward head_dim > 128 in CK backend
#151258 commented on
Jun 16, 2025 • 0 new comments -
Per channel weight observer for ConvTranspose
#54816 commented on
Jun 3, 2025 • 0 new comments -
[Inductor][Schedule][Fusion] Ops are not fused due to reduction unroll
#153346 commented on
Jun 3, 2025 • 0 new comments -
test_gradient_all Device Type test regression with Numpy >= 2.0.0
#132450 commented on
Jun 4, 2025 • 0 new comments -
Representation string of a meta tensor is not a valid `tensor` call
#147643 commented on
Jun 4, 2025 • 0 new comments -
AttributeError: Can't pickle local object 'make_opaque_bitwise_fn.<locals>.BitwiseFn'
#147841 commented on
Jun 4, 2025 • 0 new comments -
Return type annotation of `Tensor.long()` etc is not narrowed down to dtype-specific names `LongTensor` etc
#148552 commented on
Jun 4, 2025 • 0 new comments -
torch.compile of simple loop takes 34 seconds
#111441 commented on
Jun 4, 2025 • 0 new comments -
torch.jit.script persistently changes default from utf-8 to ascii
#111480 commented on
Jun 4, 2025 • 0 new comments -
Triangular matrix storage + matmul
#122454 commented on
Jun 4, 2025 • 0 new comments -
Major perf regression with `BatchNorm2d` + `torch.compile` with `reduce-overhead` + DDP
#139207 commented on
Jun 4, 2025 • 0 new comments -
DISABLED test_byte_tensor_assignment (__main__.TestAdvancedIndexing)
#137028 commented on
Jun 4, 2025 • 0 new comments -
DISABLED test_reduce_stress_cuda (__main__.ProcessGroupGlooTest)
#152367 commented on
Jun 4, 2025 • 0 new comments -
DISABLED test_reduce_stress_cuda (__main__.ProcessGroupGlooLazyInitTest)
#152201 commented on
Jun 4, 2025 • 0 new comments -
Inconsistent behavior when indexing a Tensor with a list of lists
#119548 commented on
Jun 4, 2025 • 0 new comments -
non-strict export should detect fake tensor leakage
#153062 commented on
Jun 4, 2025 • 0 new comments -
Support loading and executing a ExportedProgram from torch.export in C++ environment
#144663 commented on
Jun 4, 2025 • 0 new comments -
Torch profiler corrupted names with Python 3.11
#121219 commented on
Jun 5, 2025 • 0 new comments -
Support for `uint16`, `uint32`, and `uint64`
#58734 commented on
Jun 5, 2025 • 0 new comments -
`torch==2.6` broke `nn.Module.dtype` typing
#152292 commented on
Jun 5, 2025 • 0 new comments -
Stack trace from pytest is very far away and far too find on some tests
#141204 commented on
Jun 5, 2025 • 0 new comments -
Obscure error: Expected a value of type 'List[int]' for argument 'sizes' but instead found type 'immutable_list'
#122129 commented on
Jun 5, 2025 • 0 new comments -
[Feature request] Exclusive prefix sum, `torch.cumsum(input, dim=0, exclusive=True)`
#76191 commented on
Jun 5, 2025 • 0 new comments -
Fx Graph cache hit generates guards that does not exists in the original cached program causing recompilations only at cache hit.
#152435 commented on
Jun 5, 2025 • 0 new comments -
Numerical inaccuracies in "ddp_apply_optim_in_backward" unit tests for gloo backend
#111834 commented on
Jun 5, 2025 • 0 new comments -
redundant recompilation caused by duplicated Sym()
#144068 commented on
Jun 5, 2025 • 0 new comments -
MPS backend appears to be limited to 32 bits
#84520 commented on
Jun 2, 2025 • 0 new comments -
Floating point exception when autocast is enabled
#154014 commented on
Jun 2, 2025 • 0 new comments -
[RFC] [Feature] Intra-Device Heterogeneous Memory Allocation Support
#153745 commented on
Jun 2, 2025 • 0 new comments -
[inline_inbuilt_nn_modules] Move export to inline_inbuilt_nn_modules
#147030 commented on
Jun 2, 2025 • 0 new comments -
torch.cuda.memory_reserved always returns 0 bytes
#103243 commented on
Jun 2, 2025 • 0 new comments -
[RFC] dropping CUDA 11.8 support in CI/CD
#147383 commented on
Jun 2, 2025 • 0 new comments -
[dynamo] Graph breaks from copy.deepcopy
#115122 commented on
Jun 2, 2025 • 0 new comments -
`copy_()` fails with HSDP in FSDP2
#147568 commented on
Jun 2, 2025 • 0 new comments -
[ONNX] Create unit tests for the new export path by adapting all existing tests
#129279 commented on
Jun 2, 2025 • 0 new comments -
SequentialLR does not work correctly with multiple ConstantLR
#82684 commented on
Jun 3, 2025 • 0 new comments -
Inconsistent export behavior for nonzero+grid_sample between CUDA and CPU/MPS backends
#152791 commented on
Jun 3, 2025 • 0 new comments -
pin_memory crashes for big tensors and leaks page locked memory
#152335 commented on
Jun 3, 2025 • 0 new comments -
Triton Compilation Error in Generated Code due to possible float division in index
#153375 commented on
Jun 3, 2025 • 0 new comments -
[feature request] Discover actually loaded shared libraries at runtime
#82098 commented on
Jun 3, 2025 • 0 new comments -
[RFC] Supporting Eager Mode via torch.compile
#115545 commented on
Jun 3, 2025 • 0 new comments -
Implement einsum backprop rather than decomposing
#149133 commented on
Jun 3, 2025 • 0 new comments -
NCCL out of memory error after updating to PyTorch 2.7
#152302 commented on
Jun 3, 2025 • 0 new comments -
matmul uses excessive memory in batch cases with more than 3 dimensions
#154128 commented on
Jun 3, 2025 • 0 new comments -
add `FlopCounterMode` documentation
#123800 commented on
Jun 3, 2025 • 0 new comments -
```grad_mode.py``` torch imported as NoneType, cannot import ```torch._jit_internal```
#154114 commented on
Jun 3, 2025 • 0 new comments -
[Feature Request] Memory optimization for backward propagation in GPU
#150698 commented on
Jun 3, 2025 • 0 new comments -
[triton pin update] Run Inductor CI on pin updates for Triton and the PyTorch nightly branch
#152608 commented on
Jun 3, 2025 • 0 new comments -
Add extensions `flash_attention` and `vllm` as test of new PyTorch releases for known issues of compat of their binaries and of possibility of compiling these from source
#155066 commented on
Jun 3, 2025 • 0 new comments -
Auto format lint not making suggestions
#153273 commented on
Jun 3, 2025 • 0 new comments -
UNSTABLE Build manywheel docker images for s390x / build-docker-cpu-s390x
#154074 commented on
Jun 3, 2025 • 0 new comments -
Unable to build with ATEN_THREADING=TBB option
#144767 commented on
Jun 3, 2025 • 0 new comments -
LibTorch build error on Windows for CUDA version (debug/release)
#139108 commented on
Jun 3, 2025 • 0 new comments -
[ONNX] Support while HOP
#146674 commented on
Jun 7, 2025 • 0 new comments -
[ONNX] Implement aten.stft
#147052 commented on
Jun 7, 2025 • 0 new comments -
[Dynamo][Custombackend]: rms_norm find inplace op when using aot_export_joint_simple
#154195 commented on
Jun 7, 2025 • 0 new comments -
Intel MKL DFTI ERROR
#120986 commented on
Jun 8, 2025 • 0 new comments -
MPS incompatibility: Calls into the C++ engine to run the backward pass
#143123 commented on
Jun 8, 2025 • 0 new comments -
torch.compile does not work with Flash attention 3
#144540 commented on
Jun 8, 2025 • 0 new comments -
Please consider adding MIG (MI-rror with G-radient modification) to torch.nn
#122680 commented on
Jun 8, 2025 • 0 new comments -
[feature request] `torch.scan` (also port `lax.fori_loop` / `lax.while_loop` / `lax.associative_scan` and hopefully parallelized associative scans)
#50688 commented on
Jun 8, 2025 • 0 new comments -
torch._higher_order_ops.scan incorrect/mismatched gradients for non-trailing layers with torch.compile
#153679 commented on
Jun 8, 2025 • 0 new comments -
aten::nonzero calls taking a huge amount of time when using MPS backend vs CPU
#124850 commented on
Jun 8, 2025 • 0 new comments -
profiler.export_stacks doesn't return stack trace unless experimental_config is provided
#100253 commented on
Jun 8, 2025 • 0 new comments -
Python 3.12 "from functorch.einops import rearrange" fails with "RuntimeError: First class dim doesn't work with python 3.12"
#142032 commented on
Jun 9, 2025 • 0 new comments -
Inductor aten.clone lowering ignores Conjugate and Negative dispatch keys
#145093 commented on
Jun 9, 2025 • 0 new comments -
`RuntimeError: UR error` with XPU
#149953 commented on
Jun 9, 2025 • 0 new comments -
`torch.set_default_tensor_type() is deprecated as of PyTorch 2.1` appearing in logs even when not using this function
#120584 commented on
Jun 9, 2025 • 0 new comments -
Mixed precision causes NaN loss
#40497 commented on
Jun 9, 2025 • 0 new comments -
DDP doesn't work with retain_graph = True
#47260 commented on
Jun 9, 2025 • 0 new comments -
[DDP] doesn't support multiple backwards when static_graph=True
#80832 commented on
Jun 9, 2025 • 0 new comments -
[ONNX] pad_sequence() is not exportable, with neither legacy onnx.export nor with dynamo_export
#127153 commented on
Jun 9, 2025 • 0 new comments -
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_int64 (__main__.TestForeachCUDA)
#149735 commented on
Jun 9, 2025 • 0 new comments -
DISABLED test_binary_op_with_scalar_self_support__foreach_pow_is_fastpath_True_cuda_int32 (__main__.TestForeachCUDA)
#149628 commented on
Jun 9, 2025 • 0 new comments -
Docs Update `wrap_triton`
#152870 commented on
Jun 9, 2025 • 0 new comments -
Make implicit packages (PEP420) explicit PyTorch
#153546 commented on
Jun 9, 2025 • 0 new comments -
Torchrun does not handle worker failure gracefully
#146371 commented on
Jun 9, 2025 • 0 new comments -
Triton has removed the experimental descriptor API
#154162 commented on
Jun 9, 2025 • 0 new comments -
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_float16 (__main__.TestForeachCUDA)
#153250 commented on
Jun 10, 2025 • 0 new comments -
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_complex64 (__main__.TestForeachCUDA)
#151313 commented on
Jun 10, 2025 • 0 new comments -
Error : torch/utils/_sympy/interp.py:176] [0/2] failed while executing pow_by_natural([VR1, int_oo], VR[-1, -1]])
#148003 commented on
Jun 5, 2025 • 0 new comments -
GroupNorm compilation errors on UNet-based architecture on torch >= 2.6.0
#152185 commented on
Jun 5, 2025 • 0 new comments -
fbgemm packages are compiled in torchinductor torchbench tests
#152024 commented on
Jun 5, 2025 • 0 new comments -
"RuntimeError: CUDA error: operation not supported" fixed by downgrading toolkit version
#135126 commented on
Jun 5, 2025 • 0 new comments -
Unexpected float32 overflow for amp training with torch.compile
#153044 commented on
Jun 5, 2025 • 0 new comments -
Dump bytecode of resumption frames in tlparse
#136038 commented on
Jun 5, 2025 • 0 new comments 10000 -
PyTorch Docathon H1 2025
#153952 commented on
Jun 5, 2025 • 0 new comments -
torch.compile() within TorchDispatchMode always causes an unknown guard failure.
#144787 commented on
Jun 5, 2025 • 0 new comments -
Enable 12.8.1
#152922 commented on
Jun 6, 2025 • 0 new comments -
unbacked inputs not being preserved in backwards graph
#153778 commented on
Jun 6, 2025 • 0 new comments -
Compile + torch.autograd.grad returns no gradients
#132929 commented on
Jun 6, 2025 • 0 new comments -
PGO errors out in dynamo main path for sparse tensors
#154161 commented on
Jun 6, 2025 • 0 new comments -
Allow creation of pseudo devices for testing purposes
#61654 commented on
Jun 6, 2025 • 0 new comments -
RuntimeError: Shared memory manager connection has timed out
#129656 commented on
Jun 6, 2025 • 0 new comments -
'torch.sparse.to_sparse_semi_structured' significantly worsens performance on H100 GPUs
#153825 commented on
Jun 6, 2025 • 0 new comments -
DISABLED test_nn_module (__main__.TestGuardSerialization)
#153120 commented on
Jun 6, 2025 • 0 new comments -
FP8 Support for FlexAttention
#151695 commented on
Jun 6, 2025 • 0 new comments -
Can pytorch add sparse linear solvers like scipy.sparse.linalg.gmres, scipy.sparse.linalg.bicg etc.
#133676 commented on
Jun 6, 2025 • 0 new comments -
☂️ MPS support for large tensors
#149325 commented on
Jun 6, 2025 • 0 new comments -
`bytes(...)` support of torch tensor does not match numpy + it would be nice to support tensor.tobytes() as alias
#108565 commented on
Jun 6, 2025 • 0 new comments -
compilation fails `error: invalid argument '-std=c++17' not allowed with 'C'`
#103222 commented on
Jun 6, 2025 • 0 new comments -
Divergence of handling python del in dynamo vs eager
#153701 commented on
Jun 6, 2025 • 0 new comments -
FSDP2 "got mixed torch.Tensor and DTensor"
#153354 commented on
Jun 6, 2025 • 0 new comments -
[export] fail to export joint graph of a model with tied weights using experimental `_export_forward_backward` API
#147380 commented on
Jun 6, 2025 • 0 new comments -
LowRankMultivariateNormal doesn't work with 0 diagonal
#75173 commented on
Jun 7, 2025 • 0 new comments -
Cannot Convert Pytorch model with fft_rfftn layers to ONNX using latest torch.onnx.dynamo_export
#133785 commented on
Jun 7, 2025 • 0 new comments -
[ONNX] Migrate torchlib from onnxscript
#139301 commented on
Jun 7, 2025 • 0 new comments -
☂️ Update submodule dependencies to supported version of Cmake
#150328 commented on
May 26, 2025 • 0 new comments -
FPE in `torch.remainder`
#153919 commented on
May 27, 2025 • 0 new comments -
Inductor inappropriately tries to fuse scalar views of a CPU tensor into GPU kernels.
#140457 commented on
May 27, 2025 • 0 new comments -
Error after successful build: No module named 'torch._C._distributed_c10d'
#152285 commented on
May 27, 2025 • 0 new comments -
DISABLED test_tensor_subclasses (__main__.TestScript)
#119949 commented on
May 27, 2025 • 0 new comments -
associative scan is incorrect for certain shapes/kwargs
#137943 commented on
May 27, 2025 • 0 new comments -
2.2.0+ regresses SDPA performance on Windows
#125070 commented on
May 27, 2025 • 0 new comments -
Slow performance when running torch.jit traced model with Flash Attention using libtorch on Windows
#109770 commented on
May 27, 2025 • 0 new comments -
Deprecate and remove usage of from __future__ import annotations in codebase
#117449 commented on
May 27, 2025 • 0 new comments -
`torch.compile` and complex numbers
#125718 commented on
May 27, 2025 • 0 new comments -
Building extensions with CMake
#115937 commented on
May 27, 2025 • 0 new comments -
AssertionError: Guard check failed: 0/1: x.size()[0] == y.size()[0] # (unknown source x.size()[0], please file a bug)
#153923 commented on
May 27, 2025 • 0 new comments -
Using Inductor always throws a warning
#154160 commented on
May 27, 2025 • 0 new comments -
The "eager" and "aot_eager" backends have different behavior for the expected gradient tensor of the torch.expend_as operator
#151884 commented on
May 27, 2025 • 0 new comments -
Poor-quality random numbers generated by torch.poisson on gpus
#136750 commented on
May 27, 2025 • 0 new comments -
Add deterministic support for upsample_trilinear3d_backward_out_cuda
#154183 commented on
May 27, 2025 • 0 new comments -
torch.distributed.nn.all_reduce incorrectly scales the gradient
#58005 commented on
May 27, 2025 • 0 new comments -
[Dynamo][Inductor] `detectron2_fcos_r_50_fpn` in export config failure on dashboard
#154137 commented on
May 27, 2025 • 0 new comments -
[Release improvements] Have cherry-pick bot always add the current release to the PR
#152212 commented on
May 27, 2025 • 0 new comments -
[MPS] Possible persistent infinite loop in `nn.ReplicationPad1d`
#135442 commented on
May 27, 2025 • 0 new comments -
Flex Attention doesn't scale with custom bias
#152593 commented on
May 28, 2025 • 0 new comments -
Undefined Symobl: pybind11::detail::type_caster<at::Tensor, void>::load(pybind11::handle, bool)
#108041 commented on
May 28, 2025 • 0 new comments -
`vmap` not working on `torch.arange`, `torch.scalar_tensor`, and `torch.ones`
#152295 commented on
May 28, 2025 • 0 new comments -
RuntimeError: NVML_SUCCESS == r INTERNAL ASSERT FAILED at "/opt/conda/conda-bld/pytorch_1695392020201/work/c10/cuda/CUDACachingAllocator.cpp":1154, please report a bug to PyTorch.
#112377 commented on
May 28, 2025 • 0 new comments -
[XPU User Empathy Day] [Windows] First 'import torch' takes long time on Arc
#154180 commented on
May 28, 2025 • 0 new comments -
[Intel GPU][XPU] Slow DDP training using oneCCL backend
#153438 commented on
May 28, 2025 • 0 new comments -
SystemError: PY_SSIZE_T_CLEAN macro must be defined for '#' formats - PyTorch compile fails with Python 3.12
#153737 commented on
May 28, 2025 • 0 new comments -
RuntimeError prompting a bug report.
#154197 commented on
May 23, 2025 • 0 new comments -
Add clamped FP8 (E4M3) cast for overflow-safe inference
#154028 commented on
May 23, 2025 • 0 new comments -
[CUDA] test_c10d_nccl test_extra_cuda_context failure due to _helper_test_extra_cuda_context_by_memory
#153122 commented on
May 23, 2025 • 0 new comments -
Trying to use forward AD with _scaled_dot_product_flash_attention that does not support it because it has not been implemented yet.
#128971 commented on
May 23, 2025 • 0 new comments -
[inductor][triton] Block ptrs are being removed from Triton
#154025 commented on
May 23, 2025 • 0 new comments -
Dynamic compilation fails with torch 2.7
#153937 commented on
May 23, 2025 • 0 new comments -
Sum difference for equal channels of tensor
#153564 commented on
May 23, 2025 • 0 new comments -
[Feature request] `torch.export` .save/.load could support `safetensors` and/or `weights_only=True`
#153410 commented on
May 23, 2025 • 0 new comments -
`detectron2_maskrcnn` OOMs on eager with A100 40G.
#120115 commented on
May 23, 2025 • 0 new comments -
[feature request] "Batched" index_select (i.e. simplified torch.gather with not specifying full index)
#64208 commented on
May 23, 2025 • 0 new comments -
Error on padding 0-sized tensors
#152750 commented on
May 23, 2025 • 0 new comments -
[feature request] `torch.to(obj, device, dtype)` supporting recursive lists/dicts/tuples of tensors probably by uplifting/promoting `torch.distributed.utils._recursive_to`
#69431 commented on
May 23, 2025 • 0 new comments -
[feature request] [discussion] Baseline ONNX interpreter / executor in python / PyTorch
#130114 commented on
May 24, 2025 • 0 new comments -
pybind11 loading for c10::Scalar NYI
#154187 commented on
May 24, 2025 • 0 new comments -
`torch.ldexp` goes out of range when `2**other` is out of range
#153069 commented on
May 24, 2025 • 0 new comments -
torch.compile raise JSONDecodeError("Extra data", s, end) while using Ray with Ulysses + 4 GPUs
#153791 commented on
May 24, 2025 • 0 new comments -
[ued][gemma3] HF + torch.compile - torch.compile on Gemma3
#149574 commented on
May 24, 2025 • 0 new comments -
Inductor C++ Compile Error
#154127 commented on
May 24, 2025 • 0 new comments -
NotImplementedError: Output channels > 65536 not supported at the MPS device.
#144445 commented on
May 24, 2025 • 0 new comments -
Immutable (read-only) tensors
#44027 commented on
May 24, 2025 • 0 new comments -
A bunch of fft ops fails the size/strides assert
#145977 commented on
May 25, 2025 • 0 new comments -
Improve typing of args and kwargs with ParamSpec
#142306 commented on
May 25, 2025 • 0 new comments -
Segfault, possibly due to recursion limit
#127622 commented on
May 25, 2025 • 0 new comments -
Add option for custom ops to automatically get a FakeTensor kernel (during static shapes)
#127337 commented on
May 25, 2025 • 0 new comments -
`collect_env.py` fails with `'NoneType' object has no attribute 'splitlines'` if pytorch is installed without pip
#144615 commented on
May 25, 2025 • 0 new comments -
inductor unbacked codegen results in undefined inputs
#154146 commented on
May 25, 2025 • 0 new comments -
Invalid handling of nans in compiled torch.quantile / torch.nanquantile on cuda
#152423 commented on
May 26, 2025 • 0 new comments -
[BUG] `functionalize` silent data corruption with pre-strided tensors
#153861 commented on
May 30, 2025 • 0 new comments -
Conda Pytorch set processor affinity to the first physical core after fork
#99625 commented on
May 30, 2025 • 0 new comments -
importing torch._dynamo under meta device fails
#153330 commented on
May 30, 2025 • 0 new comments -
torch.compile on torch.vmap function gives different shape to torch.vmap alone when using jacrev
#154036 commented on
May 30, 2025 • 0 new comments -
[FSDP+TP] RuntimeError: 'weight' must be 2-D
#124019 commented on
May 30, 2025 • 0 new comments -
Unexpected incorrect size error in GaussianNLLLoss
#147521 commented on
May 30, 2025 • 0 new comments -
AOTAutograd export path does not support training graphs with parameters that do not receive gradients.
#101192 commented on
May 30, 2025 • 0 new comments -
`__setitem__` with bool mask and dtype mismatch fails
#150017 commented on
May 30, 2025 • 0 new comments -
`index_copy` has different index behavior with `index_fill`
#73501 commented on
May 30, 2025 • 0 new comments -
Bracket indexing not working
#145143 commented on
May 30, 2025 • 0 new comments -
torch.compile joint trace materializes constant tensors
#154083 commented on
May 30, 2025 • 0 new comments -
[ONNX] export() with dynamic shapes fails where dynamo_export(dynamic_shapes=True) succeeds
#126607 commented on
May 31, 2025 • 0 new comments -
Stable C bindings for libtorch
#145656 commented on
May 31, 2025 • 0 new comments -
Non-blocking GPU to CPU copy of complex numbers with the different conj status produces wrong results
#146286 commented on
May 31, 2025 • 0 new comments -
torch.nn.functional.scaled_dot_product_attention is_causal fails for kv-cache case (sequential and further parallel attention)
#144858 commented on
May 31, 2025 • 0 new comments -
【Pytorch mobile android】torch jit forward fail,Calling torch.geqrf on a CPU tensor requires compiling PyTorch with LAPACK. Please use PyTorch built with LAPACK support.
#130309 commented on
May 31, 2025 • 0 new comments -
Torch 2.1 compile + FSDP (mixed precision) + LlamaForCausalLM: `RuntimeError: attempting to assign a gradient with dtype 'c10::BFloat16' to a tensor with dtype 'float'.`
#111317 commented on
Jun 1, 2025 • 0 new comments -
dist.init_process_group not work
#154102 commented on
Jun 1, 2025 • 0 new comments -
torch.compile doesnot support index with tensor
#151997 commented on
Jun 1, 2025 • 0 new comments -
Async NCCL communication blocks CUDA kernel in the first run
#136248 commented on
Jun 2, 2025 • 0 new comments -
MPS Backend Error: ComplexDouble (complex128) Conversion Fails When Diffusers Transformer Creates 64‐bit Complex Tensors
#148670 commented on
Jun 2, 2025 • 0 new comments -
RAM leak during data loading with multiprocessing and Conv3d on CPU in Dataset __getitem__
#150612 commented on
Jun 2, 2025 • 0 new comments -
Add aten::empty.memory_format for SparseMPS
#87886 commented on
Jun 2, 2025 • 0 new comments -
ByteTensor fails under FakeTensorMode()
#146635 commented on
Jun 2, 2025 • 0 new comments -
addition of muon optimizer to torch.optim
#148819 commented on
Jun 2, 2025 • 0 new comments -
Crash when testing Libtorch example
#129819 commented on
Jun 2, 2025 • 0 new comments -
10000 Kohya SS FLUX LoRA training is way faster on Linux than Windows any ideas to debug? Same settings, libraries and GPU
#134324 commented on
Jun 2, 2025 • 0 new comments -
MX basic dtypes in pytorch/pytorch
#146414 commented on
May 28, 2025 • 0 new comments -
[inductor][cpu] aoti shared memory run on multiple cores, performance will drop
#154094 commented on
May 28, 2025 • 0 new comments -
Update `torch/nn/modules/conv.py` to use Literal for support padding modes
#152280 commented on
May 28, 2025 • 0 new comments -
MPS: Placeholder tensor is empty!
#81180 commented on
May 28, 2025 • 0 new comments -
NFS errors during DataLoader shutdown when num_workers > 1 when temporary directory is on NFS
#143471 commented on
May 28, 2025 • 0 new comments -
CuDNN SDPA Issue Tracker
#141133 commented on
May 28, 2025 • 0 new comments -
`import torch` takes forever
#137260 commented on
May 28, 2025 • 0 new comments -
cuda_utils.so: failed to map segment from shared object
#123054 commented on
May 28, 2025 • 0 new comments -
inductor `full_like` decompositions give incorrect strides
#144699 commented on
May 28, 2025 • 0 new comments -
Performance Regression nightly 03/11→03/12, on nanogpt speedrun
#152823 commented on
May 28, 2025 • 0 new comments -
[nightly][jit] bad constant exponent (e+38.f) in default_program fused_mul_div_add
#107503 commented on
May 29, 2025 • 0 new comments -
[onnx] [njt] [feature request] Export NJT-enabled SDPA / MHA ops to ORT's PackingMode Attention
#140130 commented on
May 29, 2025 • 0 new comments -
with torch compile, bf16 gelu,silu, and mish are not deterministic in some sense.
#154150 commented on
May 29, 2025 • 0 new comments -
undefined symbol: __nvJitLinkCreate_12_8, version libnvJitLink.so.12
#152783 commented on
May 29, 2025 • 0 new comments -
GRU and LSTM fail for seq_len = 0
#50192 commented on
May 29, 2025 • 0 new comments -
[torch.export.load] failed while executing `pow_by_natural`
#136628 commented on
May 29, 2025 • 0 new comments -
MPS backend gradient correctness issues with large shapes
#153957 commented on
May 29, 2025 • 0 new comments -
A more flexible API for torch.compile fullgraph=True
#144908 commented on
May 29, 2025 • 0 new comments -
[Benchmark] High compilation time variance on benchmark dashboards
#152566 commented on
May 29, 2025 • 0 new comments -
Selective Activation Checkpointing on custom autograd.Function
#153334 commented on
May 29, 2025 • 0 new comments -
Support torch.func.grad for Flex Attention
#144810 commented on
May 29, 2025 • 0 new comments -
SourcelessBuilder.create does not know how to wrap <class '__main__.InFlexData'>
#154009 commented on
May 30, 2025 • 0 new comments -
torch.nn.functional.scaled_dot_product_attention returns NaN values after backward pass.
#126654 commented on
May 30, 2025 • 0 new comments -
Feature Request: CUDA torch.histogram (and histogramdd)
#69519 commented on
May 30, 2025 • 0 new comments -
Massive initial memory overhead GPU
#12873 commented on
May 30, 2025 • 0 new comments -
Device Error on vmap
#151591 commented on
May 30, 2025 • 0 new comments -
torch.compile fails for complex nested_tensor code
#130825 commented on
May 30, 2025 • 0 new comments -
[dtensor] ops coverage tracker
#119930 commented on
Jun 18, 2025 • 0 new comments -
RFC: The State of Custom CUDA extensions in PyTorch
#152032 commented on
Jun 19, 2025 • 0 new comments -
DISABLED test_memory_snapshot (__main__.TestCudaMallocAsync)
#126953 commented on
Jun 19, 2025 • 0 new comments -
Dynamo handling for all methods of torch.Generator
#88576 commented on
Jun 19, 2025 • 0 new comments -
Windows inductor genarated zero size array code, and is not supported by MSVC(C2466).
#153180 commented on
Jun 19, 2025 • 0 new comments -
DISABLED test_mempool_ctx_multithread (__main__.TestMemPool)
#153460 commented on
Jun 19, 2025 • 0 new comments -
NotImplementedError: Could not run 'aten::log' with arguments from the 'SparseCUDA' backend.
#153497 commented on
Jun 19, 2025 • 0 new comments -
cpp wrapper calls back to python for custom op even when a C++ registration is made
#153478 commented on
Jun 19, 2025 • 0 new comments -
Segmentation fault when converting sparse COO tensor with complex values to dense
#153329 commented on
Jun 19, 2025 • 0 new comments -
`torch.sparse.log_softmax` output mismatch between CPU and CUDA
#152293 commented on
Jun 19, 2025 • 0 new comments -
NotImplementedError: Could not run 'aten::index.Tensor' with arguments from the 'SparseCUDA' backend.
#152226 commented on
Jun 19, 2025 • 0 new comments -
Sparse tensor indexing not implemented, but partially supported by using index_select
#150277 commented on
Jun 19, 2025 • 0 new comments -
Dynamo export: Fake tensor broadcast error
#129534 commented on
Jun 19, 2025 • 0 new comments -
DISABLED test_wait_tensor (__main__.CompileTest)
#148014 commented on
Jun 19, 2025 • 0 new comments -
Make streams used for NCCL operations configurable
#67158 commented on
Jun 19, 2025 • 0 new comments -
[ONNX] ONNX export of simple quantized model fails
#113817 commented on
Jun 20, 2025 • 0 new comments -
[torch.compile][Megatron] Error with Megatron with Pytorch v2.5.0 using `AOTAutograd` and `torch.compile`
#141783 commented on
Jun 20, 2025 • 0 new comments -
Remove redundant type aliases of _device for torch.Device
#152952 commented on
Jun 20, 2025 • 0 new comments -
The docstring linter should not force overridden methods to be documented
#151692 commented on
Jun 20, 2025 • 0 new comments -
DISABLED test_sdpa_mask_fp16_L6_S17_NH23_HS121 (__main__.TestSDPA)
#138905 commented on
Jun 20, 2025 • 0 new comments -
[ONNX] Simple torch.nn.Identity onnx export with dynamo=True does not load
#151017 commented on
Jun 20, 2025 • 0 new comments -
[ONNX] Use dlpack to transfer tensors when onnxruntime implements proper support
#151064 commented on
Jun 20, 2025 • 0 new comments -
[export] Decomp failure when running `aten.item.default`
#150823 commented on
Jun 20, 2025 • 0 new comments -
[ONNX Convert] Error when input to nn.AdaptiveAvgPool2d size 10000 is variable
#147720 commented on
Jun 20, 2025 • 0 new comments -
`torch.onnx.export` (dynamo=False) fails with uninformative error when exporting `apply_rotary_pos_emb`/`repeat_interleave`
#145100 commented on
Jun 20, 2025 • 0 new comments -
[ONNX] broadcast_in_dim: model (ReDimNet)
#138313 commented on
Jun 20, 2025 • 0 new comments -
[CI] [anaconda] CI Build and Test scripts Linux
#148336 commented on
Jun 20, 2025 • 0 new comments -
[DCP] Allow for rank-specific tensors with duplicate keys
#146566 commented on
Jun 17, 2025 • 0 new comments -
[feature request] Exact euclidean distance transform
#61509 commented on
Jun 17, 2025 • 0 new comments -
DISABLED test_run_decompositions_map_handle_to_new_nodes (__main__.TestNumericDebugger)
#144933 commented on
Jun 17, 2025 • 0 new comments -
Feature Request: Add a rounding mode to round
#55289 commented on
Jun 17, 2025 • 0 new comments -
[feature request] Rank-Revealing QR - Adding dgeqp3 support to torch.qr
#10454 commented on
Jun 17, 2025 • 0 new comments -
TorchInductor CPU Performance Dashboard
#93531 commented on
Jun 17, 2025 • 0 new comments -
torch.compile on MPS progress tracker
#150121 commented on
Jun 17, 2025 • 0 new comments -
[Async TP] Fuse all-gather-matmuls for float8 rowwise training
#149990 commented on
Jun 17, 2025 • 0 new comments -
Inductor Perf MX to_blocked
#153194 commented on
Jun 17, 2025 • 0 new comments -
[ONNX] dynamic_axes does not rename dynamic dimension in torch.onnx.export
#150544 commented on
Jun 17, 2025 • 0 new comments -
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_float32 (__main__.TestForeachCUDA)
#153470 commented on
Jun 17, 2025 • 0 new comments -
The difference between input grad computed by channels last backward and the input grad computed by channels first backward of Hardswish on MPS is too large
#107214 commented on
Jun 17, 2025 • 0 new comments -
[NJT] can only chunk if the 2nd dimension is ragged
#153238 commented on
Jun 17, 2025 • 0 new comments -
RuntimeError: NVML_SUCCESS == r INTERNAL ASSERT FAILED at "/opt/pytorch/pytorch/c10/cuda/CUDACachingAllocator.cpp":830, please report a bug to PyTorch.
#123834 commented on
Jun 17, 2025 • 0 new comments -
Device check missing in torch.linalg.solve_triangular leading to hard crash
#142048 commented on
Jun 17, 2025 • 0 new comments -
UR Error when calling grid_sample
#153996 commented on
Jun 18, 2025 • 0 new comments -
DTensor RNG state for non CUDA backends
#138329 commented on
Jun 18, 2025 • 0 new comments -
xpu: implement aten::_linalg_eigvals for XPU backend (affecting HF Transformers v4.46.0-v4.48.0)
#140965 commented on
Jun 18, 2025 • 0 new comments -
DISABLED test_per_sample_api_compute_batch_size_not_pytreeable_cpu (__main__.TestExpandedWeightModuleCPU)
#146972 commented on
Jun 18, 2025 • 0 new comments -
DISABLED test_inductor_all_gather_into_tensor_single (__main__.CompileTest)
#147707 commented on
Jun 18, 2025 • 0 new comments -
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_float64 (__main__.TestForeachCUDA)
#153544 commented on
Jun 18, 2025 • 0 new comments -
Timer benchmark stores only one time value, and therefore has broken mean/median/etc metrics
#106801 commented on
Jun 18, 2025 • 0 new comments -
[RFC][API-Unstable] Support 3rd party SYCL kernels with CPP Extension API
#153265 commented on
Jun 18, 2025 • 0 new comments -
Compile produces different result than eager for mutable custom op use case
#153389 commented on
Jun 18, 2025 • 0 new comments -
Escape hatch: way to dynamically add or remove tags from custom operators
#150972 commented on
Jun 18, 2025 • 0 new comments -
"RuntimeError: makeDeviceForHostname(): unsupported gloo device" with nightly torch 2.8
#150381 commented on
Jun 18, 2025 • 0 new comments -
Context Parallel -- unsharded output doesn't match output without CP.
#152261 commented on
Jun 18, 2025 • 0 new comments -
Compiled `nn.Module` with tensor subclass can't be moved to another device
#141548 commented on
Jun 23, 2025 • 0 new comments -
Flex Attention is incompatible with selective AC
#147879 commented on
Jun 23, 2025 • 0 new comments -
DISABLED test_cat_max_autotune_triton (__main__.TestMaxAutotune)
#145830 commented on
Jun 23, 2025 • 0 new comments -
DISABLED test_ranks_and_tag (__main__.CompileTest)
#147974 commented on
Jun 23, 2025 • 0 new comments -
Support sparse COO/CSR/CSC/BSR/BSC return values in gradcheck input function
#97825 commented on
Jun 19, 2025 • 0 new comments -
Support building pytorch using MKL ILP64 model.
#102613 commented on
Jun 19, 2025 • 0 new comments -
Automated submodule update: kineto
#106149 commented on
Jun 19, 2025 • 0 new comments -
[RFC] Tensordict integration
#112441 commented on
May 28, 2025 • 0 new comments -
[pytree] support PyStructSequence types for Python pytree
#113258 commented on
Jun 19, 2025 • 0 new comments -
Automated submodule update: FBGEMM
#115316 commented on
Jun 23, 2025 • 0 new comments -
[DO NOT MERGE] Test new ROCm CI Navi31 nodes
#124424 commented on
May 25, 2025 • 0 new comments -
refine fp32 precision api
#125888 commented on
Jun 23, 2025 • 0 new comments -
allow to use bf16 as fp32 internal precision for mkldnn conv
#126050 commented on
Jun 23, 2025 • 0 new comments -
allow to use bf16 as fp32 internal precision for mkldnn conv backward
#126054 commented on
Jun 23, 2025 • 0 new comments -
[AOTAutograd] tweak min-cut partitioner to avoid saving softmax output
#126348 commented on
Jun 18, 2025 • 0 new comments -
[inductor] enable bf32 test for mkldnn conv
#127293 commented on
Jun 23, 2025 • 0 new comments -
[inductor] enable bf32 for mkldnn linear pointwise/binary in inductor
#127294 commented on
Jun 23, 2025 • 0 new comments -
Fix numerical instability for norm
#129352 commented on
Jun 22, 2025 • 0 new comments -
[DTensor] decomposed sharding propagation
#130887 commented on
Jun 19, 2025 • 0 new comments -
Remove deprecated jit code
#131296 commented on
Jun 20, 2025 • 0 new comments -
[torch.special] Adding betainc, betaincc, betaincinv, betainccinv, betaln and beta with backward operation
#132135 commented on
Jun 9, 2025 • 0 new comments -
Avoid sqrt calculations with values less than zero
#136824 commented on
Jun 21, 2025 • 0 new comments -
Load cuda deps more aggressively
#137059 commented on
Jun 12, 2025 • 0 new comments -
Help fix numpy detection in cross compiled layouts
#137084 commented on
Jun 7, 2025 • 0 new comments -
[pytree] Add public pytree module `torch.utils.pytree`
#137400 commented on
Jun 18, 2025 • 0 new comments -
Add TORCH_CHECK_INDEX in convert_indices_from_coo_to_csr_cpu
#138068 commented on
Jun 18, 2025 • 0 new comments -
[pytree] add `treespec_{leaf,tuple,dict}` functions for args_spec modification
#138214 commented on
Jun 18, 2025 • 0 new comments -
[CI] [anaconda] CI Build and Test scripts Windows
#148338 commented on
Jun 20, 2025 • 0 new comments -
[Docs] [anaconda] Review and update
#148339 commented on
Jun 20, 2025 • 0 new comments -
[CI] [anaconda] CI Build and Test scripts MacOS
#148340 commented on
Jun 20, 2025 • 0 new comments -
[CI] [anaconda] Docker files have conda environment installed
#148335 commented on
Jun 20, 2025 • 0 new comments -
[release] Make pytorch source distribution package respect pep-0517
#150461 commented on
Jun 20, 2025 • 0 new comments -
CUDA 12.6 Inductor accuracy test failures
#148699 commented on
Jun 20, 2025 • 0 new comments -
[torch.export] Cannot export TorchVision fasterrcnn_mobilenet_v3_large_fpn
#146152 commented on
Jun 20, 2025 • 0 new comments -
DISABLED test_non_contiguous_input_mm_plus_mm (__main__.TestMaxAutotune)
#126867 commented on
Jun 20, 2025 • 0 new comments -
DISABLED test_slice_scatter_reinplace_cuda (__main__.GPUTests)
#145189 commented on
Jun 20, 2025 • 0 new comments -
DISABLED test_inductor_reduce_scatter_tensor_coalesced (__main__.CompileTest)
#147887 commented on
Jun 20, 2025 • 0 new comments -
UNSTABLE pull / cuda12.8-py3.10-gcc9-sm75 / test (pr_time_benchmarks)
#153987 commented on
Jun 20, 2025 • 0 new comments -
torch.export does not support torchaudio.transforms.Spectrogram
#112844 commented on
Jun 20, 2025 • 0 new comments -
MSE documentation is weak
#88327 commented on
Jun 20, 2025 • 0 new comments -
Division by zero in ONNX export with `dynamo=True` leading to NaN outputs
#150623 commented on
Jun 21, 2025 • 0 new comments -
Looking for valid compiling option for extension based on torch-2.1.0+cpu.cxx11.abi
#143780 commented on
Jun 21, 2025 • 0 new comments -
Tensor.lerp inconsistent when using -Infinity between MPS and CPU
#111374 commented on
Jun 21, 2025 • 0 new comments -
Segmentation fault when calling `torch.choose_qparams_optimized()` with empty tensors and extreme num_bins value
#153326 commented on
Jun 21, 2025 • 0 new comments -
CompiledFxGraph.current_callable is not thread-safe
#138961 commented on
Jun 21, 2025 • 0 new comments -
General MPS op coverage tracking issue
#77764 commented on
Jun 21, 2025 • 0 new comments -
[ONNX] exported nodes of Multi-head attention can be simplified
#151209 commented on
Jun 21, 2025 • 0 new comments -
MPS operator coverage tracking issue (2.6+ version)
#141287 commented on
Jun 22, 2025 • 0 new comments -
Online softmax is disabled on the fly
#153241 commented on
Jun 22, 2025 • 0 new comments -
foreach CUDA tests flaky on CUDA 12.6+ due to flaky profiler results
#148681 commented on
Jun 22, 2025 • 0 new comments -
Improve error message for wrong number of arguments in CachingAutotuner
#146018 commented on
Jun 23, 2025 • 0 new comments -
Docs are little bit outdated for torch logs
#137285 commented on
Jun 23, 2025 • 0 new comments -
[RFC] Use CUDA graphs by default on torch.compile
#121968 commented on
Jun 23, 2025 • 0 new comments -
`setup.py develop` command is disappearing soon from `setuptools`
#152276 commented on
Jun 23, 2025 • 0 new comments -
DISABLED test_inductor_inplace_op_on_view (__main__.CompileTest)
#147852 commented on
Jun 11, 2025 • 0 new comments -
Build pytorch for rocm failed
#148167 commented on
Jun 11, 2025 • 0 new comments -
DISABLED test_repeated_calling_cuda (__main__.AOTInductorTestABICompatibleGpu)
#146185 commented on
Jun 11, 2025 • 0 new comments -
DISABLED test_inductor_reduce_scatter_tensor_single (__main__.CompileTest)
#147911 commented on
Jun 11, 2025 • 0 new comments -
DISABLED test_foreach_l2_large_value_input__foreach_norm_cuda_float16 (__main__.TestForeachCUDA)
#150509 commented on
Jun 11, 2025 • 0 new comments -
When scoped_libary is destroyed the fake impls are not cleared
#152720 commented on
Jun 11, 2025 • 0 new comments -
Different Cholesky results between Windows & Linux
#131774 commented on
Jun 11, 2025 • 0 new comments -
Some files in sccache are owned by `hostmaster+pytorch`
#139143 commented on
Jun 11, 2025 • 0 new comments -
torch.compile fails in FSDP due to .data assignment with different floating type
#152162 commented on
Jun 11, 2025 • 0 new comments -
custom_op's backward changes can't invalidate `torch.compile` cache for backward
#144344 commented on
Jun 11, 2025 • 0 new comments -
Pytorch PP requires all parameters to have grad in backward
#153484 commented on
Jun 11, 2025 • 0 new comments -
MaxPool2D memory leakage on device MPS
#125217 commented on
Jun 11, 2025 • 0 new comments -
DISABLED test_inductor_reuse_buffer_after_inplace_collective (__main__.CompileTest)
#147950 commented on
Jun 11, 2025 • 0 new comments -
[PTD BE DAY]Burn Down Distributed Disabled Tests!!
#132845 commented on
Jun 12, 2025 • 0 new comments -
NCCL ISend is not asynchronous
#108378 commented on
Jun 12, 2025 • 0 new comments -
nn.InstanceNorm and nn.GroupNorm are affected by padding, so they need to masking
#81985 commented on
Jun 12, 2025 • 0 new comments -
[RFC] Proposed Changes to Feature Tracking & Classification for PyTorch Releases starting Release 2.8
#152134 commented on
Jun 12, 2025 • 0 new comments -
DISABLED test_remove_noop_slice1_cpu (__main__.CpuTests)
#151379 commented on
Jun 12, 2025 • 0 new comments -
DISABLED test_remove_noop_slice1_cuda (__main__.GPUTests)
#151381 commented on
Jun 12, 2025 • 0 new comments -
DISABLED test_remove_noop_slice_cuda (__main__.GPUTests)
#151383 commented on
Jun 12, 2025 • 0 new comments -
DISABLED test_remove_noop_slice_scatter_cpu (__main__.CpuTests)
#151382 commented on
Jun 12, 2025 • 0 new comments -
DISABLED test_inductor_all_gather_into_tensor_coalesced (__main__.CompileTest)
#146806 commented on
Jun 12, 2025 • 0 new comments -
DISABLED test_remove_noop_slice_scatter_cuda (__main__.GPUTests)
#151378 commented on
Jun 12, 2025 • 0 new comments -
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_bfloat16 (__main__.TestForeachCUDA)
#150932 commented on
Jun 12, 2025 • 0 new comments -
[ONNX] Flip `dynamo` default to True in torch.onnx.export
#151693 commented on
Jun 12, 2025 • 0 new comments -
[ONNX] Support torchvision ops
#146459 commented on
Jun 12, 2025 • 0 new comments -
torch.nn.functional.one_hot has inconsistent behavior between eager and torch.compile when num_classes=0
#146274 commented on
Jun 12, 2025 • 0 new comments -
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_float32 (__main__.TestForeachCUDA)
#153284 commented on
Jun 10, 2025 • 0 new comments -
DISABLED test_while_loop_schema_gen (__main__.TestHopSchema)
#141202 commented on
Jun 10, 2025 • 0 new comments -
DISABLED test_rng (__main__.TestCompilerBisector)
#139590 commented on
Jun 10, 2025 • 0 new comments -
Memory Corruption in `torch.batch_norm_update_stats`
#153967 commented on
Jun 10, 2025 • 0 new comments -
DISABLED test_parity__foreach_add_fastpath_inplace_cuda_float64 (__main__.TestForeachCUDA)
#153395 commented on
Jun 10, 2025 • 0 new comments -
need to document `FlopCounterMode`
#145555 commented on
Jun 10, 2025 • 0 new comments -
torch.nn.functional.conv_transpose2d produces inconsistent output on CPU and CUDA
#153276 commented on
Jun 10, 2025 • 0 new comments -
Unexpected overflow behavior when using `torch.addcmul`
#152294 commented on
Jun 10, 2025 • 0 new comments -
`torch.nn.functional.conv_transpose2d` has inconsistent handling of `float16` overflow on CPU
#153700 commented on
Jun 10, 2025 • 0 new comments -
[doc] functionalities not documented
#9886 commented on
Jun 10, 2025 • 0 new comments -
Missing examples in some API docs
#103844 commented on
Jun 10, 2025 • 0 new comments -
torch.multiprocessing.Queue Zeroes Out Tensors on Retrieval
#149155 commented on
Jun 10, 2025 • 0 new comments -
DISABLED test_inductor_all_to_all_single (__main__.CompileTest)
#147795 commented on
Jun 10, 2025 • 0 new comments -
DISABLED test_inductor_all_reduce_non_contig_input (__main__.CompileTest)
#147733 commented on
Jun 10, 2025 • 0 new comments -
JVP: Option to Disable Gradient Caching for Tangents
#151782 commented on
Jun 10, 2025 • 0 new comments -
Unexpected Behavior when using torch.isclose()
#102400 commented on
Jun 10, 2025 • 0 new comments -
`Segmentation fault` in `torch.nn.utils.rnn.pad_packed_sequence` and `torch.nn.utils.rnn.unpack_sequence`
#149622 commented on
Jun 10, 2025 • 0 new comments -
[inductor] [cpu] `torch.nn.RReLU()` outputs different resutls with eager on cpp backend
#147255 commented on
Jun 10, 2025 • 0 new comments -
Setting the tensor of the values out of `[0,1]` to `target` argument of `nn.CrossEntropyLoss()` with class probabilities works against the doc
#134771 commented on
Jun 10, 2025 • 0 new comments -
DISABLED test_foreach_copy_with_multi_dtypes__foreach_copy_cuda_int16 (__main__.TestForeachCUDA)
#150309 commented on
Jun 10, 2025 • 0 new comments -
[dynamo] Try tracing into einops
#152480 commented on
Jun 10, 2025 • 0 new comments -
Tensor parallel for convolutions and groupnorm
#133221 commented on
Jun 11, 2025 • 0 new comments -
deepcopy of LazyLinear fails
#83168 commented on
Jun 11, 2025 • 0 new comments -
DISABLED test_comprehensive_nn_functional_conv_transpose3d_cuda_float32 (__main__.TestInductorOpInfoCUDA)
#148853 commented on
Jun 11, 2025 • 0 new comments -
DISABLED test_parity__foreach_abs_fastpath_inplace_cuda_float64 (__main__.TestForeachCUDA)
#150562 commented on
Jun 11, 2025 • 0 new comments -
DISABLED test_foreach_l2_large_value_input__foreach_norm_cuda_bfloat16 (__main__.TestForeachCUDA)
#150467 commented on
Jun 11, 2025 • 0 new comments -
xpu: installed pytorch is missing aten xpu ops headers (ATen/ops/cat_xpu_dispatch.h and others)
#145902 commented on
Jun 11, 2025 • 0 new comments -
Error computing the norm of MaskedTensor
#117287 commented on
Jun 14, 2025 • 0 new comments -
CTCLoss gradient is incorrect
#52241 commented on
Jun 14, 2025 • 0 new comments -
Incompatible Torch and Torchvision while building from source for 2.6.0 and CUDA 12.6, RuntimeError: operator torchvision::nms does not exist
#146221 commented on
Jun 15, 2025 • 0 new comments -
Enable `torch.topk` to support `stable` flag
#88227 commented on
Jun 15, 2025 • 0 new comments -
bmm, topk, cholesky, linalg.norm, max with out variants set causing recompilations in torch.compile
#135859 commented on
Jun 15, 2025 • 0 new comments -
Enable TorchInductor to Generate Matmuls Natively via `tl.dot`
#151705 commented on
Jun 15, 2025 • 0 new comments -
Continuous calls to nn.Linear in fp32 on the 5090D cause severe performance degradation
#150725 commented on
Jun 15, 2025 • 0 new comments -
DISABLED test_remove_noop_view_dtype_cuda (__main__.GPUTests)
#151541 commented on
Jun 16, 2025 • 0 new comments -
Some Doc Issue about `torch.lobpcg()`
#152107 commented on
Jun 16, 2025 • 0 new comments -
DTensor + torch.compile on CPU: compiled matmul fails with multiple shape inputs
#154111 commented on
Jun 16, 2025 • 0 new comments -
DISABLED test_jacobian_vectorize_raises_no_warnings_logging_tensor (__main__.TestAutogradFunctional)
#153707 commented on
Jun 16, 2025 • 0 new comments -
TypeError when using torch.cuda.list_gpu_processes() on Windows with the WDDM driver
#64491 commented on
Jun 16, 2025 • 0 new comments -
[RFC][API-Unstable] Intel GPU distributed Backend integration in `torch-xpu-ops`and registeration in PyTorch
#141741 commented on
Jun 16, 2025 • 0 new comments -
[Tracker] Nested tensor op coverage requests
#118107 commented on
Jun 16, 2025 • 0 new comments -
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_float16 (__main__.TestForeachCUDA)
#153379 commented on
EDA0Jun 16, 2025 • 0 new comments -
Most requested ops for the MPS backend
#154052 commented on
Jun 16, 2025 • 0 new comments -
FSDP learning hangs when the program tries to save the model
#143536 commented on
Jun 16, 2025 • 0 new comments -
Cannot compile with latest LLVM-19
#139065 commented on
Jun 16, 2025 • 0 new comments -
Can't call torch.compile inside of a custom op
#151328 commented on
Jun 16, 2025 • 0 new comments -
Suggestion: integration of einops test suite
#146782 commented on
Jun 16, 2025 • 0 new comments -
Segmentation error for torch==2.2.1 on MacOs
#121101 commented on
Jun 16, 2025 • 0 new comments -
Graph Partition Issue Tracker
#151832 commented on
Jun 16, 2025 • 0 new comments -
Add support for MaxPool3D on the MPS backend
#100674 commented on
Jun 16, 2025 • 0 new comments -
DISABLED test_ddp_uneven_inputs (__main__.TestDistBackendWithSpawn)
#75648 commented on
Jun 16, 2025 • 0 new comments -
Process never ends when sending tensors through multiprocessing queues in Python 3.12+ on macOS
#153050 commented on
Jun 16, 2025 • 0 new comments -
DISABLED test_lowering_to_x86 (__main__.TestQuantizePT2EX86Inductor)
#153140 commented on
Jun 17, 2025 • 0 new comments -
DISABLED test_re_export_preserve_handle (__main__.TestNumericDebugger)
#144898 commented on
Jun 17, 2025 • 0 new comments -
DISABLED test_matrix_rank_basic_cuda_float32 (__main__.TestLinalgCUDA)
#150406 commented on
Jun 12, 2025 • 0 new comments -
DISABLED test_remove_noop_slice_cpu (__main__.CpuTests)
#151384 commented on
Jun 12, 2025 • 0 new comments -
Label tracking meta-issue (edit me to get automatically CC'ed on issues! cc bot)
#24422 commented on
Jun 12, 2025 • 0 new comments -
result_type doesn't take dtypes and doesn't match numpy
#51284 commented on
Jun 13, 2025 • 0 new comments -
mark_unbacked for strides.
#153204 commented on
Jun 13, 2025 • 0 new comments -
DISABLED test_foreach_check_stride_ignore_dims_of_one_cuda_float32 (__main__.TestForeachCUDA)
#150026 commented on
Jun 13, 2025 • 0 new comments -
[feature request] Provide FlexAttention as a new available/selectable backend for SDPA
#137574 commented on
Jun 13, 2025 • 0 new comments -
DISABLED test_is_isnot (__main__.TestScript)
#120694 commented on
Jun 13, 2025 • 0 new comments -
DISABLED test_int64_upsample3d_cuda_bfloat16 (__main__.TestTorchDeviceTypeCUDA)
#146007 commented on
Jun 13, 2025 • 0 new comments -
DISABLED test_hessian_vectorize_raises_no_warnings_logging_tensor (__main__.TestAutogradFunctional)
#153644 commented on
Jun 13, 2025 • 0 new comments -
[RFC][API-Unstable]Enable A16W4 on XPU Device
#153019 commented on
Jun 13, 2025 • 0 new comments -
DISABLED test_remove_noop_view_default_cpu (__main__.CpuTests)
#151512 commented on
Jun 13, 2025 • 0 new comments -
DISABLED test_foreach_reduce_large_input__foreach_max_w_empty_False_cuda_bool (__main__.TestForeachCUDA)
#151229 commented on
Jun 13, 2025 • 0 new comments -
Multihead Attention does not work with jagged tensors due to __torch_function__
#153472 commented on
Jun 13, 2025 • 0 new comments -
MPS Performance regressions on Sonoma 14.0
#111517 commented on
Jun 13, 2025 • 0 new comments -
DISABLED test_remove_noop_view_default_cuda (__main__.GPUTests)
#151511 commented on
Jun 13, 2025 • 0 new comments -
DISABLED test_remove_noop_view_dtype_cpu (__main__.CpuTests)
#151540 commented on
Jun 13, 2025 • 0 new comments -
[ONNX] Failed to export PyTorch-2-Export-Quantized model to onnx
#143474 commented on
Jun 13, 2025 • 0 new comments -
torch._dynamo.exc.Unsupported: builtin: bool [<class 'torch._dynamo.variables.tensor.SymNodeVariable'>] False
#136075 commented on
Jun 13, 2025 • 0 new comments -
reshape_view_helper is only used for fake tensor tracing but not proxy tracing.
#153303 commented on
Jun 14, 2025 • 0 new comments -
eval should handle (unhinted: (s77 > 3) | (u0 > 200)) when s77 has hint =5
#153227 commented on
Jun 14, 2025 • 0 new comments -
MPS Sparse Support
#129842 commented on
Jun 14, 2025 • 0 new comments -
Make tlparse able to show a summary of distinct graph breaks
#153669 commented on
Jun 14, 2025 • 0 new comments -
[torch.export] Torch Export produces incorrect program when python generators are used.
#130975 commented on
Jun 14, 2025 • 0 new comments -
DISABLED test_distributed_checkpoint_state_dict_type0_cuda (__main__.TestDistributedCheckpointCUDA)
#145807 commented on
Jun 14, 2025 • 0 new comments -
`INTERNAL ASSERT FAILED` in `interpolate` and `torch.import_ir_module`
#149737 commented on
Jun 14, 2025 • 0 new comments