8000 [Upstream Triton] AssertionError: Tensor-likes are not close! ```inductor.test_torchinductor_opinfo``` · Issue #154212 · pytorch/pytorch · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
[Upstream Triton] AssertionError: Tensor-likes are not close! inductor.test_torchinductor_opinfo #154212
Closed
@iupaikov-amd

Description

@iupaikov-amd

🐛 Describe the bug

Testing ToT triton before release/2.8 to assess the issues.

This test fails locally on amd gpus and is not confirmed to be common for cuda.

Affected tests: 42

reproducer: PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=0 python test/inductor/test_torchinductor_opinfo.py TestInductorOpInfoCPU.test_comprehensive__batch_norm_with_update_cpu_float16

Sample error:

{'message': 'Exception: Caused by sample input at index 0: SampleInput(input=Tensor[size=(5, 5, 5), device="cpu", dtype=torch.float16], args=(Tensor[size=(5,), device="cpu", dtype=torch.float16],Tensor[size=(5,), device="cpu", dtype=torch.float16],Tensor[size=(5,), device="cpu", dtype=torch.float16],Tensor[size=(5,), device="cpu", dtype=torch.float16],0.5,0.6), kwargs={}, broadcasts_input=False, name=\'\')\n\nTo execute this test, run the following from the base repo dir:\n    PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=0 python test/inductor/test_torchinductor_opinfo.py TestInductorOpInfoCPU.test_comprehensive__batch_norm_with_update_cpu_float16\n\nThis message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0', 'text': 'Traceback (most recent call last):\n  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1135, in test_wrapper\n    return test(*args, **kwargs)\n  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1434, in only_fn\n    return fn(self, *args, **kwargs)\n  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 2291, in wrapper\n    fn(*args, **kwargs)\n  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1215, in dep_fn\n    return fn(slf, *args, **kwargs)\n  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1215, in dep_fn\n    return fn(slf, *args, **kwargs)\n  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1215, in dep_fn\n    return fn(slf, *args, **kwargs)\n  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 1612, in wrapper\n    fn(*args, **kwargs)\n  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 1534, in wrapper\n    fn(*args, **kwargs)\n  File "/opt/conda/envs/py_3.10/lib/python3.10/unittest/mock.py", line 1379, in patched\n    return func(*newargs, **newkeywargs)\n  File "/opt/conda/envs/py_3.10/lib/python3.10/contextlib.py", line 79, in inner\n    return func(*args, **kwds)\n  File "/opt/conda/envs/py_3.10/lib/python3.10/contextlib.py", line 79, in inner\n    return func(*args, **kwds)\n  File "/opt/conda/envs/py_3.10/lib/python3.10/contextlib.py", line 79, in inner\n    return func(*args, **kwds)\n  File "/tmp/pytorch/test/inductor/test_torchinductor_opinfo.py", line 962, in inner\n    raise e\n  File "/tmp/pytorch/test/inductor/test_torchinductor_opinfo.py", line 954, in inner\n    fn(self, device, dtype, op)\n  File "/tmp/pytorch/test/inductor/test_torchinductor_opinfo.py", line 1207, in test_comprehensive\n    raise e\n  File "/tmp/pytorch/test/inductor/test_torchinductor_opinfo.py", line 1189, in test_comprehensive\n    self.check_model(\n  File "/tmp/pytorch/test/inductor/test_torchinductor.py", line 539, in check_model\n    self.assertEqual(\n  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 4102, in assertEqual\n    raise error_metas.pop()[0].to_error(  # type: ignore[index]\nAssertionError: Tensor-likes are not close!\n\nMismatched elements: 4 / 125 (3.2%)\nGreatest absolute difference: 0.0010986328125 at index (3, 1, 4) (up to 1e-05 allowed)\nGreatest relative difference: 0.00571441650390625 at index (3, 1, 4) (up to 0.001 allowed)\n\nThe failure occurred for item [0]\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 3154, in wrapper\n    method(*args, **kwargs)\n  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 426, in instantiated_test\n    result = test(self, **param_kwargs)\n  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_utils.py", line 1612, in wrapper\n    fn(*args, **kwargs)\n  File "/opt/conda/envs/py_3.10/lib/python3.10/site-packages/torch/testing/_internal/common_device_type.py", line 1147, in test_wrapper\n    raise e_tracked from e\nException: Caused by sample input at index 0: SampleInput(input=Tensor[size=(5, 5, 5), device="cpu", dtype=torch.float16], args=(Tensor[size=(5,), device="cpu", dtype=torch.float16],Tensor[size=(5,), device="cpu", dtype=torch.float16],Tensor[size=(5,), device="cpu", dtype=torch.float16],Tensor[size=(5,), device="cpu", dtype=torch.float16],0.5,0.6), kwargs={}, broadcasts_input=False, name=\'\')\n\nTo execute this test, run the following from the base repo dir:\n    PYTORCH_OPINFO_SAMPLE_INPUT_INDEX=0 python test/inductor/test_torchinductor_opinfo.py TestInductorOpInfoCPU.test_comprehensive__batch_norm_with_update_cpu_float16\n\nThis message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0'}

Full list of failing tests:

test_comprehensive__batch_norm_with_update_cpu_float16
test_comprehensive__native_batch_norm_legit_cpu_float16
test_comprehensive__softmax_backward_data_cpu_float16
test_comprehensive_addr_cpu_float16
test_comprehensive_complex_cpu_float16
test_comprehensive_cross_cpu_float16
test_comprehensive_histc_cpu_float16
test_comprehensive_linalg_cross_cpu_float16
test_comprehensive_linalg_vecdot_cpu_float16
test_comprehensive_log_softmax_cpu_float16
test_comprehensive_masked_log_softmax_cpu_float16
test_comprehensive_masked_var_cpu_float16
test_comprehensive_nanmean_cpu_float16
test_comprehensive_nansum_cpu_float16
test_comprehensive_native_batch_norm_cpu_float16
test_comprehensive_native_layer_norm_cpu_float16
test_comprehensive_nn_functional_batch_norm_cpu_float16
test_comprehensive_nn_functional_binary_cross_entropy_with_logits_cpu_float16
test_comprehensive_nn_functional_cosine_embedding_loss_cpu_float16
test_comprehensive_nn_functional_cosine_similarity_cpu_float16
test_comprehensive_nn_functional_grid_sample_cpu_float16
test_comprehensive_nn_functional_hinge_embedding_loss_cpu_float16
test_comprehensive_nn_functional_huber_loss_cpu_float16
test_comprehensive_nn_functional_instance_norm_cpu_float16
test_comprehensive_nn_functional_interpolate_bicubic_cpu_float16
test_comprehensive_nn_functional_interpolate_linear_cpu_float16
test_comprehensive_nn_functional_interpolate_trilinear_cpu_float16
test_comprehensive_nn_functional_multilabel_soft_margin_loss_cpu_float16
test_comprehensive_nn_functional_soft_margin_loss_cpu_float16
test_comprehensive_sub_cpu_float16
test_comprehensive_trapezoid_cpu_float16
test_comprehensive_trapz_cpu_float16
test_comprehensive_view_as_complex_cpu_float16
test_comprehensive_div_floor_rounding_cuda_float16
test_comprehensive_div_trunc_rounding_cuda_float16
test_comprehensive_floor_divide_cuda_float16
test_comprehensive_max_pool2d_with_indices_backward_cuda_float16
test_comprehensive_nanquantile_cuda_float64
test_comprehensive_remainder_cuda_float16

Versions

upstream pytorch + triton commit: triton-lang/triton@2ec711b

cc @chauhang @penguinwu @bertmaher @int3 @davidberard98 @nmacchioni @chenyang78 @embg @peterbell10 @aakhundov

Metadata

Metadata

Assignees

No one assigned

    Labels

    oncall: pt2triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleupstream tritonUpstream Triton Issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0