8000 install codegen header to torch/include by guangyey · Pull Request #1405 · intel/torch-xpu-ops · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

install codegen header to torch/include #1405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 7, 2025
Merged

install codegen header to torch/include #1405

merged 3 commits into from
Mar 7, 2025

Conversation

guangyey
Copy link
Contributor
@guangyey guangyey commented Feb 25, 2025

Motivation

This PR addresses a code generation issue related to XPU. Currently, there are two separate codegen paths for XPU:

  1. Stock PyTorch – Generates code for oneDNN ops.
  2. torch-xpu-ops – Generates code for SYCL kernel ops.

The corresponding build directories are:

  1. build/aten/src/ATen (for stock PyTorch)
  2. build/xpu/ATen (for torch-xpu-ops)

However, in the torch-xpu-ops codegen, we mistakenly omitted installing XPU op headers from build/xpu/ATen/ops to build/aten/src/ATen/ops. This PR fixes the issue and also removes some unnecessary code for better maintainability.

Solution

We copy the codegen from torch-xpu-ops to stock PyTorch

Additional Context

Fix pytorch/pytorch#145902

Sorry, something went wrong.

@guangyey guangyey force-pushed the guangyey/codegen branch 15 times, most recently from ccff7d5 to 8620737 Compare February 25, 2025 12:20
@guangyey guangyey requested a review from EikanWang February 25, 2025 12:54
@dvrogozh
Copy link
Contributor

@guangyey : this does not seem to work for me, I still don't get headers installed to torch/include/. Do we need any changes on pytorch side as well (outside of torch-xpu-ops)?

$ find . -name cat_xpu_dispatch.h
./build/xpu/ATen/ops/cat_xpu_dispatch.h
./build/aten/src/ATen/ops/cat_xpu_dispatch.h
$ find . -name cat_cuda_dispatch.h
./build/aten/src/ATen/ops/cat_cuda_dispatch.h
./torch/include/ATen/ops/cat_cuda_dispatch.h

@guangyey guangyey force-pushed the guangyey/codegen branch 8 times, most recently from d6aa4e3 to 242ad4e Compare February 26, 2025 08:29
dvrogozh added a commit to dvrogozh/pytorch that referenced this pull request Feb 26, 2025
Commit extends existing CUDA test to cover XPU SyclExtension
case for the same feature - `py_limited_api`.

NOTE: THE CHANGE CAN NOT BE MERGED AS IS

Change requires update of the commit pin for torch-xpu-ops.

Requires: intel/torch-xpu-ops#1405
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
@dvrogozh
Copy link
Contributor

this does not seem to work for me, I still don't get headers installed to torch/include/

@guangyey : I see this resolved now after last changes.

@@ -1,89 +1,95 @@
if(Codegen_GPU_cmake_included)
if(Codegen_XPU_cmake_included)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's quite a change. I worry we might see more issues with torch code being updated and 2 things getting out of sync:

  • torch-xpu-ops Codegen.cmake and related scripts with torch side versions of the same
  • ops code in native/xpu folder which you've modified for some include files

Any chance we can start bringing pieces of this code into torch codebase itself? For example, any chance we can stop having Codegen.cmake here on torch-xpu-ops side?

Note: don't consider above as a request to do that in this PR. I am just trying to discuss.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @dvrogozh, for your verification. I agree that these two components may be out of sync. However, upstreaming this code to stock PyTorch could be challenging since torch-xpu-ops is out of tree. We can track any out-of-sync issues through our nightly builds and CI unit tests. For now, we may need to address and maintain them ourselves.

@EikanWang
Copy link
Contributor

@guangyey , overall, the PR looks good to me. However, I prefer to separate the code refinement from the head files installation. Does it make sense to you?

@EikanWang
Copy link
Contributor

@guangyey , by the way, are the ci failures relevant to this PR?

@guangyey
Copy link
Contributor Author
guangyey commented Mar 3, 2025

@guangyey , overall, the PR looks good to me. However, I prefer to separate the code refinement from the head files installation. Does it make sense to you?

I capture your point. I will separate another PR to install head files. This PR will focus on code refinement.

@guangyey
Copy link
Contributor Author
guangyey commented Mar 3, 2025

@guangyey , by the way, are the ci failures relevant to this PR?

I guess it is irrelevant but I am not sure. I will check it.

@guangyey guangyey mentioned this pull request Mar 3, 2025
github-merge-queue bot pushed a commit that referenced this pull request Mar 5, 2025
# Motivation
Following the comments
[here](#1405 (comment)),
this PR intend to refine code related to codegen and remove redundant
code.
@guangyey guangyey force-pushed the guangyey/codegen branch 2 times, most recently from 3970602 to 0a8ed2e Compare March 5, 2025 08:37
@guangyey
Copy link
Contributor Author
guangyey commented Mar 5, 2025

@EikanWang, I have separated this PR and rebased it to the latest main branch. May I know if I have addressed your comments?

@guangyey
Copy link
Contributor Author
guangyey commented Mar 6, 2025

The failures are unrelated to this PR. @EikanWang, any comments?

@guangyey
Copy link
Contributor Author
guangyey commented Mar 7, 2025

These failures are irrelevant.

Show Failed cases in op_ut with skip
=========================================================================
test_transformers_xpu.py::TestTransformersXPU::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_2_key_padding_mask_dim_2_bool_xpu
test_transformers_xpu.py::TestTransformersXPU::test_multiheadattention_fastpath_attn_mask_attn_mask_dim_3_key_padding_mask_dim_2_bool_xpu
test_transformers_xpu.py::TestTransformersXPU::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_False_use_autocast_False_d_model_12_xpu
test_transformers_xpu.py::TestTransformersXPU::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_False_use_autocast_True_d_model_12_xpu
test_transformers_xpu.py::TestTransformersXPU::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_True_use_autocast_False_d_model_12_xpu
test_transformers_xpu.py::TestTransformersXPU::test_transformerencoder_fastpath_use_torchscript_False_enable_nested_tensor_True_use_autocast_True_d_model_12_xpu
test_linalg_xpu.py::TestLinalgXPU::test_gemm_bias_offline_tunableop_xpu_bfloat16
test_meta_xpu.py::TestMetaXPU::test_dispatch_meta_outplace_nn_functional_scaled_dot_product_attention_xpu_bfloat16
test_meta_xpu.py::TestMetaXPU::test_dispatch_meta_outplace_nn_functional_scaled_dot_product_attention_xpu_float16
test_meta_xpu.py::TestMetaXPU::test_dispatch_meta_outplace_nn_functional_scaled_dot_product_attention_xpu_float32
test_meta_xpu.py::TestMetaXPU::test_dispatch_symbolic_meta_outplace_all_strides_nn_functional_scaled_dot_product_attention_xpu_float32
test_meta_xpu.py::TestMetaXPU::test_dispatch_symbolic_meta_outplace_nn_functional_scaled_dot_product_attention_xpu_bfloat16
test_meta_xpu.py::TestMetaXPU::test_dispatch_symbolic_meta_outplace_nn_functional_scaled_dot_product_attention_xpu_float16
test_meta_xpu.py::TestMetaXPU::test_dispatch_symbolic_meta_outplace_nn_functional_scaled_dot_product_attention_xpu_float32

@guangyey guangyey added this pull request to the merge queue Mar 7, 2025
Merged via the queue into main with commit b275be6 Mar 7, 2025
16 of 18 checks passed
@guangyey guangyey deleted the guangyey/codegen branch March 7, 2025 03:04
@guangyey guangyey added the enhancement New feature or request label Mar 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

xpu: installed pytorch is missing aten xpu ops headers (ATen/ops/cat_xpu_dispatch.h and others)
3 participants
0