8000 RFC for vector length agnostic SVE Vectorized class #153471
Labels
enhancement
Not as big of a feature, but technically not a bug. Should be easy to fix
module: arm
Related to ARM architectures builds of PyTorch. Includes Apple M1
module: vectorization
Related to SIMD vectorization, e.g., Vec256
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Uh oh!
There was an error while loading. Please reload this page.
🚀 The feature, motivation and pitch
This is a proposal for a vector length agnostic Vectorized class to replace the current implementation for the SVE Vectorized class. This proposal will allow us to add support for all SVE vector lengths without any duplication of code and without any regressions compared to the current SVE Vectorized class.
The main idea behind this implementation is to replace the vector in the Vectorized class with an array which we will load from and store to with each op. By making use of compiler optimizations, this allows us to make the Vectorized class vector length agnostic without any regressions.
The drawback of this approach is that we will have to make the
size()
function not a constexpr as SVE vector lengths are not known at compile time. This would require a rewrite of a number of existing functions but should not offer any performance regressions.Additional context
An RFC has been created at pytorch/rfcs#73 which goes into more detail.
cc @malfet @snadampal @milpuz01 @aditew01 @nikhil-arm @fadara01
The text was updated successfully, but these errors were encountered: