Open
Description
One option is to have a linalg.tensor_pad
. If the linalg.tensor_pad
is the consumer, we can lower it to buffers world by adding the linalg.fill
op and subview
op to the top and passing the result of subview
op to generic
op as output argument. For example:
%0 = linalg.generic ... %arg { ... } (tensor<>) -> tensor<>
%1 = linalg.tensor_pad %0, %init
return %1
->
linalg.fill %0_buffer, %init
%sub_buffer = subview %0_buffer ...
linalg.generic ... %arg_buffer, %sub_buffer { ... } (memref<>, memref<>) -> ()
This only works when the pad is the "last op" (or say there is a buffer for the result tensor of pad op), because we need to make the subview and pass it to generic op.
Haven't get the idea for lowering when a linalg.tensor_pad
op is a producer. We probably need #2782 feature, so we can have a temp memory in the middle.
Tag @nicolasvasilache for visibility. I will discuss this approach with @nicolasvasilache