Do not wrap single unset_encoding op into dispatch region. #20377

hanhanW · 2025-03-25T21:57:50Z

There are few pieces to plumb it through the software stack. What the revision does is:

Do not treat unset_encoding op as a root op, so it won't get fused into dispatches.
Add the unset_encoding ops to clonable list, so the CloneProducersIntoDispatchRegions pass can pull it into the consumer dispatch.
If there is no consumer dispatch, the unset_encoding op is converted to flow.tensor.encode op.
Similar to other tensor encode op, the op is folded away if an identity layout is recognized. Otherwise, the MaterializeEncodingsPass materializes unique executables for stream.tensor.encode ops and replaces them with dispatches to those executables.

The IR dump shows that there are no additional dispatches when the backend does not support encoding.

To repro, run

iree-compile \
  --output-format=vm-bytecode \
  --iree-hal-target-backends=vulkan-spirv \
  --iree-global-opt-enable-early-materialization=false \
  --mlir-print-ir-after=iree-stream-materialize-encodings \
  --mlir-print-ir-before=iree-stream-materialize-encodings \
  --mlir-disable-threading \
  matmul_f32.mlir -o /tmp/matmul_f32.vmfb

MLIR Source:

func.func @main(%lhs: tensor<?x?xf32>, %rhs: tensor<?x?xf32>) -> tensor<?x?xf32> {
  %c0 = arith.constant 0 : index
  %c1 = arith.constant 1 : index
  %M = tensor.dim %lhs, %c0 : tensor<?x?xf32>
  %N = tensor.dim %rhs, %c1 : tensor<?x?xf32>
  %cst = arith.constant 0.0 : f32
  %init = tensor.empty(%M, %N) : tensor<?x?xf32>
  %fill = linalg.fill ins(%cst : f32) outs(%init : tensor<?x?xf32>) -> tensor<?x?xf32>
  %op = linalg.matmul
      ins(%lhs, %rhs : tensor<?x?xf32>, tensor<?x?xf32>)
      outs(%fill : tensor<?x?xf32>) -> tensor<?x?xf32>
  return %op : tensor<?x?xf32>
}

The revision deletes two old tests from dispatch_linalg_on_tensors.mlir because the functionality is tested in each pass and the pipeline tests.

Fixes #20187

@main

There are few pieces to plumb it through the software stack. What the revision does is: - Do not treat unset_encoding op as a root op, so it won't get fused into dispatches. - Add the unset_encoding ops to clonable list, so the CloneProducersIntoDispatchRegions pass can pull it into the consumer dispatch. - If there is no consumer dispatch, the unset_encoding op is converted to flow.tensor.encode op. - Similar to other tensor encode op, the op is folded away if an identity layout is recognized. Otherwise, the MaterializeEncodingsPass materializes uniqued executables for `stream.tensor.encode` ops and replaces them with dispatches to those executables. The [IR dump](https://gist.github.com/hanhanW/581d4ae9cd8ce26e47c723a27192f7d9) shows that there are no additional dispatches when the backend does not support encoding. To repro, run ```bash iree-compile \ --output-format=vm-bytecode \ --iree-hal-target-backends=vulkan-spirv \ --iree-global-opt-enable-early-materialization=false \ --mlir-print-ir-after=iree-stream-materialize-encodings \ --mlir-print-ir-before=iree-stream-materialize-encodings \ --mlir-disable-threading \ matmul_f32.mlir -o /tmp/matmul_f32.vmfb ``` ```mlir func.func @main(%lhs: tensor<?x?xf32>, %rhs: tensor<?x?xf32>) -> tensor<?x?xf32> { %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index %M = tensor.dim %lhs, %c0 : tensor<?x?xf32> %N = tensor.dim %rhs, %c1 : tensor<?x?xf32> %cst = arith.constant 0.0 : f32 %init = tensor.empty(%M, %N) : tensor<?x?xf32> %fill = linalg.fill ins(%cst : f32) outs(%init : tensor<?x?xf32>) -> tensor<?x?xf32> %op = linalg.matmul ins(%lhs, %rhs : tensor<?x?xf32>, tensor<?x?xf32>) outs(%fill : tensor<?x?xf32>) -> tensor<?x?xf32> return %op : tensor<?x?xf32> } ``` Signed-off-by: hanhanW <hanhan0912@gmail.com>

MaheshRavishankar

Nice!

Signed-off-by: hanhanW <hanhan0912@gmail.com>

hanhanW requested review from MaheshRavishankar and IanWood1 as code owners March 25, 2025 21:57

hanhanW requested a review from Max191 March 25, 2025 21:58

MaheshRavishankar approved these changes Mar 25, 2025

View reviewed changes

Delete two old tests because they are covered in new tests.

df84c70

Signed-off-by: hanhanW <hanhan0912@gmail.com>

hanhanW merged commit 2c0a24c into iree-org:main Mar 26, 2025
43 checks passed

hanhanW deleted the convert-single-unset-encoding-to-flow-tensor-encode branch March 26, 2025 00:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Do not wrap single unset_encoding op into dispatch region. #20377

Do not wrap single unset_encoding op into dispatch region. #20377

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Do not wrap single unset_encoding op into dispatch region. #20377

Do not wrap single unset_encoding op into dispatch region. #20377

Uh oh!

Conversation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!