Correctness issue related to EmplaceAllocations

This patch (957ae60) revealed a correctness issue for SDXL when disabling --iree-dispatch-creation-enable-fuse-horizontal-contractions. Commenting out Stream::EmplaceAllocationsPass fixes the numerical issues. Additionally the results are non-deterministic but non-nan, and appear to rotate through a few possible sets of values.

Dumping the IR before and after EmplaceAllocations with and without the above PR shows very minimal meaningful differences in the IR. Here is a snippet of the only differing part of the IR
Before 957ae: https://gist.github.com/qedawkins/56b861ada7c76fc7919f5fe12ca951cf // Correct
After 957ae: https://gist.github.com/qedawkins/2d6bcc8a4655330b6ac42149658f809b // Wrong

Full before and after dumps can be found here: https://gist.github.com/qedawkins/4fb2be9c832866ff5fb573bb3d92d4da + https://gist.github.com/qedawkins/316ad40b33a61096c5fb3051fdab5022

I'm not sure that EmplaceAllocations is the actual root cause of the bug, as scanning the IR by hand I could not find anything obviously wrong.

The artifacts for (f16) SDXL can be found here: https://github.com/iree-org/iree/blob/main/experimental/regression_suite/shark-test-suite-models/sdxl/test_unet.py#L107. Using the following compilation command can reproduce the full correctness issue:

iree-compile sdxl.mlir \
--iree-hal-target-backends=rocm \
--iree-hip-target=gfx942 \
--iree-vm-bytecode-module-output-format=flatbuffer-binary \
--iree-dispatch-creation-enable-aggressive-fusion \
--iree-dispatch-creation-enable-fuse-horizontal-contractions=false \
--iree-opt-aggressively-propagate-transposes=true \
--iree-codegen-llvmgpu-use-vector-distribution=true \
--iree-opt-outer-dim-concat=true \
--iree-opt-data-tiling=false \
--iree-hip-legacy-sync=true \
--iree-codegen-gpu-native-math-precision=true \
--iree-vm-target-truncate-unsupported-floats \
--iree-global-opt-propagate-transposes=true \
--iree-opt-const-eval=false \
--iree-llvmgpu-enable-prefetch=true \
--iree-execution-model=async-external \
--iree-preprocessing-pass-pipeline="builtin.module(util.func(iree-global-opt-raise-special-ops, iree-flow-canonicalize), iree-preprocessing-transpose-convolution-pipeline, iree-preprocessing-pad-to-intrinsics, util.func(iree-preprocessing-generalize-linalg-matmul-experimental))" \
--iree-codegen-transform-dialect-library=/home/qdawkins/data/current_spec.mlir \
-o sdxl.vmfb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions