8000 Correctness issue related to EmplaceAllocations · Issue #19355 · iree-org/iree · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Correctness issue related to EmplaceAllocations #19355
Closed
@qedawkins

Description

@qedawkins

This patch (957ae60) revealed a correctness issue for SDXL when disabling --iree-dispatch-creation-enable-fuse-horizontal-contractions. Commenting out Stream::EmplaceAllocationsPass fixes the numerical issues. Additionally the results are non-deterministic but non-nan, and appear to rotate through a few possible sets of values.

Dumping the IR before and after EmplaceAllocations with and without the above PR shows very minimal meaningful differences in the IR. Here is a snippet of the only differing part of the IR
Before 957ae: https://gist.github.com/qedawkins/56b861ada7c76fc7919f5fe12ca951cf // Correct
After 957ae: https://gist.github.com/qedawkins/2d6bcc8a4655330b6ac42149658f809b // Wrong

Full before and after dumps can be found here: https://gist.github.com/qedawkins/4fb2be9c832866ff5fb573bb3d92d4da + https://gist.github.com/qedawkins/316ad40b33a61096c5fb3051fdab5022

I'm not sure that EmplaceAllocations is the actual root cause of the bug, as scanning the IR by hand I could not find anything obviously wrong.

The artifacts for (f16) SDXL can be found here: https://github.com/iree-org/iree/blob/main/experimental/regression_suite/shark-test-suite-models/sdxl/test_unet.py#L107. Using the following compilation command can reproduce the full correctness issue:

iree-compile sdxl.mlir \
--iree-hal-target-backends=rocm \
--iree-hip-target=gfx942 \
--iree-vm-bytecode-module-output-format=flatbuffer-binary \
--iree-dispatch-creation-enable-aggressive-fusion \
--iree-dispatch-creation-enable-fuse-horizontal-contractions=false \
--iree-opt-aggressively-propagate-transposes=true \
--iree-codegen-llvmgpu-use-vector-distribution=true \
--iree-opt-outer-dim-concat=true \
--iree-opt-data-tiling=false \
--iree-hip-legacy-sync=true \
--iree-codegen-gpu-native-math-precision=true \
--iree-vm-target-truncate-unsupported-floats \
--iree-global-opt-propagate-transposes=true \
--iree-opt-const-eval=false \
--iree-llvmgpu-enable-prefetch=true \
--iree-execution-model=async-external \
--iree-preprocessing-pass-pipeline="builtin.module(util.func(iree-global-opt-raise-special-ops, iree-flow-canonicalize), iree-preprocessing-transpose-convolution-pipeline, iree-preprocessing-pad-to-intrinsics, util.func(iree-preprocessing-generalize-linalg-matmul-experimental))" \
--iree-codegen-transform-dialect-library=/home/qdawkins/data/current_spec.mlir \
-o sdxl.vmfb

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug 🐞Something isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0