-
Notifications
You must be signed in to change notification settings - Fork 699
Insights: iree-org/iree
Overview
Could not load contribution data
Please try again later
29 Releases published by 2 people
-
iree-3.4.0rc20250428 iree candidate iree-3.4.0rc20250428
published
Apr 28, 2025 -
iree-3.4.0rc20250429 iree candidate iree-3.4.0rc20250429
published
Apr 29, 2025 -
iree-3.4.0rc20250430 iree candidate iree-3.4.0rc20250430
published
Apr 30, 2025 -
iree-3.4.0rc20250502 iree candidate iree-3.4.0rc20250502
published
May 2, 2025 -
iree-3.4.0rc20250503 iree candidate iree-3.4.0rc20250503
published
May 3, 2025 -
iree-3.4.0rc20250504 iree candidate iree-3.4.0rc20250504
published
May 4, 2025 -
iree-3.4.0rc20250505 iree candidate iree-3.4.0rc20250505
published
May 5, 2025 -
v3.4.0 Release v3.4.0
published
May 5, 2025 -
iree-3.5.0rc20250506 iree candidate iree-3.5.0rc20250506
published
May 6, 2025 -
iree-3.5.0rc20250507 iree candidate iree-3.5.0rc20250507
published
May 7, 2025 -
iree-3.5.0rc20250508 iree candidate iree-3.5.0rc20250508
published
May 8, 2025 -
iree-3.5.0rc20250509 iree candidate iree-3.5.0rc20250509
published
May 9, 2025 -
iree-3.5.0rc20250510 iree candidate iree-3.5.0rc20250510
published
May 10, 2025 -
iree-3.5.0rc20250511 iree candidate iree-3.5.0rc20250511
published
May 11, 2025 -
iree-3.5.0rc20250512 iree candidate iree-3.5.0rc20250512
published
May 12, 2025 -
iree-3.5.0rc20250513 iree candidate iree-3.5.0rc20250513
published
May 13, 2025 -
iree-3.5.0rc20250514 iree candidate iree-3.5.0rc20250514
published
May 14, 2025 -
iree-3.5.0rc20250515 iree candidate iree-3.5.0rc20250515
published
May 15, 2025 -
iree-3.5.0rc20250516 iree candidate iree-3.5.0rc20250516
published
May 16, 2025 -
iree-3.5.0rc20250517 iree candidate iree-3.5.0rc20250517
published
May 17, 2025 -
iree-3.5.0rc20250518 iree candidate iree-3.5.0rc20250518
published
May 18, 2025 -
iree-3.5.0rc20250519 iree candidate iree-3.5.0rc20250519
published
May 19, 2025 -
iree-3.5.0rc20250520 iree candidate iree-3.5.0rc20250520
published
May 20, 2025 -
iree-3.5.0rc20250521 iree candidate iree-3.5.0rc20250521
published
May 21, 2025 -
iree-3.5.0rc20250522 iree candidate iree-3.5.0rc20250522
published
May 22, 2025 -
iree-3.5.0rc20250523 iree candidate iree-3.5.0rc20250523
published
May 23, 2025 -
iree-3.5.0rc20250524 iree candidate iree-3.5.0rc20250524
published
May 24, 2025 -
iree-3.5.0rc20250525 iree candidate iree-3.5.0rc20250525
published
May 25, 2025 -
iree-3.5.0rc20250527 iree candidate iree-3.5.0rc20250527
published
May 27, 2025
171 Pull requests merged by 37 people
-
[Codegen] Add pass for specializing executable variants
#20771 merged
May 27, 2025 -
[NFC] Rename load_from/store_to_memref to load_from/store_to_buffer
#20897 merged
May 27, 2025 -
Adding --iree-rocm-container-type= flag.
#20902 merged
May 27, 2025 -
Add support for dynamic unit trip scf.for to scf.if
#20880 merged
May 27, 2025 -
Integrate llvm-project@7a8090c037255b54895d61df2eb141fee48d6d83
#20873 merged
May 27, 2025 -
[Codegen][NFC] Make namespace usage follow IREE::[Encoding|Codegen].
#20894 merged
May 23, 2025 -
Fix Link error when
IREECompiler.lib
hits 4GiB#20892 merged
May 22, 2025 -
Adding support for
#hal.device.optimal<...>
through to runtime.#20879 merged
May 22, 2025 -
Adding tryLookupResourceUsageAffinity.
#20891 merged
May 22, 2025 -
Temporary automatic reference counting(ish) pass for inserting async deallocations.
#20765 merged
May 22, 2025 -
[Dispatch Creation] Fix GatherFusionPattern crash
#20887 merged
May 22, 2025 -
[Flow] Dump affinity info in DumpDispatchGraph pass.
#20888 merged
May 22, 2025 -
[Codegen] Add patterns to fold reshapes into load_from/store_to_memref
#20881 merged
May 22, 2025 -
[Preprocessing] Add SinkReshapesPass in MakeSingleDispatchPassPipeline
#20882 merged
May 22, 2025 -
[LinalgExt] Fold unit dims for iree_linalg_ext.gather
#20877 merged
May 22, 2025 -
[NFC][Codegen] Rename early bufferization op operands
#20874 merged
May 21, 2025 -
[LinalgExt] Canonicalize gather to an extract_slice
#20878 merged
May 21, 2025 -
Align iree_hal_sync_device_t allocation to 16 bytes.
#20773 merged
May 21, 2025 -
[VectorExt] Fix illegal transfer_read during gather vectorization
#20876 merged
May 21, 2025 -
[Codegen] Add ukernel support for argmax on BF16 and enable optional max value return
#20768 merged
May 21, 2025 -
[Codegen] Support multiple forall ops in ReconcileTranslationInfo
#20848 merged
May 21, 2025 -
[AMDGPU] Rewrite some gpu.shuffle xor to ds_swizzle, per upstream
#20868 merged
May 21, 2025 -
[Encoding][NFC] Move Encoding Utils out from IR definition.
#20871 merged
May 21, 2025 -
[GPU] Vector distribution support for multiple stores
#20816 merged
May 21, 2025 -
[CodeGen] Fix a MemoryEffec 10000 tsOpInterface bug in FuseConsumerOp.
#20869 merged
May 20, 2025 -
Fix padding to nop encoding specialization
#20837 merged
May 20, 2025 -
[Codegen][GPU] Support padding in CombineLayoutTransformation
#20797 merged
May 20, 2025 -
[LinalgExt] Add map_scatter e2e tests for CPU and VMVX backends.
#20861 merged
May 20, 2025 -
[Dispatch Creation] Clone iree_linalg_ext.gather for attn
#20866 merged
May 20, 2025 -
[VectorExt] Vectorize
iree_linalg_ext.gather
#20807 merged
May 20, 2025 -
[Codegen] Fix dominance issue in collapse shape fusion
#20864 merged
May 20, 2025 -
[VectorExt] Fix transfer_gather printer
#20860 merged
May 20, 2025 -
[Dispatch Creation] Handle
linalg.fill
in collapse dimensions#20863 merged
May 19, 2025 -
Fix logic for yieldReplacements in tileDispatchUsingForall
#20844 merged
May 19, 2025 -
[LinalgExt] Clone iree_linalg_ext.gather (5/5)
#20563 merged
May 19, 2025 -
[TensorExt] Drop space from count_from_slice printer
#20850 merged
May 19, 2025 -
[Codegen][NFC] Refresh remove_single_iteration_loop.mlir test.
#20842 merged
May 18, 2025 -
Integrate llvm-project@faf5d747f174cc
#20828 merged
May 16, 2025 -
[Codegen][ROCDL] Drop nominal support for dynamic shared mem
#20805 merged
May 16, 2025 -
Cleaning up iree_hal_module_debug_sink_t destroy.
#20841 merged
May 16, 2025 -
[Codegen] Make ReconcileTranslationInfo work with multiple exports
#20801 merged
May 16, 2025 -
[Codegen][GPU] Add support for allocating private memory for unused DPS results
#20793 merged
May 16, 2025 -
Sink cast-like flow ops across flow.tensor.transfer/barrier.
#20839 merged
May 16, 2025 -
Update
hanhanW
for CODEOWNERS based on recent activities.#20840 merged
May 16, 2025 -
[LinalgExt] Add a canonicalization pattern to drop unused results from sort op
#20827 merged
May 16, 2025 -
[iree-test-suite] Update the sharktank models benchmark time
#20830 merged
May 16, 2025 -
[DispatchCreation] Set limits on when padding encoding is applied.
#20732 merged
May 15, 2025 -
[DispatchCreation] Remove CollapseReductionDimensionsPass
#20829 merged
May 15, 2025 -
[NFC] Cleaning up flow canonicalize pass.
#20826 merged
May 15, 2025 -
[GlobalOptimization] Do not hoist fill-like operations
#19719 merged
May 15, 2025 -
[Codegen] split-k on argmax op
#20717 merged
May 15, 2025 -
[PJRT] Fix tensor element type for signed integers
#19496 merged
May 15, 2025 -
[GPU] Increase the VAE benchmark threshold
#20809 merged
May 15, 2025 -
[NFC] Refactor duplicated getEncodingInfo logic
#20820 merged
May 15, 2025 -
[Encoding] Add getOffsetSizesStrides interface for load/store materialization
#20741 merged
May 15, 2025 -
Removing invalid folder for vm.add + vm.sub ops.
#20808 merged
May 14, 2025 -
[ROCm] Set ABI version control variable correctly
#20800 merged
May 14, 2025 -
[DispatchCreation] Remove unit dim from attn mask
#20796 merged
May 13, 2025 -
[Flow] Set known dimensions on concat output
#20795 merged
May 13, 2025 -
[GPU] Enable vector distribute on reduction operations by default
#20751 merged
May 13, 2025 -
[Codegen][LLVMGPU] Optionally linearize the number of workgroups specified
#20787 merged
May 13, 2025 -
[NFC][Encoding] Move convertType to LayoutAttrInterface
#20794 merged
May 13, 2025 -
[LinalgExt] Remove region from LinalgExt::GatherOp
#20776 merged
May 13, 2025 -
[NFC] Move LayoutAttrInterface to Encoding
#20782 merged
May 13, 2025 -
[NFC][Codegen] Move getEncodingInfo to PackedLayoutAttrInterface
#20780 merged
May 13, 2025 -
[CPU][NFC] Lit tests cleanup and improvements.
#20790 merged
May 13, 2025 -
[DispatchCreation] White list ops that can be cloned.
#20791 merged
May 12, 2025 -
[Codegen][NFC] Move bufferization test out from LLVMCPU/test.
#20789 merged
May 12, 2025 -
[HAL] Add F8E8M0FNU
#20783 merged
May 12, 2025 -
[NFC][Codegen] Move EncodingNop LayoutAttrInterface to external model
#20778 merged
May 12, 2025 -
Workaround for stack overflow in stream refine usage.
#20749 merged
May 12, 2025 -
Emit a warning when one of the iree-input-demote-* passes is used.
#20784 merged
May 12, 2025 -
[AMDGPU] Define gfx950 target and its MFMAs
#20623 merged
May 9, 2025 -
Metal HAL: remove shadowed variable
#20760 merged
May 9, 2025 -
Fix iree-codegen-llvmcpu-configuration-pipeline registration
#20761 merged
May 9, 2025 -
Integrate LLVM to llvm/llvm-project@8404b29
#20757 merged
May 9, 2025 -
Adding quality and benchmark config docs
#20759 merged
May 9, 2025 -
[GPU] Enable vector distribute pipeline for Matvecs by default
#20706 merged
May 8, 2025 -
[Flow] Improve DumpDispatchGraph pass for programs at model level.
#20756 merged
May 8, 2025 -
e2e matmul test improvements: faster diagnostics, finer control with environment variables
#20755 merged
May 8, 2025 -
[hip] Add flag for disabling caching for async allocations.
#20753 merged
May 8, 2025 -
[Encoding] Use struct directive for encodingAttr assembly format
#20746 merged
May 8, 2025 -
Add regression test for #20740 / #20736
#20750 merged
May 8, 2025 -
Pingpong: add medium-sized expanded-shape FP8
#20735 merged
May 8, 2025 -
[LinalgExt][NFC] Move transformtion method declarations to Transforms.h
#20747 merged
May 7, 2025 -
Adding a
hal.executable.export
condition region.#20739 merged
May 7, 2025 -
Continue trying executable loaders when a loader reports NOT_FOUND.
#20745 merged
May 7, 2025 -
[DispatchCreation] Use patterns to bubble up expand shape across collapse shapes.
#20648 merged
May 6, 2025 -
[AMDGPU] Drop dynamic M bounds checks for pingpong
#20738 merged
May 6, 2025 -
[Codegen] Fix invalid use of iterators in
PropagateReshapesByExpansion
#20740 merged
May 6, 2025 -
Integrate llvm/llvm-project@15f7c6e
#20725 merged
May 6, 2025 -
[Docs] Make op/attr/type summary styles consistent
#20726 merged
May 6, 2025 -
Adding mimalloc v3 as an optional system allocator.
#20730 merged
May 6, 2025 -
[NFC] Move pack/unpack e2e tests to linalg/.
#20728 merged
May 6, 2025 -
[Codegen] Add pass to combine layout transformations
#20655 merged
May 6, 2025 -
[DispatchCreation] Avoid hoisting set encoding operations with padding encodings
#20733 merged
May 6, 2025 -
[Codegen][Common] Add transform op to check for lowering configs when matching
#20724 merged
May 6, 2025 -
[NFC] Use ShapedType::isDynamicShape when possible.
#20731 merged
May 6, 2025 -
Print opt flags to more accurately reproduce (.linked -> .optimized)
#20716 merged
May 6, 2025 -
Added missing dependencies for the bazel build
#20719 merged
May 6, 2025 -
Adding user-defined IREE_ALLOCATOR_SYSTEM support.
#20727 merged
May 5, 2025 -
[NFC][Encoding] Move materializeEncodingValueFn to type converter
#20720 merged
May 5, 2025 -
[LinalgExt] Implement tiling interface for map_scatter
#20688 merged
May 5, 2025 -
Bump version to 3.5.0 after 3.4.0 release.
#20721 merged
May 5, 2025 -
[Encoding] Drop resolver interface implementation for SpecializedEncodingAttr
#20718 merged
May 5, 2025 -
[Codegen][AMDGPU] Add pingpong to default gfx942 tuning
#20678 merged
May 5, 2025 -
[Codegen][GPU] Keep range and divisibility annotations on push constants
#19348 merged
May 5, 2025 -
[Encoding] Add convertType interface to generalize type conversion
#20700 merged
May 5, 2025 -
Fixing missing allocator arg on the vulkan dynamic symbol table.
#20712 merged
May 2, 2025 -
[Codegen][DerivedConfig] Add support to set outermost tile size as vector size
#20692 merged
May 2, 2025 -
Runtime float conversion helpers: add FP6, FP4 and E8M0 types.
#20707 merged
May 2, 2025 -
[Im2col] Fix bug when there is no batch dimension
#20711 merged
May 2, 2025 -
[AMDGPU] Support mask optimization for multiple users
#20697 merged
May 2, 2025 -
Adding pass documentation for IREE dialects and pipelines.
#20705 merged
May 2, 2025 -
[Codegen][GPU] Improve intrinsic based Attention heuristics
#20695 merged
May 2, 2025 -
[VectorDistribution] Improve vector.broadcast distribution
#20652 merged
May 2, 2025 -
[LLVMGPU] Vector distribute config to handle dyn dims
#20603 merged
May 2, 2025 -
[NFC] Converting the VM dialect to use tablegen passes.
#20698 merged
May 1, 2025 -
Runtime float type conversion helpers: Fix handling of denormals.
#20676 merged
May 1, 2025 -
[Im2col] Remain batch dimension untiled during decomposition when it is contiguous and innermost
#20633 merged
May 1, 2025 -
[Codegen] Drop read_only from LoadFromMemrefOp.
#20693 merged
May 1, 2025 -
[GPU] Cross lane reduction rather than serial
#20680 merged
May 1, 2025 -
[DispatchCreation] Set padding encodings on intermediate tensors.
#20634 merged
May 1, 2025 -
[iree-benchmark] Ensure destructors run before
IREE_TRACE_APP_EXIT
#20694 merged
Apr 30, 2025 -
[Codegen] Clean up TilingInterfaceUtils. NFC.
#20661 merged
Apr 30, 2025 -
[Encoding] Allow
PadEncodingAttribute
to support dynamic padding.#20662 merged
Apr 30, 2025 -
Add relative error for buffer comparison
#19464 merged
Apr 30, 2025 -
Fold iree_tensor_ext.dispatch.workload.ordinal on constants.
#20687 merged
Apr 30, 2025 -
[Dispatch Creation] Fix infinite reshape loop
#20162 merged
Apr 30, 2025 -
Add
flow.tensor.bitcast
for torch view as complex/real.#20689 merged
Apr 30, 2025 -
Revert "Update workflows to run on macOS 15 (#20675)"
#20690 merged
Apr 30, 2025 -
Revert "Call IREE_TRACE_APP_ENTER/EXIT in compiler tool main functions."
#20691 merged
Apr 30, 2025 -
Call IREE_TRACE_APP_ENTER/EXIT in compiler tool main functions.
#20686 merged
Apr 30, 2025 -
[Codegen][GPU] Add placeholder op for buffer casts on tensors
#20589 merged
Apr 30, 2025 -
Integrate llvm/llvm-project@7b70fc7
#20674 merged
Apr 30, 2025 -
Properly handle unaligned refs in VM ABI marshaling.
#20671 merged
Apr 30, 2025 -
Raise an error in demotion passes if illegal extern funcs are present.
#20679 merged
Apr 30, 2025 -
[Codegen] Add pass to bufferize dispatch.tensor.load/store ops
#20627 merged
Apr 30, 2025 -
Add MathToROCDL patterns in ConvertToROCLPass.
#20684 merged
Apr 30, 2025 -
Move ASM to the end of languages list in CMakeLists.txt
#19781 merged
Apr 30, 2025 -
Fixing hip-hal-driver.md typos.
#20683 merged
Apr 30, 2025 -
[Encoding] Add interfaces for encoding propagation
#20567 merged
Apr 30, 2025 -
[runtime][python] Allow benchmark to accept file path, not just a VmModule
#19793 merged
Apr 30, 2025 -
[PJRT] Fix stablehlo attribute parameters for buffer transpose and broadcast
#19488 merged
Apr 30, 2025 -
Update workflows to run on macOS 15
#20675 merged
Apr 29, 2025 -
Bump the github-actions group with 3 updates
#20658 merged
Apr 29, 2025 -
[Codegen] Support bufferization for load_from/store_to_memref ops
#20626 merged
Apr 29, 2025 -
Tolerate NaN == NaN in numerical checks.
#20677 merged
Apr 29, 2025 -
[LinalgExt] Add map_scatter op and verifier
#20640 merged
Apr 29, 2025 -
[LinalgExt] Enforce pure tensor or buffer semantics for LinalgExtOps
#20670 merged
Apr 29, 2025 -
[Codegen] Use isIdentityLayout instead of isNonZeroPadding
#20666 merged
Apr 29, 2025 -
Fixes for FP8 pingpong for Llama expanded shape.
#20639 merged
Apr 29, 2025 -
Use static build path in local temp disk vs. shared
#20665 merged
Apr 29, 2025 -
[docs] Add TensorExt dialect
#20669 merged
Apr 29, 2025 -
Integrate llvm/llvm-project@5953f19
#20657 merged
Apr 29, 2025 -
Raise ValueError when output is not expected
#20575 merged
Apr 29, 2025 -
Replace loop with more efficient memcmp
#20664 merged
Apr 29, 2025 -
Structuring hal.executable.export workgroup count region.
#20659 merged
Apr 29, 2025 -
[Codegen] Clean up prints with
llvm::interleaved
. NFC.#20649 merged
Apr 28, 2025 -
Updating SHA for iree-test-suites
#20656 merged
Apr 28, 2025 -
[Flow] Add folders for chained flow.tensor.transfer ops.
#20337 merged
Apr 28, 2025 -
Revert "[Dispatch Creation] Disable broken reshape fusion by collapsi…
#20653 merged
Apr 28, 2025 -
[DT][GPU] Retire experimental AMDGPU data-tiling flag.
#20644 merged
Apr 28, 2025 -
[runtime] Fix missing zone names in GPU Tracy profiling
#20297 merged
Apr 28, 2025 -
Fix IREE_FILE_IO_ENABLE check
#20573 merged
Apr 28, 2025 -
Remove include(CMakeParseArguments)
#20543 merged
Apr 28, 2025 -
e2e matmul tests: log early the presence of a numerical error
#20638 merged
Apr 28, 2025 -
[NFC] Restructure the lit tests for materialize encoding pass.
#20629 merged
Apr 28, 2025 -
[build] Fix Bazel dependencies for shared library builds
#20596 merged
Apr 28, 2025 -
Reduce verbosity of HAL translation failure errors
#20614 merged
Apr 28, 2025 -
[AMDGPU] Optimize masked transfer read in presence of fat raw buffers
#20604 merged
Apr 28, 2025
17 Pull requests opened by 10 people
-
[Stream] Avoid clone on single user in CloneToConsumers
#20667 opened
Apr 29, 2025 -
[docs] Update sharktuner documentation prior release
#20704 opened
May 1, 2025 -
[VectorDistribution] Add support for distributing vector.constant_mask
#20708 opened
May 2, 2025 -
Convert all Conv2Ds to Winograd when no coop matrices available.
#20754 opened
May 8, 2025 -
[VectorDistribution] Add pattern to distribute transfer_gather ops
#20764 opened
May 9, 2025 -
Export ExecutionEngine through python bindings
#20777 opened
May 12, 2025 -
[DispatchCreation] Relax unit dim expansion cycle check
#20781 opened
May 12, 2025 -
Start LLVM integrate integrates/llvm-20250514
#20799 opened
May 14, 2025 -
Preserve preprocessing passpipeline attribute during `WrapEntryPointsPass`.
#20818 opened
May 15, 2025 -
[Util] Move Attribute based pipeline to Util dialect.
#20832 opened
May 15, 2025 -
Shared/integrate 20250516
#20838 opened
May 16, 2025 -
[Encoding] Teach specialize encodings to handle pad encodings.
#20845 opened
May 16, 2025 -
Stream: Add topology attribute
#20885 opened
May 22, 2025 -
[Codegen] Enable reshape into buffer folding in BlockDynamicDimensions
#20898 opened
May 23, 2025 -
[Codegen] Propagate relayout ops before combining
#20901 opened
May 27, 2025 -
[GPU] Fix reduction kernel config for vectordistribute
#20903 opened
May 27, 2025 -
Prefetch shared memory in presence of scf.if
#20904 opened
May 27, 2025
68 Issues closed by 15 people
-
llvmcpu_mobilenet_v3-large_uint8.run starts failing after bumping tf-nightly
#14830 closed
May 22, 2025 -
[DT][GPU] Support codegen for pack/unpack fusion and mmt4d-like ops
#17720 closed
May 22, 2025 -
[CPU] Inefficient img2col tensor kernel
#17413 closed
May 22, 2025 -
[GPU][DT] Enable e2e tests for tensor.pack op on GPU
#17186 closed
May 22, 2025 -
[GPU][DT] Enable e2e tests for tensor.unpack op on GPU
#17187 closed
May 22, 2025 -
[GPU][DT] matmul microbenchmarks for GPU data-tiling path
#17189 closed
May 22, 2025 -
Redundant buffer allocations in LinalgExt/Linalg ops when there are constants in outs
#9813 closed
May 22, 2025 -
CI - Windows x64 MSVC failing
#20763 closed
May 22, 2025 -
[HAL] Add HAL allocator selection based on optimal affinity for a particular usage + API.
#20857 closed
May 22, 2025 -
MLIR's memref.load/store should produce `nuw` and `inbounds` GEPs
#20483 closed
May 22, 2025 -
Fix encoding dialect header/tablegen dependency.
#20681 closed
May 21, 2025 -
[VectorExt] `transfer_gather` parser fails to round-trip valid operation emitted by printer
#20802 closed
May 20, 2025 -
Correctness issue related to EmplaceAllocations
#19355 closed
May 19, 2025 -
[Codegen][AMDGPU Backend] Correctness issue for conv_2d_ngchw_gfchw
#18798 closed
May 19, 2025 -
[LLVMCPU] Data tiling of group quantized matmul on CPU
#14337 closed
May 19, 2025 -
[Flow] Improve DetachElementwiseFromNamedOpPass to match codegen expectations
#12080 closed
May 19, 2025 -
Not able to set atol in check tests
#5414 closed
May 19, 2025 -
Util constant analysis should not depend on LinalgExt dialect
#14887 closed
May 19, 2025 -
[Regression] The number of dispatches increases 12.28% after #14505
#14531 closed
May 19, 2025 -
Implement a listener for non-pattern based passes
#12858 closed
May 19, 2025 -
Missing documentation for build and test iree-dialects
#8361 closed
May 19, 2025 -
Missing fusion for winograd transform ops with their consumers
#17487 closed
May 19, 2025 -
Winograd transform generates bad memory accessing pattern for CPU
#17485 closed
May 19, 2025 -
[CPU] Explore ukernels for winograd transform ops
#17491 closed
May 19, 2025 -
[DT][Fusion] Move set encoding after forming dispatch region
#17718 closed
May 19, 2025 -
[Stream] Add compile-time attr queries for affinity resource compatibility.
#20852 closed
May 19, 2025 -
[GPU] Gather -> matmul fusion support
#18457 closed
May 19, 2025 -
topk: Failed to bufferize op
#20699 closed
May 16, 2025 -
[Codegen] Migrate more floordiv-y maps to affine.delinearize/affine.linearize
#19627 closed
May 16, 2025 -
Incorrect results from imported ONNX file model x - (x + 1)
#20803 closed
May 14, 2025 -
macOS build failure: "archive member '/' not a mach-o file" during linking
#20806 closed
May 14, 2025 -
Does iree-runtime support free input arguments of a function?
#20798 closed
May 14, 2025 -
Getting Started Instructions Fail for MobileNet
#14553 closed
May 14, 2025 -
[HIP] Runtime failure for forward conv
#20766 closed
May 13, 2025 -
Fail to compile scatter_add with --iree-opt-level=O3 due to exceeding shared memory limit
#20767 closed
May 12, 2025 -
Let the bufferization layout attribute be any memref layout
#20714 closed
May 12, 2025 -
Function with f64 tensor argument is compiled to use f32
#20774 closed
May 12, 2025 -
[HAL][Metal] Runtime returns incorrect tensor values
#19530 closed
May 9, 2025 -
[GPU] Decompose Im2col failed for padding + rank-reduce weight backward convs
#20729 closed
May 8, 2025 -
RFC: Defer Encoding Fusion To Stream
#20703 closed
May 8, 2025 -
Enable customization of executable ordinal resolution.
#20660 closed
May 7, 2025 -
Cannot use two static library loaders simultaneously
#20744 closed
May 7, 2025 -
Use of invalid iterator in `iree-codegen-propagate-reshapes-by-expansion`
#20736 closed
May 6, 2025 -
'linalg.copy' op expected operand rank (5) to match the result rank of indexing_map #0 (4)
#20734 closed
May 6, 2025 -
Release tracker - 3.4.0 (2025-05-05)
#20361 closed
May 5, 2025 -
[iree-opt] error: expected SSA operand
#20594 closed
May 5, 2025 -
Define a `amdgpu.scaling_mfma` wrapper
#20616 closed
May 2, 2025 -
[ROCM] Failure during rocdl buffer instruction optimization for wrw conv
#20709 closed
May 2, 2025 -
VM: Support for 64-bit aligned access in register marshaling
#20663 closed
Apr 30, 2025 -
DemoteF64ToF32Pass corrupts signature of imported functions
#12987 closed
Apr 30, 2025 -
iree.abi.output does not work as expected for complex<f32>
#15652 closed
Apr 30, 2025 -
[compiler] Deepseek-V3 compile fails with codegen issues
#20581 closed
Apr 30, 2025 -
[CPU] i4 pack op fails to compile
#16285 closed
Apr 29, 2025 -
[CPU] Remove redundant stack allocations in multi-level tiling
#14305 closed
Apr 29, 2025 -
[CodeGen] ConvertToDPS pass can't replace tensor.empty ops with flow.tensor.load.* ops
#14316 closed
Apr 29, 2025 -
Hang in iree-run-module on ONNX dequantizelinear test case
#16666 closed
Apr 29, 2025 -
Fills/dispatches when padding not getting folded into consumers/producers.
#11049 closed
Apr 29, 2025 -
Fuse multi-consumer insert_slices into dispatch regions as in-place operations.
#11102 closed
Apr 29, 2025 -
Extensibility: improve ergonomics/connections for custom dispatch code.
#11289 closed
Apr 29, 2025 -
Tensor cores not utilised when using `iree-run-module --device=cuda`
#11887 closed
Apr 29, 2025 -
Accuracy issues for outputs with double precision
#11924 closed
Apr 29, 2025 -
Untangle type conversion from mhlo->linalg pass.
#10897 closed
Apr 29, 2025 -
Error when mixing f32 and f64 in a model
#10348 closed
Apr 29, 2025 -
Error out on inputs with f64 values and tensors.
#8826 closed
Apr 29, 2025 -
Add RenderDoc API calls to profile Vulkan backend (non-presenting)
#20558 closed
Apr 28, 2025
64 Issues opened by 30 people
-
Correctness Issue on e2e CDNA3 matmul tests
#20905 opened
May 27, 2025 -
Slow iree.runtime.VmContext creation through python
#20900 opened
May 26, 2025 -
[CPU] `transpose -> pack` folding pattern inhibits fusion
#20896 opened
May 23, 2025 -
Add AMDGPU dialect ops for scaled fp conversions
#20890 opened
May 22, 2025 -
[Codegen] m=1, k=2, n=1 Matmul fails to compile
#20889 opened
May 22, 2025 -
How to Customize and Dynamically Adapt Tiling Strategy in IREE Based on Target Core Count
#20883 opened
May 22, 2025 -
VAE compilation failure with aggressive fusion enabled
#20875 opened
May 21, 2025 -
[CodeGen][SPIRV] Lowering for clustered reduce not implemented
#20872 opened
May 21, 2025 -
Op verification regression in tensor-parallel toy Resnet block
#20870 opened
May 20, 2025 -
build error when Generating check_cuda_ukernel_ukernel_example.mlir_module.vmfb
#20865 opened
May 20, 2025 -
[LinalgExt] Split the op definition between pure ops and LinalgExt ops
#20862 opened
May 19, 2025 -
[HIP] Support for `IREE_HAL_EXTERNAL_TIMEPOINT_TYPE_WAIT_PRIMITIVE`.
#20859 opened
May 19, 2025 -
[HAL] Support external semaphores in local-task/local-sync implementations.
#20858 opened
May 19, 2025 -
[HAL] Set HAL allocation usage bits (EXPORT/MAPPING/etc) based on affinities.
#20856 opened
May 19, 2025 -
[Stream] Add a mechanism for denoting device-device link topology for transfer elision when NUMA.
#20854 opened
May 19, 2025 -
[Stream] Elide transfers in the stream dialect when resources are accessible on participating affinities.
#20853 opened
May 19, 2025 -
Implement initial heterogeneous support MVP for CPU (+ maybe something else).
#20851 opened
May 19, 2025 -
[dlpack][runtime] Memory leak with dlpack capsules
#20849 opened
May 19, 2025 -
[Encoding] Revisit the need of EncodingNopLayout resolver
#20846 opened
May 17, 2025 -
GPU codegen bug: incorrect results on matmul with 3D expanded LHS
#20843 opened
May 16, 2025 -
[Mistral] Performance degradation with VMFB containing prefill functions of multiple batch sizes
#20836 opened
May 16, 2025 -
Tracking issue for pad-based encoding
#20835 opened
May 16, 2025 -
[GPU] Correctness issue for RoPE + Scatter dispatch when config is set on RoPE
#20834 opened
May 16, 2025 -
[CPU] Need to clean up strategy selection lit tests
#20833 opened
May 16, 2025 -
Improve SortOp canonicalization pattern to also drop unused results in buffer semantics
#20831 opened
May 15, 2025 -
Reduce unnecessary IRs from CodeGen encoding materialization tests
#20825 opened
May 15, 2025 -
[CPU] RoPE kernel producing incorrect results
#20824 opened
May 15, 2025 -
[Attention] Support for post-softmax fp8 scaling
#20823 opened
May 15, 2025 -
Failure in PadDynamicAllocPass for arith.select op
#20822 opened
May 15, 2025 -
Notes on lowering `arith.scaling_extf` and `arith.scaling_truncf` to AMDGPU
#20821 opened
May 15, 2025 -
Incorrect concatenation with both Turbine and ONNX compilation
#20819 opened
May 15, 2025 -
Add optional affinity to `stream.timepoint.join` and partitioning pass.
#20817 opened
May 15, 2025 -
Batch (some?) HAL queue operations.
#20815 opened
May 14, 2025 -
Generalize `iree_gpu.multi_mma` to allow arbitrarily-many inputs
#20814 opened
May 14, 2025 -
Dispatch inhibiting fusion
#20812 opened
May 14, 2025 -
Avoid IREE internal allocation for "Bag of Ops" mode of deploying IREE kernels
#20811 opened
May 14, 2025 -
VAE time regressed while enabling vector distribute by default.
#20810 opened
May 14, 2025 -
[ROCM][Tracker] Wan2.1 Autoencoder3d performance - MI300x
#20804 opened
May 14, 2025 -
[CPU] TileRootAndFuseProducerConsumer causes redundant stack allocation
#20792 opened
May 13, 2025 -
Add a local timepoint elision pass prior to the full global one.
#20788 opened
May 12, 2025 -
[CPU] CPU backend produces unfused IR in mmt4d->unpack->elem dispatch
#20786 opened
May 12, 2025 -
[CPU] Fail to vectorize ukernel + unpack
#20785 opened
May 12, 2025 -
ONNX MGP-STR model >100x slower compared to onnxruntime and >3x slower on CUDA vs CPU
#20775 opened
May 12, 2025 -
Slow performance for `topk`
#20772 opened
May 9, 2025 -
The adds when lowering an inbounds vector.transfer_read/vector.transfer_write should be `nuw nsw`
#20769 opened
May 9, 2025 -
[compiler] Deepseek tp8 compile fails with broadcast issue
#20762 opened
May 9, 2025 -
heterogeneous multi-device support
#20752 opened
May 8, 2025 -
DFX updateAfterInit can cause stack overflows on long tied value chains.
#20748 opened
May 7, 2025 -
Refresh Encoding dialect doc
#20742 opened
May 6, 2025 -
Tracking issue for revert of https://github.com/llvm/llvm-project/pull/137930
#20737 opened
May 6, 2025 -
[CPU] linalg.pack op is not fused in forall distribution
#20723 opened
May 5, 2025 -
Release tracker - 3.5.0 (2025-06-10)
#20722 opened
May 5, 2025 -
Add, use getPermutationAndOffset() to memref layouts
#20713 opened
May 2, 2025 -
[ROCM] Shared memory exceeded for bwd non-unit stride convs
#20710 opened
May 2, 2025 -
Define `arith.scaling_extf` and `arith.scaling_truncf`
#20702 opened
May 1, 2025 -
Plumb scaling_mfma through to IREE
#20701 opened
May 1, 2025 -
Python Runtime bindings build dependency conflict with iree-turbine
#20696 opened
May 1, 2025 -
[ONNX Zoo Models] [Regression] One or more operations with large vector sizes (32768 bytes) were found
#20685 opened
Apr 30, 2025 -
Support ml_dtypes extended numpy types in runtimes numpy I/O utilities.
#20682 opened
Apr 30, 2025 -
[tuner] `--iree-codegen-link-tuning-specs` fails to deduplicate symbol names
#20673 opened
Apr 29, 2025 -
Failure to compile quark quantized Mistral
#20672 opened
Apr 29, 2025 -
error: One or more operations with large vector sizes (8192 bytes) were found:
#20651 opened
Apr 27, 2025
43 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Lower `linalg.copy` to direct global load
#20568 commented on
May 27, 2025 • 123 new comments -
[AMDGPU] Implement gpu.subgroup_reduce with DPP intrinsics on AMD GPUs
#20468 commented on
May 22, 2025 • 6 new comments -
Add crash reproducer instrumentation
#19238 commented on
May 15, 2025 • 3 new comments -
[Bazel] Migrate to Bzlmod
#19892 commented on
Apr 30, 2025 • 3 new comments -
Support cloning operand-less dispatches
#20025 commented on
Apr 30, 2025 • 2 new comments -
Enable the linalg.mmt4d operation and add mmt4d, pack, and unpack microkernels for the riscv64
#20263 commented on
May 26, 2025 • 1 new comment -
[iree][llvm]error in converting an linalg mlir into llvm using nvvm
#20635 commented on
Apr 29, 2025 • 0 new comments -
[DispatchCreation] Move the logic to transpose indexing maps into dispatch formation logic.
#19412 commented on
Apr 30, 2025 • 0 new comments -
[Util] Fix an assert getting reached for certain nested loops in `HoistIntoGlobals`
#19576 commented on
May 15, 2025 • 0 new comments -
[CMake] Refactor finding Python
#19592 commented on
May 15, 2025 • 0 new comments -
[Flow] Always choose attention as the best op for dispatch annotation
#19696 commented on
Apr 30, 2025 • 0 new comments -
[DispatchCreation] Make truncate operations fuse with producers.
#19847 commented on
May 15, 2025 • 0 new comments -
[compiler] Fix reproducer generation/run
#20094 commented on
May 15, 2025 • 0 new comments -
[Dispatch] Only bubble reshapes when possibly blocking fusion
#20108 commented on
May 19, 2025 • 0 new comments -
[LLVMGPUVectorDistribute] VectorDistribution support for unaligned shapes
#20144 commented on
May 2, 2025 • 0 new comments -
Convert data-tiling write-ups into blog posts.
#20184 commented on
May 15, 2025 • 0 new comments -
Use caching_allocator for async allocation/deallocation in HIP
#20186 commented on
May 15, 2025 • 0 new comments -
[DT] Disable early materialization by default.
#20323 commented on
May 1, 2025 • 0 new comments -
Elide Redundant barriers
#20349 commented on
May 15, 2025 • 0 new comments -
[WIP] Unroll and flatten
#20586 commented on
May 7, 2025 • 0 new comments -
[VectorExt] Add patterns to unroll iree_vector_ext.transfer_gather
#20609 commented on
May 9, 2025 • 0 new comments -
nonlocal device driver
#20620 commented on
Apr 28, 2025 • 0 new comments -
Disable `iree-input-demote-f64-to-f32` by default?
#15830 commented on
Apr 29, 2025 • 0 new comments -
[Codegen][AMDGPU] Vectorization of stores in matmuls causes regressions
#20032 commented on
Apr 30, 2025 • 0 new comments -
Upstream narrow type emulation is breaking iree test
#20645 commented on
Apr 30, 2025 • 0 new comments -
RFC: Encoding Propagation Interfaces
#20179 commented on
Apr 30, 2025 • 0 new comments -
Data-Tiling: Missing layout transfer support in materialization patterns
#19896 commented on
May 1, 2025 • 0 new comments -
Llama_405b_tp8 OOM w/ Long Input Prompt
#19832 commented on
May 5, 2025 • 0 new comments -
Collapse MaterializeEncodingIntoPadding into the generic pass
#20160 commented on
May 13, 2025 • 0 new comments -
[Tuner] Expose root op selection logic to Python bindings
#20292 commented on
May 14, 2025 • 0 new comments -
iree-compile LLVM ERROR: operation destroyed but still has uses on ScheduleExecutionPass
#20354 commented on
May 16, 2025 • 0 new comments -
[RFC] Formally supporting some suite of "-O*" type flags
#19072 commented on
May 16, 2025 • 0 new comments -
`local-task` giving segmentation fault on `linalg.pack` when tile size is 512x32
#20595 commented on
May 19, 2025 • 0 new comments -
torch.aten.argmax kernel is slow
#20650 commented on
May 21, 2025 • 0 new comments -
[tosa] failed to legalize operation 'arith.extsi'
#19402 commented on
May 22, 2025 • 0 new comments -
Dependency graph of data-tiling issues
#17608 commented on
May 22, 2025 • 0 new comments -
[EPIC][GPU][DT] Bring up GPU data-tiling with reasonable performance
#17181 commented on
May 22, 2025 • 0 new comments -
[DT] Dependency graph of data-tiling fusion and GPU data-tiling issues
#17722 commented on
May 22, 2025 • 0 new comments -
can't build on Mac Sequoia 15.4
#20472 commented on
May 23, 2025 • 0 new comments -
error: failed to solve for affinity analysis on sharded toy llama model
#20436 commented on
May 27, 2025 • 0 new comments -
[DataTiling][Codegen] Improve layout transformation codegen
#20530 commented on
May 27, 2025 • 0 new comments -
Add a pass for simplifying numeric/index arithmetic.
#18701 commented on
Apr 30, 2025 • 0 new comments -
Add end-to-end tests for `iree_linalg_ext.custom_op`.
#18764 commented on
May 13, 2025 • 0 new comments