-
Notifications
You must be signed in to change notification settings - Fork 699
Insights: iree-org/iree
Overview
Could not load contribution data
Please try again later
7 Releases published by 1 person
-
iree-3.5.0rc20250519 iree candidate iree-3.5.0rc20250519
published
May 19, 2025 -
iree-3.5.0rc20250520 iree candidate iree-3.5.0rc20250520
published
May 20, 2025 -
iree-3.5.0rc20250521 iree candidate iree-3.5.0rc20250521
published
May 21, 2025 -
iree-3.5.0rc20250522 iree candidate iree-3.5.0rc20250522
published
May 22, 2025 -
iree-3.5.0rc20250523 iree candidate iree-3.5.0rc20250523
published
May 23, 2025 -
iree-3.5.0rc20250524 iree candidate iree-3.5.0rc20250524
published
May 24, 2025 -
iree-3.5.0rc20250525 iree candidate iree-3.5.0rc20250525
published
May 25, 2025
32 Pull requests merged by 13 people
-
[Codegen][NFC] Make namespace usage follow IREE::[Encoding|Codegen].
#20894 merged
May 23, 2025 -
Fix Link error when
IREECompiler.lib
hits 4GiB#20892 merged
May 22, 2025 -
Adding support for
#hal.device.optimal<...>
through to runtime.#20879 merged
May 22, 2025 -
Adding tryLookupResourceUsageAffinity.
#20891 merged
May 22, 2025 -
Temporary automatic reference counting(ish) pass for inserting async deallocations.
#20765 merged
May 22, 2025 -
[Dispatch Creation] Fix GatherFusionPattern crash
#20887 merged
May 22, 2025 -
[Flow] Dump affinity info in DumpDispatchGraph pass.
#20888 merged
May 22, 2025 -
[Codegen] Add patterns to fold reshapes into load_from/store_to_memref
#20881 merged
May 22, 2025 -
[Preprocessing] Add SinkReshapesPass in MakeSingleDispatchPassPipeline
#20882 merged
May 22, 2025 -
[LinalgExt] Fold unit dims for iree_linalg_ext.gather
#20877 merged
May 22, 2025 -
[NFC][Codegen] Rename early bufferization op operands
#20874 merged
May 21, 2025 -
[LinalgExt] Canonicalize gather to an extract_slice
#20878 merged
May 21, 2025 -
Align iree_hal_sync_device_t allocation to 16 bytes.
#20773 merged
May 21, 2025 -
[VectorExt] Fix illegal transfer_read during gather vectorization
#20876 merged
May 21, 2025 -
[Codegen] Add ukernel support for argmax on BF16 and enable optional max value return
#20768 merged
May 21, 2025 -
[Codegen] Support multiple forall ops in ReconcileTranslationInfo
#20848 merged
May 21, 2025 -
[AMDGPU] Rewrite some gpu.shuffle xor to ds_swizzle, per upstream
#20868 merged
May 21, 2025 -
[Encoding][NFC] Move Encoding Utils out from IR definition.
#20871 merged
May 21, 2025 -
[GPU] Vector distribution support for multiple stores
#20816 merged
May 21, 2025 -
[CodeGen] Fix a MemoryEffectsOpInterface bug in FuseConsumerOp.
#20869 merged
May 20, 2025 -
Fix padding to nop encoding specialization
#20837 merged
May 20, 2025 -
[Codegen][GPU] Support padding in CombineLayoutTransformation
#20797 merged
May 20, 2025 -
[LinalgExt] Add map_scatter e2e tests for CPU and VMVX backends.
#20861 merged
May 20, 2025 -
[Dispatch Creation] Clone iree_linalg_ext.gather for attn
#20866 merged
May 20, 2025 -
[VectorExt] Vectorize
iree_linalg_ext.gather
#20807 merged
May 20, 2025 -
[Codegen] Fix dominance issue in collapse shape fusion
#20864 merged
May 20, 2025 -
[VectorExt] Fix transfer_gather printer
#20860 merged
May 20, 2025 -
[Dispatch Creation] Handle
linalg.fill
in collapse dimensions#20863 merged
May 19, 2025 -
Fix logic for yieldReplacements in tileDispatchUsingForall
#20844 merged
May 19, 2025 -
[LinalgExt] Clone iree_linalg_ext.gather (5/5)
#20563 merged
May 19, 2025 -
[TensorExt] Drop space from count_from_slice printer
#20850 merged
May 19, 2025 -
[Codegen][NFC] Refresh remove_single_iteration_loop.mlir test.
#20842 merged
May 18, 2025
7 Pull requests opened by 6 people
-
Integrate llvm-project@7a8090c037255b54895d61df2eb141fee48d6d83
#20873 opened
May 21, 2025 -
Add support for dynamic unit trip scf.for to scf.if
#20880 opened
May 21, 2025 -
Stream: Add topology attribute
#20885 opened
May 22, 2025 -
[CPU] Disable scf.forall distribution by default.
#20893 opened
May 22, 2025 -
Revert "[GPU] Vector distribution support for multiple stores (#20816)"
#20895 opened
May 22, 2025 -
[NFC] Rename load_from/store_to_memref to load_from/store_to_buffer
#20897 opened
May 23, 2025 -
[Codegen] Enable reshape into buffer folding in BlockDynamicDimensions
#20898 opened
May 23, 2025
27 Issues closed by 7 people
-
llvmcpu_mobilenet_v3-large_uint8.run starts failing after bumping tf-nightly
#14830 closed
May 22, 2025 -
[DT][GPU] Support codegen for pack/unpack fusion and mmt4d-like ops
#17720 closed
May 22, 2025 -
[CPU] Inefficient img2col tensor kernel
#17413 closed
May 22, 2025 -
[GPU][DT] Enable e2e tests for tensor.pack op on GPU
#17186 closed
May 22, 2025 -
[GPU][DT] Enable e2e tests for tensor.unpack op on GPU
#17187 closed
May 22, 2025 -
[GPU][DT] matmul microbenchmarks for GPU data-tiling path
#17189 closed
May 22, 2025 -
Redundant buffer allocations in LinalgExt/Linalg ops when there are constants in outs
#9813 closed
May 22, 2025 -
10000
CI - Windows x64 MSVC failing
#20763 closed
May 22, 2025 -
[HAL] Add HAL allocator selection based on optimal affinity for a particular usage + API.
#20857 closed
May 22, 2025 -
MLIR's memref.load/store should produce `nuw` and `inbounds` GEPs
#20483 closed
May 22, 2025 -
Fix encoding dialect header/tablegen dependency.
#20681 closed
May 21, 2025 -
[VectorExt] `transfer_gather` parser fails to round-trip valid operation emitted by printer
#20802 closed
May 20, 2025 -
Correctness issue related to EmplaceAllocations
#19355 closed
May 19, 2025 -
[Codegen][AMDGPU Backend] Correctness issue for conv_2d_ngchw_gfchw
#18798 closed
May 19, 2025 -
[LLVMCPU] Data tiling of group quantized matmul on CPU
#14337 closed
May 19, 2025 -
[Flow] Improve DetachElementwiseFromNamedOpPass to match codegen expectations
#12080 closed
May 19, 2025 -
Not able to set atol in check tests
#5414 closed
May 19, 2025 -
Util constant analysis should not depend on LinalgExt dialect
#14887 closed
May 19, 2025 -
[Regression] The number of dispatches increases 12.28% after #14505
#14531 closed
May 19, 2025 -
Implement a listener for non-pattern based passes
#12858 closed
May 19, 2025 -
Missing documentation for build and test iree-dialects
#8361 closed
May 19, 2025 -
Missing fusion for winograd transform ops with their consumers
#17487 closed
May 19, 2025 -
Winograd transform generates bad memory accessing pattern for CPU
#17485 closed
May 19, 2025 -
[CPU] Explore ukernels for winograd transform ops
#17491 closed
May 19, 2025 -
[DT][Fusion] Move set encoding after forming dispatch region
#17718 closed
May 19, 2025 -
[Stream] Add compile-time attr queries for affinity resource compatibility.
#20852 closed
May 19, 2025 -
[GPU] Gather -> matmul fusion support
#18457 closed
May 19, 2025
17 Issues opened by 10 people
-
[CPU] `transpose -> pack` folding pattern inhibits fusion
#20896 opened
May 23, 2025 -
Add AMDGPU dialect ops for scaled fp conversions
#20890 opened
May 22, 2025 -
[Codegen] m=1, k=2, n=1 Matmul fails to compile
#20889 opened
May 22, 2025 -
How to Customize and Dynamically Adapt Tiling Strategy in IREE Based on Target Core Count
#20883 opened
May 22, 2025 -
VAE compilation failure with aggressive fusion enabled
#20875 opened
May 21, 2025 -
[CodeGen][SPIRV] Lowering for clustered reduce not implemented
#20872 opened
May 21, 2025 -
Op verification regression in tensor-parallel toy Resnet block
#20870 opened
May 20, 2025 -
build error when Generating check_cuda_ukernel_ukernel_example.mlir_module.vmfb
#20865 opened
May 20, 2025 -
[LinalgExt] Split the op definition between pure ops and LinalgExt ops
#20862 opened
May 19, 2025 -
[HIP] Support for `IREE_HAL_EXTERNAL_TIMEPOINT_TYPE_WAIT_PRIMITIVE`.
#20859 opened
May 19, 2025 -
[HAL] Support external semaphores in local-task/local-sync implementations.
#20858 opened
May 19, 2025 -
[HAL] Set HAL allocation usage bits (EXPORT/MAPPING/etc) based on affinities.
#20856 opened
May 19, 2025 -
[Stream] Add a mechanism for denoting device-device link topology for transfer elision when NUMA.
#20854 opened
May 19, 2025 -
[Stream] Elide transfers in the stream dialect when resources are accessible on participating affinities.
#20853 opened
May 19, 2025 -
Implement initial heterogeneous support MVP for CPU (+ maybe something else).
#20851 opened
May 19, 2025 -
[dlpack][runtime] Memory leak with dlpack capsules
#20849 opened
May 19, 2025
28 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Lower `linalg.copy` to direct global load
#20568 commented on
May 23, 2025 • 94 new comments -
[Encoding] Teach specialize encodings to handle pad encodings.
#20845 commented on
May 19, 2025 • 0 new comments -
[Util] Move Attribute based pipeline to Util dialect.
#20832 commented on
May 19, 2025 • 0 new comments -
Preserve preprocessing passpipeline attribute during `WrapEntryPointsPass`.
#20818 commented on
May 19, 2025 • 0 new comments -
Export ExecutionEngine through python bindings
#20777 commented on
May 24, 2025 • 0 new comments -
[Codegen] Add pass for specializing executable variants
#20771 commented on
May 19, 2025 • 0 new comments -
[AMDGPU] Implement gpu.subgroup_reduce with DPP intrinsics on AMD GPUs
#20468 commented on
May 22, 2025 • 0 new comments -
[Dispatch] Only bubble reshapes when possibly blocking fusion
#20108 commented on
May 19, 2025 • 0 new comments -
error: failed to solve for affinity analysis on sharded toy llama model
#20436 commented on
May 25, 2025 • 0 new comments -
can't build on Mac Sequoia 15.4
#20472 commented on
May 23, 2025 • 0 new comments -
[CPU] TileRootAndFuseProducerConsumer causes redundant stack allocation
#20792 commented on
May 22, 2025 • 0 new comments -
[DT] Dependency graph of data-tiling fusion and GPU data-tiling issues
#17722 commented on
May 22, 2025 • 0 new comments -
[EPIC][GPU][DT] Bring up GPU data-tiling with reasonable performance
#17181 commented on
May 22, 2025 • 0 new comments -
Dependency graph of data-tiling issues
#17608 commented on
May 22, 2025 • 0 new comments -
[tosa] failed to legalize operation 'arith.extsi'
#19402 commented on
May 22, 2025 • 0 new comments -
[CPU] linalg.pack op is not fused in forall distribution
#20723 commented on
May 22, 2025 • 0 new comments -
[CPU] CPU backend produces unfused IR in mmt4d->unpack->elem dispatch
#20786 commented on
May 21, 2025 • 0 new comments -
[DataTiling][Codegen] Improve layout transformation codegen
#20530 commented on
May 21, 2025 • 0 new comments -
torch.aten.argmax kernel is slow
#20650 commented on
May 21, 2025 • 0 new comments -
[CPU] RoPE kernel producing incorrect results
#20824 commented on
May 20, 2025 • 0 new comments -
[Attention] Support for post-softmax fp8 scaling
#20823 commented on
May 20, 2025 • 0 new comments -
Avoid IREE internal allocation for "Bag of Ops" mode of deploying IREE kernels
#20811 commented on
May 20, 2025 • 0 new comments -
GPU codegen bug: incorrect results on matmul with 3D expanded LHS
#20843 commented on
May 20, 2025 • 0 new comments -
heterogeneous multi-device support
#20752 commented on
May 20, 2025 • 0 new comments -
Incorrect concatenation with both Turbine and ONNX compilation
#20819 commented on
May 20, 2025 • 0 new comments -
[CPU] Fail to vectorize ukernel + unpack
#20785 commented on
May 19, 2025 • 0 new comments -
`local-task` giving segmentation fault on `linalg.pack` when tile size is 512x32
#20595 commented on
May 19, 2025 • 0 new comments -
[Encoding] Revisit the need of EncodingNopLayout resolver
#20846 commented on
May 19, 2025 • 0 new comments