StarRocks Roadmap 2025 #55526

Dshadowzh · 2025-02-04T05:00:07Z

Refer to previous roadmap 2024 2023 2022

Execution Engine

Query Stability
- Query Plan Manager: Enhance the robustness of the query plan generator to minimize plan instability.
- Data Skew Handling: Develop dynamic algorithms to detect and adjust for data skew, optimizing query execution.
- Cache Resilience: Implement smarter caching mechanisms to reduce query jittering during CN changes.
Performance Tuning
- Operator Improvements: Introduce poller-free execution and runtime filter pushdown to the storage layer.
- History-Based Optimizer: Leverage query feedback to refine optimization strategies.
- ARM Performance Tuning: Resolve performance bottlenecks and edge cases for ARM architectures.
Query Optimizer
- Improve NDV (Number of Distinct Values) Accuracy: Enhance the precision of NDV statistical information.
- Improve Multi-Column Statistics Accuracy: Optimize the accuracy of statistics for multi-column data.
- Optimize Sampling Estimation Algorithms: Refine algorithms for estimating statistics through sampling.
- Column Property Propagation Refactoring.
Batch Processing
- Adaptive Concurrency: Dynamically adjust the number of concurrent tasks based on system load and resource availability.
- Query Queue and Spill Stability: Improve stability and efficiency for large-scale batch processing on 1000+ core clusters.
Materialized Views
- Incremental MV Framework: Reduce full recomputation costs by enabling incremental updates for materialized views.
Data Types
- New Data Types: Support for advanced data types such as BigString, and Datetime/Timestamp with timezone, maybe Geo.
Functions
- Trino-Compatible Functions: Expand function compatibility with Trino (see #40894).
- Causal Inference: Introduce functions for causal analysis and inference.
- Others

LakeHouse

Iceberg as a Fully Featured LakeHouse
- Performant and Cost-Effective Query Engine: Enhance statistics collection, indexing, and materialized view support.
- Iceberg V3 Spec Compliance: Support for Variant, deletion vectors, geo types, and auth specifications.
- Full Operation Support: Enable DDL, DML, procedures, and seamless table migration.
- Compaction and Layout Optimization: Introduce compaction services and automatic layout arrangement.
Paimon as a fully Featured streaming lakehouse
- Query: Metadata optimization, manifest cache, index, point lookup optimization
- Full operation support: time-travel, management for tagging & branching, DDL, DML...
- Paimon new features: varient type, view, materialized view, incremental MV
Other Open Lake Formats
For other formats, we will prioritize query performance improvements:
- Hudi: Enhance RLI (Record-Level Indexing), bloom filters on Parquet, and metadata table support.
- Delta Lake: Implement optimizations as needed based on user demand.

Shared Data

Make shared-data as default architecture. Focus on stability and real-time/search capability improvement.
- Batch data ingestion stability issues
- Cost reduction for both batch and streaming data ingestion
Enhanced Functionality
- Time Travel and Snapshots: Improve support for time travel and snapshot functionality. Snapshot for shared-data #53999
- Merge Into: Enable efficient data merging operations.
- Hybird search: improve the mixed vector/full text/scalar search capability
Real-Time Storage Engine
- Data Freshness: Improve data freshness with readable memtables.
- Compaction Optimization: Optimize compaction for time-series data.
- Better Pipe: Expand the use of Pipe for continuous data ingestion.
Multi-Statement Transactions
- Enhance support for multi-statement transactions to support delete, update, and handle better transaction conflict.

kateshaowanjou · 2025-02-20T08:29:25Z

Anyone interested in Paimon can also see this doc: StarRocks Community 2025 Roadmap for Paimon

jimdowling · 2025-05-02T05:33:18Z

I see that ASOF Join that was on the 2024 roadmap is no longer here.
That is an important join for using StarRocks to create training data for AI systems -
https://www.hopsworks.ai/dictionary/point-in-time-correct-joins

Dshadowzh added the type/enhancement Make an enhancement to StarRocks label Feb 4, 2025

Dshadowzh pinned this issue Feb 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StarRocks Roadmap 2025 #55526

StarRocks Roadmap 2025 #55526

StarRocks Roadmap 2025 #55526

StarRocks Roadmap 2025 #55526

Comments

Execution Engine

LakeHouse

Shared Data