8000 diable parquet predicate pruning for decimal types by richox · Pull Request #1033 · kwai/blaze · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

diable parquet predicate pruning for decimal types #1033

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 24, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
8000
Diff view
24 changes: 23 additions & 1 deletion native-engine/datafusion-ext-plans/src/parquet_exec.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
use std::{any::Any, fmt, fmt::Formatter, ops::Range, pin::Pin, sync::Arc};

use arrow::datatypes::SchemaRef;
use arrow_schema::DataType;
use blaze_jni_bridge::{
conf, conf::BooleanConf, jni_call_static, jni_new_global_ref, jni_new_string,
};
Expand All @@ -37,7 +38,7 @@ use datafusion::{
errors::ParquetError,
file::metadata::ParquetMetaData,
},
physical_expr::EquivalenceProperties,
physical_expr::{EquivalenceProperties, PhysicalExprRef},
physical_optimizer::pruning::PruningPredicate,
physical_plan::{
metrics::{ExecutionPlanMetricsSet, MetricBuilder, MetricsSet},
Expand Down Expand Up @@ -97,6 +98,15 @@ impl ParquetExec {
}
}
})
.filter(|p| {
// https://github.com/kwai/blaze/issues/1032
// predicate pruning is buggy for decimal type, so we need to
// temporarily disable predicate pruning for decimal type
matches!(
expr_contains_decimal_type(p.predicate_expr(), file_schema),
Ok(false)
)
})
.filter(|p| !p.always_true());

let page_pruning_predicate = predicate
Expand Down Expand Up @@ -488,3 +498,15 @@ impl AsyncFileReader for ParquetFileReaderRef {
.boxed()
}
}

fn expr_contains_decimal_type(expr: &PhysicalExprRef, schema: &SchemaRef) -> Result<bool> {
if matches!(expr.data_type(schema)?, DataType::Decimal128(..)) {
return Ok(true);
}
for child_expr in expr.children().iter() {
if expr_contains_decimal_type(&child_expr, schema)? {
return Ok(true);
}
}
Ok(false)
}
Loading
0