-
Notifications
You must be signed in to change notification settings - Fork 161
Remove superfluous DISTINCT #2561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Angeber. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only minor things.
@@ -103,6 +91,34 @@ void DependentGroupByReductionRule::_apply_to_plan_without_subqueries( | |||
}; | |||
auto group_by_columns = fetch_group_by_columns(); | |||
|
|||
// Early exit (i): If this is an AggregateNode for SELECT DISTINCT (i.e., it has no aggregates) and the requested | |||
// columns are already distinct, remove the whole node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entire?
Could you elaborate a little bit more here?
- Why do we replace an aggregate with a projection?
- Why do we use the group by count? Why does it need to be the same? (maybe explain for
select distinct name from lineitem;
?)
I would have expected to see something with "DISTINCT" here.
src/test/lib/optimizer/strategy/dependent_group_by_reduction_rule_test.cpp
Show resolved
Hide resolved
src/lib/optimizer/strategy/dependent_group_by_reduction_rule.cpp
Outdated
Show resolved
Hide resolved
src/lib/optimizer/strategy/dependent_group_by_reduction_rule.hpp
Outdated
Show resolved
Hide resolved
src/test/lib/optimizer/strategy/dependent_group_by_reduction_rule_test.cpp
Outdated
Show resolved
Hide resolved
src/test/lib/optimizer/strategy/dependent_group_by_reduction_rule_test.cpp
Outdated
Show resolved
Hide resolved
@Bouncner better? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor note on a strange sounding sentencing. Otherwise ready to zünd.
Implements a pattern that removes the DISTINCT clause if the result is unique by definition or previous GROUP BY clauses.
Implements a pattern that removes the DISTINCT clause if the result is unique by definition or previous GROUP BY clauses.
Thanks to @Bouncner, I read the following note in a paper:
That's true and it does not look like they will even merge it soon: https://commitfest.postgresql.org/35/2433/
Thus, I was triggered to implement it in Hyrise and prove we can do this in less than half an hour. It does not change anything in benchmarks, but now, we have it.
(To be fair: The Postgres contributors had to implement passing UCCs in the LQP and tried multiple versions. Also, I needed more than half an hour with all tests and an edge case I forgot to cover.)