[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizer convert hash semi join with nulleq condition to cross semi join with other condition #57583

Open
windtalker opened this issue Nov 21, 2024 · 0 comments

Comments

@windtalker
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!
In TiDB, it will convert intersect/except into a hash join with nulleq join condition:

mysql> create table t1(id int, v1 int, v2 int, v3 int);
Query OK, 0 rows affected (1.47 sec)

mysql> create table t2(id int, v1 int, v2 int, v3 int);
Query OK, 0 rows affected (1.19 sec)
mysql> explain select id from t1 intersect select id from t1;
+------------------------------+----------+-----------+---------------+-------------------------------------------------------------+
| id                           | estRows  | task      | access object | operator info                                               |
+------------------------------+----------+-----------+---------------+-------------------------------------------------------------+
| HashJoin_7                   | 6400.00  | root      |               | semi join, equal:[nulleq(test.t1.id, test.t1.id)]           |
| ├─TableReader_18(Build)      | 10000.00 | root      |               | data:TableFullScan_17                                       |
| │ └─TableFullScan_17         | 10000.00 | cop[tikv] | table:t1      | keep order:false, stats:pseudo                              |
| └─HashAgg_12(Probe)          | 8000.00  | root      |               | group by:test.t1.id, funcs:firstrow(test.t1.id)->test.t1.id |
|   └─TableReader_13           | 8000.00  | root      |               | data:HashAgg_8                                              |
|     └─HashAgg_8              | 8000.00  | cop[tikv] |               | group by:test.t1.id,                                        |
|       └─TableFullScan_11     | 10000.00 | cop[tikv] | table:t1      | keep order:false, stats:pseudo                              |
+------------------------------+----------+-----------+---------------+-------------------------------------------------------------+

As you can see, HashJoin_7 is a hash join with nulleq join condition
However, if the query is a little bit complex, TiDB optimizer will generate a CARTESIAN semi join

mysql> explain select t1.id from t1 join t2 on t1.v1 = t2.v2 intersect select t1.id from t1 join t2 on t1.v1 = t2.v2;
+----------------------------------+----------+-----------+---------------+----------------------------------------------------------------+
| id                               | estRows  | task      | access object | operator info                                                  |
+----------------------------------+----------+-----------+---------------+----------------------------------------------------------------+
| HashJoin_15                      | 6393.60  | root      |               | CARTESIAN semi join, other cond:nulleq(test.t1.id, test.t1.id) |
| ├─HashJoin_26(Build)             | 12487.50 | root      |               | inner join, equal:[eq(test.t1.v1, test.t2.v2)]                 |
| │ ├─TableReader_33(Build)        | 9990.00  | root      |               | data:Selection_32                                              |
| │ │ └─Selection_32               | 9990.00  | cop[tikv] |               | not(isnull(test.t2.v2))                                        |
| │ │   └─TableFullScan_31         | 10000.00 | cop[tikv] | table:t2      | keep order:false, stats:pseudo                                 |
| │ └─TableReader_30(Probe)        | 9990.00  | root      |               | data:Selection_29                                              |
| │   └─Selection_29               | 9990.00  | cop[tikv] |               | not(isnull(test.t1.v1))                                        |
| │     └─TableFullScan_28         | 10000.00 | cop[tikv] | table:t1      | keep order:false, stats:pseudo                                 |
| └─HashAgg_16(Probe)              | 7992.00  | root      |               | group by:test.t1.id, funcs:firstrow(test.t1.id)->test.t1.id    |
|   └─HashJoin_17                  | 12487.50 | root      |               | inner join, equal:[eq(test.t1.v1, test.t2.v2)]                 |
|     ├─TableReader_24(Build)      | 9990.00  | root      |               | data:Selection_23                                              |
|     │ └─Selection_23             | 9990.00  | cop[tikv] |               | not(isnull(test.t2.v2))                                        |
|     │   └─TableFullScan_22       | 10000.00 | cop[tikv] | table:t2      | keep order:false, stats:pseudo                                 |
|     └─TableReader_21(Probe)      | 9990.00  | root      |               | data:Selection_20                                              |
|       └─Selection_20             | 9990.00  | cop[tikv] |               | not(isnull(test.t1.v1))                                        |
|         └─TableFullScan_19       | 10000.00 | cop[tikv] | table:t1      | keep order:false, stats:pseudo                                 |
+----------------------------------+----------+-----------+---------------+----------------------------------------------------------------+

The direct cause seems in
https://github.com/pingcap/tidb/tree/c091dba89718599dc1b3d3e45f9b0308a8d49ef0/pkg/planner/core/operator/logicalop/logical_join.go#L200

For inner/semi join, it will collect all join's condition together with the predicate, and call LogicalJoin::extractOnCondition to re-extract the condition, and in
https://github.com/pingcap/tidb/tree/c091dba89718599dc1b3d3e45f9b0308a8d49ef0/pkg/planner/core/operator/logicalop/logical_join.go#L1214
only eq is treated as equal condition:
https://github.com/pingcap/tidb/tree/c091dba89718599dc1b3d3e45f9b0308a8d49ef0/pkg/planner/core/operator/logicalop/logical_join.go#L1257-L1261

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants