Skip to content

MDEV-39014: FULL JOIN Phase 2#4940

Open
DaveGosselin-MariaDB wants to merge 27 commits into
mainfrom
13.2-mdev-39014-full-join-p2
Open

MDEV-39014: FULL JOIN Phase 2#4940
DaveGosselin-MariaDB wants to merge 27 commits into
mainfrom
13.2-mdev-39014-full-join-p2

Conversation

@DaveGosselin-MariaDB
Copy link
Copy Markdown
Member

In phase 1, FULL [OUTER] JOIN was only supported when simplify_joins()
could rewrite it into an equivalent LEFT, RIGHT, or INNER JOIN based
on NULL-rejecting WHERE predicates.  Queries that could not be
rewritten raised ER_NOT_SUPPORTED_YET.  (Phase 1 was not released.)

This commit removes that restriction by adding proper support for FULL
JOIN by executing a 'LEFT JOIN pass' that emits matched rows and left
null-complemented rows, then a second "null-complement" pass which
rescans the right table to emit null-complement rows that were never
matched.

FULL JOIN supports nested joins on the left of the FULL JOIN,
NATURAL FULL JOIN, semi-joins, CTEs / derived tables (kept
materialized when they participate in a FULL JOIN), prepared
statements, stored procedures, and aggregates.  Examples:

  SELECT * FROM (d1 FULL JOIN d2 ON d1.a = d2.a)
              FULL JOIN t3 ON d1.a = t3.a;

  SELECT * FROM t1 NATURAL FULL JOIN t2;

  SELECT * FROM t1 INNER JOIN t2 FULL JOIN t3 ON t1.a = t3.a;

  PREPARE st FROM
    'SELECT COUNT(*) FROM t1 FULL JOIN t2 ON t1.a = t2.a';

Limitations:
  - The join cache is disabled whenever a FULL JOIN is present, which
    can regress plans for large FULL JOINs compared to the rewritten
    cases.  A follow-up will re-enable it where safe.
  - Statistics and cost estimates for the null-complement pass have
    not been fully implemented; the optimizer may under- or
    over-estimate FULL JOIN costs in plans involving multiple
    FULL JOINs.  Again, a follow-up will optimize the cost calculations.
  - Optimizations for constant tables not fully supported.
  - Nested tables on the right side of a FULL JOIN are not yet supported.

Comment thread sql/sql_select.cc Outdated
@spetrunia
Copy link
Copy Markdown
Member

spetrunia commented May 15, 2026

Putting the NULL-complemented record generation into the branch where end_of_records=1 seems wrong.
Consider a testcase building on top of the previous (crashing testcase).

create table t10 (a int, b int, index(a));
create table t11 (a int, b int, index(a));
insert into t10 select seq, seq from seq_1_to_10;
insert into t11 select seq*2, seq*2 from seq_1_to_10;
create table t20 (a varchar(100), b varchar(100), index(a));
create table t21 (a varchar(100), b varchar(100), index(a));
insert into t20 values('match','match'), ('no-match-t20', 'no-match-t20');
insert into t21 values('match','match'), ('no-match-t21', 'no-match-t21');

Building block one (no error) here:

select * from t10 full outer join t11 on t10.a=t11.a;
+------+------+------+------+
| a    | b    | a    | b    |
+------+------+------+------+
|    1 |    1 | NULL | NULL |
|    2 |    2 |    2 |    2 |
|    3 |    3 | NULL | NULL |
|    4 |    4 |    4 |    4 |
|    5 |    5 | NULL | NULL |
|    6 |    6 |    6 |    6 |
|    7 |    7 | NULL | NULL |
|    8 |    8 |    8 |    8 |
|    9 |    9 | NULL | NULL |
|   10 |   10 |   10 |   10 |
| NULL | NULL |   12 |   12 |
| NULL | NULL |   14 |   14 |
| NULL | NULL |   16 |   16 |
| NULL | NULL |   18 |   18 |
| NULL | NULL |   20 |   20 |
+------+------+------+------+
15 rows in set (0.004 sec)

Building block two (all is fine here, too):

select * from t20 full outer join t21 on t20.a=t21.a;
+--------------+--------------+--------------+--------------+
| a            | b            | a            | b            |
+--------------+--------------+--------------+--------------+
| match        | match        | match        | match        |
| no-match-t20 | no-match-t20 | NULL         | NULL         |
| NULL         | NULL         | no-match-t21 | no-match-t21 |
+--------------+--------------+--------------+--------------+
3 rows in set (0.002 sec)

The probllem query:

select * from (t10 full outer join t11 on t10.a=t11.a) , (t20 full outer join t21 on t20.a=t21.a);
+------+------+------+------+--------------+--------------+--------------+--------------+
| a    | b    | a    | b    | a            | b            | a            | b            |
+------+------+------+------+--------------+--------------+--------------+--------------+
|    1 |    1 | NULL | NULL | match        | match        | match        | match        |
|    2 |    2 |    2 |    2 | match        | match        | match        | match        |
|    3 |    3 | NULL | NULL | match        | match        | match        | match        |
|    4 |    4 |    4 |    4 | match        | match        | match        | match        |
|    5 |    5 | NULL | NULL | match        | match        | match        | match        |
|    6 |    6 |    6 |    6 | match        | match        | match        | match        |
|    7 |    7 | NULL | NULL | match        | match        | match        | match        |
|    8 |    8 |    8 |    8 | match        | match        | match        | match        |
|    9 |    9 | NULL | NULL | match        | match        | match        | match        |
|   10 |   10 |   10 |   10 | match        | match        | match        | match        |

The row combination with (NULL-NULL-12-12-match-match-match-match) is missing, along with "similar ones" (imprecise wording but hopefully it's clear)

|    1 |    1 | NULL | NULL | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    2 |    2 |    2 |    2 | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    3 |    3 | NULL | NULL | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    4 |    4 |    4 |    4 | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    5 |    5 | NULL | NULL | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    6 |    6 |    6 |    6 | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    7 |    7 | NULL | NULL | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    8 |    8 |    8 |    8 | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    9 |    9 | NULL | NULL | no-match-t20 | no-match-t20 | NULL         | NULL         |
|   10 |   10 |   10 |   10 | no-match-t20 | no-match-t20 | NULL         | NULL         |

Similar question here.

|    1 |    1 | NULL | NULL | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    2 |    2 |    2 |    2 | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    3 |    3 | NULL | NULL | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    4 |    4 |    4 |    4 | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    5 |    5 | NULL | NULL | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    6 |    6 |    6 |    6 | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    7 |    7 | NULL | NULL | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    8 |    8 |    8 |    8 | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    9 |    9 | NULL | NULL | NULL         | NULL         | no-match-t21 | no-match-t21 |
|   10 |   10 |   10 |   10 | NULL         | NULL         | no-match-t21 | no-match-t21 |
| NULL | NULL |   12 |   12 | NULL         | NULL         | no-match-t21 | no-match-t21 |
| NULL | NULL |   14 |   14 | NULL         | NULL         | no-match-t21 | no-match-t21 |
| NULL | NULL |   16 |   16 | NULL         | NULL         | no-match-t21 | no-match-t21 |
| NULL | NULL |   18 |   18 | NULL         | NULL         | no-match-t21 | no-match-t21 |
| NULL | NULL |   20 |   20 | NULL         | NULL         | no-match-t21 | no-match-t21 |
+------+------+------+------+--------------+--------------+--------------+--------------+
35 rows in set (0.006 sec)

This one is evaluated at the end of execution so here we get the (NULL-NULL-12-12, ... ).

@spetrunia
Copy link
Copy Markdown
Member

spetrunia commented May 15, 2026

What about possible join orders?

Take t10 and t11 from the previous testcase.

create table two (c int);
insert into two values (1),(2);
explain select * from two, (t10 full outer join t11 on t10.a=t11.a);
+------+-------------+-------+------+---------------+------+---------+------------+------+------------------------------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref        | rows | Extra                              |
+------+-------------+-------+------+---------------+------+---------+------------+------+------------------------------------+
|    1 | SIMPLE      | t10   | ALL  | a             | NULL | NULL    | NULL       | 8    |                                    |
|    1 | SIMPLE      | t11   | ref  | a             | a    | 5       | test.t10.a | 1    | Using where                        |
|    1 | SIMPLE      | two   | ALL  | NULL          | NULL | NULL    | NULL       | 2    | Using join buffer (flat, BNL join) |
+------+-------------+-------+------+---------------+------+---------+------------+------+------------------------------------+

The join order is t10,t11, two regardless of the table access costs.

This happens because with empty join prefix,JOIN::get_allowed_nj_tables() calls
restrict_to_unplaced_fj_tables() which returns {t10,t11}.

But why would 'two' not be allowed as the first table?

@spetrunia
Copy link
Copy Markdown
Member

(I think I wrote this before somewhere but writing here to not forget).

We should consider putting some FULL OUTER JOIN code into its own file(s).
How about the file sql/opt_full_join.{h,cc} and the first one to put there: class full_join_duplicate_filter.

Some logic is of course all over the place in simplify_joins() or NL-join runtime and so is hard to move...

Comment thread sql/sql_select.cc Outdated
@DaveGosselin-MariaDB
Copy link
Copy Markdown
Member Author

(I think I wrote this before somewhere but writing here to not forget).

We should consider putting some FULL OUTER JOIN code into its own file(s). How about the file sql/opt_full_join.{h,cc} and the first one to put there: class full_join_duplicate_filter.

Some logic is of course all over the place in simplify_joins() or NL-join runtime and so is hard to move...

I agree with this. Let's shake out all the other changes first, then make that code movement be the last step.

@spetrunia
Copy link
Copy Markdown
Member

Let's try this also:

@spetrunia
Copy link
Copy Markdown
Member

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements support for FULL [OUTER] JOIN and NATURAL FULL JOIN, including parser updates, optimizer logic for null-complementation, and extensive test coverage. Feedback identifies a logic error in the NATURAL FULL JOIN column coalescing loop that skips the last element and an incorrect implementation of the peek_ref() iterator method. Additionally, it is recommended to maintain the original bit value for the JOIN_TYPE_OUTER constant to avoid breaking existing logic.

Comment thread sql/sql_base.cc
Comment thread sql/sql_list.h
Comment thread sql/table.h
@spetrunia
Copy link
Copy Markdown
Member

spetrunia commented May 18, 2026

Consider this:

create table t1 (
  a int, 
  b int,
  index(a),
  index(b)
);
create table t2 like t1;
insert into t1 select seq, seq from seq_1_to_100; 
insert into t2 select seq, seq from seq_95_to_195;

(Cross-database script: https://gist.github.com/spetrunia/43f3df610e5cbcd15a2f50e465edfb43)

explain
select * from t1 full outer join t2 on (t1.a=t2.a and t1.b>90 and t2.b<110)

gives

+------+-------------+-------+-------+---------------+------+---------+---------+------+-----------------------+
| id   | select_type | table | type  | possible_keys | key  | key_len | ref     | rows | Extra                 |
+------+-------------+-------+-------+---------------+------+---------+---------+------+-----------------------+
|    1 | SIMPLE      | t1    | range | a,b           | b    | 5       | NULL    | 10   | Using index condition |
|    1 | SIMPLE      | t2    | ref   | a,b           | a    | 5       | j2.t1.a | 1    | Using where           |
+------+-------------+-------+-------+---------------+------+---------+---------+------+-----------------------+

So, we will use range access for table t1.
But let's take for example the row in t1 with (a,b)= (1,1).
ON expression will produce nothing, so the row must be in the query output.

Running the query, I see

MariaDB [j2]> select * from t1 full outer join t2 on (t1.a=t2.a and t1.b>90 and t2.b<110);
+------+------+------+------+
| a    | b    | a    | b    |
+------+------+------+------+
|   91 |   91 | NULL | NULL |
|   92 |   92 | NULL | NULL |
|   93 |   93 | NULL | NULL |
|   94 |   94 | NULL | NULL |
|   95 |   95 |   95 |   95 |
|   96 |   96 |   96 |   96 |
|   97 |   97 |   97 |   97 |
|   98 |   98 |   98 |   98 |
|   99 |   99 |   99 |   99 |
|  100 |  100 |  100 |  100 |
| NULL | NULL |  101 |  101 |
| NULL | NULL |  102 |  102 |
| NULL | NULL |  103 |  103 |
...
| NULL | NULL |  195 |  195 |
+------+------+------+------+
105 rows in set (3.539 sec)

That is, the row is not there.

@DaveGosselin-MariaDB
Copy link
Copy Markdown
Member Author

DaveGosselin-MariaDB commented May 18, 2026

What about possible join orders?

Take t10 and t11 from the previous testcase.

create table two (c int);
insert into two values (1),(2);
explain select * from two, (t10 full outer join t11 on t10.a=t11.a);
+------+-------------+-------+------+---------------+------+---------+------------+------+------------------------------------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref        | rows | Extra                              |
+------+-------------+-------+------+---------------+------+---------+------------+------+------------------------------------+
|    1 | SIMPLE      | t10   | ALL  | a             | NULL | NULL    | NULL       | 8    |                                    |
|    1 | SIMPLE      | t11   | ref  | a             | a    | 5       | test.t10.a | 1    | Using where                        |
|    1 | SIMPLE      | two   | ALL  | NULL          | NULL | NULL    | NULL       | 2    | Using join buffer (flat, BNL join) |
+------+-------------+-------+------+---------------+------+---------+------------+------+------------------------------------+

The join order is t10,t11, two regardless of the table access costs.

This happens because with empty join prefix,JOIN::get_allowed_nj_tables() calls restrict_to_unplaced_fj_tables() which returns {t10,t11}.

But why would 'two' not be allowed as the first table?

After the "LEFT JOIN" pass completes, we start a second pass to generate the null-complement rows. Walking the join_tabs, we look for one with a fj_dups filter on the right side of a FULL JOIN. run_fj_null_complement_pass does a full rescan, and for each row whose rowid is not in fj_dupsevaluate_fj_null_complement_row marks the FULL JOIN's left operand as null and forwards the row through next_select to the rest of the join. This forwarding goes in only one direction. Tables after the FULL JOIN right side pass through next_select, but tables before it are never iterated again; their row buffers stay frozen at whatever value the "LEFT JOIN" pass last read. When a table that is not part of any FULL JOIN sits before the FULL JOIN, each null-complement row gets paired only with that frozen value, losing every other combination. restrict_to_unplaced_fj_tables enforces the prefix order to keep this from happening. In principle, we could repeat the entire join in "null-complement mode", and that change is (probably) needed for phase 3's support for nested joins on the right side of a FULL JOIN (but one of the biggest obstacles is that fj_dups is keyed by the right side's rowid alone). Another idea is to use materialization: if the FULL JOIN nest somewhere writes its result into a temp table, then the enclosing join can treat it like a base table, so the optimizer is free to place it anywhere in the join order. We could also implement another FULL JOIN strategy like what Postgres uses, namely a form of hash join for cases like this.

At the moment I'm writing this, the current tip of this branch allows you to use straight_join to force that join order. But, if you do this, then we get a wrong result (I will make a fix to prevent this from happening):

MariaDB [test]> select * from two, (t10 full outer join t11 on t10.a=t11.a);
+------+------+------+------+
| c    | a    | a    | b    |
+------+------+------+------+
|    1 |    1 |    1 |    5 |
|    1 |    2 |    2 |   15 |
|    1 |    3 | NULL | NULL |
|    1 | NULL |    4 |   25 |
|    2 |    1 |    1 |    5 |
|    2 |    2 |    2 |   15 |
|    2 |    3 | NULL | NULL |
|    2 | NULL |    4 |   25 |
+------+------+------+------+
8 rows in set (0.003 sec)

MariaDB [test]> select straight_join * from two, (t10 full outer join t11 on t10.a=t11.a);
+------+------+------+------+
| c    | a    | a    | b    |
+------+------+------+------+
|    1 |    1 |    1 |    5 |
|    1 |    2 |    2 |   15 |
|    1 |    3 | NULL | NULL |
|    2 |    1 |    1 |    5 |
|    2 |    2 |    2 |   15 |
|    2 |    3 | NULL | NULL |
|    2 | NULL |    4 |   25 |
+------+------+------+------+
7 rows in set (0.002 sec)

For now, I think it's best to emit a Warning when the user specifies both straight_join and FULL JOIN and drop the straight_join from the query, unless the straight_join forces the tables participating in the FULL JOIN to be first in the join order (then it's fine).

In general, to summarize a bit of the above:

  • The phase 1 and 2 work has limitations with the placement of FULL JOINs in queries
  • We keep hitting these limitations in the issues you and Pavithra have found
  • These limitations are partly due to the lack of support of nested joins on the right side of a FULL JOIN, or FULL JOINs on the inner side of another join type.
  • Limitations will be overcome by phase 3 work which will require FULL JOIN materialization or perhaps another join strategy to implement
  • Gatekeeping of FULL JOINs implemented now will be removed later as we finish the implementation.

@DaveGosselin-MariaDB
Copy link
Copy Markdown
Member Author

DaveGosselin-MariaDB commented May 18, 2026

Putting the NULL-complemented record generation into the branch where end_of_records=1 seems wrong. Consider a testcase building on top of the previous (crashing testcase).

create table t10 (a int, b int, index(a));
create table t11 (a int, b int, index(a));
insert into t10 select seq, seq from seq_1_to_10;
insert into t11 select seq*2, seq*2 from seq_1_to_10;
create table t20 (a varchar(100), b varchar(100), index(a));
create table t21 (a varchar(100), b varchar(100), index(a));
insert into t20 values('match','match'), ('no-match-t20', 'no-match-t20');
insert into t21 values('match','match'), ('no-match-t21', 'no-match-t21');

Building block one (no error) here:

select * from t10 full outer join t11 on t10.a=t11.a;
+------+------+------+------+
| a    | b    | a    | b    |
+------+------+------+------+
|    1 |    1 | NULL | NULL |
|    2 |    2 |    2 |    2 |
|    3 |    3 | NULL | NULL |
|    4 |    4 |    4 |    4 |
|    5 |    5 | NULL | NULL |
|    6 |    6 |    6 |    6 |
|    7 |    7 | NULL | NULL |
|    8 |    8 |    8 |    8 |
|    9 |    9 | NULL | NULL |
|   10 |   10 |   10 |   10 |
| NULL | NULL |   12 |   12 |
| NULL | NULL |   14 |   14 |
| NULL | NULL |   16 |   16 |
| NULL | NULL |   18 |   18 |
| NULL | NULL |   20 |   20 |
+------+------+------+------+
15 rows in set (0.004 sec)

Building block two (all is fine here, too):

select * from t20 full outer join t21 on t20.a=t21.a;
+--------------+--------------+--------------+--------------+
| a            | b            | a            | b            |
+--------------+--------------+--------------+--------------+
| match        | match        | match        | match        |
| no-match-t20 | no-match-t20 | NULL         | NULL         |
| NULL         | NULL         | no-match-t21 | no-match-t21 |
+--------------+--------------+--------------+--------------+
3 rows in set (0.002 sec)

The probllem query:

select * from (t10 full outer join t11 on t10.a=t11.a) , (t20 full outer join t21 on t20.a=t21.a);
+------+------+------+------+--------------+--------------+--------------+--------------+
| a    | b    | a    | b    | a            | b            | a            | b            |
+------+------+------+------+--------------+--------------+--------------+--------------+
|    1 |    1 | NULL | NULL | match        | match        | match        | match        |
|    2 |    2 |    2 |    2 | match        | match        | match        | match        |
|    3 |    3 | NULL | NULL | match        | match        | match        | match        |
|    4 |    4 |    4 |    4 | match        | match        | match        | match        |
|    5 |    5 | NULL | NULL | match        | match        | match        | match        |
|    6 |    6 |    6 |    6 | match        | match        | match        | match        |
|    7 |    7 | NULL | NULL | match        | match        | match        | match        |
|    8 |    8 |    8 |    8 | match        | match        | match        | match        |
|    9 |    9 | NULL | NULL | match        | match        | match        | match        |
|   10 |   10 |   10 |   10 | match        | match        | match        | match        |

The row combination with (NULL-NULL-12-12-match-match-match-match) is missing, along with "similar ones" (imprecise wording but hopefully it's clear)

|    1 |    1 | NULL | NULL | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    2 |    2 |    2 |    2 | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    3 |    3 | NULL | NULL | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    4 |    4 |    4 |    4 | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    5 |    5 | NULL | NULL | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    6 |    6 |    6 |    6 | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    7 |    7 | NULL | NULL | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    8 |    8 |    8 |    8 | no-match-t20 | no-match-t20 | NULL         | NULL         |
|    9 |    9 | NULL | NULL | no-match-t20 | no-match-t20 | NULL         | NULL         |
|   10 |   10 |   10 |   10 | no-match-t20 | no-match-t20 | NULL         | NULL         |

Similar question here.

|    1 |    1 | NULL | NULL | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    2 |    2 |    2 |    2 | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    3 |    3 | NULL | NULL | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    4 |    4 |    4 |    4 | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    5 |    5 | NULL | NULL | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    6 |    6 |    6 |    6 | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    7 |    7 | NULL | NULL | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    8 |    8 |    8 |    8 | NULL         | NULL         | no-match-t21 | no-match-t21 |
|    9 |    9 | NULL | NULL | NULL         | NULL         | no-match-t21 | no-match-t21 |
|   10 |   10 |   10 |   10 | NULL         | NULL         | no-match-t21 | no-match-t21 |
| NULL | NULL |   12 |   12 | NULL         | NULL         | no-match-t21 | no-match-t21 |
| NULL | NULL |   14 |   14 | NULL         | NULL         | no-match-t21 | no-match-t21 |
| NULL | NULL |   16 |   16 | NULL         | NULL         | no-match-t21 | no-match-t21 |
| NULL | NULL |   18 |   18 | NULL         | NULL         | no-match-t21 | no-match-t21 |
| NULL | NULL |   20 |   20 | NULL         | NULL         | no-match-t21 | no-match-t21 |
+------+------+------+------+--------------+--------------+--------------+--------------+
35 rows in set (0.006 sec)

This one is evaluated at the end of execution so here we get the (NULL-NULL-12-12, ... ).

Hi @spetrunia , with a bit of hacking on my end I can see how we can lift this up from the end_of_records logic and make the null-complement generation fire on during the normal "LEFT JOIN" scan. I just need a bit more time to experiment with it. I think that will cause the missing row to appear because that would allow tables that need to be re-scanned for the null-complement pass to get scanned again, rather than "locking" them at the last read row (which is the case now since the null-complement pass is tied to the end_of_records pass).

@DaveGosselin-MariaDB
Copy link
Copy Markdown
Member Author

@spetrunia hmm perhaps lifting the null-complement pass out of end_of_records will solve the other case ("What about possible join orders?") too.......

@spetrunia
Copy link
Copy Markdown
Member

spetrunia commented May 19, 2026

A question about NATURAL JOIN processing.

create table t30 (
  a int not null,
  t30val varchar(32)
);
insert into t30 values 
  ('1', 't30-1'),
  ('2', 't30-2-nomatch');

create table t31 (
  a int not null,
  t31val varchar(32)
);

insert into t31 values 
  ('1', 't31-1'),
  ('3', 't31-3-nomatch');
select * from t30 natural full join t31;
+------+---------------+---------------+
| a    | t30val        | t31val        |
+------+---------------+---------------+
|    1 | t30-1         | t31-1         |
|    2 | t30-2-nomatch | NULL          |
|    3 | NULL          | t31-3-nomatch |
+------+---------------+---------------+

Correct.

select * from t30 natural full join t31  where a < 5;
+------+---------------+--------+
| a    | t30val        | t31val |
+------+---------------+--------+
|    1 | t30-1         | t31-1  |
|    2 | t30-2-nomatch | NULL   |
+------+---------------+--------+

Incorrect.

It seems, where a < 5 is interpreted as where t30.a<5.
The actual meaning should be where natural_join_output.a<5 that is, where coalesce(t30.a, t31.a) < 5.

@DaveGosselin-MariaDB DaveGosselin-MariaDB force-pushed the 13.2-mdev-39014-full-join-p2 branch from c6bfb11 to 8c390c9 Compare May 21, 2026 23:49
Syntax support for FULL JOIN, FULL OUTER JOIN, NATURAL FULL JOIN, and
NATURAL FULL OUTER JOIN in the parser.

While we accept full join syntax, such joins are not yet supported.
Queries specifying any of the above joins will fail with
ER_NOT_SUPPORTED_YET.
Allow FULL OUTER JOIN queries to proceed through name resolution.

Permits limited EXPLAIN EXTENDED support so tests can prove that the
JOIN_TYPE_* table markings are reflected when the query is echoed back by the
server.  This happens in at least two places:  via a Warning message during
EXPLAIN EXTENDED and during VIEW .frm file creation.

While the query plan output is mostly meaningless at this point, this
limited EXPLAIN support improves the SELECT_LEX print function for the new
JOIN types.

TODO: fix PS protocol before end of FULL OUTER JOIN development
Rewrite FULL OUTER JOIN queries as either LEFT, RIGHT, or INNER JOIN
by checking if and how the WHERE clause rejects nulls.

For example, the following two queries are equivalent because the
WHERE condition rejects nulls from the left table and allows matches
in the right table (or NULL from the right table) for the remaining
rows:

  SELECT * FROM t1 FULL JOIN t2 ON t1.v = t2.v WHERE t1.v IS NOT NULL;
  SELECT * FROM t1 LEFT JOIN t2 ON t1.v = t2.v;

  SELECT * FROM t1 FULL JOIN t2 ON t1.v = t2.v WHERE t1.a=t2.a;
  SELECT * FROM t1 INNER JOIN t2 ON t1.v = t2.v WHERE t1.a=t2.a;
FULL JOIN yields result sets with columns from both tables participating in
the join (for the sake of explanation, assume base tables).  However,
NATURAL FULL JOIN should show unique columns in the output.

Given the following query:
  SELECT * FROM t1 NATURAL JOIN t2;
transform it into:
  SELECT COALESCE(t1.f_1, t2.f_1), ..., COALESCE(t1.f_n, t2.f_n) FROM
    t1 NATURAL JOIN t2;

This change applies only in the case of NATURAL FULL JOIN.  Otherwise,
NATURAL JOINs work as they have in the past, which is using columns
from the left table for the resulting column set.
Prevent elimination of tables participating in a FULL OUTER JOIN during
eliminate_tables as part of phase one FULL OUTER JOIN development.

Move the functionality gate for FULL JOIN further into the codebase: convert
LEX::has_full_outer_join to a counter so we can see how many FULL JOINs
remain which makes the gate work correctly after simplify_joins and
eliminate_tables are called.

Fixes an old bug where, when running the server as a debug build and in
debug mode, a null pointer deference in
Dep_analysis_context::dbug_print_deps would cause a crash.
Move the temporary gate against FULL OUTER JOIN deeper into the
codebase, which causes the FULL OUTER JOIN query plans to have
more relevant information (hence the change).  In some cases, the
join order of nested INNER JOINs within the FULL OUTER JOIN changed.

Small cleanups in get_sargable_cond ahead of the feature work in
the next commit.
Fetches the ON condition from the FULL OUTER JOIN as the sargable condition.
We ignore the WHERE clause here because we don't want accidental conversions
from FULL JOIN to INNER JOIN during, for example, range analysis, as that
would produce wrong results.

GCOV shows that existing FULL OUTER JOIN tests exercise this new codepath.
In phase 1, FULL [OUTER] JOIN was only supported when simplify_joins()
could rewrite it into an equivalent LEFT, RIGHT, or INNER JOIN based
on NULL-rejecting WHERE predicates.  Queries that could not be
rewritten raised ER_NOT_SUPPORTED_YET.  (Phase 1 was not released.)

This commit removes that restriction by adding proper support for FULL
JOIN by executing a 'LEFT JOIN pass' that emits matched rows and left
null-complemented rows, then a second "null-complement" pass which
rescans the right table to emit null-complement rows that were never
matched.

FULL JOIN supports nested joins on the left of the FULL JOIN,
NATURAL FULL JOIN, semi-joins, CTEs / derived tables (kept
materialized when they participate in a FULL JOIN), prepared
statements, stored procedures, and aggregates.  Examples:

  SELECT * FROM (d1 FULL JOIN d2 ON d1.a = d2.a)
              FULL JOIN t3 ON d1.a = t3.a;

  SELECT * FROM t1 NATURAL FULL JOIN t2;

  SELECT * FROM t1 INNER JOIN t2 FULL JOIN t3 ON t1.a = t3.a;

  PREPARE st FROM
    'SELECT COUNT(*) FROM t1 FULL JOIN t2 ON t1.a = t2.a';

Limitations:
  - The join cache is disabled whenever a FULL JOIN is present, which
    can regress plans for large FULL JOINs compared to the rewritten
    cases.  A follow-up will re-enable it where safe.
  - Statistics and cost estimates for the null-complement pass have
    not been fully implemented; the optimizer may under- or
    over-estimate FULL JOIN costs in plans involving multiple
    FULL JOINs.  Again, a follow-up will optimize the cost calculations.
  - Optimizations for constant tables not fully supported.
  - Nested tables on the right side of a FULL JOIN are not yet supported.
its tables flattened out of their nest, leading to a join ordering
problem and allowing the optimizer to interleave the outer table
into the join order incorrectly (possibly between FULL JOIN'd tables).

Disallow FULL JOIN on the right side of a left or right join and
prevent FULL JOIN on right in VIEWs.

Do not merge VIEWs that have a FULL JOIN in them and let simplify_joins
do any derived table merging as it relates to FULL JOINs.
Ignore JOIN_ORDER hints that force FULL JOINs to the inner
side of other JOINs.

A FULL JOIN must be placed before any table outside of the FULL JOIN
in the join order; FULL JOINs are not allowed on the inner side of an
enclosing LEFT or RIGHT JOIN.  Prevent the user from supplying a
JOIN_ORDER() hint that otherwise forces this invalid order, and emit a
warning.  When we implement phase 3 of FULL JOIN support, this
restriction will be relaxed.
If a table that's in a FULL OUTER JOIN is found to be a const
table, then don't allow the constant table optimization to
take place.

Later, when we support FULL OUTER JOIN on the inner side of
other join types then we may be able to relax this restriction.
Prevent simplify_joins from rewriting a chained FULL JOIN into a query
where a FULL JOIN could end up on the inner side of another outer
join.  Of course, this means that we will have a null complement pass
that the rewritten query would have avoided.  Once we support FULL
JOINs on the inner side of outer joins, in phase 3, then we can relax
this constraint.
simplify_joins keeps a nested join intact when it contains
FULL JOIN tables and has siblings in its parent join list.
Flattening would let the optimizer interleave outside tables
between the FULL JOIN tables, which the null complement
algorithm cannot handle.  Such a nest carries neither on_expr
nor sj_on_expr and reaches table elimination, where we hit an
assertion which required one or the other.

Replace the assertion with a recursion into the nest.  Pass
on_expr=NULL so the nest itself is not eliminated, and set
all_eliminated to FALSE so an enclosing call does not eliminate
the parent.  Inner LEFT JOIN tables can still be eliminated by
the recursive call.
The count of JOIN_TAB instances passed to alloc_full_join_duplicate_filters
was the total count, including the constant tables.  But start_tab points
to the first JOIN_TAB after the constant tables, so it was walking off the
end of the JOIN_TAB instances in memory.  Just take the starting point from
the JOIN directly and include the constant tables (which we'll need to do
in phase 3 anyway).
…able

Allocate the full_join_duplicate_filter with operator new.  Add a comment
to the assertion and condense a couple of conditionals together.
In run_fj_null_complement_pass, the FULL JOIN null complement pass
called ha_end_keyread() on the right table to switch from index access
before rescanning, but never restored keyread after it finished; so
do that.
compute_full_join_nest_tables built the bitmap of tables that must
stay adjacent in the join order by iterating leaf_tables and OR'ing in
each enclosing nest's direct_children_map.  That missed FULL JOIN
tables that are themselves a nest of inner joins.  With those tables
missing from full_join_nest_tables, the adjacency restriction in
get_allowed_nj_tables let the optimizer descend into an inner nest
whose remaining tables depended on a table outside it, triggering the
found_tables>0 assertion in best_extension_by_limited_search.

Replace the leaf walk with a recursive walk over top_join_list.  For
every TABLE_LIST flagged JOIN_TYPE_FULL, OR in its
nested_join->used_tables (for a nest) or table->map (for a base
table), and recurse into every nested_join so a FULL JOIN table that
is itself a nest contributes all of its leaf tables.
A FULL JOIN produces its result in two stages.  The first stage walks
the joined tables like a LEFT JOIN, emitting every left row paired
with its matches on the right (or with NULLs if there were none).  The
second stage walks the right table again and emits the right rows that
never found a match, padded with NULLs on the left.

Until now the second stage ran exactly once per query, after the whole
first stage had finished.  That worked only when the FULL JOIN was the
entire query.  As soon as another table sat outside the FULL JOIN, the
result became wrong, because the "right rows with no matches" needed
to be paired with the outer table's rows just like the matched rows
were.  To avoid producing wrong answers, the optimizer was forced to
put FULL JOIN tables first.

This change moves the second stage so that it runs as part of the same
nested loop that produced the first stage's matches.  Each time the
outer loop advances to a new row, the right table is rescanned for its
unmatched rows, and those rows flow through the rest of the query
exactly the way ordinary join output does.

With the two stages now interleaved, the optimizer no longer has to
force the FULL JOIN to the front of the join order.  Tables outside
the FULL JOIN can sit on either side of it, and the FULL JOIN itself
only needs to stay together as one contiguous group.
Early on in FULL OUTER JOIN development, it seemed prudent to
attach the ON condition to both tables during FULL OUTER JOIN
parsing, because either table in the join could be considered
the inner or outer table, depending on perspective.  This
decision ended up complicating both simplify_joins and the
FULL JOIN rewrites to other join types, as well as dependent
table propagation during make_join_statistics.  Dependent
table propagation was not correct because both tables in the
FULL JOIN carried the ON expression which confused
make_join_statistics.

We could update make_join_statistics to propagate dependent
tables for both sides of a join, but that's essentially changing
the mainstream join processing path for the sake of one case
case.  Instead, just stop attaching the ON condition to both
tables in a FULL OUTER JOIN, which simplifies the FULL JOIN
rewrites and relies on pre-existing table dependency propagation
in make_join_statistics.
Previously the FULL JOIN tables had to come first in the join
order, before any other table.  They still have to stay next to
each other, but they can now appear before, after, or between
other tables in the FROM clause.

Update the JOIN_ORDER, JOIN_PREFIX, and JOIN_SUFFIX hint conflict
check to match.  The new check only rejects a hint when it would force
a non-FULL JOIN table to sit between two FULL JOIN tables, which is
the case that actually breaks the null-complement rescan.  It finds
this by computing the set of predecessors of FULL JOIN tables and the
set of successors of FULL JOIN tables, after the hint's dependencies
have been added, and rejects the hint when those sets overlap on a
non-FULL-JOIN, non constant table.
The crash happened because we computed an incorrect stopping point
for the JOIN_TAB walk when finding the left-most JOIN_TAB for a
FULL JOIN in a bush child.  When inspecting bush children, the
eligible JOIN_TABs are those of the particular bush, so walk over
those.  Previously, we took the end point as JOIN::join_tab, which
was incorrect, causing the walk to go into arbitrary memory.
A FULL JOIN with a constant ON expression did not encode
the outer table dependency in the on_expr because the
constant references no tables.  The enclosing nest's
dep_tables, however, still carries that dependency, so
propagate it.
When an IN-subquery in the ON clause of a JOIN to a FULL JOIN nest
becomes a semijoin, the new semijoin nest is placed inside the FULL JOIN
nest.  Normally that enclosing nest would be flattened, but a FULL
JOIN nest is not flattenable, so the semijoin nest remains.

check_interleaving_with_nj used to skip sj-nests when updating
counters, so every semijoin inner table also bumped the enclosing
nest's counter.  With two inner tables and two FULL JOIN tables, the
enclosing nest's counter went to 4 against a limit of 3 and the
assertion fired.

Count the semijoin nest as one child of its parent.  Bump the semijoin
nest's own counter on each semijoin inner table, and only bump the
parent once the semijoin nest is fully placed.  restore_prev_nj_state
is changed the same way for backtracking.
@DaveGosselin-MariaDB DaveGosselin-MariaDB force-pushed the 13.2-mdev-39014-full-join-p2 branch from 8c390c9 to 09d471f Compare May 22, 2026 20:43
When an 'inner' table of a FULL JOIN sat inside a nest that was
itself the left side of an enclosing FULL JOIN,
make_outerjoin_info skipped that embedding while building the
outer join scope chain and left the inner table's first_upper
unlinked.  When make_join_select later pushed an ON condition to
that table, add_found_match_trig_cond walked off the broken
chain and dereferenced NULL.

Fix make_outerjoin_info to set first_upper for a tab that
carries its own outer join scope (first_inner == tab) when its
immediate embedding did not link it.  Point first_upper at the
nested_join->first_nested of the first enclosing outer join nest
the embedding walk reaches.

The same query exposed a second bug.  The outermost FULL JOIN's right
operand was a nested join rather than a single base table.  The parser
places the nest on the right when the outermost FULL JOIN's ON is the
last one written, because the parser keeps the outermost FULL JOIN
pending until its ON arrives, and the inner FULL JOINs reduce first
into a nest that becomes the right operand.
alloc_full_join_duplicate_filters allocates the fj_dups filter on a
JOIN_TAB carrying JOIN_TYPE_FULL | JOIN_TYPE_RIGHT, so with the
FULL|RIGHT bits on the nest, which is never a JOIN_TAB, no filter was
allocated and the null complement pass never fired.  The unmatched
rows from the right side were never emitted, producing a result with
fewer rows than the SQL:2016 standard requires.

Add swap_full_join_sides, called from rewrite_full_outer_joins
when a FULL JOIN survives simplify_joins with a leaf on the left
and a nested join on the right.  FULL JOIN is symmetric on its
operands, so swapping does not change query semantics; after the
swap the leaf carries the FULL|RIGHT bits and the rescan target
is a single base table.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants