## Who's online

There are currently 0 users and 39 guests online.

# Join

## "Cost-free" joins - 2

In the previous post I've demonstrated an unexpected Nested Loop Join caused by an extreme data distribution. Although unexpected at first sight, the performance of the execution plan selected by the optimizer is decent - provided the estimates are in the right ballpark.Here is another case of an unexpected execution plan, this time about Merge Joins.

### Merge Joins

In order to appreciate why the execution plan encountered is unexpected, first a quick summary about how Merge Joins work:A Merge Join is essentially a Nested Loop operation from one sorted row source into another sorted row source.

## "Cost-free" joins - 1

Recently I came across some interesting edge cases regarding the costing of joins. They all have in common that they result in (at first sight) unexpected execution plans, but only some of them are actual threats to performance.

### Outer Joins

The first one is about outer joins with an extreme data distribution. Consider the following data setup:

create table t1
as
select
rownum as id
, case when rownum > 1e6 then rownum end as null_fk
from
dual
connect by
level <= 1e6
;

exec dbms_stats.gather_table_stats(null, 't1')

create table t2
as
select
rownum as id
from
dual
connect by
level <= 1e6
;

## Table Functions And Join Cardinality Estimates

If you consider the usage of Table Functions then you should be aware of some limitations to the optimizer calculations, in particular when considering a join between a Table Function and other row sources.

As outlined in one of my previous posts you can and should help the optimizer to arrive at a reasonable cardinality estimate when dealing with table functions, however doing so doesn't provide all necessary inputs to the join cardinality calculation that are useful and available from the statistics when dealing with regular tables.

Therefore even when following the recommended practice regarding the cardinality estimates it is possible to end up with some inaccuracies. This post will explain why.

Join Cardinality Basics

## Multi-column joins

Consider the following scenario with four tables. Two of them represent master data, the third one uses a concatenated primary key consisting of foreign keys to the first two, and the fourth one has a foreign key to the third one.

drop table t1 cascade constraints purge;

drop table t2 cascade constraints purge;

drop table t3 cascade constraints purge;

drop table t4 cascade constraints purge;

create table t1 (
t1_id integer not null constraint pk_t1 primary key,
filler1 varchar2(40),
filler2 varchar2(40)
);

create table t2 (
t2_id integer not null constraint pk_t2 primary key,
filler1 varchar2(40),
filler2 varchar2(40)
);

create table t3 (
t1_id integer not null,
t2_id integer not null,
filler1 varchar2(40),
filler2 varchar2(40),
constraint pk_t3 primary key (t1_id, t2_id) using index (
create index pk_t3 on t3 (t1_id, t2_id)
),
constraint fk_t3_1 foreign key (t1_id) references t1 (t1_id),
constraint fk_t3_2 foreign key (t2_id) references t2 (t2_id));

create table t4 (
t4_id integer not null constraint pk_t4 primary key,
t1_id integer,
t2_id integer,
filler1 varchar2(40),
filler2 varchar2(40),
constraint t4_fk_1 foreign key (t1_id, t2_id) references t3 (t1_id, t2_id)
);

Notice that the primary key of "t3" is using a non-unique index, which is supported and can be used e.g. for deferrable constraints or when loading data into tables that might be non-unique so that the constraint can be disabled without dropping the (unique) index. This allows to simply re-enable the constraint after cleaning up the non-unique rows instead of re-creating an unique index (and the risk of losing the index if anything goes wrong).

Now when using an uncorrelated data set for the concatenated keys, Oracle's default (join) selectivity formulas apply and the estimated cardinalities are correct. Table "t1" has 10,000 rows, "t2" 3 rows, table "t3" holds 30,000 rows combining "t1" and "t2" data. "t4" has 300,000 rows.

-- non-correlated column values
exec dbms_stats.set_table_stats(null, 't1', numrows=>10000, numblks=>100, avgrlen=>100)

exec dbms_stats.set_table_stats(null, 't2', numrows=>3, numblks=>1, avgrlen=>100)

exec dbms_stats.set_table_stats(null, 't3', numrows=>30000, numblks=>300, avgrlen=>100)

exec dbms_stats.set_table_stats(null, 't4', numrows=>300000, numblks=>3000, avgrlen=>100)

exec dbms_stats.set_column_stats(null, 't1', 't1_id', distcnt=>10000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't2', 't2_id', distcnt=>3, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't3', 't1_id', distcnt=>10000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't3', 't2_id', distcnt=>3, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't4', 't4_id', distcnt=>300000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't4', 't1_id', distcnt=>10000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't4', 't2_id', distcnt=>3, nullcnt=>0)

exec dbms_stats.set_index_stats(null, 'pk_t1', numdist=>10000, clstfct=>10000, indlevel=>2, numlblks=>10, numrows=>10000)

exec dbms_stats.set_index_stats(null, 'pk_t2', numdist=>3, clstfct=>3, indlevel=>1, numlblks=>1, numrows=>3)

exec dbms_stats.set_index_stats(null, 'pk_t3', numdist=>30000, clstfct=>30000, indlevel=>2, numlblks=>30, numrows=>30000)

exec dbms_stats.set_index_stats(null, 'pk_t4', numdist=>300000, clstfct=>300000, indlevel=>2, numlblks=>300, numrows=>30000)

Joining t4 to t3 results in a correct estimate of 300K rows:

select
count(*)
from
t4
, t3
where
t3.t1_id = t4.t1_id
and t3.t2_id = t4.t2_id;

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 52 | | 1456 (3)| 00:00:18 |
| 1 | SORT AGGREGATE | | 1 | 52 | | | |
|* 2 | HASH JOIN | | 300K| 14M| 1120K| 1456 (3)| 00:00:18 |
| 3 | INDEX FAST FULL SCAN| PK_T3 | 30000 | 761K| | 11 (10)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T4 | 300K| 7617K| | 833 (3)| 00:00:10 |
----------------------------------------------------------------------------------------

Things look however different if we have the awkward situation of correlated column values for the concatenated keys:

-- correlated column values
exec dbms_stats.set_table_stats(null, 't1', numrows=>10000, numblks=>100, avgrlen=>100)

exec dbms_stats.set_table_stats(null, 't2', numrows=>20000, numblks=>200, avgrlen=>100)

exec dbms_stats.set_table_stats(null, 't3', numrows=>30000, numblks=>300, avgrlen=>100)

exec dbms_stats.set_table_stats(null, 't4', numrows=>300000, numblks=>3000, avgrlen=>100)

exec dbms_stats.set_column_stats(null, 't1', 't1_id', distcnt=>10000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't2', 't2_id', distcnt=>20000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't3', 't1_id', distcnt=>10000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't3', 't2_id', distcnt=>20000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't4', 't4_id', distcnt=>300000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't4', 't1_id', distcnt=>10000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't4', 't2_id', distcnt=>20000, nullcnt=>0)

exec dbms_stats.set_index_stats(null, 'pk_t1', numdist=>10000, clstfct=>10000, indlevel=>2, numlblks=>10, numrows=>10000)

exec dbms_stats.set_index_stats(null, 'pk_t2', numdist=>20000, clstfct=>20000, indlevel=>2, numlblks=>20, numrows=>20000)

exec dbms_stats.set_index_stats(null, 'pk_t3', numdist=>30000, clstfct=>30000, indlevel=>2, numlblks=>30, numrows=>30000)

exec dbms_stats.set_index_stats(null, 'pk_t4', numdist=>300000, clstfct=>300000, indlevel=>2, numlblks=>300, numrows=>30000)

Here we simulate 20,000 distinct values in one column, 10,000 distinct values in the second one, but only 30,000 distinct values for the combination of both columns. In this case Oracle's default selectivity formula underestimates the cardinality since it is assuming uncorrelated values:

select /*+ opt_param('_optimizer_join_sel_sanity_check', 'false') */
count(*)
from
t4
, t3
where
t3.t1_id = t4.t1_id
and t3.t2_id = t4.t2_id;

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 52 | | 1455 (3)| 00:00:18 |
| 1 | SORT AGGREGATE | | 1 | 52 | | | |
|* 2 | HASH JOIN | | 45 | 2340 | 1120K| 1455 (3)| 00:00:18 |
| 3 | INDEX FAST FULL SCAN| PK_T3 | 30000 | 761K| | 11 (10)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T4 | 300K| 7617K| | 833 (3)| 00:00:10 |
----------------------------------------------------------------------------------------

Oracle simply multiplies the selectivity of the two columns and arrives at a join cardinality of 45 rows (1/20,000*1/10,000*300,000*30,000).

You'll notice that I had to use a undocumented optimizer parameter to arrive at that default selectivity. If you run an EXPLAIN PLAN for the same statement without the hint, you'll get the following estimate:

select
count(*)
from
t4
, t3
where
t3.t1_id = t4.t1_id
and t3.t2_id = t4.t2_id;

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 52 | | 1455 (3)| 00:00:18 |
| 1 | SORT AGGREGATE | | 1 | 52 | | | |
|* 2 | HASH JOIN | | 30000 | 1523K| 1120K| 1455 (3)| 00:00:18 |
| 3 | INDEX FAST FULL SCAN| PK_T3 | 30000 | 761K| | 11 (10)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T4 | 300K| 7617K| | 833 (3)| 00:00:10 |
----------------------------------------------------------------------------------------

It can be seen from a 10053 optimizer trace file that Oracle uses a "Multi-column cardinality sanity check" by default in cases where the calculated multi-column selectivity falls below a certain limit, obviously using the smaller selectivity available from the different 1/num_rows of the tables/row sources involved in the join, arriving at an estimate 30,000 rows in this particular case.

Changing the non-unique index used for the primary key on "t3" to a unique index will bring another sanity check into the picture: The "concatenated index" sanity check that uses the number of distinct values of an unique index that corresponds exactly to the join columns used.

create table t3 (
t1_id integer not null,
t2_id integer not null,
filler1 varchar2(40),
filler2 varchar2(40),
constraint pk_t3 primary key (t1_id, t2_id) using index (
create unique index pk_t3 on t3 (t1_id, t2_id)
),
constraint fk_t3_1 foreign key (t1_id) references t1 (t1_id),
constraint fk_t3_2 foreign key (t2_id) references t2 (t2_id));

With this unique index in place Oracle uses the number of distinct keys from this index to calculate the selectivity of the join and therefore arrives at the correct cardinality again:

select
count(*)
from
t4
, t3
where
t3.t1_id = t4.t1_id
and t3.t2_id = t4.t2_id;

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 52 | | 1455 (3)| 00:00:18 |
| 1 | SORT AGGREGATE | | 1 | 52 | | | |
|* 2 | HASH JOIN | | 300K| 14M| 1120K| 1455 (3)| 00:00:18 |
| 3 | INDEX FAST FULL SCAN| PK_T3 | 30000 | 761K| | 11 (10)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T4 | 300K| 7617K| | 833 (3)| 00:00:10 |
----------------------------------------------------------------------------------------

So this is another case where the uniqueness of an index makes a significant difference for optimizer calculations.

Note that from 11g on there more options to help the optimizer to come up with a better estimate even with the non-unique index on (t3.t1_id, t3.t2_id). Obviously 11g introduced extended statistics on column groups, so we can do the following:

variable ext_name varchar2(30)

exec :ext_name := dbms_stats.create_extended_stats(null, 't3', '(t1_id, t2_id)')

exec dbms_stats.set_column_stats(null, 't3', :ext_name, distcnt=>30000, nullcnt=>0)

This allows to derive the correct selectivity for these correlated column values using the extended statistics set.

Another option in 11g is adding an index on (t4.t1_id, t4.t2_id), like that:

create index ix_t4 on t4 (t1_id, t2_id);

exec dbms_stats.set_index_stats(null, 'ix_t4', numdist=>30000, clstfct=>30000, indlevel=>2, numlblks=>30, numrows=>300000)

Having now two non-unique indexes Oracle 11g comes up again with the correct join cardinality of 300K. Notice that this doesn't work in pre-11g. Pre-11g versions require the index on t3 to be unique to take advantage of the "concatenated index" sanity check.

Having demonstrated all these sanity checks available for multi-column joins (the general multi-column and the concatenated index sanity check), let's see what happens when joining three tables:

select
count(*)
from
t1
, t3
, t4
where
t4.t1_id = t1.t1_id
and t3.t1_id = t4.t1_id
and t3.t2_id = t4.t2_id;

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | | 1468 (4)| 00:00:18 |
| 1 | SORT AGGREGATE | | 1 | 65 | | | |
|* 2 | HASH JOIN | | 300K| 18M| | 1468 (4)| 00:00:18 |
| 3 | INDEX FAST FULL SCAN | PK_T1 | 10000 | 126K| | 4 (0)| 00:00:01 |
|* 4 | HASH JOIN | | 300K| 14M| 1120K| 1455 (3)| 00:00:18 |
| 5 | INDEX FAST FULL SCAN| PK_T3 | 30000 | 761K| | 11 (10)| 00:00:01 |
| 6 | TABLE ACCESS FULL | T4 | 300K| 7617K| | 833 (3)| 00:00:10 |
-----------------------------------------------------------------------------------------

All I've done is to add "t1", in this case joining to "t4" on "t1_id". Thanks to the concatenated index sanity check (or the extended column group statistics in 11g) the calculated join cardinality is still 300K.

Now what happens if one decides to join "t3" to "t1" on "t1_id" instead of "t4.t1_id"? From a logical point of view this should lead to exactly the same result, since we can deduce that if "t4.t1_id" = "t1.t1_id" and "t3.t1_id = t1.t1_id" then "t3.t1_id = t4.t1_id".

select
count(*)
from
t1
, t3
, t4
where
t4.t1_id = t1.t1_id
and t3.t1_id = t1.t1_id
and t3.t2_id = t4.t2_id;

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | | 1475 (4)| 00:00:18 |
| 1 | SORT AGGREGATE | | 1 | 65 | | | |
|* 2 | HASH JOIN | | 45 | 2925 | | 1475 (4)| 00:00:18 |
| 3 | INDEX FAST FULL SCAN | PK_T1 | 10000 | 126K| | 4 (0)| 00:00:01 |
|* 4 | HASH JOIN | | 450K| 22M| 1120K| 1459 (3)| 00:00:18 |
| 5 | INDEX FAST FULL SCAN| PK_T3 | 30000 | 761K| | 11 (10)| 00:00:01 |
| 6 | TABLE ACCESS FULL | T4 | 300K| 7617K| | 833 (3)| 00:00:10 |
-----------------------------------------------------------------------------------------

The result is astonishing. By making this simple change we have effectively disabled all available sanity checks and arrive at the result based on the the default, uncorrelated selectivity.

So whenever you perform multi-column joins and the column data is correlated, be very careful how you join the tables - it might make a significant difference to the calculations of the optimizer.