In a recent note I wrote about index joins I made a passing comment about limitations in the optimizer’s available strategies that might make you choose to write your code to emulate an index join through explicit SQL references.
Here are two SQL similar SQL statements (with execution plans) that demonstrate the initial problem – the first is just a restatement of the basic example I supplied in the first article:
create table indjoin
as
select
rownum id,
rownum val1,
rownum val2,
rpad('x',500) padding
from all_objects where rownum <= 3000
;
-- collect stats, compute, no histograms
create unique index ij_v1 on indjoin(id, val1);
create unique index ij_v2 on indjoin(id, val2);
select
val1, val2
from
indjoin ij
where
val1 between 100 and 200
and val2 between 50 and 150
;
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 24 | 24 |
|* 1 | VIEW | index$_join$_001 | 3 | 24 | 24 |
|* 2 | HASH JOIN | | | | |
|* 3 | INDEX FAST FULL SCAN| IJ_V1 | 3 | 24 | 11 |
|* 4 | INDEX FAST FULL SCAN| IJ_V2 | 3 | 24 | 11 |
---------------------------------------------------------------------------
select
val1, val2, rowid
from
indjoin ij
where
val1 between 100 and 200
and val2 between 50 and 150
;
-----------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 60 | 17 |
|* 1 | TABLE ACCESS BY INDEX ROWID| INDJOIN | 3 | 60 | 17 |
|* 2 | INDEX FULL SCAN | IJ_V1 | 102 | | 9 |
-----------------------------------------------------------------------
When we include the rowid in the query the optimizer stops using the index join – and it won’t even use the mechanism if we hint it. Apparently, for the purposes of analysing the query, Oracle doesn’t recognise the rowid as a column in the table and this automatically precludes the possibility of using the index join as the access method. So we have to use the manual rewrites I introduced in an earlier article.
You might wonder why this matters – but consider a case where a “perfect” index doesn’t exist for the following query:
select padding from indjoin ij where val1 between 100 and 200 and val2 between 50 and 150 ;
The only access path available to the optimizer at this point is a fulll tablescan – but what if the two indexes are very small compared to the table; wouldn’t it be a good idea to use an index hash join between the two indexes to get a list of rowids and visit the table only for those rows. Unfortunately isn’t a path the optimizer can derive – so we might try something like:
select t.padding from ( select /*+ index_join(ij ij_v1 ij_v2) no_merge */ rowid from indjoin ij where val1 between 100 and 200 and val2 between 50 and 150 ) v1, indjoin t where t.rowid = v1.rowid ;
But, as we’ve just seen, you can’t do an index join if you select the rowid, so this code won’t follow the strategy we want. (In fact, when I tried it, there was something distinctly bug-like about the plan – but I won’t go into that now). But we can do the following:
select
t.padding
from
(
select
rowid
from
indjoin ij
where
val1 between 100 and 200
) v1,
(
select
rowid
from
indjoin ij
where
val2 between 50 and 150
) v2,
indjoin t
where
v2.rowid = v1.rowid
and t.rowid = v2.rowid
;
-----------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 1632 | 10 |
| 1 | NESTED LOOPS | | 3 | 1632 | 10 |
|* 2 | HASH JOIN | | 3 | 96 | 7 |
|* 3 | INDEX FAST FULL SCAN | IJ_V1 | 102 | 1632 | 3 |
|* 4 | INDEX FAST FULL SCAN | IJ_V2 | 102 | 1632 | 3 |
| 5 | TABLE ACCESS BY USER ROWID| INDJOIN | 1 | 512 | 1 |
-----------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("V2".ROWID="V1".ROWID)
3 - filter("VAL1">=100 AND "VAL1"<=200)
4 - filter("VAL2">=50 AND "VAL2"<=150)
It’s amazing what you can make the optimizer do (even without hinting) if you think about the mechanics underneath the basic operations.
Recent comments
17 weeks 5 days ago
27 weeks 3 days ago
29 weeks 1 day ago
32 weeks 2 days ago
34 weeks 4 days ago
44 weeks 1 day ago
45 weeks 5 days ago
46 weeks 5 days ago
46 weeks 6 days ago
49 weeks 4 days ago