Search

Top 60 Oracle Blogs

Recent comments

Index join – 4

In a recent note I wrote about index joins I made a passing comment about limitations in the optimizer’s available strategies that might make you choose to write your code to emulate an index join through explicit SQL references.

Here are two SQL similar SQL statements (with execution plans) that demonstrate the initial problem – the first is just a restatement of the basic example I supplied in the first article:

create table indjoin
as
select
	rownum	id,
	rownum	val1,
	rownum	val2,
	rpad('x',500) padding
from all_objects where rownum <= 3000
;

-- collect stats, compute, no histograms

create unique index ij_v1 on indjoin(id, val1);
create unique index ij_v2 on indjoin(id, val2);

select
	val1, val2
from
	indjoin		ij
where
	val1 between 100 and 200
and	val2 between 50 and 150
;

---------------------------------------------------------------------------
| Id  | Operation              | Name             | Rows  | Bytes | Cost  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                  |     3 |    24 |    24 |
|*  1 |  VIEW                  | index$_join$_001 |     3 |    24 |    24 |
|*  2 |   HASH JOIN            |                  |       |       |       |
|*  3 |    INDEX FAST FULL SCAN| IJ_V1            |     3 |    24 |    11 |
|*  4 |    INDEX FAST FULL SCAN| IJ_V2            |     3 |    24 |    11 |
---------------------------------------------------------------------------

select
	val1, val2, rowid
from
	indjoin		ij
where
	val1 between 100 and 200
and	val2 between 50 and 150
;

-----------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost  |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |     3 |    60 |    17 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| INDJOIN |     3 |    60 |    17 |
|*  2 |   INDEX FULL SCAN           | IJ_V1   |   102 |       |     9 |
-----------------------------------------------------------------------

When we include the rowid in the query the optimizer stops using the index join – and it won’t even use the mechanism if we hint it. Apparently, for the purposes of analysing the query, Oracle doesn’t recognise the rowid as a column in the table and this automatically precludes the possibility of using the index join as the access method. So we have to use the manual rewrites I introduced in an earlier article.

You might wonder why this matters – but consider a case where a “perfect” index doesn’t exist for the following query:

select
	padding
from
	indjoin		ij
where
	val1 between 100 and 200
and	val2 between 50 and 150
;

The only access path available to the optimizer at this point is a fulll tablescan – but what if the two indexes are very small compared to the table; wouldn’t it be a good idea to use an index hash join between the two indexes to get a list of rowids and visit the table only for those rows. Unfortunately isn’t a path the optimizer can derive – so we might try something like:


select
	t.padding
from
	(
	select
		/*+
			index_join(ij ij_v1 ij_v2)
			no_merge
		*/
		rowid
	from
		indjoin		ij
	where
		val1 between 100 and 200
	and	val2 between 50 and 150
	)	v1,
	indjoin	t
where
	t.rowid = v1.rowid
;

But, as we’ve just seen, you can’t do an index join if you select the rowid, so this code won’t follow the strategy we want. (In fact, when I tried it, there was something distinctly bug-like about the plan – but I won’t go into that now). But we can do the following:


select
	t.padding
from
	(
	select
		rowid
	from
		indjoin		ij
	where
		val1 between 100 and 200
	)	v1,
	(
	select
		rowid
	from
		indjoin		ij
	where
		val2 between 50 and 150
	)	v2,
	indjoin	t
where
	v2.rowid = v1.rowid
and	t.rowid = v2.rowid
;

-----------------------------------------------------------------------
| Id  | Operation                   | Name    | Rows  | Bytes | Cost  |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT            |         |     3 |  1632 |    10 |
|   1 |  NESTED LOOPS               |         |     3 |  1632 |    10 |
|*  2 |   HASH JOIN                 |         |     3 |    96 |     7 |
|*  3 |    INDEX FAST FULL SCAN     | IJ_V1   |   102 |  1632 |     3 |
|*  4 |    INDEX FAST FULL SCAN     | IJ_V2   |   102 |  1632 |     3 |
|   5 |   TABLE ACCESS BY USER ROWID| INDJOIN |     1 |   512 |     1 |
-----------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("V2".ROWID="V1".ROWID)
   3 - filter("VAL1">=100 AND "VAL1"<=200)
   4 - filter("VAL2">=50 AND "VAL2"<=150)

It’s amazing what you can make the optimizer do (even without hinting) if you think about the mechanics underneath the basic operations.

[Further reading on Index Joins]