Search

OakieTags

Who's online

There are currently 0 users and 35 guests online.

Recent comments

Affiliations

Subquery Factoring (4)

I’ve written before about the effects of subquery factoring (common table expressions – or CTEs) on the optimizer, and the way that the optimizer can “lose” some strategies when you start factoring out subquery expressions. Here’s another example I came across quite recently. It involved a join of about 15 tables so I’ve only extracted a few lines from the SQL and resulting execution plans.

We start with the original query, which had factored out an aggregate subquery then used it in place of an inline view:

with max_cust_comm as (
	select
		ccm.order_id,
		max(ccm.comm_date)
	from
		customer_communications ccm
	group by
		ccm.order_id
)
select
	...
from
	...
left join
	max_cust_comm	mcc
on	mcc.order_id = ord.order_id
...

The execution path for this query included the following lines:

|   6 |     HASH JOIN OUTER            |                            |     1 |
|   7 |      NESTED LOOP               |                            |     1 |
               ...
|  41 |      VIEW                      |                            |   798K|
|  42 |       HASH GROUP BY            |                            |   798K|
|  43 |        TABLE ACCESS FULL       | CUSTOMER_COMMUNICATIONS    |   798K|

You can see that the optimizer has created a result set (VIEW) at line 41 by scanning the entire customer_communications table, for a total of about 800,000 rows, aggregating the data by order_id. This is not very efficient becauase (a) I happen to have a very useful index on the customer_communications table that contains all the data I need, and (b) there are just a few input rows where I need to find this max(comm_date).

Note, by the way, that we have an outer join into the aggregate view, so Oracle cannot use complex view merging (CVM) to avoid the need to create the result set – but it could, in principle, use join predicate pushdown (JPPD) to create a result once for each input value.

So now we change the code to copy the subquery back to the inline position:

select
	...
from
	...
left join
	(
	select
		ccm.order_id,
		max(ccm.comm_date)
	from
		customer_communications ccm
	group by
		ccm.order_id
	) mcc
on	mcc.order_id = ord.order_id
...

This, with no further intervention, is what happened to the plan.

|   6 |     NESTED LOOP OUTER                   |                    |     1 |
|   7 |      NESTED LOOP                        |                    |     1 |
               ...
|  43 |      VIEW PUSHED PREDICATE              |                    |     1 |
|  44 |       SORT AGGREGATE                    |                    |     1 |
|  45 |        FIRST ROW                        |                    |     1 |
|* 46 |         INDEX RANGE SCAN (MIN/MAX)      | CST_CMM_IX2        |     1 |

Notice that the hash join outer has changed to a nested loop outer, and the full tablescan has changed to an optimal index-only access using the ideal (for this query) index range scan (min/max). Moving the inline view into a common table expression at the head of the SQL was enough to stop Oracle from investigating this option – even though the optimizer immediately moved it back inline.

So be careful the next time you think about using subquery factoring to make your queries look cleaner and simpler – it’s a good idea, and a nice piece of technology, but sometimes it will cause your execution plans to change for the worse.

Related Articles:

Complex View Merging
Subquery Factoring (1)
Subquery Factoring (2)
Subquery Factoring (3)
Pushing Predicates (1)
Pushing Predicates (2)

Footnote: This error, and the error described in the previous note, may be related to (unpublished) bug 9850588