Search

Top 60 Oracle Blogs

Recent comments

ANSI – argh

I’m not keen on ANSI standard SQL – even though it is, technically, the strategic option and even though you have to use it for full outer joins and partitioned outer joins.

One reason for disliking it is that it “separates join predicates from filter predicates” – a reason often given in praise of the syntax which, to my mind, claims a spurious distinction and introduces a mechanism that makes it harder to keep mental track of what’s going to happen as you walk  through the join order.

The other reason for disliking ANSI SQL in Oracle databases is that sometimes it really is necessary to add hints to the SQL to make the optimizer do what needs to be done – and ANSI makes it so much harder and messier to add hints to code. Here’s a wonderful example that Tony Hasler presented in our recent debate “Does Oracle Ignore Hints” at the UKOUG annual conference:

WITH q1 as (
	SELECT /*+ qb_name(q1block) */
		*
	FROM	t1
	JOIN	t2
	ON	t1_i1 = t2_i1
	AND	t1_i1 < 10
),
q2 AS (
	SELECT
		/*+ qb_name(q2block) */
		*
	FROM
		t3
	JOIN	t4
	ON	t3_i1 = t4_i1
	AND	t3_i1 < 10
)
SELECT
	/*+
		no_merge(@q1block)
		no_merge(@q2block)
		leading (@q1block t2)
		use_nl  (@q1block t1)
	*/
	*
FROM
	q1
JOIN
	q2
ON	t1_i1 + t2_i1 = t3_i1 + t4_i1
;

Just to make life really hard, he’s included a couple of “factored subqueries” – and there are a few outstanding optimizer defects with handling subquery factoring – so when he claimed that this was an example of Oracle ignoring hints I had two different directions of investigation to worry about.

Here’s the execution plan (from my 10.2.0.3 system with the data generation, constraints and indexing that Tony supplied):

------------------------------------------------------------------------
| Id  | Operation                      | Name  | Rows  | Bytes | Cost  |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |       |   250K|    33M|    54 |
|*  1 |  HASH JOIN                     |       |   250K|    33M|    54 |
|   2 |   VIEW                         |       |  5000 |   327K|    11 |
|*  3 |    HASH JOIN                   |       |  5000 |   581K|    11 |
|   4 |     TABLE ACCESS BY INDEX ROWID| T4    |  5000 |   253K|     3 |
|*  5 |      INDEX RANGE SCAN          | T4_I1 |   900 |       |     2 |
|   6 |     TABLE ACCESS BY INDEX ROWID| T3    |  5000 |   327K|     3 |
|*  7 |      INDEX RANGE SCAN          | T3_I1 |   900 |       |     2 |
|   8 |   VIEW                         |       |  5000 |   361K|    12 |
|*  9 |    HASH JOIN                   |       |  5000 |   615K|    12 |
|  10 |     TABLE ACCESS BY INDEX ROWID| T1    |  5000 |   297K|     3 |
|* 11 |      INDEX RANGE SCAN          | T1_I1 |   900 |       |     2 |
|  12 |     TABLE ACCESS BY INDEX ROWID| T2    |  5000 |   317K|     3 |
|* 13 |      INDEX RANGE SCAN          | T2_I1 |   900 |       |     2 |
------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$16C51A37
   2 - SEL$4C69CCA2 / Q2@SEL$1
   3 - SEL$4C69CCA2
   4 - SEL$4C69CCA2 / T4@SEL$2
   5 - SEL$4C69CCA2 / T4@SEL$2
   6 - SEL$4C69CCA2 / T3@SEL$2
   7 - SEL$4C69CCA2 / T3@SEL$2
   8 - SEL$7939585E / Q1@SEL$1
   9 - SEL$7939585E
  10 - SEL$7939585E / T1@SEL$3
  11 - SEL$7939585E / T1@SEL$3
  12 - SEL$7939585E / T2@SEL$3
  13 - SEL$7939585E / T2@SEL$3

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1_I1"+"T2_I1"="T3_I1"+"T4_I1")

As you can see, Oracle has copied the two factored subqueries inline (they appear just once each in the body of the query so this is – probably – inevitable). Then Oracle has obeyed the no_merge() hints – which I could check by deleting the hints and watching the plan change. So why, in lines 10 through 13, has Oracle not obeyed the leading() and use_nl() hints ?

By changing the ANSI syntax to traditional Oracle syntax, I got a different plan:

-------------------------------------------------------------------------
| Id  | Operation                       | Name  | Rows  | Bytes | Cost  |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |       |   250K|    33M| 10045 |
|*  1 |  HASH JOIN                      |       |   250K|    33M| 10045 |
|   2 |   VIEW                          |       |  5000 |   327K|    11 |
|*  3 |    HASH JOIN                    |       |  5000 |   581K|    11 |
|   4 |     TABLE ACCESS BY INDEX ROWID | T4    |  5000 |   253K|     3 |
|*  5 |      INDEX RANGE SCAN           | T4_I1 |   900 |       |     2 |
|   6 |     TABLE ACCESS BY INDEX ROWID | T3    |  5000 |   327K|     3 |
|*  7 |      INDEX RANGE SCAN           | T3_I1 |   900 |       |     2 |
|   8 |   VIEW                          |       |  5000 |   361K| 10003 |
|   9 |    TABLE ACCESS BY INDEX ROWID  | T1    |     1 |    61 |     2 |
|  10 |     NESTED LOOPS                |       |  5000 |   615K| 10003 |
|  11 |      TABLE ACCESS BY INDEX ROWID| T2    |  5000 |   317K|     3 |
|* 12 |       INDEX RANGE SCAN          | T2_I1 |   900 |       |     2 |
|* 13 |      INDEX RANGE SCAN           | T1_I1 |     1 |       |     1 |
-------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1
   2 - Q2BLOCK / Q2@SEL$1
   3 - Q2BLOCK
   4 - Q2BLOCK / T4@Q2BLOCK
   5 - Q2BLOCK / T4@Q2BLOCK
   6 - Q2BLOCK / T3@Q2BLOCK
   7 - Q2BLOCK / T3@Q2BLOCK
   8 - Q1BLOCK / Q1@SEL$1
   9 - Q1BLOCK / T1@Q1BLOCK
  11 - Q1BLOCK / T2@Q1BLOCK
  12 - Q1BLOCK / T2@Q1BLOCK
  13 - Q1BLOCK / T1@Q1BLOCK

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1_I1"+"T2_I1"="T3_I1"+"T4_I1")
   3 - access("T3_I1"="T4_I1")
   5 - access("T4_I1"<10)
   7 - access("T3_I1"<10)
  12 - access("T2_I1"<10)
  13 - access("T1_I1"="T2_I1")
       filter("T1_I1"<10)

Notice how the optimizer is now obeying the leading() and use_nl() hints.

The problem is this: Oracle doesn’t optimise ANSI SQL, it transforms it then optimises it. Transformation can change query blocks, and Tony’s hints apply to specific query blocks. After a little testing and checking I worked out what the SQL looked like AFTER transformation and BEFORE optimisation; and it’s this:

select
	/*+ qb_name(sel$4) */
	*
from
	(
	SELECT
		/*+ qb_name(sel$1) */
		Q1.T1_I1 T1_I1, Q1.T1_I2 T1_I2, Q1.T1_D1 T1_D1, Q1.T2_I1 T2_I1, Q1.T2_I2 T2_I2, Q1.T2_TS T2_TS,
		Q2.T3_I1 T3_I1, Q2.T3_I2 T3_I2, Q2.T3_TSTZ T3_TSTZ, Q2.T4_I1 T4_I1, Q2.T4_I2 T4_I2
	FROM
		(
		SELECT
			/*+ NO_MERGE QB_NAME (Q1BLOCK) */
			from$_subquery$_003.T1_I1_0 T1_I1, from$_subquery$_003.T1_I2_1 T1_I2,
			from$_subquery$_003.T1_D1_2 T1_D1, from$_subquery$_003.T2_I1_3 T2_I1,
			from$_subquery$_003.T2_I2_4 T2_I2, from$_subquery$_003.T2_TS_5 T2_TS
		FROM	(
			SELECT
				/*+ qb_name(sel$3) */
				T1.T1_I1 T1_I1_0, T1.T1_I2 T1_I2_1, T1.T1_D1 T1_D1_2,
				T2.T2_I1 T2_I1_3, T2.T2_I2 T2_I2_4, T2.T2_TS T2_TS_5
			FROM
				TEST_USER.T1 T1,
				TEST_USER.T2 T2
			WHERE
				T1.T1_I1 = T2.T2_I1
			AND	T1.T1_I1 < 10
			)	from$_subquery$_003
		)	Q1,
		(
		SELECT
			/*+ NO_MERGE QB_NAME (Q2BLOCK) */
			from$_subquery$_006.T3_I1_0 T3_I1, from$_subquery$_006.T3_I2_1 T3_I2,
			from$_subquery$_006.T3_TSTZ_2 T3_TSTZ, from$_subquery$_006.T4_I1_3 T4_I1,
			from$_subquery$_006.T4_I2_4 T4_I2
		FROM
			(
			SELECT
				/*+ qb_name(sel$2) */
				T3.T3_I1 T3_I1_0, T3.T3_I2 T3_I2_1, T3.T3_TSTZ T3_TSTZ_2,
				T4.T4_I1 T4_I1_3, T4.T4_I2 T4_I2_4 FROM TEST_USER.T3 T3, TEST_USER.T4 T4
			WHERE
				T3.T3_I1 = T4.T4_I1
			AND	T3.T3_I1 < 10
			)	from$_subquery$_006
		)	Q2
	WHERE
		Q1.T1_I1 + Q1.T2_I1 = Q2.T3_I1 + Q2.T4_I1
	)
;

I got most of this from the “Query Block Name / Object Alias” section of the ANSI execution plan (there are some important clues there, like ‘T1@SEL$3′) and the “unparsed” SQL from the 10053 trace.

Notice how the query blocks q1block and q2block still exist – that’s why the no_merge() hints can survive the transformation. Notice, though, that the transformation engines has introduced a layer of inline views inside q1block and q2block - which is why the leading(@q1block t2) and use_nl(@q1block t1) hints are no longer valid: they reference objects which are not in q1block. To get his hints to work at the global level, Tony would have to change the last two hints to reference sel$3 rather than q1block.

So, next time you write a complicated piece of ANSI, make sure you think carefully about what you’re going to have to do if you subsequently have to add hints to force a particular execution plan.  (And bear in mind that one day the transformation engine might be modified to transform the query differently.)

[Further reading on "ignoring hints"]