Search

Top 60 Oracle Blogs

Recent comments

ANSI Standard

ANSI bug

The following note is about a script that I found on my laptop while I was searching for some details about a bug that appears when you write SQL using the ANSI style format rather than traditional Oracle style. The script is clearly one that I must have cut and pasted from somewhere (possibly the OTN/ODC database forum) many years ago without making any notes about its source or resolution. All I can say about it is that the file has a creation date of July 2012 and I can’t find any reference to a problem through Google searches – though the tables and even a set of specific insert statements appears in a number of pages that look like coursework for computer studies and MoS has a similar looking bug “fixed in 11.2”.

Here’s the entire script:

Table order

Over the last few days I’ve highlighted on Twitter a couple of older posts showing how a change in the order that tables appear in the from clause could affect the execution plan of a query. In one case the note was purely theoretical describing a feature of the way the optimizer works with simple query blocks, in the other case the note was about an anomaly with table elimination that could appear with both “ANSI” and “traditional” Oracle syntax.

ANSI bug

In almost all cases the SQL you write using the ANSI standard syntax is tranformed into a statement using Oracle’s original syntax before being optimised – and there are still odd cases where the translation is not ideal.  This can result in poor performance, it can result in wrong results. The following examples arrived in my in-tray a couple of weeks ago:

Invalidation

Someone who attended my sessions at the Bucharest Oracle Summit earlier on this year sent me an example of a quirky little bug, possibly related to the newer “fine-grained” invalidation mechanisms, possibly related to ANSI syntax SQL, that’s very easy to reproduce. (That’s always nice for Oracle support – a perfect test case.)

All it takes is two tables and a packaged procedure that queries those tables. The package is coded to do something that should not be allowed in production code; but “should not” and “is not” are very different things. For anyone who wants to play with the example, here’s the script to create the necessary objects:

ANSI expansion

Here’s a quirky little bug that appeared on the OTN database forum in the last 24 hours which (in 12c, at least) produces an issue which I can best demonstrate with the following cut-n-paste:

ROWID

Here’s a suggestion to help you avoid wasting time. If you ever include the rowid in a query – not that that should happen very commonly – make sure you give it an alias, especially if you’re using ANSI SQL. If you don’t, you may find yourself struggling to work out why you’re getting an irrational error message. Here’s an example that appeared recently on the OTN forum, with the output cut-n-pasted from a system running 11.1.0.7:

select 
	'1' 
from 
	dual a
left join 
	(
	select	c.dummy, b.rowid
	from	dual b
	join	dual c
	on b.dummy = c.dummy
	) d
on 	a.dummy = d.dummy
;

select
*
ERROR at line 1:
ORA-01445: cannot select ROWID from, or sample, a join view without a key-preserved table

The error doesn’t really seem to fit the query, does it?
If you want to bypass the problem all you have to do is give b.rowid (line 7) an alias like rid.

ANSI Outer 2

A comment on a recent post of mine pointed me to a question on the OTN SQL and PL/SQL Forum where someone had presented a well-written test case of an odd pattern of behaviour in ANSI SQL. I made a couple of brief comments on the thread, but thought it worth highlighting here as well. The scripts to create the required tables (plus a few extras) are all available on OTN. If you create only the four tables needed and all their indexes you will need about 1.3GB of space.

ANSI Outer

Here’s an example of ANSI SQL that does something in a fashion that arguably looks a little tidier than the strategy you have to adopt in Oracle. As so often when I compare Oracle syntax and ANSI syntax it’s an exanple that relates to an outer join. We start with two tables – as usual I have locally managed tablespaces, 8KB blocks, 1MB uniform extents and freelist management. I’ve also disabled system statistics (CPU costing):


create table t1
as
select
	rownum - 1			id,
	mod(rownum - 1,20)		n1,
	lpad(rownum - 1,10,'0')		v1,
	rpad('x',100)			padding
from
	all_objects
where
	rownum <= 4000
;

create table t2
as
select
	rownum - 1			id,
	mod(rownum - 1,20)		n1,
	lpad(rownum - 1,10,'0')		v1,
	rpad('x',100)			padding
from
	all_objects
where
	rownum <= 4000
;

create index t1_i1 on t1(id);
create index t2_i1 on t2(id);

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T1',
		estimate_percent => 100,
		method_opt 	 => 'for all columns size 1'
	);

	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'T2',
		estimate_percent => 100,
		method_opt 	 => 'for all columns size 1'
	);

end;
/

If you're familiar with ANSI SQL you won't need more than a couple of moments to interpret the following query - but I have to admit that I had to think about it carefully before I figured out what it was trying to achieve:


select
	/*+ gather_plan_statistics */
	t1.id,
	t1.n1,
	t1.v1,
	t2.n1
from
	t1
left join
	t2
on
	t2.id = t1.n1
and	t1.n1 in (7, 11, 13)
where
	t1.id = 15
;

        ID         N1 V1                 N1
---------- ---------- ---------- ----------
        15         15 0000000015

--------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------
|   1 |  NESTED LOOPS OUTER            |       |      1 |      1 |      1 |00:00:00.01 |       4 |
|   2 |   TABLE ACCESS BY INDEX ROWID  | T1    |      1 |      1 |      1 |00:00:00.01 |       4 |
|*  3 |    INDEX RANGE SCAN            | T1_I1 |      1 |      1 |      1 |00:00:00.01 |       3 |
|   4 |   VIEW                         |       |      1 |      1 |      0 |00:00:00.01 |       0 |
|*  5 |    FILTER                      |       |      1 |        |      0 |00:00:00.01 |       0 |
|   6 |     TABLE ACCESS BY INDEX ROWID| T2    |      0 |      1 |      0 |00:00:00.01 |       0 |
|*  7 |      INDEX RANGE SCAN          | T2_I1 |      0 |      1 |      0 |00:00:00.01 |       0 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."ID"=15)
   5 - filter(("T1"."N1"=7 OR "T1"."N1"=11 OR "T1"."N1"=13))
   7 - access("T2"."ID"="T1"."N1")
       filter(("T2"."ID"=7 OR "T2"."ID"=11 OR "T2"."ID"=13))

Look carefully at the query. The ON clause includes a reference to the t1 table that is NOT a join condition. This condition means that Oracle will try to find a matching row in t2 only if the n1 value in t1 is in 7,11, or 13.

Since the join is left outer, any t1 row where id = 15 will be returned, but the n1 column from t2 will be reported only if the t1.n1 value is in 7,11, or 13 and there is a t2 row with a matching id value.

How would you express the same requirment in standard Oracle syntax ? Here's one possibility:

select
	/*+ gather_plan_statistics */
	t1.id,
	t1.n1,
	t1.v1,
	t2.n1
from
	t1, t2
where
	t1.id = 15
and	t2.id(+) = case
		when t1.n1 not in (7, 11, 13)
		 	then null
			else t1.n1
	end
;

        ID         N1 V1                 N1
---------- ---------- ---------- ----------
        15         15 0000000015

------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
------------------------------------------------------------------------------------------------
|   1 |  NESTED LOOPS OUTER          |       |      1 |      1 |      1 |00:00:00.01 |       4 |
|   2 |   TABLE ACCESS BY INDEX ROWID| T1    |      1 |      1 |      1 |00:00:00.01 |       4 |
|*  3 |    INDEX RANGE SCAN          | T1_I1 |      1 |      1 |      1 |00:00:00.01 |       3 |
|   4 |   TABLE ACCESS BY INDEX ROWID| T2    |      1 |      1 |      0 |00:00:00.01 |       0 |
|*  5 |    INDEX RANGE SCAN          | T2_I1 |      1 |      1 |      0 |00:00:00.01 |       0 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."ID"=15)
   5 - access("T2"."ID"=CASE  WHEN (("T1"."N1"<>7) AND ("T1"."N1"<>11) AND
              ("T1"."N1"<>13)) THEN NULL ELSE "T1"."N1" END )

You will have noticed that I used the /*+ gather_plan_statistics */ hint in both queries, and the plans I've shown are the ones that I pulled from memory with their last execution statistics included.

The plans are clearly just a little different from each other - but are they equally efficient ?

Both plans start the same way - for each relevant row in t1 they call line 4 - and in both cases there is only one relevant row in t1, hence one call to line 4.

In the first plan we call the subroutine to create a rowsource (the VIEW operator), and this calls the FILTER operation. The filter operation is an example of a "conditional" filter - i.e. if the test in line 5 is true then line 6 is called - and in this case the test is false so line 6 is never called (Starts = 0) and Oracle doesn't try to access table t2. So we fetch one row from t1, then call two subroutines that between them do a "case" test but access no data blocks.

In the second plan line 4 calls the table access operation, which calls the index range scan operation in line 5 - but line 5 starts with a call to the case statement that returns NULL - so even though we call the index range scan operation, we don't access any data blocks, which means we don't pass any rowids to the table access in line 4, which means that that operation doesn't access any data blocks. The net workload is that lines 4 and 5 in the second plan also represent nothing more than two subroutine calls and a "case" test.

The two plans are virtually identical in resource usage - so your choice of which to use probably comes down to aesthetic appeal and comfort factor. Thanks to my previous habits I think the Oracle variant is "obviously" much easier to understand - but the client probably felt the opposite because they spent most of their time writing ANSI.

Is there another way to write the query - yes, there is, but for some reason it's not supported. If you look at the 10053 trace file for the ANSI example you'll find that Oracle has rewritten it with a LATERAL subquery before optimising it. So, presumably, we can appply the same transformation manually:


select
	/*+ gather_plan_statistics */
	t1.id,
	t1.n1,
	t1.v1,
	t2.n1
from
	t1,
	lateral (
		(
		select
			t2.n1
		from
			t2
		where
			t1.n1 in (7, 11, 13)
		and	t2.id = t1.n1
		)
	)(+) t2
where
	t1.id = 15
;

        lateral (
                *
ERROR at line 9:
ORA-00933: SQL command not properly ended

On second thoughts perhaps we can't - but it was a nice idea.

Oracle uses the table() operator with collections (or pipelined functions) as the equivalent of the ANSI LATERAL() operator, but doesn't let you use it with queries (you get Oracle error "ORA-22905: cannot access rows from a non-nested table item" if you try).

The concept is simple - the table() or lateral() operator allows you to introduce a subquery in the FROM clause that references columns from a table that appears earlier in the same FROM clause.

I don't know why Oracle doesn't support the lateral() operator in end-user code - but actually, for experimental purposes only of course, if you want to play with it you can always set event 22829:


alter session set events '22829 trace name context forever';

-- execute lateral query, and get this plan

--------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------
|   1 |  NESTED LOOPS OUTER            |       |      1 |      1 |      1 |00:00:00.01 |       4 |
|   2 |   TABLE ACCESS BY INDEX ROWID  | T1    |      1 |      1 |      1 |00:00:00.01 |       4 |
|*  3 |    INDEX RANGE SCAN            | T1_I1 |      1 |      1 |      1 |00:00:00.01 |       3 |
|   4 |   VIEW                         |       |      1 |      1 |      0 |00:00:00.01 |       0 |
|*  5 |    FILTER                      |       |      1 |        |      0 |00:00:00.01 |       0 |
|   6 |     TABLE ACCESS BY INDEX ROWID| T2    |      0 |      1 |      0 |00:00:00.01 |       0 |
|*  7 |      INDEX RANGE SCAN          | T2_I1 |      0 |      1 |      0 |00:00:00.01 |       0 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T1"."ID"=15)
   5 - filter(("T1"."N1"=7 OR "T1"."N1"=11 OR "T1"."N1"=13))
   7 - access("T2"."ID"="T1"."N1")
       filter(("T2"."ID"=7 OR "T2"."ID"=11 OR "T2"."ID"=13))

The plan is identical to the plan for the ANSI after transformation. I'll leave it to you to decide whether the code is easier to understand than the ANSI or ordinary Oracle versions - but I'd like to see it made legal, even if I didn't find many cases where I needed it.

ANSI – argh

I’m not keen on ANSI standard SQL – even though it is, technically, the strategic option and even though you have to use it for full outer joins and partitioned outer joins.

One reason for disliking it is that it “separates join predicates from filter predicates” – a reason often given in praise of the syntax which, to my mind, claims a spurious distinction and introduces a mechanism that makes it harder to keep mental track of what’s going to happen as you walk  through the join order.

The other reason for disliking ANSI SQL in Oracle databases is that sometimes it really is necessary to add hints to the SQL to make the optimizer do what needs to be done – and ANSI makes it so much harder and messier to add hints to code. Here’s a wonderful example that Tony Hasler presented in our recent debate “Does Oracle Ignore Hints” at the UKOUG annual conference:

WITH q1 as (
	SELECT /*+ qb_name(q1block) */
		*
	FROM	t1
	JOIN	t2
	ON	t1_i1 = t2_i1
	AND	t1_i1 < 10
),
q2 AS (
	SELECT
		/*+ qb_name(q2block) */
		*
	FROM
		t3
	JOIN	t4
	ON	t3_i1 = t4_i1
	AND	t3_i1 < 10
)
SELECT
	/*+
		no_merge(@q1block)
		no_merge(@q2block)
		leading (@q1block t2)
		use_nl  (@q1block t1)
	*/
	*
FROM
	q1
JOIN
	q2
ON	t1_i1 + t2_i1 = t3_i1 + t4_i1
;

Just to make life really hard, he’s included a couple of “factored subqueries” – and there are a few outstanding optimizer defects with handling subquery factoring – so when he claimed that this was an example of Oracle ignoring hints I had two different directions of investigation to worry about.

Here’s the execution plan (from my 10.2.0.3 system with the data generation, constraints and indexing that Tony supplied):

------------------------------------------------------------------------
| Id  | Operation                      | Name  | Rows  | Bytes | Cost  |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |       |   250K|    33M|    54 |
|*  1 |  HASH JOIN                     |       |   250K|    33M|    54 |
|   2 |   VIEW                         |       |  5000 |   327K|    11 |
|*  3 |    HASH JOIN                   |       |  5000 |   581K|    11 |
|   4 |     TABLE ACCESS BY INDEX ROWID| T4    |  5000 |   253K|     3 |
|*  5 |      INDEX RANGE SCAN          | T4_I1 |   900 |       |     2 |
|   6 |     TABLE ACCESS BY INDEX ROWID| T3    |  5000 |   327K|     3 |
|*  7 |      INDEX RANGE SCAN          | T3_I1 |   900 |       |     2 |
|   8 |   VIEW                         |       |  5000 |   361K|    12 |
|*  9 |    HASH JOIN                   |       |  5000 |   615K|    12 |
|  10 |     TABLE ACCESS BY INDEX ROWID| T1    |  5000 |   297K|     3 |
|* 11 |      INDEX RANGE SCAN          | T1_I1 |   900 |       |     2 |
|  12 |     TABLE ACCESS BY INDEX ROWID| T2    |  5000 |   317K|     3 |
|* 13 |      INDEX RANGE SCAN          | T2_I1 |   900 |       |     2 |
------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$16C51A37
   2 - SEL$4C69CCA2 / Q2@SEL$1
   3 - SEL$4C69CCA2
   4 - SEL$4C69CCA2 / T4@SEL$2
   5 - SEL$4C69CCA2 / T4@SEL$2
   6 - SEL$4C69CCA2 / T3@SEL$2
   7 - SEL$4C69CCA2 / T3@SEL$2
   8 - SEL$7939585E / Q1@SEL$1
   9 - SEL$7939585E
  10 - SEL$7939585E / T1@SEL$3
  11 - SEL$7939585E / T1@SEL$3
  12 - SEL$7939585E / T2@SEL$3
  13 - SEL$7939585E / T2@SEL$3

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1_I1"+"T2_I1"="T3_I1"+"T4_I1")

As you can see, Oracle has copied the two factored subqueries inline (they appear just once each in the body of the query so this is – probably – inevitable). Then Oracle has obeyed the no_merge() hints – which I could check by deleting the hints and watching the plan change. So why, in lines 10 through 13, has Oracle not obeyed the leading() and use_nl() hints ?

By changing the ANSI syntax to traditional Oracle syntax, I got a different plan:

-------------------------------------------------------------------------
| Id  | Operation                       | Name  | Rows  | Bytes | Cost  |
-------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |       |   250K|    33M| 10045 |
|*  1 |  HASH JOIN                      |       |   250K|    33M| 10045 |
|   2 |   VIEW                          |       |  5000 |   327K|    11 |
|*  3 |    HASH JOIN                    |       |  5000 |   581K|    11 |
|   4 |     TABLE ACCESS BY INDEX ROWID | T4    |  5000 |   253K|     3 |
|*  5 |      INDEX RANGE SCAN           | T4_I1 |   900 |       |     2 |
|   6 |     TABLE ACCESS BY INDEX ROWID | T3    |  5000 |   327K|     3 |
|*  7 |      INDEX RANGE SCAN           | T3_I1 |   900 |       |     2 |
|   8 |   VIEW                          |       |  5000 |   361K| 10003 |
|   9 |    TABLE ACCESS BY INDEX ROWID  | T1    |     1 |    61 |     2 |
|  10 |     NESTED LOOPS                |       |  5000 |   615K| 10003 |
|  11 |      TABLE ACCESS BY INDEX ROWID| T2    |  5000 |   317K|     3 |
|* 12 |       INDEX RANGE SCAN          | T2_I1 |   900 |       |     2 |
|* 13 |      INDEX RANGE SCAN           | T1_I1 |     1 |       |     1 |
-------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
   1 - SEL$1
   2 - Q2BLOCK / Q2@SEL$1
   3 - Q2BLOCK
   4 - Q2BLOCK / T4@Q2BLOCK
   5 - Q2BLOCK / T4@Q2BLOCK
   6 - Q2BLOCK / T3@Q2BLOCK
   7 - Q2BLOCK / T3@Q2BLOCK
   8 - Q1BLOCK / Q1@SEL$1
   9 - Q1BLOCK / T1@Q1BLOCK
  11 - Q1BLOCK / T2@Q1BLOCK
  12 - Q1BLOCK / T2@Q1BLOCK
  13 - Q1BLOCK / T1@Q1BLOCK

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1_I1"+"T2_I1"="T3_I1"+"T4_I1")
   3 - access("T3_I1"="T4_I1")
   5 - access("T4_I1"<10)
   7 - access("T3_I1"<10)
  12 - access("T2_I1"<10)
  13 - access("T1_I1"="T2_I1")
       filter("T1_I1"<10)

Notice how the optimizer is now obeying the leading() and use_nl() hints.

The problem is this: Oracle doesn’t optimise ANSI SQL, it transforms it then optimises it. Transformation can change query blocks, and Tony’s hints apply to specific query blocks. After a little testing and checking I worked out what the SQL looked like AFTER transformation and BEFORE optimisation; and it’s this:

select
	/*+ qb_name(sel$4) */
	*
from
	(
	SELECT
		/*+ qb_name(sel$1) */
		Q1.T1_I1 T1_I1, Q1.T1_I2 T1_I2, Q1.T1_D1 T1_D1, Q1.T2_I1 T2_I1, Q1.T2_I2 T2_I2, Q1.T2_TS T2_TS,
		Q2.T3_I1 T3_I1, Q2.T3_I2 T3_I2, Q2.T3_TSTZ T3_TSTZ, Q2.T4_I1 T4_I1, Q2.T4_I2 T4_I2
	FROM
		(
		SELECT
			/*+ NO_MERGE QB_NAME (Q1BLOCK) */
			from$_subquery$_003.T1_I1_0 T1_I1, from$_subquery$_003.T1_I2_1 T1_I2,
			from$_subquery$_003.T1_D1_2 T1_D1, from$_subquery$_003.T2_I1_3 T2_I1,
			from$_subquery$_003.T2_I2_4 T2_I2, from$_subquery$_003.T2_TS_5 T2_TS
		FROM	(
			SELECT
				/*+ qb_name(sel$3) */
				T1.T1_I1 T1_I1_0, T1.T1_I2 T1_I2_1, T1.T1_D1 T1_D1_2,
				T2.T2_I1 T2_I1_3, T2.T2_I2 T2_I2_4, T2.T2_TS T2_TS_5
			FROM
				TEST_USER.T1 T1,
				TEST_USER.T2 T2
			WHERE
				T1.T1_I1 = T2.T2_I1
			AND	T1.T1_I1 < 10
			)	from$_subquery$_003
		)	Q1,
		(
		SELECT
			/*+ NO_MERGE QB_NAME (Q2BLOCK) */
			from$_subquery$_006.T3_I1_0 T3_I1, from$_subquery$_006.T3_I2_1 T3_I2,
			from$_subquery$_006.T3_TSTZ_2 T3_TSTZ, from$_subquery$_006.T4_I1_3 T4_I1,
			from$_subquery$_006.T4_I2_4 T4_I2
		FROM
			(
			SELECT
				/*+ qb_name(sel$2) */
				T3.T3_I1 T3_I1_0, T3.T3_I2 T3_I2_1, T3.T3_TSTZ T3_TSTZ_2,
				T4.T4_I1 T4_I1_3, T4.T4_I2 T4_I2_4 FROM TEST_USER.T3 T3, TEST_USER.T4 T4
			WHERE
				T3.T3_I1 = T4.T4_I1
			AND	T3.T3_I1 < 10
			)	from$_subquery$_006
		)	Q2
	WHERE
		Q1.T1_I1 + Q1.T2_I1 = Q2.T3_I1 + Q2.T4_I1
	)
;

I got most of this from the “Query Block Name / Object Alias” section of the ANSI execution plan (there are some important clues there, like ‘T1@SEL$3′) and the “unparsed” SQL from the 10053 trace.

Notice how the query blocks q1block and q2block still exist – that’s why the no_merge() hints can survive the transformation. Notice, though, that the transformation engines has introduced a layer of inline views inside q1block and q2block - which is why the leading(@q1block t2) and use_nl(@q1block t1) hints are no longer valid: they reference objects which are not in q1block. To get his hints to work at the global level, Tony would have to change the last two hints to reference sel$3 rather than q1block.

So, next time you write a complicated piece of ANSI, make sure you think carefully about what you’re going to have to do if you subsequently have to add hints to force a particular execution plan.  (And bear in mind that one day the transformation engine might be modified to transform the query differently.)

[Further reading on "ignoring hints"]