Search

OakieTags

Who's online

There are currently 0 users and 30 guests online.

Recent comments

Affiliations

Troubleshooting

A little new feature for shared pool geeks :-)

If you’ve taken any shared pool dumps from 11g+ databases lately, you might have wondered about this:

http://tech.e2sn.com/oracle/troubleshooting/shared-pool

Share

Another (secret) hacking session with me – using Oracle Session Snapper for flexible troubleshooting (and fun)

And this time we have audio !!! (Wow!)

Following the huge success of my last hacking session, planned while drinking beer at Graham Woods OOW pre-party and delivered from Miracle’s massive Oracle Closed World event in Thirsty Bear (between drinking beers), I’m announcing another hacking session:

What: Using Session Snapper for flexible Oracle Performance Troubleshooting

When: Wednesday 27th Oct 9:00-10:00 AM PDT (US West coast / California time). Check what’s this in your time zone here

Where: Internet! -> Sign up here: http://tech.e2sn.com/secret

You’ll need to register fast and be “there” on time as my current GotoWebinar account only allow 100 attendees to log on… last time over 100 people signed up, but “luckily” less actually showed up, so nobody got left outside!

BTW, I have figured out what went wrong with audio last time and caused my voice in the end of presentation disappear). A program, which I accidentally launched via a keyboard shortcut, grabbed my Mic input to itself, so gotowebinar’s app couldn’t access it anymore.

See you soon!

Share

COUNT STOPKEY operation (the where ROWNUM

I was running tests on some very large tables on an Exadata cluster and found an interesting bug.

Instead of having to query 4Billion row tables, I have reproduced this case with a cartesian join…

Check this. I’m generating 8 Billion rows using a 3-way cartesian join of set of 2000 rows. So, this results in 2000 * 2000 * 2000 rows, which is 8 billion rows.

SQL>  with sq as (select null from dual connect by level <= 2000)
      select count(*)
      from sq a, sq b, sq c;

COUNT(*)
----------
8000000000

Everything worked well as you see. All 8 billion rows were nicely counted. Let’s modify this query a bit, by adding a WHERE rownum <= 8 000 000 000 predicate, which shouldn’t modify the outcome of my query as 8 billion rows is exactly what I have:

SQL> with sq as (select null from dual connect by level <= 2000)
     select count(*)
     from sq a, sq b, sq c
     where rownum <= 8000000000;

COUNT(*)
----------
4294967295

Ding! (as I’m unable to come up with a better word to present this problem :)

Seems like we have a problem! The COUNT operation thinks that we have only a bit over 4 billion rows returned from the query plan branches below it.

We clearly have a(n easily reproducible ) bug here!

What happens here is that whenever you use ROWNUM < N operator in a query predicate like I have done, you will introduce an additional step to the query plan (COUNT STOPKEY).

Check this simple query:

SELECT * FROM t WHERE ROWNUM <= 1000

Plan hash value: 508354683

--------------------------------------------
| Id  | Operation          | Name | E-Rows |
--------------------------------------------
|   0 | SELECT STATEMENT   |      |        |
|*  1 |  COUNT STOPKEY     |      |        |
|   2 |   TABLE ACCESS FULL| T    |     41 |
--------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=1000)

As the above plan explains, the “WHERE rownum <” predicate is applied in the execution plan line (row source) 1 – COUNT STOPKEY. You won’t see this line when you are just counting rows without any “ROWNUM <” predicate.

How this works is that the COUNT STOPKEY knows how many rows you want and will just keep calling its child function under it in the execution plan tree to get more and more rows, until the required amount of rows have been returned. And then the COUNT STOPKEY function just stops calling its child row sources and declares the end-of-data condition.

And here’s the catch – due to a bug, the variable used to keep track of number of processed rows by COUNT STOPKEY is only 4 bytes, even on 64bit platforms. So, it can hold values up to 4294967295 in it (the count returned above), which is 2 ^ 32 – 1. That wouldn’t be so much of a problem in practical real world applications, but what I don’t like is that the operation will silently return wrong data – it will just stop fetching more rows, even though we haven’t reached the ROWNUM <= 8 000 000 000 limit yet and there are rows to fetch from the child row-source, but COUNT STOPKEY declares end-of-data condition and returns ~4B as a result.

This is (obviously a bug) and now there’s a bug number for that as well (thanks to Greg Rahn for letting me know) – Bug# 10214991 (unfortunately it’s not public in MOS).

Now, there’s a reason why this bug has gone unnoticed for so long despite that 4+ billion-row tables have existed for long time (I worked first with such Oracle database in year 2000 – on Windows NT :).

  1. A real-life business query with a construct of WHERE ROWNUM <= X makes sense only when the data is ordered by some meaningful business attribute (a column). Otherwise you’ll get the query results in quite random order. Note that I’m talking about real, business queries here.
  2. The only right way to order data in SQL is via an ORDER BY clause. There is no other right way to do it, even though some hacks sometimes work (and will stop working after the next upgrade)
  3. Nobody (or not many people) have written queries like: give me the sum of 5 billion biggest purchases ordered by the selling price or give me the sum of last 5 billion purchases up to this date. If you replace the billion by just ten, hundred, or just a thousand, then hell yeah, such types of queries are being executed every day (or should I say minute, second).
    Whether using ROWNUM in your business queries (instead of actual date range or purchase price in $$$) is a good or bad design is a completely different question – the fact is that ROWNUM is used in such SQLs.

Now we’ll just need to wait until the data volumes (and reports) get so big that someone actually hits this problem in real life!

So, better patch your database before the end of this decade! :-)

Share

History

A little while ago I noticed a couple of page views that had come from the AskTom website – and I’m always interested to see what question has prompted a link and visit to my blog. In this case it was a question that raised a piece of (nearly ancient) history. In a question dated July 2009 someone was asking about a comment I had made in “Practical Oracle 8i”. (Despite the book being over 10 years old I’m still getting royalty cheques on it – small ones, but they keep on coming – and it’s still a book worth reading).

The question was about the following text:

Avoid naming tablespaces according to time-periods; this is particularly relevant to large, time-based data warehouses where very old data is ultimately dropped off the back end of the database.

There is an unfortunate quirk of tablespace naming – Oracle never forgets a tablespace – in other words it never deletes it from the data dictionary table TS$ (I think this is some sort of hangover from Trusted Oracle where tablespaces have security labels associated with them). Unfortunately, one of the permanently running Oracle processes called SMON (the System monitor) scans this list of tablespaces every 3 minutes looking for tablespaces that might have some free space to coalesce; the size and cost of this scan will grow indefinitely unless you adopt a strategy of recycling tablespace names. It is actually surprising how many databases suffer a massive I/O load on the SYSTEM tablespace because of SMON.

The problem will be reduced somewhat by the introduction of locally managed tablespaces, since the scanning problem is actually exacerbated by the number of free extents that gets generated in poorly managed systems in a way that can’t happen with locally managed tablespaces; nevertheless you need to make sure that the list of tablespace names does not get out of hand.

You’ll be pleased to hear that this is no longer a problem. If you have access to Metalink (MOS), you can find information about (unpublished) bug 5861536: “Slow DDL due to Tablespace lookup with large number of tablespaces” dated 24th Sept 2008; and see in note 401436.1 (10.2.0.4 Patch Set – List of bug fixes by Problem Type) that this problem has now been addressed.

Frequency Histograms – 6

In an earlier post on frequency histograms I described how Oracle creates an approximate histogram when dealing with character columns, and warned you that the strategy could lead to a couple of anomalies if you were unlucky. I’ve already published a note about one such anomaly that can occur with fairly long character strings, this note describes another anomaly that could appear in less extreme cases. Again, we start by constructing a data set.


create table t1  (v1 varchar2(42));

insert	into t1
select	'next day'
from	all_objects
where 	rownum <= 150
;

insert into t1
select	'same day'
from 	all_objects
where 	rownum <= 350
;

insert into t1
select 	'priority high'
from 	all_objects
where 	rownum <= 500
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'t1',
		estimate_percent => 100,
		method_opt 	 => 'for all columns size 254'
	);
end;
/

select
        endpoint_number,
        endpoint_number - nvl(prev_endpoint,0)  frequency,
        hex_val,
        chr(to_number(substr(hex_val, 2,2),'XX')) ||
        chr(to_number(substr(hex_val, 4,2),'XX')) ||
        chr(to_number(substr(hex_val, 6,2),'XX')) ||
        chr(to_number(substr(hex_val, 8,2),'XX')) ||
        chr(to_number(substr(hex_val,10,2),'XX')) ||
        chr(to_number(substr(hex_val,12,2),'XX')),
        endpoint_actual_value
from    (
        select
                endpoint_number,
                lag(endpoint_number,1) over(
                        order by endpoint_number
                )                                                       prev_endpoint,
                to_char(endpoint_value,'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX')hex_val,
                endpoint_actual_value
        from
                user_tab_histograms
        where
                table_name = 'T1'
        and     column_name = 'V1'
        )
order by
        endpoint_number
;

ENDPOINT_NUMBER  FREQUENCY HEX_VAL                         CHR(TO ENDPOINT_ACTUAL_VALUE
--------------- ---------- ------------------------------- ------ ------------------------------------------
            150        150  6E65787420644D6D50DD04A6400000 next d
            650        500  7072696F72698970D6651967200000 priori
           1000        350  73616D6520645C36B7AD479D600000 same d

3 rows selected.

This example captures a histogram at a point when my data set has only three popular values in it, and those values are sufficiently different that their “6-byte encoding” is enough for Oracle to tell the difference between them so the histogram doesn’t use the endpoint_actual_value column.

But I have a problem. During the course of the working day I introduce some data with different status codes, and all the queries I run are about these status codes, but by the end of the day (when, possibly, the stats will be collected again by the automatic stats collection job) all the “interesting” data has been updated to one of the three popular values.

What does Oracle do about queries for my “rare” values ? It depends what they look like. Here’s a couple of examples from 10.2.0.3:

select
	count(*)
from
	t1
where
	v1 = 'overnight'
;

select
	count(*)
from
	t1
where
	v1 = 'priority low'
;

set autotrace off

Neither of these values is in the histogram, so I would like Oracle to give a low cardinality estimate to them. But one of the values is very similar to a value that DOES exist in the histogram. This leads to an interesting variation in the numbers that appear in the execution plans:

-----------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost  |
-----------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    11 |     2 |
|   1 |  SORT AGGREGATE    |      |     1 |    11 |       |
|*  2 |   TABLE ACCESS FULL| T1   |     1 |    11 |     2 |
-----------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("V1"='overnight')

Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953

-----------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost  |
-----------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    11 |     2 |
|   1 |  SORT AGGREGATE    |      |     1 |    11 |       |
|*  2 |   TABLE ACCESS FULL| T1   |   500 |  5500 |     2 |
-----------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("V1"='priority low')

In the first case (‘overnight’) the optimizer has decided that the value doesn’t exist in the histogram. Since this database is running 10.2.0.3 it has set the resulting cardinality to 1 (this would be 75 in 10.2.0.4 or 11g – half the lowest known value in the histogram). But in the second case (‘priority low’) the optimizer has done its 6-byte manipulation and decided that ‘priority low’ matches ‘priority high’, and given it a cardinality of 500.

If Oracle had seen ‘priority low’ when creating the histogram it would have used the endpoint_actual_value column and generated two separate rows for ‘priority low’ and ‘priority high’. But it is perfectly feasible that because of timing (the interesting data exists only briefly during the day) or sampling (the interesting data is tiny compared to the popular values) you could see Oracle taking a “large data” execution path when you know that you’re only expecting to handle a small data set.

The solution, of course, is to write code to create a histogram that represents the situation as you want Oracle to see it.

Read currently running SQL statement’s bind variable values using V$SQL_MONITOR.BIND_XML in Oracle 11.2

The title pretty much says it. In Oracle 11.2, if you have the Diag+Tuning Pack licenses and the SQL monitoring kicks in for your SQL statement, then instead of the old fashioned ERRORSTACK dump reading you can just query the V$SQL_MONITOR.BIND_XML to find the values and metadata of your SQL statement’s bind variables.

I’ve written an example here:

And a related comment – V$SQL_BIND_CAPTURE is not a reliable way for identifying the current bind variable values in use. Oracle’s bind capture mechanism does not capture every single bind variable into SGA (it would slow down apps which run lots of short statements with bind variables). The bind capture only selectively samples bind values, during the first execution of a new cursor and then every 15 minutes from there (controlled by _cursor_bind_capture_interval parameter), assuming that new executions of that same cursor are still being started (the capture happens only when execution starts, not later during the execution).

Share

Announcing the first E2SN Virtual Conference with Jonathan Lewis, Cary Millsap, Kerry Osborne and me – 18-19 Nov!

Yes, it’s official. I’m organizing a virtual conference with some of THE top speakers in the world. The topic is Systematic Oracle SQL Optimization (in real world)

The dates are 18-19 November, the conference lasts for 4 hours on both days, so you’ll be able to still get some work done as well (and immediately apply the knowledge acquired!).

Well, none of the speakers need introduction, but just in case you’ve lived in space for last 20 years, here are the links to their blogs :)

I can tell you, (at least the first 3) people in the above list ROCK!!!

And all of them are OakTable members too :)

This conference will have 4 x 1.5 hour sessions, each delivered by a separate speaker. We aim to systematically cover the path of:

  1. Finding out where is the performance problem (and which SQLs cause it)
  2. Finding out what is the problem SQL execution plan doing and which part of it is slow
  3. How to write and fix your code so that the optimizer wouldn’t hate your SQL
  4. How to fix the SQL execution plan performance problem when you can’t touch the application code!

And as this is the first (pilot) virtual conference, then the price is awesome, especially if you get the early bird rate by signing up before 1. November!

So, check out the abstracts, details, agenda and sign up here!

#000000;">#0000ff;">http://tech.e2sn.com/virtual-conferences

P.S. I expect this event to be awesome!

Share

Frequency Histogram 5

In an earlier post on frequency histograms I described how Oracle creates an approximate histogram when dealing with character columns, and warned you  that the strategy could lead to a couple of anomalies if you were unlucky. This note describes one such anomaly. We start with a slightly curious data set:

create table t1  (v1 varchar2(42));

insert	into t1
select	'short'
from	all_objects
where 	rownum <= 100
;

insert into t1
select	'shorter'
from 	all_objects
where 	rownum <= 300
;

insert into t1
select 	'shortest'
from 	all_objects
where 	rownum <= 500
;

insert into t1
select 	'shortest_thing_in_the_recorded_data_not'
from 	all_objects
where 	rownum <= 700
;

insert into t1
select 	'shortest_thing_in_the_recorded_data_really'
from 	all_objects
where 	rownum <= 900
;

begin
	dbms_stats.gather_table_stats(
		ownname		 => user,
		tabname		 =>'t1',
		estimate_percent => 100,
		method_opt 	 => 'for all columns size 254'
	);
end;
/

If we run the script from the earlier posting to see what Oracle has stored, we get the following:

ENDPOINT_NUMBER  FREQUENCY HEX_VAL                         CHR(TO ENDPOINT_ACTUAL_VALUE
--------------- ---------- ------------------------------- ------ ------------------------------------------
            100        100  73686F72740018721DBD93B2E00000 short  short
            400        300  73686F727465871679E2CF40400000 shorte shorter
            900        500  73686F727465871679E2CF40400000 shorte shortest
           2500       1600  73686F727465871679E2CF40400000 shorte shortest_thing_in_the_recorded_d

Because two of our values were identical to the first 32 characters Oracle has recorded them as the same, and stored them under the same (incomplete) entry. So what happens when you query for a value that isn’t in the table, and doesn’t get mentioned in the histogram – but looks similar to the other two long items ?


set autotrace traceonly explain

select
	count(*)
from
	t1
where
	v1 = 'shortest_thing_in_the_recorded_data'
;

select
	count(*)
from
	t1
where
	v1 = 'shortest_thing_in_the_data'
;

set autotrace off

-----------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost  |
-----------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    29 |     4 |
|   1 |  SORT AGGREGATE    |      |     1 |    29 |       |
|*  2 |   TABLE ACCESS FULL| T1   |  1600 | 46400 |     4 |
-----------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("V1"='shortest_thing_in_the_recorded_data')

-----------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost  |
-----------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    29 |     4 |
|   1 |  SORT AGGREGATE    |      |     1 |    29 |       |
|*  2 |   TABLE ACCESS FULL| T1   |    50 |  1450 |     4 |
-----------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("V1"='shortest_thing_in_the_data')

Neither of the two strings in my predicate appear in the data or in the histogram – but the first string (after applying the histogram algorithm) is a match for an entry in the histogram so Oracle reports the value derived from the matching endpoint_number as the cardinality. The second string doesn’t appear even after applying the algorithm – so Oracle has supplied a cardinality of 50, which is half the cardinality of the least frequently occuring value that it can find in the histogram. (Note – if you are running 10.2.0.3 or earlier the cardinality for this case would be 1; the optimizer changed in 10.2.0.4).

This means we have demonstrated a situation where two “rare” values which should be treated identically end up with dramatically different cardinalities estimates and could, therefore, end up being subject to dramatically different execution plans.

Conclusion: If you have fairly long, and similar, strings in a column that is a good candidate for a frequency histogram (e.g. a very descriptive status column) then you have a problem if a value that is very rare looks identical to a very popular value up to the first 32 characters. You may find that the only solution is to change the list of legal values (although various strategies involving virtual columns or function-based indexes can bypass the problem).

Reading Parallel Execution Plans With Bloom Pruning And Composite Partitioning

You’ve probably heard sayings like “sometimes things aren’t always what they seem” and “people lie”. Well, sometimes execution plans lie. It’s not really by intent, but it is sometimes difficult (or impossible) to represent everything in a query execution tree in nice tabular format like dbms_xplan gives. One of the optimizations that was introduced back in 10gR2 was the use of bloom filters. Bloom filters can be used in two ways: 1) for filtering or 2) for partition pruning (bloom pruning) starting with 11g. Frequently the data models used in data warehousing are dimensional models (star or snowflake) and most Oracle warehouses use simple range (or interval) partitioning on the fact table date key column as that is the filter that yields the largest I/O reduction from partition pruning (most queries in a time series star schema include a time window, right!). As a result, it is imperative that the join between the date dimension and the fact table results in partition pruning. Let’s consider a basic two table join between a date dimension and a fact table. For these examples I’m using STORE_SALES and DATE_DIM which are TPC-DS tables (I frequently use TPC-DS for experiments as it uses a [...]

Distributed Objects

I recently came across a tidy solution to a common problem – how to minimise code maintenance in a procedure while maximising flexibility of the procedure. The task was fairly simple – create a ref cursor for a calling program to return data that (a) followed complex selection rules and (b) allowed the user to specify numerous types of input.

The principle was simple – the final ref cursor was driven by a list of (say) order ids – and the details to be returned about those orders required some fairly complex SQL to execute. To separate the complexity of constructing the list of columns from the complexity of identifying the required rows the developers had split the procedure into two stages. First, select the list of relevant order ids using one of several possible statements – the appropriate statement being derived from analysis of the inputs to the procedure; secondly open a ref cursor using that list of order ids. In this way if a new set of rules for selection appeared the only new code needed was a new query to select the ids – the main body of code didn’t need to be modified and re-optimised.

They could have inserted the list of order ids into a global temporary table (GTT), of course; but what they actually did was to “bulk collect” them into an object table type, and then use the table() operator to reference them in their ref cursor. Here’s a very simple example demonstrating the concept:


create type number_type_table as table of number;
/

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		rownum <= 10000
)
select
	cast(rownum as number(8,0))	id,
	lpad(rownum,10,'0')		small_vc,
	rpad('x',100)			padding
from
	generator	v1,
	generator	v2
where
	rownum <= 100000
;

alter table t1 add constraint t1_pk primary key(id);

-- collect some stats on the table

declare
	m_n1_tab	number_type_table;

begin
	select
		id
	bulk collect into m_n1_tab
	from
		t1
	where
		rownum <= 10
	;

	for c1 in (
		select
			/*+
				cardinality(l1 10)
			*/
			r1.small_vc
		from
			table(m_n1_tab)		l1,
			t1			r1
		where
			r1.id = l1.column_value
	) loop
		dbms_output.put_line(c1.small_vc);
	end loop;
end;
/

I’ve used the table t1 both as the source of a few id values and as the target for the final select – the client code was obviously more subtle and complex.

Key details – from the top down – are: the “table type”, the efficiency of using a “bulk collect”, and the table(m_n1_tab) that appears in the cursor loop. (I’ve used a simple cursor loop in my demonstration rather than opening a ref cursor). You’ll notice that I’ve included a /*+ cardinality */ hint to give the optimizer a rough idea of the volume of data in the collection.

It’s clean, it’s simple, and it works well – until you start using distributed queries, and then sometimes the performance is dire. In this example my target table has a primary key, and I’m joining 10 rows on the primary key – the code is very fast and the plan (running 11.1.0.6) is as follows:

SELECT /*+     cardinality(l1 10)    */ R1.SMALL_VC FROM TABLE(:B1 )
L1, T1 R1 WHERE R1.ID = L1.COLUMN_VALUE

Plan hash value: 609144676

---------------------------------------------------------------------------------------------
| Id  | Operation                           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |       |       |       |    39 (100)|          |
|   1 |  NESTED LOOPS                       |       |       |       |            |          |
|   2 |   NESTED LOOPS                      |       |    10 |   180 |    39   (0)| 00:00:01 |
|   3 |    COLLECTION ITERATOR PICKLER FETCH|       |       |       |            |          |
|*  4 |    INDEX UNIQUE SCAN                | T1_PK |     1 |       |     0   (0)|          |
|   5 |   TABLE ACCESS BY INDEX ROWID       | T1    |     1 |    16 |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - access("R1"."ID"=VALUE(KOKBF$))

But when I changed the code so that the second use of t1 was referenced (through a loopback database link) as a remote table the query took about four minutes to complete, with a plan that looked like this:

SELECT /*+     cardinality(l1 10)    */ R1.SMALL_VC FROM TABLE(:B1 )
L1, T1@D11G@LOOPBACK R1 WHERE R1.ID = L1.COLUMN_VALUE

Plan hash value: 3896059973

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                   |      |       |       |    39 (100)|          |        |      |
|   1 |  NESTED LOOPS                      |      |    10 |   220 |    39   (0)| 00:00:01 |        |      |
|   2 |   COLLECTION ITERATOR PICKLER FETCH|      |       |       |            |          |        |      |
|*  3 |   FILTER                           |      |     1 |    20 |     1   (0)| 00:00:01 |        |      |
|   4 |    REMOTE                          | T1   |       |       |            |          | D11G@~ | R->S |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - filter("R1"."ID"=VALUE(KOKBF$))
   4 - SELECT "ID","SMALL_VC" FROM "T1" "R1" (accessing 'D11G@LOOPBACK' )

(Notice that the second use of table t1 is now “t1@d11g@loopback” – a remote table).

The optimizer has decided to take the same nested loop execution path, presumably because it can see the high precision indexed access path – but then something has gone wrong. The SQL that gets sent to the remote database doesn’t include an access predicate on table t1 – so the remote database does a full tablescan for each row in the collection. The local database pulls the entire set of rows across the database link and then uses a filter operation at line 3 to eliminate all but the one row it needed.

What’s gone wrong ? The problem (I think) is that the value we want to pass to the remote database comes from an abstract data type (ADT) and although we know that the base type is a simple number type the code isn’t designed to deduce that. Since the remote database may not know how to treat our abstract data types we can’t possibly send them across the network. (The client’s immediate problem was that remote remote table was 1.2M blocks long – ca. 10GB – and it took about 90 seconds to get one row by tablescan.)

The client could have switched to using GTTs in the cases that went wrong, of course, but didn’t really want to have two different code mechanisms. Fortunately, once I’d modelled the problem I found a workaround. It’s a method that you may not be able to take advantage of in all cases but it was perfect for the client. Since the basic problem is that we don’t seem to be allowed to send abstract types across the network all we have to do is turn the abstract type into an Oracle base type:

begin
	select
		id
	bulk collect into m_n1_tab
	from
		t1
	where
		rownum <= 10
	;

	for c1 in (
		select
			r1.small_vc
		from
			(
			select
				/*+
					no_merge
					cardinality(l1 10)
				*/
				cast(column_value as number(8,0))	v1
			from
				table(m_n1_tab)		l1
			)				l1,
		t1@d11g@loopback	r1
		where
			r1.id = l1.v1
	) loop
		dbms_output.put_line(c1.small_vc);
	end loop;
end;
/

We replace the naked table() operator with a non-mergeable inline view that selects the absract data type from the table() operator then casts it into an Oracle base type. With this code change the query runs quickly again and we see the following execution plan.

SELECT R1.SMALL_VC FROM ( SELECT /*+      no_merge      cardinality(l1
10)     */ CAST(COLUMN_VALUE AS NUMBER(8,0)) V1 FROM TABLE(:B1 ) L1 )
L1, T1@D11G@LOOPBACK R1 WHERE R1.ID = L1.V1

Plan hash value: 966067288

------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name | Rows  | Bytes | Cost (%CPU)| Time     | Inst   |IN-OUT|
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |      |       |       |    39 (100)|          |        |      |
|   1 |  NESTED LOOPS                       |      |    10 |   330 |    39   (0)| 00:00:01 |        |      |
|   2 |   VIEW                              |      |    10 |   130 |    29   (0)| 00:00:01 |        |      |
|   3 |    COLLECTION ITERATOR PICKLER FETCH|      |       |       |            |          |        |      |
|   4 |   REMOTE                            | T1   |     1 |    20 |     1   (0)| 00:00:01 | D11G@~ | R->S |
------------------------------------------------------------------------------------------------------------

Remote SQL Information (identified by operation id):
----------------------------------------------------
   4 - SELECT "ID","SMALL_VC" FROM "T1" "R1" WHERE "ID"=:1 (accessing 'D11G@LOOPBACK' )

As you can see from the view operator at line 2, we have created a result set from the collection operation, and the query that goes to the remote site at line 4 now includes our high-precision predicate. It’s also rather convenient that we can include the cardinality hint in the in-line view – the client loves ANSI SQL, and for complex statements it can be quite difficult getting the hints right.