Who's online

There are currently 0 users and 43 guests online.

Recent comments


Give Me a Hint – How were These Autotrace Execution Statistics Achieved?

June 27, 2011 I recently received an email asking why different performance is achieved when a FIRST_ROWS hint, FIRST_ROWS(100) hint, and an unhinted version of the query are executed.  This seems to be a simple problem, yet it might also be an interesting problem.  I thought that it might be helpful to transform my response into a blog article (allowing the [...]


Here’s a video of the presentation that I did as the keynote for the Turkish Oracle Users’ Group conference earlier on this year. Recording courtesy of the people at Formspider.

The topic is about tuning a two-table join, and the video is about an hour long (so there’s a lot more that could be  said on the topic, and one day I might do a half  day “masterclass” on it).


A question came up on Oracle-L recently about the difference in work done by the following two queries:



Before saying anything else, I should point out that these two queries are NOT logically equivalent unless you can guarantee that the table() operator returns a unique set of values – and Oracle doesn’t allow uniqueness to be enforced on collections.

However, the author of the question had ensured that the results from the table() operator contained no duplicates so the two versions of the query returned the same result set. The question is, why does one query do fewer buffer visits than the other – as evidenced by the results from sql_trace and tkprof.

Rule-based Join

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.01       0.01          0          0          0           0
Fetch        1      0.86       0.86          0     200047          0      115195
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.87       0.87          0     200047          0      115195

Rows     Row Source Operation
-------  ---------------------------------------------------
 115195  NESTED LOOPS  (cr=200047 pr=0 pw=0 time=6355 us)
  99704   COLLECTION ITERATOR PICKLER FETCH (cr=0 pr=0 pw=0 time=284 us)
 115195   INDEX RANGE SCAN DOM_NAME_IDX (cr=200047 pr=0 pw=0 time=3191 us)(object id 54309)

CBO with subquery

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.01       0.01          0          0          0           0
Fetch        1      0.78       0.78          0     157986          0      115195
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      0.80       0.80          0     157986          0      115195

Rows     Row Source Operation
-------  ---------------------------------------------------
 115195  NESTED LOOPS (cr=157986 pr=0 pw=0 time=4720 us cost=536 size=6141 card=267)
  99704   SORT UNIQUE (cr=0 pr=0 pw=0 time=417 us)
  99704    COLLECTION ITERATOR PICKLER FETCH (cr=0 pr=0 pw=0 time=259 us)
 115195   INDEX RANGE SCAN DOM_NAME_IDX (cr=157986 pr=0 pw=0 time=3353 us cost=2 size=21 card=1)(object id 54309)

Notice how the optimizer has unnested the subquery and introduced a “sort unique” – this demonstrates the optimizer’s recognition that the two queries are not logically identical. However, the number of rows from the collection is the same (99,704) and the number of rows after joining is the same (115,195) – the data sets have been rigged so that it is a fair comparison. So why does the explicit join take 200,047 buffer visits when the transformed  subquery approach take only 157,986 buffer visits.

The answer is that you can visit buffers without doing “logical I/Os” – and this benefit accrues most frequently to indexed access paths. Thanks to the way Oracle has sorted the collection before doing the join the nested loop follows the second index (dom_name_idx) in order – this increases the chance that index blocks that were used in the previous cycle of the loop will be re-used in the next cycle, which means that Oracle need not release the pins on the index blocks, but can revisit them incrementing the statitsic: “buffer is pinned count”.

I have to say that I haven’t proved that this is exactly what was happening, but when I suggested to the person who had asked the question that he re-run and check to see if the decrease in “session logical reads” was balanced by an increase in “buffer is pinned count” the answer seemed to confirm the hypothesis. [Correction (see comment #1 from Randolf): the results didn't confirm the hypothesis - but I think that' s because of the change in "pinning" that Randolf describes, so I'll have to find a way to confirm the hypothesis some other time.]


HOWTO: XML Partitioning and Multiple XMLIndex Structures

Although not a “pure” XML partitioning example, that is partitioning data on criteria within the XML document, and before I forget to mention this exercise, I would like to point out the following URL:

This small exercise was setup based on questions / comments from a reader on this blog regarding the ”
Structured XMLIndex (Part 3) – Building Multiple XMLIndex Structures” content after heaving trouble to setup structured and unstructured local XMLIndexes.

The forum link demonstrates howto:

  • Register a XML Schema for use with Binary XML storage
  • Create a RANGE partitioned table with a XMLType column (Binary XML Securefile storage)
  • Create a Unstructured LOCAL Partitioned XMLIndex (UXI)
  • Create multiple Structured local partitioned XMLIndexes (SXI)
  • Create secondary indexes on the Content Tables created by the SXI structures
  • The effects of different queries and their explain plan output making use of the UXI, SXI and partitioning


Real-World Performance Videos on YouTube – OLTP

In addition, here are some OLTP demos that demonstrate how much performance and throughput can be wasted by poor design and suboptimal database programming.

OLTP Performance – The Trouble with Parsing

width="640" height="510" src="" frameborder="0" allowfullscreen>

OLTP Performance – Concurrent Mid-Tier Connections

width="640" height="510" src="" frameborder="0" allowfullscreen>

Real-World Performance Videos on YouTube – Data Warehousing

Here are some videos of a data warehouse demo that the Real-World Performance Group has been running for a while now and we thought it was time to put them on YouTube. Hope you find them informative.

Migrate a 1TB Data warehouse in 20 Minutes (Part 1)

width="640" height="510" src="" frameborder="0" allowfullscreen>

Migrate a 1TB Data warehouse in 20 Minutes (Part 2)

width="640" height="510" src="" frameborder="0" allowfullscreen>

Migrate a 1TB Data warehouse in 20 Minutes (Part 3)

width="640" height="510" src="" frameborder="0" allowfullscreen>

Migrate a 1TB Data warehouse in 20 Minutes (Part 4)

width="640" height="510" src="" frameborder="0" allowfullscreen>

Quiz Night

Here’s an interesting question from the OTN database forum:

“If I delete 90% of the rows from a table which has a few indexes, without rebuildling or coalescing indexes afterwards, will this improve the performance of index range scans ?”

The thing that makes it interesting is the scope it gives you for imagining reasons why the performance won’t change, or might get better, or could get worse. So how about it – can you think of an argument for each of the three possibilities ?

Update – 15th June

The first thing you need to do when you see a question like this is start asking for more information – as the first comment did.

  • How big is the table – percentage are generally useless unless you also know the scale, and it’s possible that pure size might make a difference.
  • What’s the pattern of data in the table – that probably matters
  • Is the table partitioned – and does the delete pattern vary with partition
  • What’s the pattern of data across the indexes, that could make a lot of difference
  • Are any of the indexes partitioned (whether or not the table is)
  • Are all the indexes plain B-tree indexes – or bitmaps, or context, or spatial; and function-based indexes, reverse key indexes …
  • Are you deleting the data in lots of tiny batches with commits, or one big delete
  • And yes, as asked in the first comment – are you going to collect statistics before running queries, because that could make a big difference.
  • Timing is quite important – are you thinking only in the very short term (delayed block cleanout effects), or in the longer, stabilised term.
  • Are you assuming the data you delete is data that was not previously returned in the queries – but would that always matter anyway
  • And why are you asking about range scans, not general purpose queries – that rather avoids the problem of how query plans can change as stats change.

Maybe we could make some assumptions – though that can have dangerous consequences. We might guess that the driving idea behind this question is that there’s a load of old data that isn’t being queried any more and if we simply delete it our queries will, at worst, still show the same performance. So I’m going to take the line that (a) we’re expecting any queries to return the same result and (b) we’re going to be inserting new data over time.

Just picking up a few of the comments that have appeared so far we can see how many details you might have to consider to think this question properly.

Comment 1 says: “how about collecting stats” – it seems likely that new stats would appear soon (90% deletion is more than a 10% change, so if auto stats collection hasn’t been disabled …).  What if, for some indexes, 100% of the data has been deleted from 90% of the blocks, then the leaf_block count of an index will change by a factor of 10 (and that’s effectively a bug, by the way): a possible consequence is that Oracle will decide that an index fast full scan is appropriate in cases where it previously took a different path.

Comment 3 picks up the case that if you had an index based on columns where the deleted data had null values in the relevant columns, then the 90% table deletion wouldn’t affect that index at all. This may sound a little extreme – but as a minor variation on the theme, the idea of creating function-based indexes  which hold values for “recent” data is quote well-known and very effective; the argument might be totally appropriate.

Comment 5 makes a fairly subtle and important point. If I used to have a range scan that examined 1,000 rows and returned 100 and the 90% of rows I deleted was exactly the 900 rows I didn’t return then there are two outcomes. If I had to visit the table 1,000 times to identify the 100 rows then the query will now be quicker; if I had been able to identify the 100 rows by examining 1,000 index entries then the query performance will be pretty much unchanged.

Comment 5 also has a rather nice thought on stopkeys (rownum <= N) – but it’s similar to Richard Foote’s min() example in comment 4 – I think it would only apply if the result were going to change.

Comment 9 makes an interesting point. If you delete a lot of data from a table you have no idea (until you look at the data patterns carefully, or until after you’ve collected the stats) of how the clustering_factor of the index will change. It will probably drop – though technically it could stay the same – but how far it drops compared to the change in the number of rows left in the table could make a big difference to the “selectivity * clustering_factor” bit of the index range scan calculation. Bear in mind that if your queries are returning the same result before and after the delete then your selectivities must have gone up by a factor of 10 because you’re returning the same volume of data from a total volume that has decreased by a factor of 10. (I’m assuming that the optimizer gets its stats right when I say this, of course).

Comment 6 brings up an important feature of bitmap indexes. When you delete a row from a table you delete the corresponding row from a b-tree index (eventually), but you update a bitmap index chunk, which means generating a new bitmap index chunk with one bit changed. So deleting 90% of the table data, however you did it, could result in a doubling of every bitmap index on the table. (Results will vary with version of Oracle and the strategy used to delete the data)

Several comments picked up the issue of “delayed block cleanout” that’s likely to have an impact for a while after the big delete. There might be other intermittent effects later on, depending where new data goes and how many leaf blocks were completely emptied by the delete – it is possible for some odd locking waits to appear as Oracle tries to find an empty block that can legally be moved for the split.

Personally, and ignoring the bitmap index threat, I think it’s safe to say that if your indexes are very well designed (so that eliminates most systems I’ve seen), then most queries will probably perform almost as efficiently after the delete (and delayed block cleanout) as they did before - provided the optimizer still picks the same execution plan: but the  moment you collect up to date statistics you may see some staggering changes (in either direction) for some queries.  (And if you don’t collect up to date statistics at some point you will also seem some staggering changes as Oracle starts bring in “out of range” adjustments to costs.

Bottom line: If you’re going to delete 90% of the data in a large table then you’ve got to think careful about how to minimise the side effects.


Cache buffers chains latch contention troubleshooting using latchprofx.sql example

Laurent Demaret has written a good article about how he systematically troubleshooted cache buffers chains latch contention, starting from wait interface and drilling down into details with my latchprofx tool:

A common cause for cache buffers chains latch contention is that some blocks are visited and re-visited way too much by a query execution. This usually happens due to nested loops joins or FILTER loops retrieving many rows from their outer (driving) row sources and then visiting the inner row-source again for each row from driving row source. Once you manage to fix your execution plan (perhaps by getting a hash join instead of the loop), then the blocks will not be re-visited so much and the latches will be hammered much less too.

The moral of the story is that if you have latch contention in a modern Oracle database, you don’t need to start tweaking undocumented latching parameters, but reduce the latch usage instead. And Laurent has done a good job with systematically identifying the SQL that needs to be fixed.

Good stuff!

If you don’t know what LatchProfX is, read this:


Why is my SYSAUX Tablespace so Big? Statistics_level=ALL

One of my most popular postings is about why your SYSTEM tablespace could be rather large. Recently I’ve had issues with a SYSAUX tablespace being considerably larger than I expected, so I thought I would do a sister posting on the reason.

The client I was working with at the time was about to go live with a new application and database. For various reasons I was a little anxious about how the Java application (the User Interface) would actually call the stored PL/SQL code I had helped develop. Initial workloads would be low and so I asked that the STATISTICS_LEVEL be set to ALL, so that bind variables (amongst other things) would be gathered. This is on version, btw, enterprise edition and 4-node RAC.

We went live, issues were encountered and resolved, the usual way these things work. Then, a few weeks in and when everything was still very “exciting” from a problem resolution perspective, I got an odd email from the DBA team. Would they like us to add another datafile to the SYSAUX tablespace. Huh? I checked. I’d been watching the size of our application’s tablespaces but not the others {well, I was not supposed to be a DBA and I was watching an awful lot of other things}. Our SYSAUX tablespace was around 160GB in size, having pretty much filled it’s 5th datafile. Why? I checked to see what was taking up the space in the tablespace:

 select * from
 (select owner,segment_name||'~'||partition_name segment_name,bytes/(1024*1024) size_m
 from dba_segments
 where tablespace_name = 'SYSAUX'
 where rownum < 40

OWNER              SEGMENT_NAME                                             SIZE_M
------------------ -------------------------------------------------- ------------
SYS                WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_3911            27,648
SYS                WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_3911         26,491
SYS                WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_3537            23,798
SYS                WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_3537         22,122
SYS                WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_4296            17,378
SYS                WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_4296         16,818
SYS                WRH$_ACTIVE_SESSION_HISTORY~WRH$_ACTIVE_14459270_3          136
SYS                WRH$_SQLSTAT~WRH$_SQLSTA_14459270_3911                       96
SYS                WRH$_SQLSTAT~WRH$_SQLSTA_14459270_3537                       72
SYS                WRH$_SQLSTAT~WRH$_SQLSTA_14459270_4296                       47
SYS                WRH$_LATCH_MISSES_SUMMARY_PK~WRH$_LATCH__14459270_           45
SYS                I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST~                              41
SYS                WRH$_SYSMETRIC_SUMMARY~                                      40
SYS                WRH$_LATCH_MISSES_SUMMARY_PK~WRH$_LATCH__14459270_           37

As you can see, almost all the space is being taken up by WRH$_LATCH_CHILDREN and WRH$_LATCH_CHILDREN_PK partitions. They are massive compared to other objects. A quick goggle did not come up with much other than many hits just listing what is in SYSAUX and the odd person also seeing SYSAUX being filled up with these objects and suggested ways to clear down space, nothing about the cause.

I had a chat with the DBAs and we quickly decided that this was going to be something to do with AWR given the name of objects – “WRH$_” objects are the things underlying AWR. The DBA suggested my settings of 15 minute intervals and 35 day retention was too aggressive. I knew this was not the case, I’ve had more aggressive snapshot intervals and longer retention periods on far busier systems than this. I did not have access to Metalink at that point so I asked the DBAs to raise a ticket, which they duly did.

Oracle support cogitated for a couple of days and came back with the advice to reduce the retention period. Hmmmm. Via the DBA I asked Oracle support to explain why those objects were so large when I had not seen this issue on several other systems. Was it a bug? I had by now corroborated with a friend from a previous site with 5 minute snapshot intervals and two months retention period and their SYSAUX tablespace was about 10GB all in. I did not want to go changing things if we did not know it would fix the issue as we really wanted to stop the growth of SYSAUX as soon as possible, not just try a hunch.

As you probably realise from the title of this blog, the issue was not the snapshot interval or retention period but the STATISTICS_LEVEL=ALL. The one and only hit you get in metalink if you search on WRH$_LATCH_CHILDREN is note 874518.1. From V10.1.0.2 to V11.1.0.7 setting this parameter to ALL is known to create a lot of data about Latch children and not clear it down when the AWR data is purged (Bug 8289729). The advice was to change STATISTICS_LEVEL and make the snapshot interval larger. I’d suggest you just need to alter the STATISTICS_LEVEL, unless you really, really need that extra information gathered. It seemed to take Oracle Support an extra day or two to find that note for us. {I’ve since checked out Metalink directly to confirm all this}.

So with a known issue we felt confident that altering the initialisation parameter would solve the issue. It took a while for us to change the STATISTICS_LEVEL on the production system – Change Control for that site is rather robust. This allowed us to see some other impacts of this issue.

The mmon process which looks after AWR data was becoming a top session in our OEM performance screens. In particular, a statement with SQL id 2prbzh4qfms7u that inserted into the WRH$_LATCH_CHILDREN table was taking several seconds to run each time and was running quite often {I include the SQL ID as it may be the same on many oracle V10 systems as it is internal code}:

The internal SQL inserting into wrh$_latch_children was becoming demanding

This was doing a lot of IO, by far the majority of the IO on our system at the time – it was a new system and we had been able to tune out a lot of the physical IO.

The physical IO requirements and 15-20 second elapsed time made this out most demanding statement on the system

We also now started to have issues with mmon running out of undo space when it ran at the same time as our daily load. This was particularly unfortunate as it coincided in a period of “intense management interest” in the daily load…

What was happening to the size of the SYSAUX tablespace?

Enter the tablespace (or leave null)> sys

TS_NAME              ORD      SUM_BLKS        SUM_K  MAX_CHNK_K NUM_CHNK
-------------------- ----- ----------- ------------ ----------- --------
SYSAUX               alloc  58,187,904  465,503,232  33,553,408       14
                     free       10,728       85,824      21,504       20
SYSTEM               alloc     128,000    1,024,000   1,024,000        1
                     free       68,360      546,880     546,752        3

4 rows selected.

select * from
(select owner,segment_name||'~'||partition_name segment_name,bytes/(1024*1024) size_m
 from dba_segments
 where tablespace_name = 'SYSAUX'
where rownum < 40

OWNER    SEGMENT_NAME                                                     SIZE_M
-------- ------------------------------------------------------------ ----------
SYS      WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_6201                     30262
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_5817                     29948
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_5435                     28597
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_4675                     28198
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_3911                     27648
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_5817                  27144
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_6585                     26965
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_6201                  26832
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_4675                  26741
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_3911                  26491
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_4296                     26307
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_5435                  26248
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_4296                  25430
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_6585                  25064
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_5058                     24611
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_5058                  23161
         WRH$_LATCH_CHILDREN~WRH$_LATCH__14459270_6966                      9209
         WRH$_LATCH_CHILDREN_PK~WRH$_LATCH__14459270_6966                   8462
         WRH$_SYSMETRIC_SUMMARY~                                             152
         WRH$_ACTIVE_SESSION_HISTORY~WRH$_ACTIVE_14459270_3911               136
         WRH$_SQLSTAT~WRH$_SQLSTA_14459270_3911                               96


OWNER                          OBJ_PART_NAME                                SIZE_M
------------------------------ ---------------------------------------- ----------
SYS                            WRH$_LATCH_CHILDREN-WRH                  231745.063
SYS                            WRH$_LATCH_CHILDREN_PK-WRH               215573.063
SYS                            WRH$_SQLSTAT-WRH                           711.0625
SYS                            WRH$_LATCH_MISSES_SUMMARY_PK-WRH           439.0625
SYS                            WRH$_ACTIVE_SESSION_HISTORY-WRH            437.0625
SYS                            WRH$_LATCH_PARENT-WRH                      292.0625
SYS                            WRH$_LATCH-WRH                             276.0625
SYS                            WRH$_LATCH_MISSES_SUMMARY-WRH              273.0625
SYS                            WRH$_SEG_STAT-WRH                          268.0625
SYS                            WRH$_LATCH_PARENT_PK-WRH                   239.0625
SYS                            WRH$_SYSSTAT_PK-WRH                        237.0625

Yes, that is close to half a terabyte of SYSAUX and it is all used, more partitions have appeared and the total size of the largest segments in SYSAUX show how WRH$_LATCH_CHILDREN and WRH$_LATCH_CHILDREN_PK make up the vast majority of the space used.

Shortly after, we finally got permission to change the live system. The impact was immediate, mmon dropped from being the most demanding session, that SQL code dropped down the rankings and the issues with running out of undo ceased.

I was anxious to see if the old data got purged, as the Metalink note had suggested the data would not be purged. Thankfully, that was not the case. The space was slowly released as normal purging of data outside the retention period took place and after just over a month, the SYSAUX tablespace contained a lot less information and was mostly free space:

OWNER      OBJ_PART_NAME                                SIZE_M
---------- ---------------------------------------- ----------
SYS        WRH$_LATCH_MISSES_SUMMARY_PK-WRH           512.0625
SYS        WRH$_LATCH_MISSES_SUMMARY-WRH              350.0625
SYS        WRH$_LATCH-WRH                             304.0625
SYS        WRH$_SQLSTAT-WRH                           280.0625
SYS        WRH$_LATCH_PK-WRH                          259.0625
SYS        WRH$_SYSSTAT_PK-WRH                        247.0625
SYS        WRH$_SERVICE_STAT_PK-WRH                   228.0625
SYS        WRH$_PARAMETER_PK-WRH                      201.0625
SYS        WRH$_PARAMETER-WRH                         169.0625
SYS        WRH$_SYSSTAT-WRH                           169.0625
SYS        WRH$_SEG_STAT-WRH                          161.0625
SYS        WRH$_SYSTEM_EVENT_PK-WRH                   156.0625
SYS        WRH$_SYSMETRIC_SUMMARY-                         152
SYS        WRH$_SYSTEM_EVENT-WRH                      133.0625
SYS        WRH$_SERVICE_STAT-WRH                      123.0625
SYS        WRH$_ACTIVE_SESSION_HISTORY-WRH            115.0625

TS_NAME              ORD      SUM_BLKS        SUM_K  MAX_CHNK_K NUM_CHNK
-------------------- ----- ----------- ------------ ----------- --------
SYSAUX               alloc  58,251,904  466,015,232  33,553,408       15
                     free   57,479,400  459,835,200   4,063,232    1,208
SYSTEM               alloc     128,000    1,024,000   1,024,000        1
                     free       68,048      544,384     544,320        2

Now, how do we get that space back? I left that with the DBA team to resolve.

Oh, one last thing. I mentioned the above to a couple of the Oaktable lot in the pub a few weeks back. Their instant response was to say “You set STATISTICS_LEVEL to ALL on a live system?!? You are mad!”

{Update, I’ve just spotted this posting by Colbran which is related. Last time I googled this I just got a stub with no information}

Row Values to Comma Separated Lists, an Overly Complicated Use Case Example

May 26, 2011 In a previous article I showed a simple method to convert values found in multiple table rows into a comma delimited list.  The method works very well, until the situation in which the approach is to be used becomes a bit more complex. Assume for a moment that the following table structure [...]