Recent thread in the OakTable mailing list prompted me to create a poll and ask about the ways DBAs use system statistics in real systems. If you struggle to understand what system statistics is and what are the available options, here is the suggested reading:
Documentation – System Statistics
Best Practices for Gathering Optimizer Statistics, Oracle whitepaper
System Statistics – Troubleshooting Oracle Performance
A hint is an instruction to the optimizer
a code that tells a computer to perform a particular operation
Which means Oracle CBO must obey the hints and must perform particular operation. The latter is hard to define correctly and explain precisely because it involves the logic of the block-box (what Cost Based Optimizer is). Some of the operations are mentioned in the standard Oracle documentation, some of them scattered across different places, and there are exceptions as usual. I think I’ll list here these cases which could lead to “ignoring hints” with the links to documentation/blogs.
Here’s a summary of a recent posting on OTN:
I have two indexes (REFNO, REFTYPESEQNO) and (REFNO,TMSTAMP,REFTYPESEQNO). When I run the following query the optimizer uses the second index rather than the first index – which is an exact match for the predicates, unless I hint it otherwise:
Issues with OICA/OIC (OPTIMIZER_INDEX_COST_ADJ/OPTIMIZER_INDEX_CACHING) parameters have already been mentioned many times. Recently I’ve noticed one more and I think I didn’t see this case somewhere else on the Internet so I’ll share it.
Here’s a simple table T1 holding 1M rows with the indexed column X having 10 distinct values:
In the previous post I've demonstrated an unexpected Nested Loop Join caused by an extreme data distribution. Although unexpected at first sight, the performance of the execution plan selected by the optimizer is decent - provided the estimates are in the right ballpark.Here is another case of an unexpected execution plan, this time about Merge Joins.
In order to appreciate why the execution plan encountered is unexpected, first a quick summary about how Merge Joins work:A Merge Join is essentially a Nested Loop operation from one sorted row source into another sorted row source.
Recently I came across some interesting edge cases regarding the costing of joins. They all have in common that they result in (at first sight) unexpected execution plans, but only some of them are actual threats to performance.
The first one is about outer joins with an extreme data distribution. Consider the following data setup:
create table t1
rownum as id
, rpad('x', 100) as filler
, case when rownum > 1e6 then rownum end as null_fk
level <= 1e6
exec dbms_stats.gather_table_stats(null, 't1')
create table t2
rownum as id
, rpad('x', 100) as filler
level <= 1e6
When performing aggregate GROUP BY operations an additional filter on the aggregates can be applied using the HAVING clause.Usually aggregates are one of the last steps executed before the final result set is returned to the client.However there are various reasons, why a GROUP BY operation might be somewhere in the middle of the execution plan operation, for example it might be part of a view that cannot be merged (or was hinted not to be merged using the NO_MERGE hint), or in the more recent releases (11g+) the optimizer decided to use the GROUP BY PLACEMENT transformation that deliberately can move the GROUP BY operation to a different execution step of the plan.In such cases, when the GROUP BY operation will be input to some other operation, it becomes essential for the overall efficiency of the execution plan preferred by the optimizer that the cardinality estimates are in the right ballpark, as it will influe
Here’s a question that is NOT a trick question, it’s demonstrating an example of optimizer behaviour that might come as a surprise.
I have an index (addr_id0050, effective_date), the first column is numeric, the second is a date. Here’s a query with an execution plan that uses that index:
A recent question on OTN asked how you could model a case where Oracle had the choice between a “perfect” index for a range scan and an index that could be used for an index skip scan and choose the latter path even though it was clearly (to the human eye) the less sensible choice. There have been a number of wierd and wonderful anomalies with the index skip scan and bad choice over the years, and this particular case is just one of many oddities I have seen in the past – so I didn’t think it would be hard to model one (in fact, I thought I already had at least two examples somewhere in my library – but I couldn’t find them).
One of the most useful information that the Oracle kernel attaches to plans in the library cache are measures of various resource consumption figures, such as elapsed time, consistent and current gets, disk reads, etcetera. These can be made available for each plan line (aka "row source operation").
These figures are always cumulative, that is, include both the resource consumed by the line itself and all of its progeny. It is very often extremely useful to exclude the progeny from the measure, to get what we could name the "self" figure (following, of course, the terminology introduced by Cary Millsap and Jeff Holt in their famous book Optimizing Oracle Performance).