Oracle 11g added Extended Statistics support for column groups in order to detect correlated columns for filter predicates using an equal comparison.
Note that Oracle 11g also added the ability to use the number of distinct keys of a composite index as an upper limit for the cardinality estimates for matching column predicates, which means that the optimizer is now capable of detecting correlated columns without the explicit addition of Extended Statistics / Column Groups.
Oracle 11.2 introduced a set of new Query Transformations, among others the ability to coalesce subqueries which means that multiple correlated subqueries can be merged into a number of less subqueries.
Timur Akhmadeev already demonstrated the basic principles in a blog entry, but when I was recently involved into supporting a TPC-H benchmark for a particular storage vendor I saw a quite impressive application of this optimization that I would like to share here.
Here is an odd little bug that was discussed a couple of weeks ago on the OTN forums.
It's about queries on join views by ROWID that fail with "ORA-01410: invalid ROWID" under certain circumstances. The bug can only be reproduced when using the 11.2 code base. In fact the same setup will cause an internal error in 11.1, but 10.2 will return correct results.
It's probably not a very common scenario but it is an interesting example of how features that work fine by themselves can cause problems when used together.
First of all (hopefully) some of you may ask: How is it possible to query from a join view by ROWID, since the view is based on multiple objects and hence doesn't have a simple one-to-one mapping to a ROWID of a single table?
There is a lot more to say about Dynamic Sampling and indexes, and I'll try to cover these basics in my Dynamic Sampling series on AllThingsOracle.com, but two recent discussions on the OTN forums and on Charles Hooper's blog prompted me to publish this blog post.
These discussions revolved around the following issues with Dynamic Sampling and indexes:
1. CREATE INDEX On Empty Table
This is just a short note that Oracle has added several nice details to 126.96.36.199 and 188.8.131.52 respectively that can be helpful for troubleshooting.
ASH, PGA Memory And TEMP Consumption
Since 184.108.40.206 the V$ACTIVE_SESSION_HISTORY view (that requires Enterprise Edition plus Diagnostic License) contains the PGA_ALLOCATED and TEMP_SPACE_ALLOCATED columns.
In particular the latter closes an instrumentation gap that always bothered me in the past: So far it wasn't easy to answer the question which session used to allocate TEMP space in the past. Of course it is easy to answer while the TEMP allocation was still held by a session by looking at the corresponding V$ views like V$SORT_USAGE, but once the allocation was released answering questions like why was my TEMP space exhausted three hours ago was something that couldn't be told by looking at the information provided by Oracle.
Here is a summary of the findings while evaluating Incremental Partition Statistics that have been introduced in Oracle 11g.
The most important point to understand is that Incremental Partition Statistics are not "cost-free", so anyone who is telling you that you can gather statistics on the lowest level (partition or sub-partition in case of composite partitioning) without any noticeable overhead in comparison to non-incremental statistics (on the lowest level) is not telling you the truth.
Although this might be obvious I've already personally heard someone making such claims so it's probably worth to mention.
In principle you need to test on your individual system whether the overhead that is added to each statistics update on the lowest level outweighs the overhead of actually gathering statistics on higher levels, of course in particular on global level.
In a recent OTN thread I've been reminded of two facts about Dynamic Sampling that I already knew but had forgotten in the meantime:
1. The table level dynamic sampling hint uses a different number of blocks for sampling than the session / cursor level dynamic sampling. So even if for both for example level 5 gets used the number of sampled blocks will be different for most of the 10 levels available (obviously level 0 and 10 are exceptions)
2. The Dynamic Sampling code uses a different approach for partitioned objects if it is faced with the situation that there are more partitions than blocks to sample according to the level (and type table/cursor/session) of Dynamic Sampling
Note that all this here applies to the case where no statistics have been gathered for the table - I don't cover the case when Dynamic Sampling gets used on top of existing statistics.
If you consider the usage of Table Functions then you should be aware of some limitations to the optimizer calculations, in particular when considering a join between a Table Function and other row sources.
As outlined in one of my previous posts you can and should help the optimizer to arrive at a reasonable cardinality estimate when dealing with table functions, however doing so doesn't provide all necessary inputs to the join cardinality calculation that are useful and available from the statistics when dealing with regular tables.
Therefore even when following the recommended practice regarding the cardinality estimates it is possible to end up with some inaccuracies. This post will explain why.
Join Cardinality Basics
This is just a short note about one of the potential side-effects of the new Auto Degree Of Parallelism (DOP) feature introduced in 11.2.
If you happen to have Parallel DML enabled in your session along with Auto DOP (and here I refer to the PARALLEL_DEGREE_POLICY = AUTO setting, not LIMITED) then it might take you by surprise that INSERT statements that are neither decorated with a parallel hint nor use any parallel enabled objects can be turned into direct-path inserts.
Preface (with apologies to Kevin Closson)
This blog post is too long
In the previous part of this series I've already demonstrated that the logical I/O optimization of the Table Prefetching feature depends on the order of the row sources - and 11g takes this approach a big step further.
It is very interesting that 11g does not require any particular feature like Table Prefetching or Nested Loop Join Batching (another new feature introduced in 11g) to take advantage of the Logical I/O optimization - it seems to be available even with the most basic form of a Nested Loop join.