In a recent OTN thread I've been reminded of two facts about Dynamic Sampling that I already knew but had forgotten in the meantime:
1. The table level dynamic sampling hint uses a different number of blocks for sampling than the session / cursor level dynamic sampling. So even if for both for example level 5 gets used the number of sampled blocks will be different for most of the 10 levels available (obviously level 0 and 10 are exceptions)
2. The Dynamic Sampling code uses a different approach for partitioned objects if it is faced with the situation that there are more partitions than blocks to sample according to the level (and type table/cursor/session) of Dynamic Sampling
Note that all this here applies to the case where no statistics have been gathered for the table - I don't cover the case when Dynamic Sampling gets used on top of existing statistics.
A few days ago I published an example of the optimizer failing to handle an updateable join view because it didn’t recognise that a particular type of aggregate subquery would guarantee key-preservation. Here’s another example where the human eye can see key-preservation, but the optimizer can’t (even in 22.214.171.124). As usual we’ll start with some sample data – in this case two tables since I want to update from one table to the other.
create table t1 ( id1 number, id2 number, val number, constraint t1_pk primary key (id1, id2) ); insert into t1 values (1,1,99); commit; create table t2 ( id1 number, id2 number, id3 number, val number, constraint t2_pk primary key (id1, id2, id3) ); insert into t2 values (1,1,1,200); insert into t2 values (1,1,2,200); commit;
A recent comment on a note I wrote some time ago about faking histograms asked about the calculations of selectivity in the latest versions of Oracle. As I read the question, I realised that I had originally supplied a formula for calculating cardinality, rather than selectivity, so I thought I’d supply a proper example.
We’ll start with a script to create some data and stats – and I’m going to start with a script I wrote in Jan 2001 (which is why it happens to use the analyze command rather than dbms_stats.gather_table_stats, even though this example comes from an instance of 126.96.36.199).
Here’s a simple update statement that identifies a few rows in a table then updates a column where a matching value can be derived from another table – it’s an example of an update by correlated subquery:
update t1 set small_vc = ( select max(small_vc) from t2 where t2.id = t1.id ) where mod(id,100) = 0 and exists ( select null from t2 where t2.id = t1.id ) ;
Last year I wrote a few articles for Simpletalk, a web service created by Redgate for users of SQL Server. This year, Redgate is setting up a similar service called “All things Oracle” (I’ve added a link in my blogroll) for Oracle users, and I’ve volunteered to write articles for them occasionally.
Some of the stuff they publish will be complete articles on their website, some will be short introductions with links to the authors’ own websites. My first article for them has just been posted – it’s an article that captures a couple of key points from the optimizer presentation I did at the UKOUG conference a couple of weeks ago.
In our application we extensively use a function-based index on an important table. Couple of days ago I’ve seen an interesting issue associated with this FBI, view and a GROUP BY query. I have to say I don’t have an explanation what exactly it is and how I should call it properly, hence just “trouble” in the subject line.
Here are a few thoughts on dbms_stats – in particular the procedure gather_index_stats.
The procedure counts the number of used leaf blocks and the number of distinct keys using a count distinct operation, which means you get an expensive aggregation operation when you gather stats on a large index. It would be nice efficiency feature if Oracle changed the code to use the new Approximate NDV mechanism for these counts.
If you consider the usage of Table Functions then you should be aware of some limitations to the optimizer calculations, in particular when considering a join between a Table Function and other row sources.
As outlined in one of my previous posts you can and should help the optimizer to arrive at a reasonable cardinality estimate when dealing with table functions, however doing so doesn't provide all necessary inputs to the join cardinality calculation that are useful and available from the statistics when dealing with regular tables.
Therefore even when following the recommended practice regarding the cardinality estimates it is possible to end up with some inaccuracies. This post will explain why.
Join Cardinality Basics
And you should remember that. Here is a nice example how Cost Based Optimizer can miss an obvious option (which is available to human eye and Oracle run-time with a hint) while searching for the best plan. CBO simply doesn’t consider Index Skip Scan with constant ‘in list’ predicates in the query, although it costs skip scan for a join. Such bits are always popping up here and there, so you just can’t say “The Cost Based Optimizer examines all of the possible plans for a SQL statement …”, even if Optimizer Team tells you CBO should do so. There will always be places where CBO will do less than possible to come to the best plan and will need a help from your side, such as re-written SQL or a hint.
If you want to make use of Oracle's cunning Star Transformation feature then you need to be aware of the fact that the star transformation logic - as the name implies - assumes that you are using a proper star schema.
Here is a nice example of what can happen if you attempt to use star transformation but your model obviously doesn't really correspond to what Oracle expects:
purge table d;
drop table t;
purge table t;
create table t
rownum as id
, mod(rownum, 100) + 1 as fk1
, 1000 + mod(rownum, 10) + 1 as fk2
, 2000 + mod(rownum, 100) + 1 as fk3
, rpad('x', 100) as filler
level <= 1000000
exec dbms_stats.gather_table_stats(null, 't')
create bitmap index t_fk1 on t (fk1);