I’ll probably have to file this one under “Optimizer ignoring hints” – except that it should also go under “bugs”, and that’s one of the get-out clauses I use in my “hints are not hints” argument.
Sometimes an invisible index isn’t completely invisible.
Here’s a demonstration from 220.127.116.11 showing something which, to my mind, is a very annoying problem. The objects are in a tablespace that has been created with uniform extents of 1MB on an 8KB block size, using freelist management. I’ve rigged the Hakan factor to ensure that I get exactly 40 rows per block, and I’ve set the system statistics to ensure that a relatively small swing in cost results in a change in execution plan.
Cost Based Oracle – Fundamentals (November 2005)
But the most interesting function for our purposes is sys_op_countchg(). Judging from its name, this function is probably counting changes, and the first input parameter is the block ID portion (object_id, relative file number, and block number) of the table’s rowid, so the function is clearly matching our notional description of how the clustering_factor is calculated. But what is that 1 we see as the second parameter?
I don’t think this is likely to happen on a production system (until 12c) – but look what you can do if you try hard enough:
1 select 2 index_name, column_name from user_ind_columns 3 where 4 table_name = 'T1' 5 order by 6* index_name , column_position SQL> / INDEX_NAME COLUMN_NAME -------------------- -------------------- T1_I1 N1 V1 T1_I2 N1 V1 4 rows selected.
That’s a straight cut-n-paste from an Oracle 18.104.22.168 SQL*Plus session. (You can tell I typed it in real time because I missed the return before the FROM, and couldn’t be bothered to go back and do it again ;) )
This is the text of an article I published in the UKOUG magazine a few years ago, but it caught my eye while I was browsing through my knowledge base recently, and it’s still relevant. I haven’t altered the original apart from adding a couple of very brief comments in brackets [Ed: like this].
One of the strengths of a relational database is that you should be able to throw any reasonable query (and even some unreasonable queries) at it and it will be able to return the right answer without being told how to navigate through the data.
There’s no guarantee, though, that you’ll get the answer quickly unless you’ve given the database some help by turning your logical model into a sensible physical implementation. Part of the physical implementation will be the choice of indexes – and this article reviews one of the commonest indexing issues that I see in OLTP systems
Here’s a question that is NOT a trick question, it’s demonstrating an example of optimizer behaviour that might come as a surprise.
I have an index (addr_id0050, effective_date), the first column is numeric, the second is a date. Here’s a query with an execution plan that uses that index:
When I introduced Connor McDonald’s blog a few days ago, it was because we had exchanged a couple of email messages (through the Oak Table Network) about how to minimise the resource consumption when copying a load of data from one IOT to another of the same structure. His problem was the way in which the obvious way of copying the data resulted in a massive sort even though, in principle, it should not have been necessary to sort anything since the data could have been extracted in order by walking the existing IOT.
Here’s a funny little glitch – typical of the sort of oddity that creeps into the data dictionary from time to time – cut-n-pasted from 22.214.171.124:
SQL> select count(*) from user_indexes; COUNT(*) ---------- 1074 1 row selected. SQL> select count(*) from user_objects where object_type = 'INDEX'; COUNT(*) ---------- 917 1 row selected.
We seem to have 157 indexes that aren’t indexes – what could they be. Perhaps a small enhancement to our query on user_indexes (the one that has the larger result) will help:
A recent question on OTN asked how you could model a case where Oracle had the choice between a “perfect” index for a range scan and an index that could be used for an index skip scan and choose the latter path even though it was clearly (to the human eye) the less sensible choice. There have been a number of wierd and wonderful anomalies with the index skip scan and bad choice over the years, and this particular case is just one of many oddities I have seen in the past – so I didn’t think it would be hard to model one (in fact, I thought I already had at least two examples somewhere in my library – but I couldn’t find them).
Another little detail that Hermann Baer mentioned in his presentation yesterday was the ability to create multiple indexes with the same column definition – something which currently gets you Oracle error “ORA-01408: such column list already indexed.”
No details, and there’s always the “safe harbour” slide of course – the one which says seomthing about the presentation being only an indication of current thinking and nothing is guaranteed to appear.
Occasionally I come across complaints that dbms_stats is not obeying the estimate_percent when sampling data and is therefore taking more time than it “should” when gathering stats. The complaint, when I have seen it, always seems to be about the sample size Oracle chose for indexes.
There is a simple but (I believe) undocumented reason for this: because indexes are designed to collate similar data values they are capable of accentuating any skew in the data distribution, which means a sample taken from a small number of leaf blocks can be highly misleading as a guide to the whole index – so Oracle aims for a minimum sample size for gathering index stats.