A little while ago I highlighted a special case with the MINUS operator (that one of the commentators extended to include the INTERSECT operator) relating to the way the second subquery would take place even if the first subquery produced no rows. I’ve since had an email from an Oracle employee letting me know that the developers looked at this case and decided that it wasn’t feasible to address it because – taking a wider view point – if the query were to run parallel they would need a mechanism that allowed some synchronisation between slaves so that every slave could find out that none of the slaves had received no rows from the first subquery, and this was going to lead to hanging problems.
Common internet question: does the order of predicates in the where clause make a difference.
General answer: It shouldn’t, but sometimes it will thanks to defects in the optimizer.
There’s a nicely presented example on the OTN database forum where predicate order does matter (between 10.1.x.x and 220.127.116.11). Notnne particularly – there’s a script to recreate the issue; note, also, the significance of the predicate section of the execution plan.
It’s bug 6782665, fixed in 18.104.22.168
I’m afraid this is one of my bad puns again – an example of the optimizer making a real hash of the index hash join. I’m going to create a table with several indexes (some of them rather similar to each other) and execute a query that should do an index join between the obvious two indexes. To show how obvious the join should be I’m going to start with a couple of queries that show the cost of simple index fast full scans.
Here’s the data generating code:
When you upgrade you often find that some little detail (of the optimizer) that didn’t receive a lot of attention in the “New Features” manuals introduces a few dramatic changes in execution plans. Here’s one example of a detail that is likely to catch a few unlucky people. We start with a very simple table which is just and id column with some padding, and then show the effect of a change in the handling of “constant subqueries”. Here’s my data set:
Here’s the output I got from a 10.2.0.5 system after generating a stored outline on a query – then dropping the index that was referenced by the stored outline and creating an alternative index. Spot the problem:
When the optimizer has to estimate the data volume (the BYTES column in the plan output), it usually bases this information on the column statistics, if applicable and available (think of complex expressions).However, whenever there is a VIEW operator in an execution plan, that represents an unmerged view, the optimizer obviously "loses" this information and starts applying defaults that are based on the column definition.Depending on the actual content of the columns this can lead to dramatic differences in data volume estimates.Both, under- and overestimates are possible, because for character based columns these defaults seem to be based on an assumed 50% fill grade, so a VARCHAR2(100 BYTE) column counts as 50 bytes data volume.For multi-byte character sets the same rule applies based on the maximum width of a column using the "char" semantics, so a VARCHAR2(1000 CHAR) column counts as 2000 byte
It probably won’t surprise many people to hear me say that the decode() function can be a bit of a nuisance; and I’ll bet that quite a lot of people have had trouble occasionally trying to get function-based indexes that use this function to behave properly. So (to put it all together and support the general directives that case is probably a better choice than decode() and that the cast() operator is an important thing to learn) here’s an example of how function-based indexes don’t always allow you to work around bad design/code. (Note: this is a model of a problem I picked up at a client site, stripped to a minimum – you have to pretend that I’m not allowed to fix the problem by changing code).
First we create some data and indexes, and gather all relevant stats:
Or – to be more accurate – real statistics on a virtual column.
This is one of the “10 top tips” that I came up with for my session with Maria Colgan at OOW13. A method of giving more information that might improve execution plans when you can change the code. I’ll start with a small data set including a virtual column (running 22.214.171.124), and a couple of problem queries:
Here’s a little demo cut-n-pasted from a session running Oracle 126.96.36.199 (it works on 11g, too). All it does is create a table by copying from a well-known table, gather extended stats on a column group, then show you the resulting column names by querying view user_tab_cols.
I’ve written notes about the different joins in the past – but such things are always worth revisiting, so here’s an accumulated bundle of comments about hash joins.