Some time ago on the OTN forum the following table layout was part of a discussion regarding performance issues and it revealed an interesting anomaly regarding list partition pruning:
If you're using list partitioning with partitions that use multiple values that map to a single list partition then the optimizer obviously uses a questionable approach when you're using multiple values on the partition key to prune to a single partition.
Consider the following table layout:
I'm now going to populate that table using this sample data:
The result is the following distribution:
As you can see the data is evenly distributed except for the default partition which holds 5,000 rows for the value 'AA'.
I'm now going to gather statistics, but deliberately only on partition level:
So I'm simulating here an approach where I assume that I'm going to prune to a single partition and therefore don't need high quality global level table statistics, but gather only statistics on partition level. The global level statistics are aggregated/derived statistics.
Of course the whole approach regarding partition statistics changes with 11g and its incremental partition statistics features (see e.g. http://structureddata.org/2008/07/16/oracle-11g-incremental-global-stati...), and therefore in 11g you don't suffer from the usual overhead caused by maintaining high quality global level statistics.
I'm getting the following statistics using above DBMS_STATS call:
Everything works fine if I only use a single value to prune to one of the list partitions, in this case the skewed default partition:
Explained.
SQL>
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3795892923
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5000 | 590K| 26 (0)| 00:00:01 | | |
| 1 | PARTITION LIST SINGLE| | 5000 | 590K| 26 (0)| 00:00:01 | KEY | KEY |
|* 2 | TABLE ACCESS FULL | XYZ | 5000 | 590K| 26 (0)| 00:00:01 | 4 | 4 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXCH_CODE"='AA')
14 rows selected.
SQL> -- non-existent value in default partition
SQL> explain plan for
2 SELECT
3 * from XYZ
4 WHERE exch_code IN ('AB');
Explained.
SQL>
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3795892923
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 121 | 26 (0)| 00:00:01 | | |
| 1 | PARTITION LIST SINGLE| | 1 | 121 | 26 (0)| 00:00:01 | KEY | KEY |
|* 2 | TABLE ACCESS FULL | XYZ | 1 | 121 | 26 (0)| 00:00:01 | 4 | 4 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXCH_CODE"='AB')
14 rows selected.
But look what happens if I use multiple values that still prune to a single list partition, again the default partition:
Explained.
SQL>
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3676732184
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 556 | 67276 | 26 (0)| 00:00:01 | | |
| 1 | PARTITION LIST INLIST| | 556 | 67276 | 26 (0)| 00:00:01 |KEY(I) |KEY(I) |
|* 2 | TABLE ACCESS FULL | XYZ | 556 | 67276 | 26 (0)| 00:00:01 |KEY(I) |KEY(I) |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXCH_CODE"='AA' OR "EXCH_CODE"='AB')
14 rows selected.
SQL> -- non-existent values
SQL> explain plan for
2 SELECT
3 * from XYZ
4 WHERE exch_code IN ('AB', 'BC');
Explained.
SQL>
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3676732184
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 556 | 67276 | 26 (0)| 00:00:01 | | |
| 1 | PARTITION LIST INLIST| | 556 | 67276 | 26 (0)| 00:00:01 |KEY(I) |KEY(I) |
|* 2 | TABLE ACCESS FULL | XYZ | 556 | 67276 | 26 (0)| 00:00:01 |KEY(I) |KEY(I) |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXCH_CODE"='AB' OR "EXCH_CODE"='BC')
14 rows selected.
There are at least two noteworthy points to read from these plans: The cardinality estimate has changed significantly, but the cost has not.
This seems to be odd, the partition operation has changed to PARTITION LIST INLIST and it shows a KEY(I) operation for the partitions pruned. This could suggest that the optimizer is now using the global level statistics but then the cost should change, too.
Looking at the corresponding 10053 optimizer trace files reveals some interesting details. This is what we get for when specifying a single value for the default partition:
***************************************
and this is what we get if we use multiple values for the default partition:
***************************************
So the oddity that shows up here is, that for the cardinality estimate the global level statistics are used ("(Using composite stats)"), but on the other hand the optimizer is clearly able to work out the pruning information to come to the conclusion that 90 blocks from the default partition need to be read, so the cost estimate is the same 26 as in the first case.
This looks like an odd mixture of global level and partition level statistics and the obvious question is why the optimizer doesn't use the partition level statistics for the cardinality estimate if it's possible to use the same statistics for the cost estimate.
The same happens for non-default partitions:
Explained.
SQL>
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3795892923
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 60 | 7260 | 10 (0)| 00:00:01 | | |
| 1 | PARTITION LIST SINGLE| | 60 | 7260 | 10 (0)| 00:00:01 | KEY | KEY |
|* 2 | TABLE ACCESS FULL | XYZ | 60 | 7260 | 10 (0)| 00:00:01 | 2 | 2 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXCH_CODE"='UO')
14 rows selected.
SQL>
SQL> explain plan for
2 SELECT
3 * from XYZ
4 WHERE exch_code IN ('UO', 'US');
Explained.
SQL>
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3676732184
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 556 | 67276 | 10 (0)| 00:00:01 | | |
| 1 | PARTITION LIST INLIST| | 556 | 67276 | 10 (0)| 00:00:01 |KEY(I) |KEY(I) |
|* 2 | TABLE ACCESS FULL | XYZ | 556 | 67276 | 10 (0)| 00:00:01 |KEY(I) |KEY(I) |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXCH_CODE"='UO' OR "EXCH_CODE"='US')
14 rows selected.
In order to see what the estimates would look like if the partition level statistics were used, we can make use of the explicit partition pruning by name:
Explained.
SQL>
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3795892923
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5000 | 590K| 26 (0)| 00:00:01 | | |
| 1 | PARTITION LIST SINGLE| | 5000 | 590K| 26 (0)| 00:00:01 | KEY | KEY |
|* 2 | TABLE ACCESS FULL | XYZ | 5000 | 590K| 26 (0)| 00:00:01 | 4 | 4 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXCH_CODE"='AA' OR "EXCH_CODE"='AB')
14 rows selected.
SQL>
SQL> explain plan for
2 SELECT
3 * from XYZ partition (BBO_EURO)
4 WHERE exch_code IN ('AB', 'BC');
Explained.
SQL>
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3795892923
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 121 | 26 (0)| 00:00:01 | | |
| 1 | PARTITION LIST SINGLE| | 1 | 121 | 26 (0)| 00:00:01 | KEY | KEY |
|* 2 | TABLE ACCESS FULL | XYZ | 1 | 121 | 26 (0)| 00:00:01 | 4 | 4 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXCH_CODE"='AB' OR "EXCH_CODE"='BC')
14 rows selected.
SQL>
SQL> explain plan for
2 SELECT
3 * from XYZ partition (BBO_NAMR)
4 WHERE exch_code IN ('UO', 'US');
Explained.
SQL>
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3795892923
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 114 | 13794 | 10 (0)| 00:00:01 | | |
| 1 | PARTITION LIST SINGLE| | 114 | 13794 | 10 (0)| 00:00:01 | KEY | KEY |
|* 2 | TABLE ACCESS FULL | XYZ | 114 | 13794 | 10 (0)| 00:00:01 | 2 | 2 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXCH_CODE"='UO' OR "EXCH_CODE"='US')
14 rows selected.
You can see that in all cases the cardinality estimates are quite different (and very accurate by the way since I have a histogram on the EXCH_CODE column on partition level), and the cost estimate is still the same.
Just for completeness, dynamic sampling can help in this case, but only if used explicitly on table level, since the predicate is not considered as "guess", therefore with statistics in place the dynamic sampling doesn't get used.
Explained.
SQL>
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3676732184
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 556 | 67276 | 10 (0)| 00:00:01 | | |
| 1 | PARTITION LIST INLIST| | 556 | 67276 | 10 (0)| 00:00:01 |KEY(I) |KEY(I) |
|* 2 | TABLE ACCESS FULL | XYZ | 556 | 67276 | 10 (0)| 00:00:01 |KEY(I) |KEY(I) |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXCH_CODE"='UO' OR "EXCH_CODE"='US')
14 rows selected.
SQL> -- dynamic sampling explicitly specified for table
SQL> explain plan for
2 SELECT /*+ dynamic_sampling(xyz, 4) */
3 * from XYZ
4 WHERE exch_code IN ('UO', 'US');
Explained.
SQL>
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3676732184
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 114 | 13794 | 10 (0)| 00:00:01 | | |
| 1 | PARTITION LIST INLIST| | 114 | 13794 | 10 (0)| 00:00:01 |KEY(I) |KEY(I) |
|* 2 | TABLE ACCESS FULL | XYZ | 114 | 13794 | 10 (0)| 00:00:01 |KEY(I) |KEY(I) |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXCH_CODE"='UO' OR "EXCH_CODE"='US')
Note
-----
- dynamic sampling used for this statement
18 rows selected.
SQL>
So this odd mixture of global and partition level statistics requires to have high-quality global level statistics including histograms where necessary to get accurate cardinality estimates:
Now the results look different:
Explained.
SQL>
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3676732184
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5018 | 592K| 26 (0)| 00:00:01 | | |
| 1 | PARTITION LIST INLIST| | 5018 | 592K| 26 (0)| 00:00:01 |KEY(I) |KEY(I) |
|* 2 | TABLE ACCESS FULL | XYZ | 5018 | 592K| 26 (0)| 00:00:01 |KEY(I) |KEY(I) |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXCH_CODE"='AA' OR "EXCH_CODE"='AB')
14 rows selected.
SQL>
SQL> explain plan for
2 SELECT
3 * from XYZ
4 WHERE exch_code IN ('AB', 'BC');
Explained.
SQL>
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3676732184
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 35 | 4235 | 26 (0)| 00:00:01 | | |
| 1 | PARTITION LIST INLIST| | 35 | 4235 | 26 (0)| 00:00:01 |KEY(I) |KEY(I) |
|* 2 | TABLE ACCESS FULL | XYZ | 35 | 4235 | 26 (0)| 00:00:01 |KEY(I) |KEY(I) |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXCH_CODE"='AB' OR "EXCH_CODE"='BC')
14 rows selected.
SQL>
SQL> explain plan for
2 SELECT
3 * from XYZ
4 WHERE exch_code IN ('UO', 'US');
Explained.
SQL>
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3676732184
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 114 | 13794 | 10 (0)| 00:00:01 | | |
| 1 | PARTITION LIST INLIST| | 114 | 13794 | 10 (0)| 00:00:01 |KEY(I) |KEY(I) |
|* 2 | TABLE ACCESS FULL | XYZ | 114 | 13794 | 10 (0)| 00:00:01 |KEY(I) |KEY(I) |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EXCH_CODE"='UO' OR "EXCH_CODE"='US')
14 rows selected.
But this comes at the price of the overhead to maintain global level statistics, which wasn't necessary if the pruning would be handled consistently in case the process is designed to prune to a single list partition.
I haven't tested yet in its entirety if the new incremental statistics approach of 11g or its corresponding 10.2.0.4 feature "APPROX_GLOBAL AND PARTITION" (see e.g. here) maintain histograms on global level without the need to gather global level statistics, but the present approach of the optimizer towards list partition pruning to a single partition seems to be questionable.
What I can confirm is that the traditional statistics collection on partition level including histograms doesn't generate histograms on aggregate levels, so with aggregate statistics you don't get histograms on global or partition level (in case of composite partitioning).
This test case result applies to 9.2.0.8, 10.2.0.4 and 11.1.0.7, tests were run on Windows XP 32bit.
Full name
Randolf Geist
My company
http://www.sqltools-plusplus.org:7676/
Recent comments
1 week 3 days ago
1 week 5 days ago
1 week 5 days ago
1 week 5 days ago
6 weeks 5 days ago
6 weeks 6 days ago
8 weeks 6 hours ago
11 weeks 6 days ago
17 weeks 4 days ago
20 weeks 2 days ago