Search

OakieTags

Who's online

There are currently 0 users and 18 guests online.

Recent comments

Affiliations

Dynamic sampling and partitioned tables

------------------------------------------------------------------------------

Update January 2010: A thread on OTN mentioned this blog post and another, actually contradicting blog post by Asif Momen.

So why are these two blog posts coming to different conclusions regarding Dynamic Sampling and partitions with missing statistics?

This is the good thing with documented test cases - I reproduced what Asif has done and found out that the significant difference between these two test cases is the existence of global level statistics.

In my test case below, I have explicitly gathered statistics only on partition level, and there are no statistics on global level/table (which can be seen from the output of the query against user_tab_statistics below).

Asif has actually gathered statistics on global/table level which can be seen from his blog post.

So the conclusion seems to be: If you prune to a single partition, but this partition has no statistics, then Dynamic sampling will be used if no global/table level statistics are available. If global/table level statistics are available, the optimizer won't perform dynamic sampling and revert to these global/table level statistics instead.

Oddly this obviously doesn't apply to the subpartition/partition level case: Repeating a similar setup with subpartitions having no statistics, but statistics on partition level are available, Dynamic Sampling still was used (tested on 11.1.0.7 Win32).

------------------------------------------------------------------------------

Dynamic sampling for tables with missing statistics is enabled by default from Oracle 10g on (OPTIMIZER_DYNAMIC_SAMPLING = 2). You can get the same behaviour in Oracle 9i by increasing the default dynamic sampling level of 1 to at least 2, by the way, at system, session or statement level (OPTIMIZER_DYNAMIC_SAMPLING parameter or the DYNAMIC_SAMPLING hint at statement level). For more information, see the documentation.

It's an interesting question what happens if you have a partitioned table but only for some of the partitions or subpartitions statistics are missing, and some others have statistics gathered.

Does dynamic sampling selectively kick in depending on which partition accessed or is it simply checking if the table itself has statistics or not?

The following testcase which works only on 11.1 and later since it's using list/range composite partitioning for the subpartition specific tests shows the results of 11.1.0.7 on Win32:

SQL> exec dbms_random.seed(0)

PL/SQL procedure successfully completed.

SQL>
SQL> -- Range partitioning testcase
SQL> CREATE TABLE wr_test
2 ( test_id
3 , trade_date
4 , CONSTRAINT test_pk PRIMARY KEY (trade_date, test_id) using index local)
5 PARTITION BY RANGE (trade_date)
6 ( PARTITION p_jan VALUES LESS THAN (DATE '2009-02-01')
7 , PARTITION p_feb VALUES LESS THAN (DATE '2009-03-01')
8 , PARTITION p_mar VALUES LESS THAN (DATE '2009-04-01') )
9 AS
10 SELECT ROWNUM AS test_id
11 , DATE '2009-02-01' + trunc(dbms_random.value(0, 59)) as trade_date
12 FROM dual
13 connect by level <= 1000;

Table created.

SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'WR_TEST', partname=>'p_feb', granularity=>'partition')

PL/SQL procedure successfully completed.

SQL>
SQL> select
2 partition_name
3 , num_rows
4 from
5 user_tab_statistics
6 where
7 table_name = 'WR_TEST';

PARTITION_NAME NUM_ROWS
------------------------------ ----------

P_JAN
P_FEB 491
P_MAR

4 rows selected.

SQL>
SQL> -- Dynamic sampling is selectively used on partitions without statistics
SQL> explain plan for
2 select * from wr_test where trade_date = date '2009-03-01';

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1136113187

--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 16 | 352 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 16 | 352 | 2 (0)| 00:00:01 | 3 | 3 |
|* 2 | TABLE ACCESS FULL | WR_TEST | 16 | 352 | 2 (0)| 00:00:01 | 3 | 3 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("TRADE_DATE"=TO_DATE(' 2009-03-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Note
-----
- dynamic sampling used for this statement

18 rows selected.

SQL>
SQL> -- No dynamic sampling with statistics in place
SQL> explain plan for
2 select * from wr_test where trade_date = date '2009-02-01';

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3091737428

--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 18 | 198 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 18 | 198 | 2 (0)| 00:00:01 | 2 | 2 |
|* 2 | INDEX RANGE SCAN | TEST_PK | 18 | 198 | 2 (0)| 00:00:01 | 2 | 2 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("TRADE_DATE"=TO_DATE(' 2009-02-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

14 rows selected.

SQL>
SQL> drop table wr_test purge;

Table dropped.

SQL>
SQL> exec dbms_random.seed(0)

PL/SQL procedure successfully completed.

SQL>
SQL> -- composite partitioning testcase
SQL> CREATE TABLE wr_test
2 ( test_id
3 , trade_date
4 , CONSTRAINT test_pk PRIMARY KEY (trade_date, test_id) using index local)
5 partition by list (test_id)
6 SUBPARTITION BY RANGE (trade_date)
7 (
8 partition p_default values (default)
9 ( SUBPARTITION p_jan VALUES LESS THAN (DATE '2009-02-01')
10 , SUBPARTITION p_feb VALUES LESS THAN (DATE '2009-03-01')
11 , SUBPARTITION p_mar VALUES LESS THAN (DATE '2009-04-01') )
12 )
13 AS
14 SELECT ROWNUM AS test_id
15 , DATE '2009-02-01' + trunc(dbms_random.value(0, 59)) as trade_date
16 FROM dual
17 connect by level <= 1000;

Table created.

SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'WR_TEST', partname=>'p_feb', granularity=>'subpartition')

PL/SQL procedure successfully completed.

SQL>
SQL> select
2 partition_name
3 , subpartition_name
4 , num_rows
5 from
6 user_tab_statistics
7 where
8 table_name = 'WR_TEST';

PARTITION_NAME SUBPARTITION_NAME NUM_ROWS
------------------------------ ------------------------------ ----------

P_DEFAULT
P_DEFAULT P_JAN
P_DEFAULT P_FEB 491
P_DEFAULT P_MAR

5 rows selected.

SQL>
SQL> -- Dynamic sampling also is selectively used on SUBpartitions without statistics
SQL> explain plan for
2 select * from wr_test where trade_date = date '2009-03-01';

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1060835009

---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 16 | 352 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION LIST SINGLE | | 16 | 352 | 2 (0)| 00:00:01 | 1 | 1 |
| 2 | PARTITION RANGE SINGLE| | 16 | 352 | 2 (0)| 00:00:01 | 3 | 3 |
|* 3 | TABLE ACCESS FULL | WR_TEST | 16 | 352 | 2 (0)| 00:00:01 | 3 | 3 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("TRADE_DATE"=TO_DATE(' 2009-03-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Note
-----
- dynamic sampling used for this statement

19 rows selected.

SQL>
SQL> -- No dynamic sampling with statistics in place
SQL> explain plan for
2 select * from wr_test where trade_date = date '2009-02-01';

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1060835009

---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 18 | 198 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION LIST SINGLE | | 18 | 198 | 2 (0)| 00:00:01 | 1 | 1 |
| 2 | PARTITION RANGE SINGLE| | 18 | 198 | 2 (0)| 00:00:01 | 2 | 2 |
|* 3 | TABLE ACCESS FULL | WR_TEST | 18 | 198 | 2 (0)| 00:00:01 | 2 | 2 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("TRADE_DATE"=TO_DATE(' 2009-02-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

15 rows selected.

SQL>
SQL> -- Different treatment of subpartitions in pre-10.2.0.4
SQL> alter session set optimizer_features_enable = '10.2.0.3';

Session altered.

SQL>
SQL> -- Now uses partition-level statistics
SQL> -- These are missing
SQL> -- Therefore dynamic sampling
SQL> -- Although the subpartition accessed has statistics
SQL> -- Bet these are not used by pre-10.2.0.4 optimizer code
SQL> explain plan for
2 select * from wr_test where trade_date = date '2009-02-01';

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1060835009

---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 23 | 506 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION LIST SINGLE | | 23 | 506 | 2 (0)| 00:00:01 | 1 | 1 |
| 2 | PARTITION RANGE SINGLE| | 23 | 506 | 2 (0)| 00:00:01 | 2 | 2 |
|* 3 | TABLE ACCESS FULL | WR_TEST | 23 | 506 | 2 (0)| 00:00:01 | 2 | 2 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("TRADE_DATE"=TO_DATE(' 2009-02-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Note
-----
- dynamic sampling used for this statement

19 rows selected.

SQL>
SQL> -- Gathering statistics on partition level
SQL> exec dbms_stats.gather_table_stats(null, 'WR_TEST', partname=>'p_default', granularity=>'partition', method_opt=>'for all columns size 1')

PL/SQL procedure successfully completed.

SQL>
SQL> -- No longer using dynamic sampling
SQL> explain plan for
2 select * from wr_test where trade_date = date '2009-02-01';

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3944471208

---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17 | 187 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION LIST SINGLE | | 17 | 187 | 2 (0)| 00:00:01 | 1 | 1 |
| 2 | PARTITION RANGE SINGLE| | 17 | 187 | 2 (0)| 00:00:01 | 2 | 2 |
|* 3 | INDEX RANGE SCAN | TEST_PK | 17 | 187 | 2 (0)| 00:00:01 | 2 | 2 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("TRADE_DATE"=TO_DATE(' 2009-02-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

15 rows selected.

SQL>

So as can be seen in 11.1 dynamic sampling is selectively used, depending on what kind of partition pruning is recognized by the optimizer at parse time and if statistics have been gathered for that partition, and this also applies to the subpartition level. The same can be seen in 10.2.0.4, apart from the severe bug regarding single subpartition pruning in the 10.2.0.4 patch set release as shown here.

As already mentioned a couple of times here on my blog, the optimizer code of pre-10.2.0.4 versions doesn't use subpartition level statistics even when pruned to a single subpartition and always reverts to the partition level.