Search

OakieTags

Who's online

There are currently 0 users and 36 guests online.

Recent comments

Affiliations

partitioning

Online Materialized View Complete Refresh With Partitions

The other day I was posed with an interesting problem regarding the refresh of the materialized views. First, a little background. MVs are nothing new in Oracle; they have been around since Oracle 7 (called snapshots then). Essentially they are similar to tables (as they actually store data) but populated by running a query on some source tables. This query is the defining query of the MV. So, the user can select from the MV instead of executing the defining query – reducing the execution time significantly.

However, the MV and the query would both give the same result if the source tables have not changed. If the data has changed, the MV will not have known about it and will give a stale result. Therefore, you need to refresh the MV from time to time. Usually the MVs are refreshed by the following method:

begin
dbms_mview.refresh(‘’);
end;

So, what’s the problem? The problem is, during complete refreshes, the MV remains unavailable to the users. This duration of refresh depends on the size of the MV, the execution time of the underlying query and so on; and the unavailability is a nuisance for most applications. In some cases the applications are designed to timeout after some wait; and they do so quite consistently in this case – making it more than just a nuisance.

So, I was asked to devise a solution to make the MVs available during the complete refreshes. I accomplished that with a not-so-common use of partitions. Here is how I did it.

Prebuilt Table

First, as an architectural standard, I advocate the use of prebuilt tables in MVs. Here is how an MV is usually created:

create materialized view mv_bookings_01
refresh complete on demand
enable query rewrite
as
select hotel_id, sum(book_txn) tot_txn
from bookings
group by hotel_id
/

This creates the segment mv_bookings_01 during this process. Instead of this approach, I recommend first creating and populating a regular table and then converting that to an MV. In this revised approach, you would first create the table:

create table mv_bookings_01
as
select HOTEL_ID, sum(book_txn) tot_txn
from bookings
group by HOTEL_ID
/


Then, you would create the MV using the prebuilt table option:

create materialized view mv_bookings_01
on prebuilt table
refresh complete on demand
enable query rewrite
as
select HOTEL_ID, sum(book_txn) tot_txn
from bookings
group by HOTEL_ID
/

This does not do anything to the functionality of the MV itself. The MV feels, looks and smells like it was before; but with two significant advantages:

(1) You can easily alter the columns
(2) You can manipulate the table

I blogged about it at http://arup.blogspot.com/2009/01/ultra-fast-mv-alteration-using-prebuilt.html. Anyway, back to the issue at hand. Using the prebuilt trick, I can create a partitioned MV as well. In this specific case, I will add a column to partition on. Remember, this column does not actually participate in the application; it simply creates an anchor point for the partition.

create table mv_bookings_01
(
ref_type number,
hotel_id number(3),
tot_txn number
)
partition by list (ref_type)
(
partition p0 values (0),
partition p1 values (1)
)
/

After the table is created, I can insert data into it:

insert /*+ append */ into mv_bookings_01
select 0, hotel_id, sum(book_txn) tot_txn
from bookings
group by 0, hotel_id
/

Of course I could have selected the data from the source tables directly while creating the prebuilt table; but I just wanted to show a different approach of data loading. The column REF_TYPE is not part of the application; but I have added it to divide the table along a value by list partitioning. The column can hold only two values – 0 and 1, and hence the table has only two partitions.

Once the table is created, I can use the prebuilt table option to create the MV as shown above. However, the presence of the new column makes it a little different. My MV creation script now looks like this:

create materialized view mv_bookings_01
on prebuilt table
enable query rewrite
as
select 0 as ref_type, HOTEL_ID, sum(book_txn) tot_txn
from bookings
group by 0, HOTEL_ID
/

Note that I have used “0 as ref_type” in the select clause, i.e. the ref_type will always be 0. This is not going to a problem as you can see later. When the MV is first created, the value of ref_type is 0; hence the partition p0 is the one is that is populated; not the partition p1. We can confirm that:

SQL> select count(1) from mv_bookings_01 partition (p0);

COUNT(1)
----------
2

SQL> select count(1) from mv_bookings_01 partition (p1);

COUNT(1)
----------
0

Now that the partitions are in place, let’s see how we refresh this MV. We will no longer use the dbms_mview approach. Rather, we will use this:

(1) Create a temporary table
(2) Create all the indexes
(3) Exchange the partition that is not used with this temporary table

Step 1: Temporary table

First we create a temporary table that is identical to the prebuilt table of the MV in structure; but just not partitioned. We will use the value of ref_type column as 0 or 1 based on the value already in the table, in fact exactly opposite of what is in the table. Since we have 0 in the table, we will use 1 in the temporary table. This temporary table will contain the data that we need refreshed as.

create table tmp1
as
select 1 as ref_type, hotel_id, sum(book_txn) tot_txn
from bookings
group by 1, hotel_id
/

We also need to create another temporary table with no data.

create table tmp0
(
ref_type number,
hotel_id number(3),
tot_txn number
)
/

Step 2: Create Indexes

In this case we don’t have any indexes on the MV; but if we had, we would have created them here on the TMP0 and TMP1 tables.

Step 3: Exchange Partitions

When the temporary table is ready, we can exchange the inactive partition with this.

alter table mv_bookings_01
exchange partition p1
with table tmp1
including indexes
without validation
update global indexes
/
alter table mv_bookings_01
exchange partition p0
with table tmp0
including indexes
without validation
update global indexes
/

This SQL performs a switch: the segment that was called partition P1 in the table mv_bookings_01 is now called TMP1 and former segment TMP1 is now called partition P1 in the table mv_bookings_01. This occurs at the data dictionary level; no actual data is ever transferred from one segment to the other. As a result this is very quick. A lock is required on the table only for that instance is such an insignificant amount of time that it may not be even noticeable. Similarly the segment used to be called P0 partition is now known as TMP0 table and TMP0 is called P0.

There are several key things to note here. Note the clause “without validation”, which tells Oracle not to bother checking inside the TMP0 table that it will confirm to the partition specification. This is critical for the performance. Since we built the table with “1” as a hardcoded value, we know that the REF_TYPE column will definitely contain 1, satisfying the partition requirement for P1. A further checking is not required.

The “including indexes” clause switched the local index partitions as well. If there is a global index on the MV’s prebuilt table, that would have been invalidated and needed rebuilding afterwards. But, we avoided that invalidation by using a “update global indexes” clause in the alter statement.

Once the switchover is complete, we can check the MV to see if the data has been visible.

SQL> select count(1) from mv_bookings_01 partition (p0)
2 /

COUNT(1)
----------
0

SQL> select count(1) from mv_bookings_01 partition (p1)
2 /

COUNT(1)
----------
2

This completes the refreshing of the MV and the data is visible to end users. Let’s see the timeline

(1) Building of temporary table
(2) Building of indexes on the temporary table
(3) Exchange partitions

Steps 1 and 2 take most of the time; but that is done offline, without affecting the MV itself. So, long time there does not affect the availability of the MV itself. The step 3 is where the availability is impacted; but that is miniscule.

The table TMP0 now has the rows from partition P0 of the MV. To confirm that:

SQL> select count(1) from tmp0;

COUNT(1)
----------
2

You can drop the table by “drop table tmp0 purge”. You should also drop TMP1 since that table will contain the previous contents of the partition P1, which is useless now.

The next time this MV needs refreshing, you have to repeat the process; but with a very important distinction – the partition P0 needs exchange now. I used a script that checks the value of REF_TYPE in the MV now and use the other value passed to an SQL script with a positional parameter that accepts 0 or 1. Using this approach you can refresh a MV very quickly.

Oracle 11g Variation

In Oracle 11g, there is a new feature – Virtual Column. You can define a column as virtual in the table, which is not actually stored in the table; but calculated during runtime. What’s more, this virtual column can also be used as a partition key. You can define the REF_TYPE as a virtual column in that case.

Takeaways

(1) Just because there is an Materialized View, it does not have to be refreshed by the usual approach, i.e. dbms_mview.refresh. There are other options as well.
(2) Partitioning is not just for performance or purge reasons; it can be used in clever ways such as MV refreshes.
(3) In this specific case, we didn’t have a column that could have been used as a partitioning key; so we had to resort to adding a column. In many cases, the table may already have a column for that purpose, making the process even simpler.

The Core Performance Fundamentals Of Oracle Data Warehousing – Partitioning

[back to Introduction] Partitioning is an essential performance feature for an Oracle data warehouse because partition elimination (or partition pruning) generally results in the elimination of a significant amount of table data to be scanned. This results in a need for less system resources and improved query performance. Someone once told me “the fastest I/O is the one that never happens.” This is precisely the reason that partitioning is a must for Oracle data warehouses – it’s a huge I/O eliminator. I frequently refer to partition elimination as the anti-index. An index is used to find a small amount data that is required; partitioning is used to eliminate vasts amounts of data that is not required. Main Uses For Partitioning I would classify the main reasons to use partitioning in your Oracle data warehouse into these four areas: Data Elimination Partition-Wise Joins Manageability (Partition Exchange Load, Local Indexes, etc.) Information Lifecycle Management (ILM) Partitioning Basics The most common partitioning design pattern found in Oracle data warehouses is to partition the fact tables by range (or interval) on the event date/time column. This allows for partition elimination of all the data not in the desired time window in queries. For example: If I have a [...]

Does CREATE INDEX Gather Global Statistics?

You can add the COMPUTE STATISTICS clause to the CREATE INDEX statement. It instructs the SQL statement to gather and store index statistics in the data dictionary, while creating the index. This is useful because the overhead associated with the gathering of statistics while executing this SQL statement is negligible. In Oracle9i, the gathering of [...]

Zero-Size Unusable Indexes and the Query Optimizer

Zero-size unusable indexes and index partions is a small but useful feature of Oracle Database 11g Release 2. Simply put, its aim is to save space in the database by immediately releasing the segment associated to unusable indexes or index partitions. To illustrate this, let’s have a look to an example…

Create a partitioned table, insert [...]

PLAN_HASH_VALUE - How equal (and stable?) are your execution plans - part 1

Oracle provides in recent releases the PLAN_HASH_VALUE information, which according to the documentation, is the following:

"Numerical representation of the SQL plan for the cursor. Comparing one PLAN_HASH_VALUE to another easily identifies whether or not two plans are the same (rather than comparing the two plans line by line)."

So according to the documentation the PLAN_HASH_VALUE can be used as a shortcut to quickly and easily determine if two execution plans are the "same".

I think that the statement "the two execution plans are the same" suggests that execution plans having the same PLAN_HASH_VALUE yield the same or at least similar performance at runtime. An interesting point that I would like to cover here in the following test cases.

This raises the interesting question, what exactly is the PLAN_HASH_VALUE based upon? Obviously it is a hash value calculated using the execution plan information as input. Apart from the fact that hash values are in theory always subject to potential hash collisions, which means that two different inputs can lead to the same hash value, the more interesting question is, which attributes of the execution plan are used as input?

Having this information at hand allows us to get a better understanding if two execution plans with the same PLAN_HASH_VALUE actually have to a yield similar execution profile (which Oracle doesn't say but I assume is a common assumption - or may be misconception?).

Let's start with a simple example to determine what makes the PLAN_HASH_VALUE different. All results shown below come from an 11.1.0.7 Win32 instance with a 8KB default block size, a MSSM 8KB LMT tablespace and default system statistics.

SQL>
SQL> drop table plan_hash_value_test1 purge;

Table dropped.

SQL>
SQL> drop table plan_hash_value_test2 purge;

Table dropped.

SQL>
SQL> drop table plan_hash_value_test3 purge;

Table dropped.

SQL>
SQL> drop user test_user_plan_hash_value cascade;

User dropped.

SQL>
SQL> create user test_user_plan_hash_value
2 identified by test_user_plan_hash_value
3 default tablespace test_8k
4 quota unlimited on test_8k;

User created.

SQL>
SQL> create table plan_hash_value_test1
2 as
3 select
4 id as id1
5 , mod(id, 2) as id2
6 , rpad('x', 20) as small_vc
7 from
8 (
9 select
10 level as id
11 from
12 dual
13 connect by
14 level <= 1000
15 );

Table created.

SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'plan_hash_value_test1')

PL/SQL procedure successfully completed.

SQL>
SQL> create index plan_hash_value_test1_idx1 on plan_hash_value_test1 (id1, id2);

Index created.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> select /*+ find_me
2 index(a, plan_hash_value_test1_idx1) */
3 *
4 from
5 plan_hash_value_test1 a
6 where
7 id1 = 1
8 and rownum <= 1;

ID1 ID2 SMALL_VC
---------- ---------- --------------------
1 1 x

SQL>
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID 3u4sfg1kzqtct, child number 0
-------------------------------------
select /*+ find_me index(a, plan_hash_value_test1_idx1) */
* from plan_hash_value_test1 a where id1 = 1
and rownum <= 1

Plan hash value: 2655295642

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| PLAN_HASH_VALUE_TEST1 | 1 | 27 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | PLAN_HASH_VALUE_TEST1_IDX1 | 1 | | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=1)
3 - access("ID1"=1)

23 rows selected.

SQL>
SQL> select
2 object#, object_owner, object_name
3 from
4 v$sql_plan
5 where
6 operation = 'INDEX'
7 and sql_id = (select
8 sql_id
9 from
10 v$sql
11 where
12 sql_text like 'select /*+ find_me%'
13 );

OBJECT# OBJECT_OWNER OBJECT_NAME
---------- ------------------------------ ------------------------------
76267 CBO_TEST PLAN_HASH_VALUE_TEST1_IDX1

SQL>
SQL> pause

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> drop index plan_hash_value_test1_idx1;

Index dropped.

SQL>
SQL> create index test_user_plan_hash_value.plan_hash_value_test1_idx1 on plan_hash_value_test1 (id2, id1);

Index created.

SQL>
SQL> exec dbms_stats.set_index_stats('test_user_plan_hash_value', 'plan_hash_value_test1_idx1', numdist=>0, numrows=>0)

PL/SQL procedure successfully completed.

SQL>
SQL> select /*+ find_me
2 index(a, plan_hash_value_test1_idx1) */
3 *
4 from
5 plan_hash_value_test1 a
6 where
7 id2 = 1
8 and rownum <= 2;

ID1 ID2 SMALL_VC
---------- ---------- --------------------
1 1 x
3 1 x

SQL>
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID dsjuqc6fkfffj, child number 0
-------------------------------------
select /*+ find_me index(a, plan_hash_value_test1_idx1) */
* from plan_hash_value_test1 a where id2 = 1
and rownum <= 2

Plan hash value: 2655295642

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| PLAN_HASH_VALUE_TEST1 | 2 | 54 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | PLAN_HASH_VALUE_TEST1_IDX1 | 1 | | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=2)
3 - access("ID2"=1)

23 rows selected.

SQL>
SQL> select
2 object#, object_owner, object_name
3 from
4 v$sql_plan
5 where
6 operation = 'INDEX'
7 and sql_id = (select
8 sql_id
9 from
10 v$sql
11 where
12 sql_text like 'select /*+ find_me%'
13 );

OBJECT# OBJECT_OWNER OBJECT_NAME
---------- ------------------------------ ------------------------------
76268 TEST_USER_PLAN_HASH_VALUE PLAN_HASH_VALUE_TEST1_IDX1

SQL>
SQL> pause

SQL>
SQL> drop index test_user_plan_hash_value.plan_hash_value_test1_idx1;

Index dropped.

SQL>
SQL> create index plan_hash_value_test1_idx1 on plan_hash_value_test1 (id2, id1);

Index created.

SQL>
SQL> select /*+ find_me
2 index(a, plan_hash_value_test1_idx1) */
3 *
4 from
5 plan_hash_value_test1 a
6 where
7 id2 = 1
8 and rownum <= 2;

ID1 ID2 SMALL_VC
---------- ---------- --------------------
1 1 x
3 1 x

SQL>
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID dsjuqc6fkfffj, child number 0
-------------------------------------
select /*+ find_me index(a, plan_hash_value_test1_idx1) */
* from plan_hash_value_test1 a where id2 = 1
and rownum <= 2

Plan hash value: 2655295642

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| PLAN_HASH_VALUE_TEST1 | 2 | 54 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | PLAN_HASH_VALUE_TEST1_IDX1 | 500 | | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=2)
3 - access("ID2"=1)

23 rows selected.

SQL>
SQL> select
2 object#, object_owner, object_name
3 from
4 v$sql_plan
5 where
6 operation = 'INDEX'
7 and sql_id = (select
8 sql_id
9 from
10 v$sql
11 where
12 sql_text like 'select /*+ find_me%'
13 );

OBJECT# OBJECT_OWNER OBJECT_NAME
---------- ------------------------------ ------------------------------
76269 CBO_TEST PLAN_HASH_VALUE_TEST1_IDX1

SQL>

Note the subtle differences: We have an index on a simple table, but these are actually three different objects, once owned by a different user and once with a different definition, but all three statements get the same PLAN_HASH_VALUE.

So this simple example already raises some of the most important points:

- The OBJECT_ID of an object obviously doesn't get used for the PLAN_HASH_VALUE calculation. Although present in the V$SQL_PLAN view, checking the PLAN_TABLE definition suggests why: It's not part of the PLAN_TABLE definition. This means that e.g. replacing an index with a different one but keeping the name will result in the same PLAN_HASH_VALUE if the operations and their order remain the same, like demonstrated (Although I have to admit that doing so might be called bad practice).

- The owner of an object doesn't get used either. Why this is so, remains unclear, since it seems to be part of all related object definitions (OBJECT_OWNER attribute in PLAN_TABLE/V$SQL_PLAN), but might be an very important point if you have multiple schemas with objects of the same name but representing different data or data volume. As you can see from the example the PLAN_HASH_VALUE is the same but the number of rows estimated is different due to the different underlying object (and its intentionally manipulated statistics).

- One of the most crucial aspects is also demonstrated: The FILTER_PREDICATES and ACCESS_PREDICATES information is NOT part of the PLAN_HASH_VALUE. This means that two executions plans can have the same PLAN_HASH_VALUE but behave significantly differently at actual execution time.

- What also is obvious from this example is that the ROWS and BYTES information is not used as part of the PLAN_HASH_VALUE. More on this later.

Let's move on to the next example:

SQL>
SQL> variable b2 number
SQL>
SQL> exec :b2 := 1;

PL/SQL procedure successfully completed.

SQL>
SQL> select /*+ find_me
2 index(a, plan_hash_value_test1_idx1) */
3 *
4 from
5 plan_hash_value_test1 a
6 where
7 id2 = :b2
8 and rownum <= 1;

ID1 ID2 SMALL_VC
---------- ---------- --------------------
1 1 x

SQL>
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID 4y6b2pu1hzs3h, child number 0
-------------------------------------
select /*+ find_me index(a, plan_hash_value_test1_idx1) */
* from plan_hash_value_test1 a where id2 = :b2
and rownum <= 1

Plan hash value: 2655295642

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| PLAN_HASH_VALUE_TEST1 | 1 | 27 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | PLAN_HASH_VALUE_TEST1_IDX1 | 500 | | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=1)
3 - access("ID2"=:B2)

23 rows selected.

SQL>
SQL> variable b2 varchar2(20)
SQL>
SQL> exec :b2 := '1';

PL/SQL procedure successfully completed.

SQL>
SQL> select /*+ find_me
2 index(a, plan_hash_value_test1_idx1) */
3 *
4 from
5 plan_hash_value_test1 a
6 where
7 id2 = :b2
8 and rownum <= 1;

ID1 ID2 SMALL_VC
---------- ---------- --------------------
1 1 x

SQL>
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID 4y6b2pu1hzs3h, child number 1
-------------------------------------
select /*+ find_me index(a, plan_hash_value_test1_idx1) */
* from plan_hash_value_test1 a where id2 = :b2
and rownum <= 1

Plan hash value: 2655295642

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| PLAN_HASH_VALUE_TEST1 | 1 | 27 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | PLAN_HASH_VALUE_TEST1_IDX1 | 500 | | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(ROWNUM<=1)
3 - access("ID2"=TO_NUMBER(:B2))

23 rows selected.

SQL>

Whereas the previous examples mainly used different SQLs (and therefore had different SQL_HASH_VALUEs or SQL_IDs) this example uses the same SQL and demonstrates the following:

- Bad application behaviour (in this case different types of bind variables) or other reasons can lead to unshared cursors, i.e. multiple child cursors for the same SQL (not actually the main topic here)

- But although the different cursors have different access predicates (in this case the implicit type conversion), as already shown, different access predicates don't lead to different PLAN_HASH_VALUEs

For this particular statement the difference in the predicates very likely doesn't represent a threat, but there are more subtle cases where these differences can lead to significant changes in behaviour. Possible reasons are different order of the predicate evaluation (e.g. when having system statistics enabled, which is the default from 10g on) which can make a significant difference in CPU usage or general resource consumption depending on the actual cost of the predicate evaluation (e.g. a costly PL/SQL function call), or even the evaluation of (filter) predicates at different steps of the same execution plan which could lead to significant differences in the number of rows generated by each operation step of the execution plan, and therefore make an execution plan much more inefficient due to the larger number of rows processed.

What about the actual estimates associated with the particular operations of an execution plan?

SQL>
SQL> explain plan for
2 select /*+ full(a) */
3 *
4 from
5 plan_hash_value_test1 a
6 where
7 id2 = 1;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3758120161

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 500 | 13500 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| PLAN_HASH_VALUE_TEST1 | 500 | 13500 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("ID2"=1)

13 rows selected.

SQL>
SQL> exec dbms_stats.set_table_stats(null, 'plan_hash_value_test1', numrows=>100000, numblks=>10000)

PL/SQL procedure successfully completed.

SQL>
SQL> explain plan for
2 select /*+ full(a) */
3 *
4 from
5 plan_hash_value_test1 a
6 where
7 id2 = 1;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3758120161

-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 50000 | 1318K| 2733 (1)| 00:00:33 |
|* 1 | TABLE ACCESS FULL| PLAN_HASH_VALUE_TEST1 | 50000 | 1318K| 2733 (1)| 00:00:33 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("ID2"=1)

13 rows selected.

SQL>

It's obvious that none of the ROWS, BYTES, COST nor derived information like TIME in the later releases are used to calculate the PLAN_HASH_VALUE. So again this makes clear that the same PLAN_HASH_VALUE of two statements doesn't say anything about the similarity of the runtime performance.

Another example that demonstrates this point with a slightly more complex plan:

SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'plan_hash_value_test1')

PL/SQL procedure successfully completed.

SQL>
SQL> create table plan_hash_value_test2
2 as
3 select
4 id as id1
5 , mod(id, 2) as id2
6 , rpad('x', 20) as small_vc
7 from
8 (
9 select
10 level as id
11 from
12 dual
13 connect by
14 level <= 1000
15 );

Table created.

SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'plan_hash_value_test2')

PL/SQL procedure successfully completed.

SQL>
SQL> create index plan_hash_value_test2_idx1 on plan_hash_value_test2 (id2, id1);

Index created.

SQL>
SQL> explain plan for
2 select /*+
3 use_nl(a, b)
4 index(b)
5 leading(a)
6 */
7 *
8 from
9 plan_hash_value_test1 a
10 , plan_hash_value_test2 b
11 where
12 a.id1 = b.id1
13 and a.id2 = b.id2
14 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 458854847

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 486 | 23 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 9 | 486 | 23 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST1 | 10 | 270 | 3 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | PLAN_HASH_VALUE_TEST2_IDX1 | 1 | | 1 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| PLAN_HASH_VALUE_TEST2 | 1 | 27 | 2 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
4 - access("A"."ID2"="B"."ID2" AND "A"."ID1"="B"."ID1")
filter("B"."ID1"<=10 AND "B"."ID1">=1)

19 rows selected.

SQL>
SQL> explain plan for
2 select /*+
3 use_nl(a, b)
4 index(b)
5 leading(a)
6 */
7 *
8 from
9 plan_hash_value_test1 a
10 , plan_hash_value_test2 b
11 where
12 a.id2 = b.id2
13 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 458854847

-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5005 | 263K| 74 (2)| 00:00:01 |
| 1 | NESTED LOOPS | | | | | |
| 2 | NESTED LOOPS | | 5005 | 263K| 74 (2)| 00:00:01 |
|* 3 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST1 | 10 | 270 | 3 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | PLAN_HASH_VALUE_TEST2_IDX1 | 500 | | 2 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| PLAN_HASH_VALUE_TEST2 | 500 | 13500 | 7 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
4 - access("A"."ID2"="B"."ID2")

18 rows selected.

SQL>

Notice how one statement performs an effective join using appropriate join predicates whereas the other one generates duplicate records. Again, since the filter and access predicates are not evaluated, these two plans get the same PLAN_HASH_VALUE, although we can say that one of them is potentially suboptimal (and usually would result in a significantly different execution plan which in turn would have different PLAN_HASH_VALUEs but there are certainly more complex scenarios where the optimizer goes wrong for whatever reason resulting in such potentially inefficient execution plans).

What about extended execution plan information like partitioning and parallel execution?

Let's first address partitioning:

SQL> create table plan_hash_value_test3
2 (
3 invoice_no number,
4 sale_year integer not null,
5 sale_month integer not null,
6 sale_day integer not null
7 )
8 partition by range (invoice_no)
9 (
10 partition part_001 values less than (100),
11 partition part_002 values less than (400),
12 partition part_003 values less than (800),
13 partition part_004 values less than (maxvalue)
14 );

Table created.

SQL>
SQL> exec dbms_random.seed(0)

PL/SQL procedure successfully completed.

SQL>
SQL> insert into plan_hash_value_test3 (
2 invoice_no,
3 sale_year,
4 sale_month,
5 sale_day
6 )
7 select rownum,
8 2000 + round(dbms_random.value(0, 8)) as sale_year,
9 trunc(dbms_random.value(1, 13)) as sale_month,
10 trunc(dbms_random.value(1, 29)) as sale_day
11 from dual
12 connect by level <= 1000;

1000 rows created.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'plan_hash_value_test3')

PL/SQL procedure successfully completed.

SQL>
SQL> exec dbms_stats.set_table_stats(null, 'plan_hash_value_test3', partname=>'part_002', numblks=>100000)

PL/SQL procedure successfully completed.

SQL>
SQL> select
2 partition_name
3 , blocks
4 from
5 user_tab_statistics
6 where
7 table_name = 'PLAN_HASH_VALUE_TEST3';

PARTITION_NAME BLOCKS
------------------------------ ----------
5
PART_001 1
PART_002 100000
PART_003 2
PART_004 1

SQL>
SQL> explain plan for
2 select
3 *
4 from
5 plan_hash_value_test3
6 where
7 invoice_no < 100;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 4079248530

----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 99 | 1188 | 2 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 99 | 1188 | 2 (0)| 00:00:01 | 1 | 1 |
| 2 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST3 | 99 | 1188 | 2 (0)| 00:00:01 | 1 | 1 |
----------------------------------------------------------------------------------------------------------------

9 rows selected.

SQL>
SQL> explain plan for
2 select
3 *
4 from
5 plan_hash_value_test3
6 where
7 invoice_no >= 100 and invoice_no < 400;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 4079248530

----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 300 | 3900 | 27256 (1)| 00:05:28 | | |
| 1 | PARTITION RANGE SINGLE| | 300 | 3900 | 27256 (1)| 00:05:28 | 2 | 2 |
| 2 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST3 | 300 | 3900 | 27256 (1)| 00:05:28 | 2 | 2 |
----------------------------------------------------------------------------------------------------------------

9 rows selected.

SQL>
SQL> variable b1 number
SQL> variable b2 number
SQL>
SQL> exec :b1 := 0;

PL/SQL procedure successfully completed.

SQL> exec :b2 := 100;

PL/SQL procedure successfully completed.

SQL>
SQL> select
2 count(*)
3 from
4 plan_hash_value_test3
5 where
6 invoice_no >= :b1 and invoice_no < :b2;

COUNT(*)
----------
99

SQL>
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID 23m1jk6dbsvbx, child number 0
-------------------------------------
select count(*) from plan_hash_value_test3 where
invoice_no >= :b1 and invoice_no < :b2

Plan hash value: 2447373661

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)| | | |
| 1 | SORT AGGREGATE | | 1 | 3 | | | | |
|* 2 | FILTER | | | | | | | |
| 3 | PARTITION RANGE ITERATOR| | 99 | 297 | 2 (0)| 00:00:01 | KEY | KEY |
|* 4 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST3 | 99 | 297 | 2 (0)| 00:00:01 | KEY | KEY |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter(:B1<:B2)
4 - filter(("INVOICE_NO">=:B1 AND "INVOICE_NO"<:B2))

23 rows selected.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> exec :b1 := 100;

PL/SQL procedure successfully completed.

SQL> exec :b2 := 400;

PL/SQL procedure successfully completed.

SQL>
SQL> select
2 count(*)
3 from
4 plan_hash_value_test3
5 where
6 invoice_no >= :b1 and invoice_no < :b2;

COUNT(*)
----------
300

SQL>
SQL> select * from table(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
SQL_ID 23m1jk6dbsvbx, child number 0
-------------------------------------
select count(*) from plan_hash_value_test3 where
invoice_no >= :b1 and invoice_no < :b2

Plan hash value: 2447373661

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 27256 (100)| | | |
| 1 | SORT AGGREGATE | | 1 | 4 | | | | |
|* 2 | FILTER | | | | | | | |
| 3 | PARTITION RANGE ITERATOR| | 300 | 1200 | 27256 (1)| 00:05:28 | KEY | KEY |
|* 4 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST3 | 300 | 1200 | 27256 (1)| 00:05:28 | KEY | KEY |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter(:B1<:B2)
4 - filter(("INVOICE_NO">=:B1 AND "INVOICE_NO"<:B2))

23 rows selected.

SQL>

The PSTART and PSTOP information is not used either for the PLAN_HASH_VALUE, so different execution plans accessing different partitions of the same object might get the same PLAN_HASH_VALUE, but again the runtime performance might be dramatically different.

By the way above example demonstrates that the CBO peeks at the binds and uses then the partition level statistics of the corresponding partition defined by the bind values (in case the bind values prune to a single partition), although the execution plan might be executed with different bind values actually accessing different partitions at runtime.

What about parallel execution?

SQL>
SQL> explain plan for
2 select /*+
3 full(a)
4 full(b)
5 use_hash(a, b)
6 parallel(a)
7 parallel(b)
8 pq_distribute(b, none, broadcast)
9 */
10 *
11 from
12 plan_hash_value_test1 a
13 , plan_hash_value_test2 b
14 where
15 a.id1 = b.id1
16 and a.id2 = b.id2
17 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1365899609

-------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 486 | 5 (20)| 00:00:01 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 9 | 486 | 5 (20)| 00:00:01 | Q1,01 | P->S | QC (RAND) |
|* 3 | HASH JOIN | | 9 | 486 | 5 (20)| 00:00:01 | Q1,01 | PCWP | |
| 4 | PX BLOCK ITERATOR | | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWC | |
|* 5 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST1 | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWP | |
| 6 | BUFFER SORT | | | | | | Q1,01 | PCWC | |
| 7 | PX RECEIVE | | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWP | |
| 8 | PX SEND BROADCAST | :TQ10000 | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | P->P | BROADCAST |
| 9 | PX BLOCK ITERATOR | | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | PCWC | |
|* 10 | TABLE ACCESS FULL| PLAN_HASH_VALUE_TEST2 | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | PCWP | |
-------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("A"."ID1"="B"."ID1" AND "A"."ID2"="B"."ID2")
5 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
10 - filter("B"."ID1"<=10 AND "B"."ID1">=1)

24 rows selected.

SQL>
SQL> explain plan for
2 select /*+
3 full(a)
4 full(b)
5 use_hash(a, b)
6 parallel(a)
7 parallel(b)
8 pq_distribute(b, broadcast, none)
9 */
10 *
11 from
12 plan_hash_value_test1 a
13 , plan_hash_value_test2 b
14 where
15 a.id1 = b.id1
16 and a.id2 = b.id2
17 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3903716067

------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 486 | 5 (20)| 00:00:01 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 9 | 486 | 5 (20)| 00:00:01 | Q1,01 | P->S | QC (RAND) |
|* 3 | HASH JOIN | | 9 | 486 | 5 (20)| 00:00:01 | Q1,01 | PCWP | |
| 4 | PX RECEIVE | | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWP | |
| 5 | PX SEND BROADCAST | :TQ10000 | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | P->P | BROADCAST |
| 6 | PX BLOCK ITERATOR | | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | PCWC | |
|* 7 | TABLE ACCESS FULL| PLAN_HASH_VALUE_TEST1 | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | PCWP | |
| 8 | PX BLOCK ITERATOR | | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWC | |
|* 9 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST2 | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWP | |
------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("A"."ID1"="B"."ID1" AND "A"."ID2"="B"."ID2")
7 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
9 - filter("B"."ID1"<=10 AND "B"."ID1">=1)

23 rows selected.

SQL>
SQL> explain plan for
2 select /*+
3 full(a)
4 full(b)
5 use_hash(a, b)
6 parallel(a)
7 parallel(b)
8 pq_distribute(b, hash, hash)
9 */
10 *
11 from
12 plan_hash_value_test1 a
13 , plan_hash_value_test2 b
14 where
15 a.id1 = b.id1
16 and a.id2 = b.id2
17 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 904614956

------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 486 | 5 (20)| 00:00:01 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10002 | 9 | 486 | 5 (20)| 00:00:01 | Q1,02 | P->S | QC (RAND) |
|* 3 | HASH JOIN BUFFERED | | 9 | 486 | 5 (20)| 00:00:01 | Q1,02 | PCWP | |
| 4 | PX RECEIVE | | 10 | 270 | 2 (0)| 00:00:01 | Q1,02 | PCWP | |
| 5 | PX SEND HASH | :TQ10000 | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | P->P | HASH |
| 6 | PX BLOCK ITERATOR | | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | PCWC | |
|* 7 | TABLE ACCESS FULL| PLAN_HASH_VALUE_TEST1 | 10 | 270 | 2 (0)| 00:00:01 | Q1,00 | PCWP | |
| 8 | PX RECEIVE | | 10 | 270 | 2 (0)| 00:00:01 | Q1,02 | PCWP | |
| 9 | PX SEND HASH | :TQ10001 | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | P->P | HASH |
| 10 | PX BLOCK ITERATOR | | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWC | |
|* 11 | TABLE ACCESS FULL| PLAN_HASH_VALUE_TEST2 | 10 | 270 | 2 (0)| 00:00:01 | Q1,01 | PCWP | |
------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("A"."ID1"="B"."ID1" AND "A"."ID2"="B"."ID2")
7 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
11 - filter("B"."ID1"<=10 AND "B"."ID1">=1)

25 rows selected.

SQL>

The example uses different parallel distribution options for the same execution plan. In 10g and later this is reflected in different operations (like PX SEND BROADCAST) and this suggests that the PLAN_HASH_VALUEs are going to be different due to the different operations.

Running a similar test case on 9.2.0.8 shows it seems that actually some of the attributes related to parallel execution are also used to calculate the PLAN_HASH_VALUE:

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> explain plan for
2 select /*+ find_me
3 full(a)
4 full(b)
5 use_hash(a, b)
6 parallel(a)
7 parallel(b)
8 pq_distribute(b, none, broadcast)
9 */
10 *
11 from
12 plan_hash_value_test1 a
13 , plan_hash_value_test2 b
14 where
15 a.id1 = b.id1
16 and a.id2 = b.id2
17 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 270 | 6 (34)| | | |
|* 1 | HASH JOIN | | 5 | 270 | 6 (34)| 55,01 | P->S | QC (RAND) |
|* 2 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST1 | 10 | 270 | 3 (34)| 55,01 | PCWP | |
|* 3 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST2 | 10 | 270 | 3 (34)| 55,00 | P->P | BROADCAST |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("A"."ID1"="B"."ID1" AND "A"."ID2"="B"."ID2")
2 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
3 - filter("B"."ID1"<=10 AND "B"."ID1">=1)

16 rows selected.

SQL>
SQL> rollback;

Rollback complete.

SQL>
SQL> select /*+ find_me
2 full(a)
3 full(b)
4 use_hash(a, b)
5 parallel(a)
6 parallel(b)
7 pq_distribute(b, none, broadcast)
8 */
9 *
10 from
11 plan_hash_value_test1 a
12 , plan_hash_value_test2 b
13 where
14 a.id1 = b.id1
15 and a.id2 = b.id2
16 and a.id1 between 1 and 10;

ID1 ID2 SMALL_VC ID1 ID2 SMALL_VC
---------- ---------- -------------------- ---------- ---------- --------------------
1 1 x 1 1 x
2 0 x 2 0 x
3 1 x 3 1 x
4 0 x 4 0 x
5 1 x 5 1 x
6 0 x 6 0 x
7 1 x 7 1 x
8 0 x 8 0 x
9 1 x 9 1 x
10 0 x 10 0 x

10 rows selected.

SQL>
SQL> select
2 plan_hash_value
3 , hash_value
4 from
5 v$sql
6 where
7 sql_text like 'select /*+ find_me%';

PLAN_HASH_VALUE HASH_VALUE
--------------- ----------
1709875781 508205717

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> explain plan for
2 select /*+ find_me
3 full(a)
4 full(b)
5 use_hash(a, b)
6 parallel(a)
7 parallel(b)
8 pq_distribute(b, broadcast, none)
9 */
10 *
11 from
12 plan_hash_value_test1 a
13 , plan_hash_value_test2 b
14 where
15 a.id1 = b.id1
16 and a.id2 = b.id2
17 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 270 | 6 (34)| | | |
|* 1 | HASH JOIN | | 5 | 270 | 6 (34)| 57,01 | P->S | QC (RAND) |
|* 2 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST1 | 10 | 270 | 3 (34)| 57,00 | P->P | BROADCAST |
|* 3 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST2 | 10 | 270 | 3 (34)| 57,01 | PCWP | |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("A"."ID1"="B"."ID1" AND "A"."ID2"="B"."ID2")
2 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
3 - filter("B"."ID1"<=10 AND "B"."ID1">=1)

16 rows selected.

SQL>
SQL> rollback;

Rollback complete.

SQL>
SQL> select /*+ find_me
2 full(a)
3 full(b)
4 use_hash(a, b)
5 parallel(a)
6 parallel(b)
7 pq_distribute(b, broadcast, none)
8 */
9 *
10 from
11 plan_hash_value_test1 a
12 , plan_hash_value_test2 b
13 where
14 a.id1 = b.id1
15 and a.id2 = b.id2
16 and a.id1 between 1 and 10;

ID1 ID2 SMALL_VC ID1 ID2 SMALL_VC
---------- ---------- -------------------- ---------- ---------- --------------------
1 1 x 1 1 x
2 0 x 2 0 x
3 1 x 3 1 x
4 0 x 4 0 x
5 1 x 5 1 x
6 0 x 6 0 x
7 1 x 7 1 x
8 0 x 8 0 x
9 1 x 9 1 x
10 0 x 10 0 x

10 rows selected.

SQL>
SQL> select
2 plan_hash_value
3 , hash_value
4 from
5 v$sql
6 where
7 sql_text like 'select /*+ find_me%';

PLAN_HASH_VALUE HASH_VALUE
--------------- ----------
441284116 3983849749

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> alter system flush shared_pool;

System altered.

SQL>
SQL> explain plan for
2 select /*+ find_me
3 full(a)
4 full(b)
5 use_hash(a, b)
6 parallel(a)
7 parallel(b)
8 pq_distribute(b, hash, hash)
9 */
10 *
11 from
12 plan_hash_value_test1 a
13 , plan_hash_value_test2 b
14 where
15 a.id1 = b.id1
16 and a.id2 = b.id2
17 and a.id1 between 1 and 10;

Explained.

SQL>
SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 270 | 6 (34)| | | |
|* 1 | HASH JOIN | | 5 | 270 | 6 (34)| 59,02 | P->S | QC (RAND) |
|* 2 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST1 | 10 | 270 | 3 (34)| 59,00 | P->P | HASH |
|* 3 | TABLE ACCESS FULL | PLAN_HASH_VALUE_TEST2 | 10 | 270 | 3 (34)| 59,01 | P->P | HASH |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("A"."ID1"="B"."ID1" AND "A"."ID2"="B"."ID2")
2 - filter("A"."ID1"<=10 AND "A"."ID1">=1)
3 - filter("B"."ID1"<=10 AND "B"."ID1">=1)

16 rows selected.

SQL>
SQL> rollback;

Rollback complete.

SQL>
SQL> select /*+ find_me
2 full(a)
3 full(b)
4 use_hash(a, b)
5 parallel(a)
6 parallel(b)
7 pq_distribute(b, hash, hash)
8 */
9 *
10 from
11 plan_hash_value_test1 a
12 , plan_hash_value_test2 b
13 where
14 a.id1 = b.id1
15 and a.id2 = b.id2
16 and a.id1 between 1 and 10;

ID1 ID2 SMALL_VC ID1 ID2 SMALL_VC
---------- ---------- -------------------- ---------- ---------- --------------------
1 1 x 1 1 x
3 1 x 3 1 x
5 1 x 5 1 x
8 0 x 8 0 x
10 0 x 10 0 x
2 0 x 2 0 x
4 0 x 4 0 x
6 0 x 6 0 x
7 1 x 7 1 x
9 1 x 9 1 x

10 rows selected.

SQL>
SQL> select
2 plan_hash_value
3 , hash_value
4 from
5 v$sql
6 where
7 sql_text like 'select /*+ find_me%';

PLAN_HASH_VALUE HASH_VALUE
--------------- ----------
353704519 1797852549

SQL>

Although the main operations and their order of execution stays the same, the calculated PLAN_HASH_VALUE is different in all three cases.

So in summary the following conclusions can be made:

- The same PLAN_HASH_VALUE is merely an indicator that the same operations on objects of the same name are performed in the same order.

- It tells nothing about the similarity of the expected runtime performance of the execution plan, due to various reasons as demonstrated. The most significant information that is not covered by the PLAN_HASH_VALUE are the filter and access predicates, but there are other attributes, too, that are not part of the hash value calculation.

- Still it's very likely that for the same SQL statement and the same unchanged underlying data the same PLAN_HASH_VALUE indicates similar expected runtime performance, except for some particular cases where the execution plan itself already contains "conditional" execution paths which might be taken by the runtime engine and therefore lead to significantly different runtimes. Some of these examples have already been covered by Jonathan Lewis: Conditional Plan and Hierarchical Queries in some versions of Oracle.

In the second part I'll demonstrate a possible approach how to calculate your own PLAN_HASH_VALUE that covers some of the information omitted if you suspect that you might have encountered a scenario where the same PLAN_HASH_VALUE suggests similar runtime performance but there are significant differences in the execution plans that you would like to have covered by the hash value calculation to detect these plan changes easily.

Dynamic sampling and partitioned tables

------------------------------------------------------------------------------

Update January 2010: A thread on OTN mentioned this blog post and another, actually contradicting blog post by Asif Momen.

So why are these two blog posts coming to different conclusions regarding Dynamic Sampling and partitions with missing statistics?

This is the good thing with documented test cases - I reproduced what Asif has done and found out that the significant difference between these two test cases is the existence of global level statistics.

In my test case below, I have explicitly gathered statistics only on partition level, and there are no statistics on global level/table (which can be seen from the output of the query against user_tab_statistics below).

Asif has actually gathered statistics on global/table level which can be seen from his blog post.

So the conclusion seems to be: If you prune to a single partition, but this partition has no statistics, then Dynamic sampling will be used if no global/table level statistics are available. If global/table level statistics are available, the optimizer won't perform dynamic sampling and revert to these global/table level statistics instead.

Oddly this obviously doesn't apply to the subpartition/partition level case: Repeating a similar setup with subpartitions having no statistics, but statistics on partition level are available, Dynamic Sampling still was used (tested on 11.1.0.7 Win32).

------------------------------------------------------------------------------