Search

OakieTags

Who's online

There are currently 0 users and 23 guests online.

Recent comments

Affiliations

performance

Speaking engagements 2011

Despite my attempts to not travel (much) anymore, I’m still going to speak at two events in first half of 2011.

The first event is at the Suncoast Oracle User Group meeting in Tampa, FL on Thursday, 27th January. I’ll speak about Advanced Oracle Troubleshooting in 60 minutes.

I will deliver the second presentation at Hotsos Symposium 2011 near Dallas, TX on Wednesday 9th March. I will talk about Troubleshooting the Most Complex Oracle Performance Problem I’ve ever seen. If you think that this performance problem was complex, then be prepared to see an even worse one there!

Share

Describing Performance Improvements (Beware of Ratios)

Recently, I received into my Spam folder an ad claiming that a product could “...improve performance 1000%.” Claims in that format have bugged me for a long time, at least as far back as the 1990s, when some of the most popular Oracle “tips & techniques” books of the era used that format a lot to state claims.

Beware of claims worded like that.

Whenever I see “...improve performance 1000%,” I have to do extra work to decode what the author has encoded in his tidy numerical package with a percent-sign bow. The two performance improvement formulas that make sense to me are these:

Viewing Runtime Load Balancing Events

Yesterday I have run a benchmark on a 2 node RAC cluster (ProLiant BL685c G6 with 4 Six-Core AMD Opteron 8431) and 32G RAM each. It’s running Oracle Grid Infrastructure 11.2.0.2 as well as an Oracle 11.2.0.2 database on Oracle Enterprise Linux 5.5 64bit and device-mapper-multipath.

I was testing how the system would react under load but also wanted to see if the Runtime Load Balancing was working. The easiest way to check this is to view the AQ events that are generated for a service if AQ HA notifications is set to true. They can either be dequeued from the database as described in chapter 11 of Pro Oracle Database 11g RAC on Linux or alternatively queried from the database. The latter is the quicker method and this article will focus on it.

Before you can make use of Runtime Load Balancing you need to set at least 2 properties in your service:

  • Connection Load Balancing Goal (either SHORT or LONG)
  • Runtime Load Balancing Goal (SERVICE_TIME or THROUGHPUT)

.Net applications require AQ HA notifications to be set to true as these can’t directly make use of Fast Application Notification (FAN) events as said in the introduction. My JDBC application is fully capable of using the FAN events, however as you will see later I am using the AQ notifications anyway to view the events.

Connected as the owner of the Oracle binaries, I created a new service to make use of both instances:

$ srvctl add service -d TEST -s TESTSRV -r TEST1,TEST2 -P BASIC  \
> -l PRIMARY -y MANUAL -q true -x false -j short -B SERVICE_TIME \
> -e SESSION -m BASIC -z 0 -w 0

The service TESTSRV for database TEST has TEST1 and TEST2 as preferred instances, and the service should be started (manually) when the database is in the primary role. AQ Notifications are enabled, and I chose the connection load balancing goal to be “short” (usually ok with web applications and connection pooling) and a runtime load balancing goal of service time (should also be appropriate for many short transactions typical for a web environment). The remaining paramters define Transparent Application Failover. Please refer to the output of “srvctl add service -h” for more information about the command line parameters.

The result of this endavour can be viewed with srvctl config service:

$ srvctl config service -d TEST -s TESTSRV
Service name: TESTSRV
Service is enabled
Server pool: TEST_TESTSRV
Cardinality: 2
Disconnect: false
Service role: PRIMARY
Management policy: MANUAL
DTP transaction: false
AQ HA notifications: true
Failover type: SESSION
Failover method: BASIC
TAF failover retries: 0
TAF failover delay: 0
Connection Load Balancing Goal: SHORT
Runtime Load Balancing Goal: SERVICE_TIME
TAF policy specification: BASIC
Edition:
Preferred instances: TEST1,TEST2
Available instances:

So to begin with I created the order entry schema (SOE) in preparation of a swingbench run. (I know that Swingbench’s Order Entry is probably not the best benchmark out there but my client knows and likes it). Once about 10G of data were generated I started a swingbench run with 300 users, and reasonably low think time (min transaction time 20ms and max of 60ms). The connect string was //scan1.example.com:1800/TESTSRV

A query against gv$session showed an even balance of sessions, which was good:

select count(inst_id), inst_id
 from gv$session
where username = 'SOE'
group by inst_id

However, whatever I did I couldn’t get the Runtime Load Balancing in sys.sys$service_metrics_tab to chanage. They always looked like this (column user_data):

{instance=TEST1 percent=50 flag=UNKNOWN aff=TRUE}{instance=TEST2 percent=50 flag=UNKNOWN aff=TRUE} }

That sort of made sense as none of the nodes broke into a sweat-the system was > 50% idle with a load average of about 12. So that wouldn’t cut it. Instead of trying to experiment with the Swingbench parameters, I decided to revert back to the silly CPU burner: a while loop which generates random numbers. I wasn’t interested in I/O at this stage, and created this minimal script:

$ cat dothis.sql
declare
 n number;
begin
 while (true) loop
 n:= dbms_random.random();
 end loop;
end;
/

A simple for loop can be used to start the load test:

$ for i in $(seq 30); do
> sqlplus soe/soe@scan1.example.com:1800/TESTSRV @dothis &
done

This created an even load on both nodes. I then started another 20 sessions on node1 against TEST1 to trigger the change in behaviour. And fair enough, the top few lines of “top” revealed the difference. The output for node 1 was as follows:


top - 10:59:30 up 1 day, 21:16,  6 users,  load average: 42.44, 20.23, 10.07
Tasks: 593 total,  48 running, 545 sleeping,   0 stopped,   0 zombie
Cpu(s): 99.9%us,  0.1%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  32960688k total, 11978912k used, 20981776k free,   441260k buffers
Swap: 16777208k total,        0k used, 16777208k free,  8078336k cached

Whereas node 2 was relatively idle.

top - 10:59:22 up 5 days, 17:45,  4 users,  load average: 15.80, 10.53, 5.74
Tasks: 631 total,  16 running, 605 sleeping,  10 stopped,   0 zombie
Cpu(s): 58.8%us,  0.6%sy,  0.0%ni, 40.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  32960688k total, 11770080k used, 21190608k free,   376672k buffers
Swap: 16777208k total,        0k used, 16777208k free,  7599496k cached

Would that imbalance finally make a difference? It did, as the user_data column (truncated here for better readability) reveals:

SQL> select user_data
2  from sys.sys$service_metrics_tab
3  order by enq_time desc;

{instance=TEST1 percent=4 flag=GOOD aff=TRUE}{instance=TEST2 percent=96 flag=GOOD aff=TRUE} } timestamp=2011-01-20 11:01:16')
{instance=TEST1 percent=6 flag=GOOD aff=TRUE}{instance=TEST2 percent=94 flag=GOOD aff=TRUE} } timestamp=2011-01-20 11:00:46')
{instance=TEST1 percent=10 flag=GOOD aff=TRUE}{instance=TEST2 percent=90 flag=GOOD aff=TRUE} } timestamp=2011-01-20 11:00:16')
{instance=TEST1 percent=18 flag=GOOD aff=TRUE}{instance=TEST2 percent=82 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:59:46')
{instance=TEST1 percent=28 flag=GOOD aff=TRUE}{instance=TEST2 percent=72 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:59:16')
{instance=TEST1 percent=35 flag=GOOD aff=TRUE}{instance=TEST2 percent=65 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:58:46')
{instance=TEST1 percent=40 flag=GOOD aff=TRUE}{instance=TEST2 percent=60 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:58:16')
{instance=TEST1 percent=43 flag=GOOD aff=TRUE}{instance=TEST2 percent=57 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:57:46')
{instance=TEST1 percent=44 flag=GOOD aff=TRUE}{instance=TEST2 percent=56 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:57:16')
{instance=TEST1 percent=48 flag=GOOD aff=TRUE}{instance=TEST2 percent=52 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:56:46')
{instance=TEST1 percent=49 flag=GOOD aff=TRUE}{instance=TEST2 percent=51 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:56:16')
{instance=TEST1 percent=50 flag=GOOD aff=TRUE}{instance=TEST2 percent=50 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:55:46')
{instance=TEST1 percent=50 flag=GOOD aff=TRUE}{instance=TEST2 percent=50 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:55:16')
{instance=TEST1 percent=50 flag=UNKNOWN aff=TRUE}{instance=TEST2 percent=50 flag=UNKNOWN aff=TRUE} } timestamp=2011-01-20 10:54:46')
{instance=TEST1 percent=50 flag=UNKNOWN aff=TRUE}{instance=TEST2 percent=50 flag=UNKNOWN aff=TRUE} } timestamp=2011-01-20 10:54:16')

Where it was initially even at 50-50 it soon became imbalanced, and TEST2 would be preferred after a few minutes in the test.  So everything was working as expected, I just didn’t manage to put enough load on the system intially.

New cursor_bind_capture_destination parameter in Oracle 11.2.0.2

I just noticed that there’s a new cursor_bind_capture_destination parameter in Oracle 11.2.0.2 (which is really more like Oracle 11gR3 version because of the large amount of new features in it, as opposed to just bugfixes).

This parameter allows you to save some SYSAUX tablespace disk space – if the occasionally captured bind variable values (from V$SQL_BIND_DATA) take too much space. Normally these bind values (in a packed RAW form) are visible in DBA_HIST_SQLSTAT.BIND_DATA column, which can take up to 2kB per statement in a snapshot – it’s stored as RAW(2000). Of course a more convenient way to query the actual bind values is to use DBA_HIST_SQLBIND (you can also use DBMS_SQLTUNE.EXTRACT_BIND function for translating the raw payload to meaningful values).

So, if you choose to capture a lot of SQL statements per AWR snapshot (it’s configurable) and don’t really care about the sampled bind variable values and want to save the disk space, then you can set cursor_bind_capture_destination = MEMORY or OFF (if you don’t want to capture bind variable values at all).

I’m using my pvalid.sql script for checking its valid values (it’s based on the X$ table underlying V$PARAMETER_VALID_VALUES view, so I could see undocumented parameter valid values too):

SQL> @pvalid cursor_bind_capture_destination
Display valid values for multioption parameters matching “cursor_bind_capture_destination”…

  PAR# PARAMETER                                                 ORD VALUE         
—— ————————————————– ———- —————
  2062 cursor_bind_capture_destination                             2 MEMORY
       cursor_bind_capture_destination                             3 MEMORY+DISK
       cursor_bind_capture_destination                             1 OFF

The default is MEMORY+DISK (this is essentially what you get before 11.2.0.2 and you can’t turn it off unless you turn off the AWR flushing of the whole SQLSTATS metrics).

Share

Solaris Eye for the Linux Guy… Part II (oprofile, Dtrace, and Oracle Event Trace)

Proper tool for the job

My grandfather used to say to me: “Use the proper tool for the job”.  This is important to keep in mind when faced with performance issues.  When I am faced with performance problems in Oracle, I typically start at a high level with AWR reports or Enterprise Manager to get a high level understanding of the workload.   To drill down further, the next step is to use Oracle “10046 event” tracing.  Cary Millsap created a methodology around event tracing called “Method-R” which shows how to focus in on the source of a performance problem by analyzing the components that contribute to response time.   These are all fine places to start to analyze performance problems from the “user” or “application” point of view.  But what happens if the OS is in peril?

Hash Aggregation

Oracle introduced in Oracle 10g the hash aggregation as a new feature. It can be used for both GROUP BY and UNIQUE operations (HASH GROUP BY and HASH UNIQUE respectively) and is by default the preferred aggregation method if there is no particular reason that lets the cost based optimizer prefer the sort based aggregation (SORT GROUP BY and SORT UNIQUE), for example if the GROUP BY is followed by an ORDER BY on the same expression (using the SORT GROUP BY in such cases is not always beneficial by the way, see Guy Harrison's blog for an example).

Ever since its introduction from time to time I've heard complaints about performance degradations of aggregation operations that are based on the new hash aggregation algorithm compared to the previously used sort based aggregations.

Now there may be many potential reasons for such performance degradations (and possibly many of them might not have anything to do with the hash aggregation) but here is a surprising revelation that might explain why some of them were indeed caused by the switch to the new algorithm: The hash aggregation operation does not work very well together with the automatic PGA management (WORKAREA_SIZE_POLICY = AUTO, default since 9i). The fundamental defect is that it is not able to dynamically resize to a larger workarea size when using automatic PGA management and therefore remains more or less at its initial expected size based on the estimates at optimization time.

This effectively means that the efficiency of the hash aggregation operation when using automatic PGA management is heavily dependant on the cardinality estimates at optimization time - in case of estimates in the right ballpark, the memory used at execution time will correspond to the actual requirements at runtime, but in case of bad estimates, the operation potentially uses far less memory than available and unnecessarily spills to disk.

Let's start with a simple script to demonstrate the issue:

set echo on timing on

show parameter pga
show parameter processes

-- alter session set workarea_size_policy = manual sort_area_size = 40000000;

drop table t1 purge;

drop table t2 purge;

create table t1
as
select mod(rownum, 27) as user_id, rownum as object_id from dual connect by level <= 30000;

create table t2
as
select distinct user_id from t1;

exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

column low_val new_value low_value
column high_val new_value high_value

select
max(user_id) + 1 as low_val
, max(user_id) + max(user_id) - min(user_id) + 1 as high_val
from
t2;

-- Invalidate any cursors using T1
comment on table t1 is '';

alter session set statistics_level = all;

-- alter session set tracefile_identifier = 'smm_trace_correct_cardinality';

-- alter session set "_smm_trace" = 65535;

select /*+ full(@inner t1)
full(@inner t2)
full(@inner t3)
leading(@inner t2 t3 t1)
use_nl(@inner t3)
use_hash(@inner t1)
no_swap_join_inputs(@inner t1)
use_hash_aggregation(@inner)
*/
max(cnt)
from
(
select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
from t1, t2, t2 t3
where t1.user_id = t2.user_id
group by t1.object_id, t3.user_id
)
;

-- alter session set "_smm_trace" = 0;

set pagesize 0 linesize 200 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

set pagesize 14

-- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
declare
srec dbms_stats.statrec;
novals dbms_stats.numarray;
distcnt number;
avgclen number;
nullcnt number;
density number;
srec2 dbms_stats.statrec;
begin
dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
srec2.epc := 2;
novals := dbms_stats.numarray(
&low_value,
&high_value
);
srec2.bkvals := null;
dbms_stats.prepare_column_values(srec2,novals);
dbms_stats.set_column_stats(
ownname=>null,
tabname=>'t1',
colname=>'user_id',
distcnt=>distcnt,
nullcnt=>nullcnt,
srec=>srec2,
avgclen=>avgclen,
density=>density
);
end;
/

-- Invalidate any cursors using T1
comment on table t1 is '';

alter session set statistics_level = all;

-- alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';

-- alter session set "_smm_trace" = 65535;

select /*+ full(@inner t1)
full(@inner t2)
full(@inner t3)
leading(@inner t2 t3 t1)
use_nl(@inner t3)
use_hash(@inner t1)
no_swap_join_inputs(@inner t1)
use_hash_aggregation(@inner)
*/
max(cnt)
from
(
select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
from t1, t2, t2 t3
where t1.user_id = t2.user_id
group by t1.object_id, t3.user_id
)
;

-- alter session set "_smm_trace" = 0;

set pagesize 0 linesize 200 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

I first start with showing the current PGA_AGGREGATE_TARGET (P_A_T) and PROCESSES setting.

I create then two tables as sample data that will be joined and already collect information about the minimum and maximum value of the join column.

Any cursors that are dependent on table T1 will then be invalidated to make sure that a reoptimization of the next query will take place (of course running the complete script invalidates such cursors anyway due to the drop and re-create of the tables).

I then run a simple join followed by a group by operation. I've added a cartesian join by the way, but its only purpose is to ensure that the generated row source is sufficiently large to give the group by something to do.

I have used hints to ensure that I always get the same execution plan even if I later on manipulate the statistics to see the effect of incorrect cardinality estimates.

After getting the actual runtime execution plan along with execution statistics I modify the column statistics of one of the tables' join column in such a way that the optimizer thinks that there is no overlap between the join column values and therefore computes a very low join cardinality.

After making sure again that a reoptimization will take place I run the same statement again with the same data volume and the same hints in place.

And this is what I get for all major versions that are currently out there (10.2.0.4, 10.2.0.5, 11.1.0.7, 11.2.0.1 and 11.2.0.2):

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 200M
SQL> show parameter processes

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 0
job_queue_processes integer 10
log_archive_max_processes integer 2
processes integer 150
SQL>
SQL> -- alter session set workarea_size_policy = manual sort_area_size = 40000000;
SQL>
SQL> drop table t1 purge;
drop table t1 purge
*
ERROR at line 1:
ORA-00942: table or view does not exist

Elapsed: 00:00:00.01
SQL>
SQL> drop table t2 purge;
drop table t2 purge
*
ERROR at line 1:
ORA-00942: table or view does not exist

Elapsed: 00:00:00.00
SQL>
SQL> create table t1
2 as
3 select mod(rownum, 27) as user_id, rownum as object_id from dual connect by level <= 30000;

Table created.

Elapsed: 00:00:00.07
SQL>
SQL> create table t2
2 as
3 select distinct user_id from t1;

Table created.

Elapsed: 00:00:00.06
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.09
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> column low_val new_value low_value
SQL> column high_val new_value high_value
SQL>
SQL> select
2 max(user_id) + 1 as low_val
3 , max(user_id) + max(user_id) - min(user_id) + 1 as high_val
4 from
5 t2;

LOW_VAL HIGH_VAL
---------- ----------
27 53

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.04
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> -- alter session set tracefile_identifier = 'smm_trace_correct_cardinality';
SQL>
SQL> -- alter session set "_smm_trace" = 65535;
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:00.71
SQL>
SQL> -- alter session set "_smm_trace" = 0;
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner
t2 t3 t1) use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner
t1) use_hash_aggregation(@inner) */ max(cnt) from ( select /*+ qb_name(inner) */ count(*)
as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by t1.object_id,
t3.user_id )

Plan hash value: 3134842094

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.72 | 139 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.72 | 139 | | | |
| 2 | VIEW | | 1 | 570K| 810K|00:00:00.43 | 139 | | | |
| 3 | HASH GROUP BY | | 1 | 570K| 810K|00:00:00.43 | 139 | 35M| 5521K| 38M (0)|
|* 4 | HASH JOIN | | 1 | 807K| 810K|00:00:00.02 | 139 | 1348K| 1348K| 1093K (0)|
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29907 | 30000 |00:00:00.01 | 55 | | | |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")

29 rows selected.

Elapsed: 00:00:00.12
SQL>
SQL> set pagesize 14
SQL>
SQL> -- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 srec2 dbms_stats.statrec;
9 begin
10 dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
11 srec2.epc := 2;
12 novals := dbms_stats.numarray(
13 &low_value,
14 &high_value
15 );
16 srec2.bkvals := null;
17 dbms_stats.prepare_column_values(srec2,novals);
18 dbms_stats.set_column_stats(
19 ownname=>null,
20 tabname=>'t1',
21 colname=>'user_id',
22 distcnt=>distcnt,
23 nullcnt=>nullcnt,
24 srec=>srec2,
25 avgclen=>avgclen,
26 density=>density
27 );
28 end;
29 /
old 13: &low_value,
new 13: 27,
old 14: &high_value
new 14: 53

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.03
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> -- alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';
SQL>
SQL> -- alter session set "_smm_trace" = 65535;
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:01.37
SQL>
SQL> -- alter session set "_smm_trace" = 0;
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner t1) use_hash_aggregation(@inner)
*/ max(cnt) from ( select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id =
t2.user_id group by t1.object_id, t3.user_id )

Plan hash value: 3134842094

------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.37 | 139 | 2715 | 2715 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.37 | 139 | 2715 | 2715 | | | | |
| 2 | VIEW | | 1 | 1 | 810K|00:00:00.62 | 139 | 2715 | 2715 | | | | |
| 3 | HASH GROUP BY | | 1 | 1 | 810K|00:00:00.62 | 139 | 2715 | 2715 | 35M| 5521K| 7010K (1)| 23552 |
|* 4 | HASH JOIN | | 1 | 1 | 810K|00:00:00.01 | 139 | 0 | 0 | 1348K| 1348K| 1167K (0)| |
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | 0 | 0 | | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | 0 | 0 | | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | 0 | 0 | | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29907 | 30000 |00:00:00.01 | 55 | 0 | 0 | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")

28 rows selected.

Elapsed: 00:00:00.03
SQL>

As it can be seen with an estimate in the right ballpark the HASH GROUP BY operation completes in memory (very close to the 20% PGA_AGGREGATE_TARGET maximum size of a single workarea from 10.2 on with automatic PGA management for PGA_AGGREGATE_TARGET < 500M, for more information see Joze Senegacnik's paper on the internals of automatic PGA management).

However repeating exactly the same operation with the fudged statistics it spills to disk and uses only 7M, although it could have used up to 40M (given that there is no concurrent workload).

The same does not happen when repeating this experiment with other operations that use a workarea - the most obvious one being a SORT GROUP BY, as can be seen from this output:

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 200M
SQL> show parameter processes

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 0
job_queue_processes integer 10
log_archive_max_processes integer 2
processes integer 150
SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:00.04
SQL>
SQL> drop table t2 purge;

Table dropped.

Elapsed: 00:00:00.04
SQL>
SQL> create table t1
2 as
3 select mod(rownum, 15) as user_id, rownum as object_id from dual connect by level <= 30000;

Table created.

Elapsed: 00:00:00.06
SQL>
SQL> create table t2
2 as
3 select distinct user_id from t1;

Table created.

Elapsed: 00:00:00.03
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.09
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> column low_val new_value low_value
SQL> column high_val new_value high_value
SQL>
SQL> select
2 max(user_id) + 1 as low_val
3 , max(user_id) + max(user_id) - min(user_id) + 1 as high_val
4 from
5 t2;

LOW_VAL HIGH_VAL
---------- ----------
15 29

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> -- alter session set tracefile_identifier = 'smm_trace_correct_cardinality';
SQL>
SQL> -- alter session set "_smm_trace" = 65535;
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 no_use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:00.54
SQL>
SQL> -- alter session set "_smm_trace" = 0;
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 94tqnspjuzk8x, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner
t2 t3 t1) use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner
t1) no_use_hash_aggregation(@inner) */ max(cnt) from ( select /*+ qb_name(inner) */
count(*) as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by
t1.object_id, t3.user_id )

Plan hash value: 2068941295

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.53 | 103 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.53 | 103 | | | |
| 2 | VIEW | | 1 | 315K| 450K|00:00:00.37 | 103 | | | |
| 3 | SORT GROUP BY | | 1 | 315K| 450K|00:00:00.37 | 103 | 28M| 1913K| 25M (0)|
|* 4 | HASH JOIN | | 1 | 446K| 450K|00:00:00.01 | 103 | 1348K| 1348K| 1097K (0)|
| 5 | NESTED LOOPS | | 1 | 225 | 225 |00:00:00.01 | 48 | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 15 | 15 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL| T2 | 15 | 15 | 225 |00:00:00.01 | 45 | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29787 | 30000 |00:00:00.01 | 55 | | | |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")

29 rows selected.

Elapsed: 00:00:00.03
SQL>
SQL> set pagesize 14
SQL>
SQL> -- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 srec2 dbms_stats.statrec;
9 begin
10 dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
11 srec2.epc := 2;
12 novals := dbms_stats.numarray(
13 &low_value,
14 &high_value
15 );
16 srec2.bkvals := null;
17 dbms_stats.prepare_column_values(srec2,novals);
18 dbms_stats.set_column_stats(
19 ownname=>null,
20 tabname=>'t1',
21 colname=>'user_id',
22 distcnt=>distcnt,
23 nullcnt=>nullcnt,
24 srec=>srec2,
25 avgclen=>avgclen,
26 density=>density
27 );
28 end;
29 /
old 13: &low_value,
new 13: 15,
old 14: &high_value
new 14: 29

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> -- alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';
SQL>
SQL> -- alter session set "_smm_trace" = 65535;
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 no_use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:00.56
SQL>
SQL> -- alter session set "_smm_trace" = 0;
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 94tqnspjuzk8x, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner
t2 t3 t1) use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner
t1) no_use_hash_aggregation(@inner) */ max(cnt) from ( select /*+ qb_name(inner) */
count(*) as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by
t1.object_id, t3.user_id )

Plan hash value: 2068941295

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.55 | 103 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.55 | 103 | | | |
| 2 | VIEW | | 1 | 1 | 450K|00:00:00.38 | 103 | | | |
| 3 | SORT GROUP BY | | 1 | 1 | 450K|00:00:00.38 | 103 | 28M| 1913K| 25M (0)|
|* 4 | HASH JOIN | | 1 | 1 | 450K|00:00:00.01 | 103 | 1348K| 1348K| 1053K (0)|
| 5 | NESTED LOOPS | | 1 | 225 | 225 |00:00:00.01 | 48 | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 15 | 15 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL| T2 | 15 | 15 | 225 |00:00:00.01 | 45 | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29787 | 30000 |00:00:00.01 | 55 | | | |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")

29 rows selected.

Elapsed: 00:00:00.03
SQL>

All I've done is to reduce the data set, because the SORT GROUP BY in this case required more memory for the same amount of data than the HASH GROUP BY, in order to prevent the operation to spill to disk with a 200M PGA_AGGREGATE_TARGET setting and change the USE_HASH_AGGREGATION hint to NO_USE_HASH_AGGREGATION.

As you can see, the operation completes both times in memory and uses the same amount of memory no matter what the estimates look like.

I've tested the serial execution of the most common workarea based operations like HASH JOIN, SORT ORDER BY, WINDOW SORT, SORT UNIQUE and all of them were able to dynamically resize the workarea in cases where the initial estimated size was too small.

If you carefully check then you'll notice that I haven't mentioned the HASH UNIQUE operation yet, and later on you'll see why.

A cunning feature of the automatic PGA management comes to help, however, which is a kind of "feedback loop" for the workareas based on statistics maintained by the automatic PGA memory management, and indeed when the HASH GROUP BY cursor based on the incorrect cardinality estimate gets shared (not invalidated) and re-executed, the next execution will look like this:

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_wrong_cardinality_repeated_execution';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:00.76
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner t1)
use_hash_aggregation(@inner) */ max(cnt) from ( select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id,
t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by t1.object_id, t3.user_id )

Plan hash value: 3134842094

------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.75 | 139 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.75 | 139 | | | | |
| 2 | VIEW | | 1 | 1 | 810K|00:00:00.44 | 139 | | | | |
| 3 | HASH GROUP BY | | 1 | 1 | 810K|00:00:00.44 | 139 | 35M| 5521K| 38M (0)| |
|* 4 | HASH JOIN | | 1 | 1 | 810K|00:00:00.01 | 139 | 1348K| 1348K| 1412K (0)| |
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29836 | 30000 |00:00:00.01 | 55 | | | | |
------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")

28 rows selected.

Elapsed: 00:00:00.03
SQL>

So Oracle this time based the workarea memory requirements on the feedback from the previous execution and therefore allocated sufficient memory to complete the operation without spilling to disk.

Notice however that the feedback loop unfortunately does not work as desired when the workarea execution fails due to insufficient TEMP space. Ideally the feedback loop should allow the subsequent executions to grab more memory specifically in such cases, but at present it doesn't - the failure obviously prevents an update of the statistics and therefore subsequent executions continue to fail since they still use the same amount of memory.

You can test this by simply assigned a very small TEMP tablespace to the user executing the query so that the second execution fails due to insufficient TEMP space. If you repeat the execution in this case, it will fail again and keeps doing so.

What happens if I invalidate the cursor and repeat the execution?

SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.01
SQL> set echo on timing on
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_wrong_cardinality_repeated_exec_invalid';

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;
1

Elapsed: 00:00:01.29
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner t1) use_hash_aggregation(@inner)
*/ max(cnt) from ( select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id =
t2.user_id group by t1.object_id, t3.user_id )

Plan hash value: 3134842094

------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.28 | 139 | 2715 | 2715 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.28 | 139 | 2715 | 2715 | | | | |
| 2 | VIEW | | 1 | 1 | 810K|00:00:00.61 | 139 | 2715 | 2715 | | | | |
| 3 | HASH GROUP BY | | 1 | 1 | 810K|00:00:00.61 | 139 | 2715 | 2715 | 35M| 5521K| 7067K (1)| 23552 |
|* 4 | HASH JOIN | | 1 | 1 | 810K|00:00:00.01 | 139 | 0 | 0 | 1348K| 1348K| 1129K (0)| |
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | 0 | 0 | | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | 0 | 0 | | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | 0 | 0 | | | | |
| 8 | TABLE ACCESS FULL | T1 | 1 | 29836 | 30000 |00:00:00.01 | 55 | 0 | 0 | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T1"."USER_ID"="T2"."USER_ID")

28 rows selected.

Elapsed: 00:00:00.03
SQL>

Back to square one - with the invalidation the statistics are gone, too, so the bad cardinality estimate again lead to the suboptimal execution of the HASH GROUP BY.

So you might think, why bother? Only the first execution of a cursor without workarea execution statistics will be affected by the problem, subsequent execution of the same cursor will benefit from the statistics from the previous executions.

The problem however is, that this is fine for applications that share cursors. Unfortunately applications that peform heavy duty aggregations like data warehouses typically do not share cursors, since they do not care about the optimization overhead and deliberately use literals to provide as much information to the optimizer as possible.

Also these heavy duty aggregations usually use Parallel Execution features, and as you'll see from the output of the same test case, if I run the HASH GROUP BY in parallel by simply setting table T1 to parallel degree 2, a similar problem occurs - so Parallel Execution is affected as well.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 200M
SQL> show parameter processes

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 0
job_queue_processes integer 10
log_archive_max_processes integer 2
processes integer 150
SQL>
SQL> -- alter session set workarea_size_policy = manual sort_area_size = 40000000;
SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:00.01
SQL>
SQL> drop table t2 purge;

Table dropped.

Elapsed: 00:00:00.04
SQL>
SQL> create table t1
2 as
3 select mod(rownum, 27) as user_id, rownum as object_id from dual connect by level <= 30000;

Table created.

Elapsed: 00:00:00.06
SQL>
SQL> alter table t1 parallel 2;

Table altered.

Elapsed: 00:00:00.03
SQL>
SQL> create table t2
2 as
3 select distinct user_id from t1;

Table created.

Elapsed: 00:00:00.03
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> column low_val new_value low_value
SQL> column high_val new_value high_value
SQL>
SQL> select
2 max(user_id) + 1 as low_val
3 , max(user_id) + max(user_id) - min(user_id) + 1 as high_val
4 from
5 t2;

LOW_VAL HIGH_VAL
---------- ----------
27 53

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_correct_cardinality';

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:00.42
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_hash(@inner t1) no_swap_join_inputs(@inner t1)
use_hash_aggregation(@inner) */ max(cnt) from ( select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id,
t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by t1.object_id, t3.user_id )

Plan hash value: 464898991

--------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | O/1/M |
--------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.43 | 87 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.43 | 87 | | | |
| 2 | PX COORDINATOR | | 1 | | 2 |00:00:00.43 | 87 | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10002 | 0 | 1 | 0 |00:00:00.01 | 0 | | | |
| 4 | SORT AGGREGATE | | 2 | 1 | 2 |00:00:00.81 | 0 | | | |
| 5 | VIEW | | 2 | 575K| 810K|00:00:00.48 | 0 | | | |
| 6 | HASH GROUP BY | | 2 | 575K| 810K|00:00:00.48 | 0 | 35M| 5521K| 2/0/0|
| 7 | PX RECEIVE | | 2 | 814K| 810K|00:00:00.01 | 0 | | | |
| 8 | PX SEND HASH | :TQ10001 | 0 | 814K| 0 |00:00:00.01 | 0 | | | |
|* 9 | HASH JOIN | | 2 | 814K| 810K|00:00:00.01 | 89 | 1348K| 1348K| 2/0/0|
| 10 | BUFFER SORT | | 2 | | 1458 |00:00:00.01 | 0 | 29696 | 29696 | 2/0/0|
| 11 | PX RECEIVE | | 2 | 729 | 1458 |00:00:00.01 | 0 | | | |
| 12 | PX SEND BROADCAST | :TQ10000 | 0 | 729 | 0 |00:00:00.01 | 0 | | | |
| 13 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | | | |
| 14 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | | | |
| 15 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | | | |
| 16 | PX BLOCK ITERATOR | | 2 | 30164 | 30000 |00:00:00.01 | 89 | | | |
|* 17 | TABLE ACCESS FULL | T1 | 18 | 30164 | 30000 |00:00:00.01 | 89 | | | |
--------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

9 - access("T1"."USER_ID"="T2"."USER_ID")
17 - access(:Z>=:Z AND :Z<=:Z)

38 rows selected.

Elapsed: 00:00:00.03
SQL>
SQL> set pagesize 14
SQL>
SQL> -- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 srec2 dbms_stats.statrec;
9 begin
10 dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
11 srec2.epc := 2;
12 novals := dbms_stats.numarray(
13 &low_value,
14 &high_value
15 );
16 srec2.bkvals := null;
17 dbms_stats.prepare_column_values(srec2,novals);
18 dbms_stats.set_column_stats(
19 ownname=>null,
20 tabname=>'t1',
21 colname=>'user_id',
22 distcnt=>distcnt,
23 nullcnt=>nullcnt,
24 srec=>srec2,
25 avgclen=>avgclen,
26 density=>density
27 );
28 end;
29 /
old 13: &low_value,
new 13: 27,
old 14: &high_value
new 14: 53

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_hash(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 use_hash_aggregation(@inner)
9 */
10 max(cnt)
11 from
12 (
13 select /*+ qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 group by t1.object_id, t3.user_id
17 )
18 ;

MAX(CNT)
----------
1

Elapsed: 00:00:01.17
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS'));
SQL_ID 2s1sdfhvhajv3, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2) full(@inner t3) leading(@inner t2 t3 t1) use_nl(@inner t3)
use_hash(@inner t1) no_swap_join_inputs(@inner t1) use_hash_aggregation(@inner) */ max(cnt) from ( select /*+
qb_name(inner) */ count(*) as cnt, t1.object_id, t3.user_id from t1, t2, t2 t3 where t1.user_id = t2.user_id group by t1.object_id, t3.user_id )

Plan hash value: 464898991

------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | O/1/M | Max-Tmp |
------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.16 | 87 | 0 | 0 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.16 | 87 | 0 | 0 | | | | |
| 2 | PX COORDINATOR | | 1 | | 2 |00:00:01.16 | 87 | 0 | 0 | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10002 | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | 0 | | | | |
| 4 | SORT AGGREGATE | | 2 | 1 | 2 |00:00:02.25 | 0 | 2460 | 2460 | | | | |
| 5 | VIEW | | 2 | 1 | 810K|00:00:01.53 | 0 | 2460 | 2460 | | | | |
| 6 | HASH GROUP BY | | 2 | 1 | 810K|00:00:01.53 | 0 | 2460 | 2460 | 35M| 5521K| | 11264 |
| 7 | PX RECEIVE | | 2 | 1 | 810K|00:00:00.01 | 0 | 0 | 0 | | | | |
| 8 | PX SEND HASH | :TQ10001 | 0 | 1 | 0 |00:00:00.01 | 0 | 0 | 0 | | | | |
|* 9 | HASH JOIN | | 2 | 1 | 810K|00:00:00.01 | 89 | 0 | 0 | 1348K| 1348K| 2/0/0| |
| 10 | BUFFER SORT | | 2 | | 1458 |00:00:00.01 | 0 | 0 | 0 | 29696 | 29696 | 2/0/0| |
| 11 | PX RECEIVE | | 2 | 729 | 1458 |00:00:00.01 | 0 | 0 | 0 | | | | |
| 12 | PX SEND BROADCAST | :TQ10000 | 0 | 729 | 0 |00:00:00.01 | 0 | 0 | 0 | | | | |
| 13 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | 0 | 0 | | | | |
| 14 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | 0 | 0 | | | | |
| 15 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | 0 | 0 | | | | |
| 16 | PX BLOCK ITERATOR | | 2 | 30164 | 30000 |00:00:00.01 | 89 | 0 | 0 | | | | |
|* 17 | TABLE ACCESS FULL | T1 | 18 | 30164 | 30000 |00:00:00.01 | 89 | 0 | 0 | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

9 - access("T1"."USER_ID"="T2"."USER_ID")
17 - access(:Z>=:Z AND :Z<=:Z)

37 rows selected.

Elapsed: 00:00:00.04
SQL>

So, if your database uses automatic PGA management and

- uses the hash aggregation HASH GROUP BY (or HASH UNIQUE, more on that in a moment)
- and does not share cursors

every execution that is based on bad cardinality estimates potentially has a problem with the hash aggregation because it might not make efficient use of the available memory.

The same applies to applications that share cursors, however in that case only the first execution after re-optimization / invalidation is affected.

So you might want to carefully check the runtime execution statistics of your critial hash aggregations.

Mind you, things could be worse, and that is where the HASH UNIQUE operation comes into the picture.

When I realised the issue with the HASH GROUP BY operation I was quite certain that the HASH UNIQUE operation will be affected in a similar way, since internally Oracle seems to use the same mechanism for both operations (In the SMM trace files both are called HASH GROUP BY).

To my surprise I noticed that in 10g versions below 10.2.0.5 and 11g versions below 11.2.0.1 (which means in this case 10.2.0.4 and 11.1.0.7 respectively, I didn't test other versions) the HASH UNIQUE operation suffers from an even more dramatic problem: The cardinality estimate is not considered and the initial workarea size is always based on minimum assumptions.

In passing, according to the SMM trace files that can be activated using the undocumented "_smm_trace" parameter it looks like that many of the sort-based workareas like SORT ORDER BY or WINDOW SORT seem to suffer from the same defect, since they however are able to dynamically resize and make use of the workarea statistics feedback they effectively work as expected, they just start with a too low workarea size estimate every time they are executed for the first time after re-optimization / invalidation.

The combination of the two issues - inability to dynamically resize and ignoring the optimizer estimates - leads to a dire result: Every first execution of a HASH UNIQUE operation in those versions will only use the minimum amount of memory. The following test case shows the problem:

set echo on timing on

show parameter pga
show parameter processes

-- alter session set workarea_size_policy = manual sort_area_size = 40000000;

drop table t1 purge;

drop table t2 purge;

create table t1
as
select mod(rownum, 27) as user_id, rownum as object_id from dual connect by level <= 30000;

create table t2
as
select distinct user_id from t1;

exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

column low_val new_value low_value
column high_val new_value high_value

select
max(user_id) + 1 as low_val
, max(user_id) + max(user_id) - min(user_id) + 1 as high_val
from
t2;

-- Invalidate any cursors using T1
comment on table t1 is '';

alter session set statistics_level = all;

alter session set tracefile_identifier = 'smm_trace_correct_cardinality';

alter session set "_smm_trace" = 65535;

select /*+ full(@inner t1)
full(@inner t2)
full(@inner t3)
leading(@inner t2 t3 t1)
use_nl(@inner t3)
use_nl(@inner t1)
use_hash_aggregation(@inner)
*/
max(user_id)
from
(
select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
from t1, t2, t2 t3
where t1.user_id = t2.user_id
)
;

alter session set "_smm_trace" = 0;

set pagesize 0 linesize 200 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

set pagesize 14

-- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
declare
srec dbms_stats.statrec;
novals dbms_stats.numarray;
distcnt number;
avgclen number;
nullcnt number;
density number;
srec2 dbms_stats.statrec;
begin
dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
srec2.epc := 2;
novals := dbms_stats.numarray(
&low_value,
&high_value
);
srec2.bkvals := null;
dbms_stats.prepare_column_values(srec2,novals);
dbms_stats.set_column_stats(
ownname=>null,
tabname=>'t1',
colname=>'user_id',
distcnt=>distcnt,
nullcnt=>nullcnt,
srec=>srec2,
avgclen=>avgclen,
density=>density
);
end;
/

-- Invalidate any cursors using T1
comment on table t1 is '';

alter session set statistics_level = all;

alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';

alter session set "_smm_trace" = 65535;

select /*+ full(@inner t1)
full(@inner t2)
full(@inner t3)
leading(@inner t2 t3 t1)
use_nl(@inner t3)
use_nl(@inner t1)
use_hash_aggregation(@inner)
*/
max(user_id)
from
(
select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
from t1, t2, t2 t3
where t1.user_id = t2.user_id
)
;

alter session set "_smm_trace" = 0;

set pagesize 0 linesize 200 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

Obviously it is very similar to the HASH GROUP BY test case, this time however I've used a DISTINCT clause and replaced the HASH JOIN with a NESTED LOOP which makes the generated trace files easier to read, since there is exactly one workarea involved in this execution.

And this is what I get from 11.1.0.7, this time using AMM, and therefore the PGA_AGGREGAT_TARGET has been set to 0 (You'll get the same result from 10.2.0.4 with a corresponding P_A_T setting):

Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 0
SQL> show parameter processes

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 0
global_txn_processes integer 1
job_queue_processes integer 1000
log_archive_max_processes integer 4
processes integer 150
SQL>
SQL> -- alter session set workarea_size_policy = manual sort_area_size = 40000000;
SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:00.01
SQL>
SQL> drop table t2 purge;

Table dropped.

Elapsed: 00:00:00.00
SQL>
SQL> create table t1
2 as
3 select mod(rownum, 27) as user_id, rownum as object_id from dual connect by level <= 30000;

Table created.

Elapsed: 00:00:00.04
SQL>
SQL> create table t2
2 as
3 select distinct user_id from t1;

Table created.

Elapsed: 00:00:00.04
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.06
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.01
SQL>
SQL> column low_val new_value low_value
SQL> column high_val new_value high_value
SQL>
SQL> select
2 max(user_id) + 1 as low_val
3 , max(user_id) + max(user_id) - min(user_id) + 1 as high_val
4 from
5 t2;

LOW_VAL HIGH_VAL
---------- ----------
27 53

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_correct_cardinality';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_nl(@inner t1)
7 use_hash_aggregation(@inner)
8 */
9 max(user_id)
10 from
11 (
12 select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
13 from t1, t2, t2 t3
14 where t1.user_id = t2.user_id
15 )
16 ;

MAX(USER_ID)
------------
26

Elapsed: 00:00:01.79
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 7cnqz02uwcs1a, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2)
full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_nl(@inner t1)
use_hash_aggregation(@inner) */ max(user_id) from ( select /*+
qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id from t1,
t2, t2 t3 where t1.user_id = t2.user_id )

Plan hash value: 1541846686

------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.78 | 40179 | 1470 | 1470 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.78 | 40179 | 1470 | 1470 | | | | |
| 2 | VIEW | | 1 | 572K| 810K|00:00:00.97 | 40179 | 1470 | 1470 | | | | |
| 3 | HASH UNIQUE | | 1 | 572K| 810K|00:00:00.97 | 40179 | 1470 | 1470 | 25M| 3296K| 6029K (1)| 13312 |
| 4 | NESTED LOOPS | | 1 | 810K| 810K|00:00:00.01 | 40179 | 0 | 0 | | | | |
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | 0 | 0 | | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | 0 | 0 | | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | 0 | 0 | | | | |
|* 8 | TABLE ACCESS FULL | T1 | 729 | 1111 | 810K|00:00:00.03 | 40095 | 0 | 0 | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - filter("T1"."USER_ID"="T2"."USER_ID")

30 rows selected.

Elapsed: 00:00:00.03
SQL>
SQL> set pagesize 14
SQL>
SQL> -- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 srec2 dbms_stats.statrec;
9 begin
10 dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
11 srec2.epc := 2;
12 novals := dbms_stats.numarray(
13 &low_value,
14 &high_value
15 );
16 srec2.bkvals := null;
17 dbms_stats.prepare_column_values(srec2,novals);
18 dbms_stats.set_column_stats(
19 ownname=>null,
20 tabname=>'t1',
21 colname=>'user_id',
22 distcnt=>distcnt,
23 nullcnt=>nullcnt,
24 srec=>srec2,
25 avgclen=>avgclen,
26 density=>density
27 );
28 end;
29 /
old 13: &low_value,
new 13: 27,
old 14: &high_value
new 14: 53

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_nl(@inner t1)
7 use_hash_aggregation(@inner)
8 */
9 max(user_id)
10 from
11 (
12 select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
13 from t1, t2, t2 t3
14 where t1.user_id = t2.user_id
15 )
16 ;

MAX(USER_ID)
------------
26

Elapsed: 00:00:01.26
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID 7cnqz02uwcs1a, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2)
full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_nl(@inner t1)
use_hash_aggregation(@inner) */ max(user_id) from ( select /*+
qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id from t1,
t2, t2 t3 where t1.user_id = t2.user_id )

Plan hash value: 1541846686

------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp|
------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:01.26 | 40179 | 1470 | 1470 | | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:01.26 | 40179 | 1470 | 1470 | | | | |
| 2 | VIEW | | 1 | 1 | 810K|00:00:00.98 | 40179 | 1470 | 1470 | | | | |
| 3 | HASH UNIQUE | | 1 | 1 | 810K|00:00:00.98 | 40179 | 1470 | 1470 | 25M| 3296K| 6093K (1)| 13312 |
| 4 | NESTED LOOPS | | 1 | 1 | 810K|00:00:00.01 | 40179 | 0 | 0 | | | | |
| 5 | NESTED LOOPS | | 1 | 729 | 729 |00:00:00.01 | 84 | 0 | 0 | | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 27 | 27 |00:00:00.01 | 3 | 0 | 0 | | | | |
| 7 | TABLE ACCESS FULL| T2 | 27 | 27 | 729 |00:00:00.01 | 81 | 0 | 0 | | | | |
|* 8 | TABLE ACCESS FULL | T1 | 729 | 1 | 810K|00:00:00.01 | 40095 | 0 | 0 | | | | |
------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - filter("T1"."USER_ID"="T2"."USER_ID")

30 rows selected.

Elapsed: 00:00:00.03
SQL>

As you can see no matter what the estimates looks like, on this system the first execution of a HASH UNIQUE will not get more than 6M - subsequent executions of the same cursor benefit from the statistics from the previous run as seen before.

Again, switching to a SORT UNIQUE and reducing the data set accordingly, the problem can not be reproduced:

Connected to:
Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL>
SQL> show parameter pga

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 0
SQL> show parameter processes

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 1
gcs_server_processes integer 0
global_txn_processes integer 1
job_queue_processes integer 1000
log_archive_max_processes integer 4
processes integer 150
SQL>
SQL> drop table t1 purge;

Table dropped.

Elapsed: 00:00:00.15
SQL>
SQL> drop table t2 purge;

Table dropped.

Elapsed: 00:00:00.07
SQL>
SQL> create table t1
2 as
3 select mod(rownum, 15) as user_id, rownum as object_id from dual connect by level <= 30000;

Table created.

Elapsed: 00:00:00.06
SQL>
SQL> create table t2
2 as
3 select distinct user_id from t1;

Table created.

Elapsed: 00:00:00.04
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't1', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.04
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't2', method_opt => 'for all columns size 1')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.03
SQL>
SQL> column low_val new_value low_value
SQL> column high_val new_value high_value
SQL>
SQL> select
2 max(user_id) + 1 as low_val
3 , max(user_id) + max(user_id) - min(user_id) + 1 as high_val
4 from
5 t2;

LOW_VAL HIGH_VAL
---------- ----------
15 29

Elapsed: 00:00:00.00
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_correct_cardinality';

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_nl(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 no_use_hash_aggregation(@inner)
9 */
10 max(user_id)
11 from
12 (
13 select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 )
17 ;

MAX(USER_ID)
------------
14

Elapsed: 00:00:00.73
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID ckb2sbz3y2z14, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2)
full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_nl(@inner t1)
no_swap_join_inputs(@inner t1)
no_use_hash_aggregation(@inner) */ max(user_id) from ( select
/*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id from
t1, t2, t2 t3 where t1.user_id = t2.user_id )

Plan hash value: 3828303002

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.71 | 12423 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.71 | 12423 | | | |
| 2 | VIEW | | 1 | 318K| 450K|00:00:00.62 | 12423 | | | |
| 3 | SORT UNIQUE | | 1 | 318K| 450K|00:00:00.62 | 12423 | 22M| 1744K| 20M (0)|
| 4 | NESTED LOOPS | | 1 | 450K| 450K|00:00:00.01 | 12423 | | | |
| 5 | NESTED LOOPS | | 1 | 225 | 225 |00:00:00.01 | 48 | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 15 | 15 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL| T2 | 15 | 15 | 225 |00:00:00.01 | 45 | | | |
|* 8 | TABLE ACCESS FULL | T1 | 225 | 2000 | 450K|00:00:00.01 | 12375 | | | |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - filter("T1"."USER_ID"="T2"."USER_ID")

31 rows selected.

Elapsed: 00:00:00.06
SQL>
SQL> set pagesize 14
SQL>
SQL> -- Fudge the statistics so that the join cardinality between t1 and t2 will be 0 (rounded up to 1)
SQL> declare
2 srec dbms_stats.statrec;
3 novals dbms_stats.numarray;
4 distcnt number;
5 avgclen number;
6 nullcnt number;
7 density number;
8 srec2 dbms_stats.statrec;
9 begin
10 dbms_stats.get_column_stats(null, 't1', 'user_id', distcnt => distcnt, avgclen => avgclen, nullcnt => nullcnt, density => density, srec => srec);
11 srec2.epc := 2;
12 novals := dbms_stats.numarray(
13 &low_value,
14 &high_value
15 );
16 srec2.bkvals := null;
17 dbms_stats.prepare_column_values(srec2,novals);
18 dbms_stats.set_column_stats(
19 ownname=>null,
20 tabname=>'t1',
21 colname=>'user_id',
22 distcnt=>distcnt,
23 nullcnt=>nullcnt,
24 srec=>srec2,
25 avgclen=>avgclen,
26 density=>density
27 );
28 end;
29 /
old 13: &low_value,
new 13: 15,
old 14: &high_value
new 14: 29

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.06
SQL>
SQL> -- Invalidate any cursors using T1
SQL> comment on table t1 is '';

Comment created.

Elapsed: 00:00:00.01
SQL>
SQL> alter session set statistics_level = all;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> alter session set tracefile_identifier = 'smm_trace_wrong_cardinality';

Session altered.

Elapsed: 00:00:00.01
SQL>
SQL> alter session set "_smm_trace" = 65535;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> select /*+ full(@inner t1)
2 full(@inner t2)
3 full(@inner t3)
4 leading(@inner t2 t3 t1)
5 use_nl(@inner t3)
6 use_nl(@inner t1)
7 no_swap_join_inputs(@inner t1)
8 no_use_hash_aggregation(@inner)
9 */
10 max(user_id)
11 from
12 (
13 select /*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id
14 from t1, t2, t2 t3
15 where t1.user_id = t2.user_id
16 )
17 ;

MAX(USER_ID)
------------
14

Elapsed: 00:00:00.74
SQL>
SQL> alter session set "_smm_trace" = 0;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> set pagesize 0 linesize 200 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID ckb2sbz3y2z14, child number 0
-------------------------------------
select /*+ full(@inner t1) full(@inner t2)
full(@inner t3) leading(@inner t2 t3 t1)
use_nl(@inner t3) use_nl(@inner t1)
no_swap_join_inputs(@inner t1)
no_use_hash_aggregation(@inner) */ max(user_id) from ( select
/*+ qb_name(inner) no_merge */ distinct t1.object_id, t3.user_id from
t1, t2, t2 t3 where t1.user_id = t2.user_id )

Plan hash value: 3828303002

--------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.73 | 12423 | | | |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.73 | 12423 | | | |
| 2 | VIEW | | 1 | 1 | 450K|00:00:00.61 | 12423 | | | |
| 3 | SORT UNIQUE | | 1 | 1 | 450K|00:00:00.61 | 12423 | 22M| 1744K| 20M (0)|
| 4 | NESTED LOOPS | | 1 | 1 | 450K|00:00:00.01 | 12423 | | | |
| 5 | NESTED LOOPS | | 1 | 225 | 225 |00:00:00.01 | 48 | | | |
| 6 | TABLE ACCESS FULL| T2 | 1 | 15 | 15 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL| T2 | 15 | 15 | 225 |00:00:00.01 | 45 | | | |
|* 8 | TABLE ACCESS FULL | T1 | 225 | 1 | 450K|00:00:00.01 | 12375 | | | |
--------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - filter("T1"."USER_ID"="T2"."USER_ID")

31 rows selected.

Elapsed: 00:00:00.01
SQL>
SQL> set doc off
SQL> doc
SQL>

In 10.2.0.5 and 11.2.0.1/11.2.0.2 this issue is apparently fixed and the cardinality estimates are used for an initial workarea size calculation - however this doesn't mean that the dynamic resize problem is fixed - it simply means that the HASH UNIQUE in those versions behaves exactly as the HASH GROUP BY and the memory usage for the first execution depends on the cardinality estimates.

Summary

If your database uses automatic PGA management then for the hash aggregation HASH GROUP BY / HASH UNIQUE operations every initial execution that is based on bad cardinality estimates potentially has a problem because it might not make efficient use of the available memory.

The same applies to applications that share cursors, however in that case only the initial execution after re-optimization / invalidation is affected, subsequent executions benefit from the workarea statistics feedback mechanism.

Furthermore in 10g versions below 10.2.0.5 and 11g versions below 11.2.0.1 the initial execution of a HASH UNIQUE operation ignores the cardinality estimates and will always be based on minimum assumptions.

So you might want to carefully check the runtime execution statistics of your critical hash aggregations.

Possible Workarounds

Different strategies are available as workarounds, depending on the situation:

- Upgrading to versions where the HASH UNIQUE operation at least considers the optimizer estimates might be beneficial

- Obviously good cardinality estimates are crucial, in that case and when using the correct Oracle versions you should be fine. Using the undocumented OPT_ESTIMATE hint (or the deprecated undocumented CARDINALITY hint) might help in cases where other options like manually crafted statistics are not able to help the optimizer to come up with reasonable cardinality estimates.

- Applications that are able to share cursors might not be too much affected due to the ability to use the workarea executions for subsequent executions of the same cursor

- The described problems disappear when switching to manual workarea size policy and allowing for sufficient memory of workarea. Interestingly the less obvious SORT_AREA_SIZE is used for the manual control of the HASH GROUP BY operation, and not the HASH_AREA_SIZE. Of course using manual PGA memory management system-wide is highly unrecommended, so this might only be a workaround in certain cases of large batch jobs where manual workarea sizes are used anyway. Also note, when switching to manual workareas be aware of a nasty bug that was introduced in the 10.2.0.3 patchset. For more information see MOS note "6053134.8: ALTER SESSION to set SORT_AREA_SIZE no honoured" and for example mine and Jonathen Lewis' post about the issue. According to the MOS note the bug has been fixed in 10.2.0.4.3, 10.2.0.5 and 11.1.0.7 / 11.2

- Obviously the hash aggregation can be avoided on statement level using the NO_USE_HASH_AGGREGATION hint or on session / system level using the _GBY_HASH_AGGREGATION_ENABLED parameter. Since the sort based aggregation can be less efficient (but note that it doesn't have to be, it depends on the individual grouping and data pattern) again this doesn't necessarily solve the problem since aggregate workareas might spill to disk which wouldn't be necessary when hash aggregation was used and worked as expected.

Final Note

Of course there are a lot of things that I haven't touched yet or only briefly, like Parallel Execution of the hash aggregations, workareas used for index sorts, and all the other details when the workarea spills to disk.

If you want to get an idea what kind of nasty things can happen in that case you might want to read Jonathan Lewis' post on analytic functions.

I also haven't tried yet to fiddle around with the undocumented _smm* related parameters if setting them to non-default values allows to work around the issue.

And of course then there are bugs like this one: Bug 6817844 - Multi pass sort with auto memory management even with plenty of PGA [ID 6817844.8], which is also mentioned in Jonathan's post.

As a final note, the odd TEMP tablespace I/O pattern issue that Jonathan describes in his post (Bug 9041800) is marked as fixed in the 11.2.0.2 patch set, but I haven't tested this yet.

Adding Comments to SQL Statements Improves Performance?

January 15, 2011 While reading the “Pro Oracle SQL” book I learned something interesting.  Commenting your work can improve database performance.  You certainly are aware that thoroughly documenting what you do could prevent hours of headaches that might appear later when trying to investigate problems or add additional features to an existing procedure (I think that was [...]

New paper "Mastering Performance with Extended SQL Trace"

Happy New Year.

It’s been a busy few weeks. I finally have something tangible to show for it: “Mastering Performance with Extended SQL Trace” is the new paper I’ve written for this year’s RMOUG conference. Think of it a 15-page update to chapter 5 of Optimizing Oracle Performance.

There’s lots of new detail in there. Some highlights:

Performance Stories from Exadata Migrations

Here are my UKOUG 2010 slides about Exadata migration performance, this is real life stuff, not repeating the marketing material:
View more presentations from tanelp.

Share

PQ_DISTRIBUTE enhancement – Part 1

Oracle 11g has introduced a hell lot of new features related to Parallel Execution – like statement queuing, in-memory PX and auto-DOP, to name a few. There are also small bits of enhancements here and there – like default value of the PARALLEL_EXECUTION_MESSAGE_SIZE and an introduction of an easy way to stick PX slaves to a current RAC node with PARALLEL_FORCE_LOCAL.