Search

OakieTags

Who's online

There are currently 0 users and 43 guests online.

Recent comments

Affiliations

ASH

That Pictures demo in full

Note - features in this post require the Diagnostics Pack license

With so many potential technical posts in my pile, it was initially difficult to decide where to start again but I figured I should avoid the stats series until I'm back into the swing of things ;-) Instead I decided to fulfill a commitment I made to myself (and others, whether they knew about it or not) almost three months ago.

When I gave the evening demo session in the Amis offices I think the 2 hours went pretty well but, as usual with the OEM presentations, I got a little carried away and didn't conclude the demo properly. (This is also the demo I *would* have done at Hotsos last year if the damn thing had worked first time ;-)) It was a shame because as well as showing the neat and useful side of OEM Performance Pages, it also illustrates one of the common pitfalls in interpreting what the graphs are showing you.

I began by running a 4 concurrent user Sales Order Entry (SOE) test using Dominic Giles' Swingbench utility. I won't got into the details of the SOE test because I don't think it's particularly relevant here but you can always download and/or read about Swingbench for yourself at Dominics website.

I ran the test for a fixed period of 5 minutes using no think-time delay.

Using the capability to look at ASH data in the recent past, the OEM Top Activity page looks like this.

- There was a fairly consistent average of 4-5 active sessions over the 5 minutes period and, looking at the Top Sessions panel in the bottom right of the screen, these were four SOE sessions of similar activity levels and the LGWR process.

- The majority of time was spent on User I/O, System I/O and Commit Wait Class activity, with a little CPU.

- Three PL/SQL blocks were responsible for most of the Commit activity.

- The LGWR process was responsible for most of the System I/O activity.

I'll leave it there for now and won't drill down into any more detail.

In terms of optimising the performance of this test, what might I consider doing?

The most important aspect is to optimise the application to reduce the resource consumption to the minimum required to achieve our objectives. There's a whole bunch of User I/O activity that could perhaps be eliminated? But I'm going to ask you to accept the big assumption here that this application has been optimised and that I'm just using a Swingbench test as an illustration of the type of system-wide problem you could see. In that case, my eye is drawn to the Commit activity.

When I'm teaching this stuff, I'm usually deliberately simplistic (at least at the end of the process) and highlight that what I'm interested in 'tuning' is whatever most sessions are waiting on according to the ASH samples this screen uses. I used to explain how I'd look for the biggest areas of colour, drill down into those and identify what's going on. Sadly, I later heard that someone (I think it was JB at Oracle*) had already come up with a nifty acronym for this - COBS. Click on the Big Stuff! One day I will come up with a nifty acronym for something too, but you shouldn't hold your breath waiting.

So, if I click on the big stuff here, I can see that the Commit Class waits are log file sync.

How might I reduce the time that sessions are waiting for log file sync? Here are a few reasons why the test sessions might be waiting on log file sync more often or for longer than I'd like.

- Application design - committing too frequently
- CPU starvation
- Slow I/O to online redo log files

Whether waits are predominantly the result of CPU overload or slow I/O can be determined by looking at the underlying log file parallel write wait times on the LGWR process but that's a bigger subject for another time.

You can look into all of these in more depth - and should - but as this is designed to be a fun demo of the pretty pictures (it used to be 'the USB stick demo'), I'll simply try to eliminate that activity and re-run the same test. Here's how Top Activity looks now.

Oh. Maybe that wasn't what you expected? OK, the LGWR activity has disappeared, but it seems the system is almost as busy as it was before but that the main bottleneck is now User I/O activity. That's often the way, though - you eliminate one bottleneck in a system and it just shows up somewhere else. It must be good to get rid of log file sync waits though, right? User I/O seems like more productive work and I've managed to make the LGWR activity disappear completely.

But then if you were to look at this graph in terms of Average Active Sessions or DB Time or (as it's more likely to be expressed) how big that spike looks, the two tests would look similarly busy from a system-wide perspective. They were but the real question is - busy doing *what*? There's some important information missing here and Swingbench is able to provide it.

TotalCompletedTransactions 22,868

Mmmm, so I wonder what that value was for the first run?

TotalCompletedTransactions 12,232

Woo-hoo! *That's* what I call tuning a benchmark - processing almost twice the number of transactions in the same 5 minute period.

So it turns out that the sessions in the database *were* just as busy during the second run (not too surprising seeing as the test has no user think time so keeps hammering the database with as many requests as it can handle) but that they were busy doing the more productive work of reading and processing data rather than just waiting for COMMITs to complete.

I raised this issue of DB Time not showing activity details with Graham Wood* at Oracle in relation to a previous blog post. I think he made the point to me that that's why the OEM Performance Home Page is *not* the Top Activity page. If I take a look at that home page, it shows me the same information as the Swingbench results output did, albeit not as clearly

Looking at the Throughput graph, I can see that the second test processesd around double the number of Transactions per second for the same test running on the same system.

To wrap up (and be a little defensive) ...

- Yes, I could have traced one or more sessions and generated a complete and detailed response time profile that should have lead me to the same conclusion.

- Yes, as this is a controlled test environment and I'm the only 'user', AWR/Statspack would have been an even more powerful analysis tool in the right hands.

- The Top Activity page is not the most appropriate tool for this job but it is handy for illustrating concepts.

- Lest I seem a slavish pictures fan, I'm showing how people might misuse or misundersand ASH/Top Activity. In this case, the Home Performance Page is a much better tool because we're looking at system-wide data and not drilling into session or SQL details.

Oh, and what is my Top Secret Magic Silver Bullet Tuning Tip for OLTP-type applications? (only to be used by Advanced Oracle Performance Wizards)

alter system set commit_write='BATCH, NOWAIT';

Done! In fact, why not just use this on all of your systems, just in case people are waiting on log file sync?

(Leaves space below for angry responses and my withering humorous retorts)

* This is not name-dropping, this is giving due credit to the people who really know what they're talking about

Time Model Bug

Tasks that are performed via jobs in the database will be double accounted in the system time model that has been introduced with Oracle 10g.

So if you execute significant workload via DBMS_JOB or DBMS_SCHEDULER any system time model related statistic like DB Time, DB CPU etc. that gets recorded for that workload gets double accounted.

This bug is not particularly relevant since your top workloads will still be the same top workloads, because all other statistics (like Elapsed Time, CPU, Buffer Gets etc.) are not affected by the bug.

I mention it only here since the bug (see below for details) as of the time of writing can't yet be found on My Oracle Support in the bug database but I recently came across several AWR reports where the majority of workload was generated via job processes and therefore the time model statistics were effectively doubled.

It might help as a viable explanation if you sometimes wonder why an AWR or Statspack report only captures 50% or less of the recorded total DB Time or DB CPU and where this unaccounted time has gone. If a significant part of the workload during the reporting period has been performed by sessions controlled via DBMS_JOB or DBMS_SCHEDULER then probably most of the unaccounted time is actually not unaccounted but the time model statistics are wrong.

So if you have such an otherwise unexplainable unaccounted DB Time / DB CPU etc. you might want to check if significant workload during the reporting period was executed via the job system. Note that I don't say that this is the only possible explanation of such unaccounted time - there might be other reasons like uninstrumented waits, other bugs etc.

Of course all the percentages that are shown in the AWR / ADDM / Statspack reports that refer to "Percentage of DB Time" or "Percentage of DB CPU" will be too small in such cases.

If the majority of workload during the reporting period has been generated by jobs then you can safely assume that the time model statistics have to be divided by 2 (and the percentages have to be doubled). If you have a mixture of jobs and regular foreground sessions then it will be harder to derive the correct time model statistics.

Note that the "Active Session History" (ASH) is not affected by the bug - the ASH reports always were consistent in my tests regarding the DB Time (respectively the number of samples) and CPU time information.

The following simple test case can be used to reproduce the issue at will. Ideally you should have exclusive access to the test system since any other concurrent activity will affect the test results.

You might want to check the 1000000000 iterations of the simple PL/SQL loop on your particular CPU - on my test system this takes approx. 46 seconds to complete.

The first version assumes that a PERFSTAT user with an installed STATSPACK is present in the database since STATSPACK doesn't require an additional license. An AWR variant follows below.

alter session set nls_language = american nls_territory = america;

store set .settings replace

set echo on timing on define on

define iter="1000000000"

variable snap1 number

exec :snap1 := statspack.snap

declare
n_cnt binary_integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
end;
/

variable snap2 number

exec :snap2 := statspack.snap

/* Uncomment this if you want to test via DBMS_JOB
variable job_id number

begin
dbms_job.submit(:job_id, '
declare
n_cnt binary_integer;
n_status integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
n_status := dbms_pipe.send_message(''bg_job_complete'');
end;
');
end;
/

commit;
*/

/* Uncomment this if you want to test via DBMS_SCHEDULER */
begin
dbms_scheduler.create_job(
job_name => dbms_scheduler.generate_job_name
, job_type => 'PLSQL_BLOCK'
, job_action => '
declare
n_cnt binary_integer;
n_status integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
n_status := dbms_pipe.send_message(''bg_job_complete'');
end;
' , enabled => true);
end;
/

declare
pipe_status integer;
begin
pipe_status := dbms_pipe.receive_message('bg_job_complete');
end;
/

declare
pipe_id integer;
begin
pipe_id := dbms_pipe.remove_pipe('bg_job_complete');
end;
/

variable snap3 number

exec :snap3 := statspack.snap

rem set heading off pagesize 0 feedback off linesize 500 trimspool on termout off echo off verify off

prompt Enter PERFSTAT password

connect perfstat

column dbid new_value dbid noprint

select dbid from v$database;

column instance_number new_value inst_num noprint

select instance_number from v$instance;

column b_id new_value begin_snap noprint
column e_id new_value end_snap noprint

select :snap1 as b_id, :snap2 as e_id from dual;
define report_name=sp_foreground.txt

@?/rdbms/admin/sprepins

column dbid new_value dbid noprint

select dbid from v$database;

column instance_number new_value inst_num noprint

select instance_number from v$instance;

column b_id new_value begin_snap noprint
column e_id new_value end_snap noprint

select :snap2 as b_id, :snap3 as e_id from dual;
define report_name=sp_background.txt

@?/rdbms/admin/sprepins

undefine iter

@.settings

set termout on

Here is the same test case but with AWR reports (requires additional diagnostic license)

alter session set nls_language = american nls_territory = america;

store set .settings replace

set echo on timing on define on

define iter="1000000000"

column snap1 new_value awr_snap1 noprint

select dbms_workload_repository.create_snapshot as snap1 from dual;

declare
n_cnt binary_integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
end;
/

column snap2 new_value awr_snap2 noprint

select dbms_workload_repository.create_snapshot as snap2 from dual;

/* Uncomment this if you want to test via DBMS_JOB
variable job_id number

begin
dbms_job.submit(:job_id, '
declare
n_cnt binary_integer;
n_status integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
n_status := dbms_pipe.send_message(''bg_job_complete'');
end;
');
end;
/

commit;
*/

/* Uncomment this if you want to test via DBMS_SCHEDULER */
begin
dbms_scheduler.create_job(
job_name => dbms_scheduler.generate_job_name
, job_type => 'PLSQL_BLOCK'
, job_action => '
declare
n_cnt binary_integer;
n_status integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
n_status := dbms_pipe.send_message(''bg_job_complete'');
end;
' , enabled => true);
end;
/

declare
pipe_status integer;
begin
pipe_status := dbms_pipe.receive_message('bg_job_complete');
end;
/

declare
pipe_id integer;
begin
pipe_id := dbms_pipe.remove_pipe('bg_job_complete');
end;
/

column snap3 new_value awr_snap3 noprint

select dbms_workload_repository.create_snapshot as snap3 from dual;

set heading off pagesize 0 feedback off linesize 500 trimspool on termout off echo off verify off

spool awr_foreground.html

select
output
from
table(
sys.dbms_workload_repository.awr_report_html(
(select dbid from v$database)
, (select instance_number from v$instance)
, &awr_snap1
, &awr_snap2
)
);

spool off

spool awr_background.html

select
output
from
table(
sys.dbms_workload_repository.awr_report_html(
(select dbid from v$database)
, (select instance_number from v$instance)
, &awr_snap2
, &awr_snap3
)
);

spool off

spool awr_diff.html

select
output
from
table(
sys.dbms_workload_repository.awr_diff_report_html(
(select dbid from v$database)
, (select instance_number from v$instance)
, &awr_snap1
, &awr_snap2
, (select dbid from v$database)
, (select instance_number from v$instance)
, &awr_snap2
, &awr_snap3
)
);

spool off

undefine awr_snap1
undefine awr_snap2
undefine awr_snap3

undefine iter

column snap1 clear
column snap2 clear
column snap3 clear

@.settings

set termout on

And here is a sample snippet from a generated Statspack report on a single CPU system with nothing else running on the system:

Normal foreground execution:

STATSPACK report for

Database DB Id Instance Inst Num Startup Time Release RAC
~~~~~~~~ ----------- ------------ -------- --------------- ----------- ---
orcl112 1 05-Aug-10 08:21 11.2.0.1.0 NO

Host Name Platform CPUs Cores Sockets Memory (G)
~~~~ ---------------- ---------------------- ----- ----- ------- ------------
XXXX Microsoft Windows IA ( 1 0 0 2.0

Snapshot Snap Id Snap Time Sessions Curs/Sess Comment
~~~~~~~~ ---------- ------------------ -------- --------- ------------------
Begin Snap: 13 05-Aug-10 08:34:17 25 1.2
End Snap: 14 05-Aug-10 08:35:05 25 1.2
Elapsed: 0.80 (mins) Av Act Sess: 1.1
DB time: 0.87 (mins) DB CPU: 0.80 (mins)

Cache Sizes Begin End
~~~~~~~~~~~ ---------- ----------
Buffer Cache: 104M Std Block Size: 8K
Shared Pool: 128M Log Buffer: 6,076K

Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~ ------------------ ----------------- ----------- -----------
DB time(s): 1.1 2.4 0.09 3.99
DB CPU(s): 1.0 2.2 0.08 3.68

Execution via Job/Scheduler:

STATSPACK report for

Database DB Id Instance Inst Num Startup Time Release RAC
~~~~~~~~ ----------- ------------ -------- --------------- ----------- ---
orcl112 1 05-Aug-10 08:21 11.2.0.1.0 NO

Host Name Platform CPUs Cores Sockets Memory (G)
~~~~ ---------------- ---------------------- ----- ----- ------- ------------
XXXX Microsoft Windows IA ( 1 0 0 2.0

Snapshot Snap Id Snap Time Sessions Curs/Sess Comment
~~~~~~~~ ---------- ------------------ -------- --------- ------------------
Begin Snap: 14 05-Aug-10 08:35:05 25 1.2
End Snap: 15 05-Aug-10 08:35:53 24 1.3
Elapsed: 0.80 (mins) Av Act Sess: 1.9
DB time: 1.55 (mins) DB CPU: 1.54 (mins)

Cache Sizes Begin End
~~~~~~~~~~~ ---------- ----------
Buffer Cache: 104M Std Block Size: 8K
Shared Pool: 128M Log Buffer: 6,076K

Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~ ------------------ ----------------- ----------- -----------
DB time(s): 1.9 92.8 0.79 7.74
DB CPU(s): 1.9 92.1 0.78 7.68

As you might have guessed my single CPU test system has not been added a second CPU when performing the same task via DBMS_SCHEDULER / DBMS_JOB yet the time model reports (almost) 2 DB Time / DB CPU seconds and active sessions per second in that case.

I have reproduced the bug on versions 10.2.0.4, 11.1.0.7 and 11.2.0.1 but very likely all versions supporting the time model are affected.

A (non-public) bug "9882245 - DOUBLE ACCOUNTING OF SYS MODEL TIMINGS FOR WORKLOAD RUN THROUGH JOBS" has been filed for it, but the fix is not available yet therefore as far as I know it is not yet part of any available patch set / PSU.

Note that there seems to a different issue with the DB CPU time model component: If you have a system that reports more CPUs than sockets (for example a Power5, Power6 or Power7 based IBM server that reports 16 sockets / 32 CPUs) then the DB CPU component gets reduced by approximately 50%, which means it is divided by 2.

This means in combination with above bug that you end up with a doubled DB Time component for tasks executed via jobs, but the DB CPU time model component is in the right ballpark since the doubled DB CPU time gets divided by 2.

I don't know if the bug fix also covers this issue, so you might want to keep this in mind when checking any time model based information.