Search

Top 60 Oracle Blogs

Recent comments

Exadata

Essential tools for Exadata performance experiments

Like I said in a previous post I have started working in the Exadata performance field, which is really exciting, especially after you get it to work really fast!

Also, finding out what your session is spending time on is important if you are just getting started.

I found the following indispensable tools for Exadata performance analysis:

  • Session Snapper from Tanel Poder, available from his website
  • The Tuning and Diagnostic Pack license
  • The Real Time SQL Monitor package
  • A good understanding of DBMS_XPLAN
  • Oracle 10046 traces
  • collectl – I have blogged about it before here and mention it for reference only

There are probably a lot more than those, but these are the bare essentials. Let’s have a look at them in a bit more detail.

Snapper

I don’t think I have to lose a single word about snapper-it’s an established tool for performance diagnostics tought by Tanel in his advanced troubleshooting course. I use snapper primarily for analysis of smart scans. Below is an example for a session using serial execution to scan a table (best viewed with Firefox):

SQL> select count(1) from order_items;

SQL> @snapper all 5 1 243
Sampling SID 243 with interval 5 seconds, taking 1 snapshots...
setting stats to all due option = all

-- Session Snapper v3.52 by Tanel Poder @ E2SN ( http://tech.e2sn.com )

-------------------------------------------------------------------------------------------------------------------------------------
SID, USERNAME  , TYPE, STATISTIC                                                 ,     HDELTA, HDELTA/SEC,    %TIME, GRAPH
-------------------------------------------------------------------------------------------------------------------------------------
243, SOE       , STAT, session logical reads                                     ,     95.02k,        19k,
243, SOE       , STAT, user I/O wait time                                        ,        331,       66.2,
243, SOE       , STAT, non-idle wait time                                        ,        332,       66.4,
243, SOE       , STAT, non-idle wait count                                       ,      1.51k,      302.2,
243, SOE       , STAT, physical read total IO requests                           ,        776,      155.2,
243, SOE       , STAT, physical read total multi block requests                  ,        746,      149.2,
243, SOE       , STAT, physical read requests optimized                          ,         30,          6,
243, SOE       , STAT, physical read total bytes                                 ,    778.67M,    155.73M,
243, SOE       , STAT, cell physical IO interconnect bytes                       ,    778.67M,    155.73M,
243, SOE       , STAT, consistent gets                                           ,     95.06k,     19.01k,
243, SOE       , STAT, consistent gets from cache                                ,          7,        1.4,
243, SOE       , STAT, consistent gets from cache (fastpath)                     ,          7,        1.4,
243, SOE       , STAT, consistent gets direct                                    ,     95.05k,     19.01k,
243, SOE       , STAT, physical reads                                            ,     95.05k,     19.01k,
243, SOE       , STAT, physical reads direct                                     ,     95.05k,     19.01k,
243, SOE       , STAT, physical read IO requests                                 ,        776,      155.2,
243, SOE       , STAT, physical read bytes                                       ,    778.67M,    155.73M,
243, SOE       , STAT, calls to kcmgcs                                           ,          7,        1.4,
243, SOE       , STAT, file io wait time                                         ,        394,       78.8,
243, SOE       , STAT, Number of read IOs issued                                 ,        776,      155.2,
243, SOE       , STAT, no work - consistent read gets                            ,     95.15k,     19.03k,
243, SOE       , STAT, cell flash cache read hits                                ,         30,          6,
243, SOE       , TIME, DB CPU                                                    ,      1.96s,   391.94ms,    39.2%, |@@@@      |
243, SOE       , TIME, sql execute elapsed time                                  ,         6s,       1.2s,   120.0%, |@@@@@@@@@@|
243, SOE       , TIME, DB time                                                   ,         6s,       1.2s,   120.0%, |@@@@@@@@@@|
243, SOE       , WAIT, direct path read                                          ,       3.3s,   660.48ms,    66.0%, |@@@@@@@   |
243, SOE       , WAIT, kfk: async disk IO                                        ,        5ms,        1ms,      .1%, |          |
--  End of Stats snap 1, end=2011-08-23 16:20:03, seconds=5

-----------------------------------------------------------------------
Active% | SQL_ID          | EVENT                     | WAIT_CLASS
-----------------------------------------------------------------------
74% | 1p7y7pvzmxx3y   | direct path read          | User I/O
26% | 1p7y7pvzmxx3y   | ON CPU                    | ON CPU

--  End of ASH snap 1, end=2011-08-23 16:20:03, seconds=5, samples_taken=47

Watch out for the cell smart entries to figure out of a smart scan is happening. In the above example, there wasn’t one.

The same query again, but this time making use of session offloading:

SQL> select /*+full(t)*/ count(1) from order_items t;

SQL> @snapper all 5 1 243
Sampling SID 243 with interval 5 seconds, taking 1 snapshots...
setting stats to all due option = all

-- Session Snapper v3.52 by Tanel Poder @ E2SN ( http://tech.e2sn.com )

-------------------------------------------------------------------------------------------------------------------------------------
SID, USERNAME  , TYPE, STATISTIC                                                 ,     HDELTA, HDELTA/SEC,    %TIME, GRAPH
-------------------------------------------------------------------------------------------------------------------------------------
243, SOE       , STAT, session logical reads                                     ,    427.69k,     85.54k,
243, SOE       , STAT, user I/O wait time                                        ,        165,         33,
243, SOE       , STAT, non-idle wait time                                        ,        167,       33.4,
243, SOE       , STAT, non-idle wait count                                       ,      3.52k,        703,
243, SOE       , STAT, enqueue waits                                             ,         12,        2.4,
243, SOE       , STAT, enqueue requests                                          ,         10,          2,
243, SOE       , STAT, enqueue conversions                                       ,         15,          3,
243, SOE       , STAT, enqueue releases                                          ,         10,          2,
243, SOE       , STAT, global enqueue gets sync                                  ,         25,          5,
243, SOE       , STAT, global enqueue releases                                   ,         10,          2,
243, SOE       , STAT, physical read total IO requests                           ,      4.56k,      911.6,
243, SOE       , STAT, physical read total multi block requests                  ,      4.31k,      862.2,
243, SOE       , STAT, physical read total bytes                                 ,       3.5G,     700.6M,
243, SOE       , STAT, cell physical IO interconnect bytes                       ,      1.52G,    303.95M,
243, SOE       , STAT, ges messages sent                                         ,         12,        2.4,
243, SOE       , STAT, consistent gets                                           ,    427.54k,     85.51k,
243, SOE       , STAT, consistent gets from cache                                ,         75,         15,
243, SOE       , STAT, consistent gets from cache (fastpath)                     ,         75,         15,
243, SOE       , STAT, consistent gets direct                                    ,    427.47k,     85.49k,
243, SOE       , STAT, physical reads                                            ,    427.47k,     85.49k,
243, SOE       , STAT, physical reads direct                                     ,    427.47k,     85.49k,
243, SOE       , STAT, physical read IO requests                                 ,      4.56k,        911,
243, SOE       , STAT, physical read bytes                                       ,       3.5G,    700.36M,
243, SOE       , STAT, calls to kcmgcs                                           ,         75,         15,
243, SOE       , STAT, file io wait time                                         ,     17.45k,      3.49k,
243, SOE       , STAT, cell physical IO bytes eligible for predicate offload     ,      3.49G,       699M,
243, SOE       , STAT, cell smart IO session cache lookups                       ,          5,          1,
243, SOE       , STAT, cell smart IO session cache hits                          ,          5,          1,
243, SOE       , STAT, cell physical IO interconnect bytes returned by smart scan,      1.52G,    303.25M,
243, SOE       , STAT, cell session smart scan efficiency                        ,         -3,        -.6,
243, SOE       , STAT, table scans (long tables)                                 ,          5,          1,
243, SOE       , STAT, table scans (direct read)                                 ,          5,          1,
243, SOE       , STAT, table scan rows gotten                                    ,    129.59M,     25.92M,
243, SOE       , STAT, table scan blocks gotten                                  ,    422.53k,     84.51k,
243, SOE       , STAT, cell scans                                                ,          5,          1,
243, SOE       , STAT, cell blocks processed by cache layer                      ,    501.97k,    100.39k,
243, SOE       , STAT, cell blocks processed by txn layer                        ,    501.97k,    100.39k,
243, SOE       , STAT, cell blocks processed by data layer                       ,     426.9k,     85.38k,
243, SOE       , STAT, cell blocks helped by minscn optimization                 ,    501.91k,    100.38k,
243, SOE       , STAT, cell simulated session smart scan efficiency              ,       3.5G,    699.17M,
243, SOE       , STAT, cell IO uncompressed bytes                                ,       3.5G,    699.17M,
243, SOE       , TIME, DB CPU                                                    ,      3.98s,   796.68ms,    79.7%, |@@@@@@@@  |
243, SOE       , TIME, sql execute elapsed time                                  ,      6.31s,      1.26s,   126.2%, |@@@@@@@@@@|
243, SOE       , TIME, DB time                                                   ,      6.31s,      1.26s,   126.2%, |@@@@@@@@@@|
243, SOE       , WAIT, enq: KO - fast object checkpoint                          ,     2.01ms,      402us,      .0%, |          |
243, SOE       , WAIT, cell smart table scan                                     ,      1.65s,   329.33ms,    32.9%, |@@@@      |
243, SOE       , WAIT, events in waitclass Other                                 ,     5.69ms,     1.14ms,      .1%, |          |
--  End of Stats snap 1, end=2011-08-23 16:21:16, seconds=5

-----------------------------------------------------------------------
Active% | SQL_ID          | EVENT                     | WAIT_CLASS
-----------------------------------------------------------------------
66% | adhm3mbjfzysd   | ON CPU                    | ON CPU
34% | adhm3mbjfzysd   | cell smart table scan     | User I/O

--  End of ASH snap 1, end=2011-08-23 16:21:16, seconds=5, samples_taken=44

The enq: KO fast object checkpoint and the “cell smart table scan” are giveaways for a smart scan.

DBMS_XPLAN

Another proven, useful tool in the arsenal of the performance analyst. For Exadata you might want to use the display_cursor function. It can take a sql_id, a child number and a format parameter. I like the format ‘ALL’ best, as it gives me most information about a statement.

Note that even though you might get the word “storage” in the execution plan, it doesn’t mean you actually see a smart scan in the end! Always check the session wait events and session counters for the word “smart” to make sure one is occurring.

SQL> select * from table(dbms_xplan.display_cursor('3xjhbw9m5u9qu',format=>'ALL'))
2  /

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------
SQL_ID  3xjhbw9m5u9qu, child number 0
-------------------------------------
select count(*) from order_items where product_id > 10

Plan hash value: 2209137760

-------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                 |       |       |   249K(100)|          |
|   1 |  SORT AGGREGATE               |                 |     1 |     4 |            |          |
|*  2 |   INDEX STORAGE FAST FULL SCAN| ITEM_PRODUCT_IX |   338M|  1293M|   249K  (2)| 00:50:00 |
-------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

1 - SEL$1
2 - SEL$1 / ORDER_ITEMS@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage("PRODUCT_ID">10)
filter("PRODUCT_ID">10)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

1 - (#keys=0) COUNT(*)[22]

31 rows selected.

This leads me to the next tool, SQL Monitoring, and it rocks.

Real Time SQL Monitoring

This is yet another awesome tool in your repository if you have the license. It allows you to check what’s happening during the execution of a SQL statement, which is more than useful in larger data warehouse style queries.

The easiest and most accessible way is to use OEM Grid Control or database console to view the report. I personally like to run the report in a putty session on a large screen. To make it easier to run the report based on a SQL ID I created the below script “report.sql”:

set trim on long 2000000 longchunksize 2000000
set trimspool on  pagesize 0 echo off timing off linesize 1000
select DBMS_SQLTUNE.REPORT_SQL_MONITOR(
sql_id => '&1',
report_level=>'ALL')
from dual;

With this at hand you can query v$sesssion for a SQL_ID and feed it into the code snippet.

Unfortunately I can’t upload text files to wordpress for security reasons, and posting the output here isn’t possible since the report is rather wide. I decided to run the report with HTML output and type “EM” instead an post the print screens. The command I used is sjhown below:

SQL> select dbms_sqltune.report_sql_monitor(sql_id=>’&1′, type=>’EM’, report_level=>’+histogram’) from dual;

The output of the script would be far too wide so I moved it into the attachement of the EM report type here for an insert /*+ append */ into table select ….

You can use this to even look at statistics gathering-in the below two examples I executed dbms_stats-this report has been created for a stats gathering session with a DOP of 16

What I like best about the monitoring report is that it points out where you are in the execution plan, and it provides an estimate for how much work has already been done. More intelligent people have written lots about using these tools, off the top of my head I’d check Greg Rahn’s as well as Doug Burn’s weblogs for more information.

Oracle 10046 trace

Cary Millsap has written the standard about extended trace files and I strongly recommend reading his book and papers on the subject.

Whilst you can run queries in isolation and easily identify them during testing, the matter gets considerably more difficult as soon as you have to trace a session from a connection pool or other third party application. Most of them are not instrumented making tracing by

Of course you could use the MRTools to enable tracing on a wider scale and then trawl through the trace files with their help, but it might be difficult to get permission to do so on a busy production system.

The lesson learned for me was how to enable tracing in parallel query: this MOS note “Master Note; How to Get a 10046 trace for a Parallel Query [ID 1102801.1]” has the answer. It becomes more difficult if your PQ spans instances, but there are ways around this. You also should consider using DBMS_MONITOR to enable session trace if you can’t use “alter session set events” for some reason. Be sure to read all of the note, especially the bit at the bottom which explains the new 11g interface using “alter session set sql_trace” syntax.

I personally liked setting the tracefile identifier to something more easily identifiable. The ADRCI has a nice option to show trace files-if your ADR is consolidated in a cluster file system you could simply run “adcri > show tracefile -t” to list them all and pick the ones with your custom tracefile identifier.

What about command line utilities?

In the past I liked to use strace to see what a process was doing while “on cpu”. Unfortunately this isn’t really an option with Exadata, as most of the calls are purely network related. After all, RDS is a network protocol. Consider this output from strace during a 10 second query (taken on a database node):

#  strace -cp 29031
Process 29031 attached - interrupt to quit
Process 29031 detached
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
88.85    0.033064           2     14635         5 poll
5.47    0.002035           0     14754      7377 setsockopt
2.68    0.000999         999         1           munmap
2.43    0.000906           0     22075      7321 recvmsg
0.56    0.000209           0      7399           sendmsg
0.00    0.000000           0         8         6 read
0.00    0.000000           0         2           write
0.00    0.000000           0         3           mmap
0.00    0.000000           0        60           rt_sigprocmask
0.00    0.000000           0        15         5 rt_sigreturn
0.00    0.000000           0        30           setitimer
0.00    0.000000           0         1           semctl
0.00    0.000000           0       262           getrusage
0.00    0.000000           0       112           times
0.00    0.000000           0         1           io_setup
0.00    0.000000           0         1           io_destroy
0.00    0.000000           0        23           semtimedop
------ ----------- ----------- --------- --------- ----------------
100.00    0.037213                 59382     14714 total

Not too enlightening-all I/O happens in the cells.

Summary

These are the tools I found very useful in getting started with Exadata performance analysis, and they are very good for the job at hand. Note that snapper can be used for getting information about parallel query and SQL Monitor is a blessing when it comes to looking at huge parallel queries involving lots of tables.

Why is my Exadata smart scan not offloading?

After receiving the excellent “Expert Oracle Exadata” book I decided to spend some time looking into Exadata performance after I having spent most of the time previously on infrastructure related questions such as “how can I prevent someone from overwriting my DR system with UAT data”, patching etc.

Now there is one thing to keep in mind with Exadata-you need lots of data before it breaks into a little sweat. Luckily for me, one of my colleagues has performed some testing on this environment. For reasons unknown the swingbench order entry benchmark (version 2.3) has been chosen. For those who don’t know the OE benchmark: it’s a heavy OLTP style workload simulating a web shop where users browse products, place orders etc. OE is optimised for single block I/O, and despite what you may have heard, Exadata doesn’t provide a noticable benefit for these queries.

Anyway, what I liked was the fact that the order_items table had about 350 million rows organised in about 8GB. From discussions with Frits Hoogland I know that a full scan of such a table takes between 40 and 60 seconds depending on system load.

SOE.ORDER_ITEMS

Here are some of the interesting facts around the table:

SQL> select partition_name,num_rows,blocks,avg_space,chain_cnt,global_stats,user_stats,stale_stats
2  from dba_tab_statistics
3  where table_name = 'ORDER_ITEMS'
4  /

PARTITION_NAME                   NUM_ROWS     BLOCKS  AVG_SPACE  CHAIN_CNT GLO USE STA
------------------------------ ---------- ---------- ---------- ---------- --- --- ---
349990815    1144832          0          0 YES NO  NO
SYS_P341                         21866814      71297          0          0 YES NO  NO
SYS_P342                         21889112      72317          0          0 YES NO  NO
SYS_P343                         21877726      71297          0          0 YES NO  NO
SYS_P344                         21866053      71297          0          0 YES NO  NO
SYS_P345                         21870127      71297          0          0 YES NO  NO
SYS_P346                         21887971      72317          0          0 YES NO  NO
SYS_P347                         21875056      71297          0          0 YES NO  NO
SYS_P348                         21891454      72317          0          0 YES NO  NO
SYS_P349                         21883576      72317          0          0 YES NO  NO
SYS_P350                         21859704      71297          0          0 YES NO  NO
SYS_P351                         21866820      71297          0          0 YES NO  NO
SYS_P352                         21865681      71297          0          0 YES NO  NO
SYS_P353                         21865239      71297          0          0 YES NO  NO
SYS_P354                         21870373      71297          0          0 YES NO  NO
SYS_P355                         21882656      71297          0          0 YES NO  NO
SYS_P356                         21872453      71297          0          0 YES NO  NO

17 rows selected.

It has 4 indexes, out of which some are reverse key indexes:

SQL> select index_name,index_type from user_indexes where table_name = 'ORDER_ITEMS';

INDEX_NAME                     INDEX_TYPE
------------------------------ ---------------------------
ITEM_ORDER_IX                  NORMAL/REV
ORDER_ITEMS_UK                 NORMAL
ORDER_ITEMS_PK                 NORMAL
ITEM_PRODUCT_IX                NORMAL/REV

SQL>

This is going to be interesting (I have to admit I didn’t check initially for reverse key indexes). All I wanted was a huge table to see if a smart scan really is so mind blowingly fast. These examples can easily be reproduced by generating the SOE schema with the oewizard-just make sure you select the use of partitioning (you should have a license for it).

Performance Testing

My plan was to start off with a serial execution, then use parallel query and check for execution times.As with all performance tuning of this kind you should have a copy of session snapper from Tanel Poder available. At the time of this writing, the latest version was 3.52 available from Tanel’s blog.

I also wanted to see when a smart scan kicked in. Here’s the first test with serial execution:

10:44:37 SQL> select count(*) from order_items
10:44:40   2  /

COUNT(*)
----------
350749016

Elapsed: 00:00:47.54

OK, that doesn’t look like a smart scan has happened, the 47 seconds are a little too slow. As always, check using snapper to confirm:

SQL> @snapper all 5 1 243
Sampling SID 243 with interval 5 seconds, taking 1 snapshots...
setting stats to all due option = all

-- Session Snapper v3.52 by Tanel Poder @ E2SN ( http://tech.e2sn.com )

-------------------------------------------------------------------------------------------------------------------------------------
SID, USERNAME  , TYPE, STATISTIC                                                 ,     HDELTA, HDELTA/SEC,    %TIME, GRAPH
-------------------------------------------------------------------------------------------------------------------------------------
243, SOE       , STAT, session logical reads                                     ,     81.07k,     16.21k,
243, SOE       , STAT, user I/O wait time                                        ,        299,       59.8,
243, SOE       , STAT, non-idle wait time                                        ,        300,         60,
243, SOE       , STAT, non-idle wait count                                       ,      1.26k,        251,
243, SOE       , STAT, physical read total IO requests                           ,        634,      126.8,
243, SOE       , STAT, physical read total multi block requests                  ,        634,      126.8,
243, SOE       , STAT, physical read total bytes                                 ,    664.14M,    132.83M,
243, SOE       , STAT, cell physical IO interconnect bytes                       ,    664.14M,    132.83M,
243, SOE       , STAT, consistent gets                                           ,     81.07k,     16.21k,
243, SOE       , STAT, consistent gets from cache                                ,          1,         .2,
243, SOE       , STAT, consistent gets from cache (fastpath)                     ,          1,         .2,
243, SOE       , STAT, consistent gets direct                                    ,     81.07k,     16.21k,
243, SOE       , STAT, physical reads                                            ,     81.07k,     16.21k,
243, SOE       , STAT, physical reads direct                                     ,     81.07k,     16.21k,
243, SOE       , STAT, physical read IO requests                                 ,        634,      126.8,
243, SOE       , STAT, physical read bytes                                       ,    664.14M,    132.83M,
243, SOE       , STAT, calls to kcmgcs                                           ,          1,         .2,
243, SOE       , STAT, file io wait time                                         ,        395,         79,
243, SOE       , STAT, Number of read IOs issued                                 ,        635,        127,
243, SOE       , STAT, no work - consistent read gets                            ,      81.1k,     16.22k,
243, SOE       , TIME, DB CPU                                                    ,      1.19s,   237.96ms,    23.8%, |@@@       |
243, SOE       , TIME, sql execute elapsed time                                  ,      4.01s,   801.37ms,    80.1%, |@@@@@@@@  |
243, SOE       , TIME, DB time                                                   ,      4.01s,   801.37ms,    80.1%, |@@@@@@@@  |
243, SOE       , WAIT, direct path read                                          ,      2.99s,   598.63ms,    59.9%, |@@@@@@    |
243, SOE       , WAIT, kfk: async disk IO                                        ,     4.25ms,    849.4us,      .1%, |          |
--  End of Stats snap 1, end=2011-08-17 10:30:47, seconds=5

-----------------------------------------------------------------------
Active% | SQL_ID          | EVENT                     | WAIT_CLASS
-----------------------------------------------------------------------
62% | b0hcgjs21yrq9   | direct path read          | User I/O
38% | b0hcgjs21yrq9   | ON CPU                    | ON CPU

--  End of ASH snap 1, end=2011-08-17 10:30:47, seconds=5, samples_taken=42

PL/SQL procedure successfully completed.

Right-no smart scan; this puzzled me. To recap, smart scans happen only if:

  • Direct path reads are used
  • The row source can be offloaded
  • The parameter cell_offload_processing is set to true (I think it’s “always” in 11.2)
  • There are no chained or migrated rows

Now let’s check these.

I can clearly see that direct path reads have happened from the snapper output-check. It’s also worth remembering that the decision to perform direct path reads is made on a segment basis-and each partition is a segment: keep that in mind!

You can check the execution plan to find out if the row source can be offloaded as in this example:

SQL> select * from table(dbms_xplan.display(format=>'+projection'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 2209137760

-----------------------------------------------------------------------------------------
| Id  | Operation                     | Name            | Rows  | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                 |     1 |   249K  (1)| 00:49:52 |
|   1 |  SORT AGGREGATE               |                 |     1 |            |          |
|   2 |   INDEX STORAGE FAST FULL SCAN| ITEM_PRODUCT_IX |   349M|   249K  (1)| 00:49:52 |
-----------------------------------------------------------------------------------------

Column Projection Information (identified by operation id):

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------

1 - (#keys=0) COUNT(*)[22]

14 rows selected.

You can see that the INDEX STORAGE FAST FULL SCAN clause is used, in other words the source can be offloaded. Strange that it didn’t happen though. The parameter cell_offload_processing was set to true on my system. Do you remember that the index is a reverse key index? I’m wondering if that has anything to do with it.

To rule out that there was a problem with the direct path reads I set “_serial_direct_read”=true and tried again, but it didn’t make a difference.

Another way to check for smart scans in a live system is the use of v$sql-columns IO_CELL_UNCOMPRESSED_BYTES and  IO_CELL_OFFLOAD_RETURNED_BYTES are cumulative counters for smart scan activity. However if they are 0 like in my case, they indicate some sort of issue.

This continued with parallel query: 8 or 64 slaves and still no smart scan. I even traced the execution, but there was not a single pxxx trace file with the word “smart” in it (and I made sure I captured the waits)

What was going on?

SQL_ID  4vkkk105tny60, child number 0
-------------------------------------
select /*+parallel(64)*/ count(*) from order_items

Plan hash value: 544438321

--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                         | Name            | Rows  | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                  |                 |       |  4306 (100)|          |        |      |            |
|   1 |  SORT AGGREGATE                   |                 |     1 |            |          |        |      |            |
|   2 |   PX COORDINATOR                  |                 |       |            |          |        |      |            |
|   3 |    PX SEND QC (RANDOM)            | :TQ10000        |     1 |            |          |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE                |                 |     1 |            |          |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR            |                 |   349M|  4306   (1)| 00:00:52 |  Q1,00 | PCWC |            |
|*  6 |       INDEX STORAGE FAST FULL SCAN| ITEM_PRODUCT_IX |   349M|  4306   (1)| 00:00:52 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

1 - SEL$1
6 - SEL$1 / ORDER_ITEMS@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

6 - storage(:Z>=:Z AND :Z<=:Z)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

1 - (#keys=0) COUNT()[22]
2 - SYS_OP_MSR()[10]
3 - (#keys=0) SYS_OP_MSR()[10]
4 - (#keys=0) SYS_OP_MSR()[10]

Note
-----
- automatic DOP: Computed Degree of Parallelism is 64

Ok then I got fed up with that ITEM_PRODUCT_IX and forced a full table scan and volia-a smart scan happened.

SQL> select * from table(dbms_xplan.display_cursor('5a1x1v72ujf8s', format=>'ALL'));

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  5a1x1v72ujf8s, child number 0
-------------------------------------
select /*+ full(t) parallel(t, 8) */ count(*) from order_items t

Plan hash value: 661298821

-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name        | Rows  | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |             |       | 43358 (100)|          |       |       |        |      |            |
|   1 |  SORT AGGREGATE                |             |     1 |            |          |       |       |        |      |            |
|   2 |   PX COORDINATOR               |             |       |            |          |       |       |        |      |            |
|   3 |    PX SEND QC (RANDOM)         | :TQ10000    |     1 |            |          |       |       |  Q1,00 | P->S | QC (RAND)  |
|   4 |     SORT AGGREGATE             |             |     1 |            |          |       |       |  Q1,00 | PCWP |            |
|   5 |      PX BLOCK ITERATOR         |             |   349M| 43358   (1)| 00:08:41 |     1 |    16 |  Q1,00 | PCWC |            |
|*  6 |       TABLE ACCESS STORAGE FULL| ORDER_ITEMS |   349M| 43358   (1)| 00:08:41 |     1 |    16 |  Q1,00 | PCWP |            |
-----------------------------------------------------------------------------------------------------------------------------------

Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------

1 - SEL$1
6 - SEL$1 / T@SEL$1

Predicate Information (identified by operation id):
---------------------------------------------------

6 - storage(:Z>=:Z AND :Z<=:Z)

Column Projection Information (identified by operation id):
-----------------------------------------------------------

1 - (#keys=0) COUNT()[22]
2 - SYS_OP_MSR()[10]
3 - (#keys=0) SYS_OP_MSR()[10]
4 - (#keys=0) SYS_OP_MSR()[10]

37 rows selected.

To be sure that was correct, I retried with serial execution (check the time):

11:44:10 SQL> select /*+ full(t) single */ count(*) from order_items t;

COUNT(*)
----------
350749016

Elapsed: 00:00:12.02

@snapper all 5 1 241

Sampling SID 243 with interval 5 seconds, taking 1 snapshots...
setting stats to all due option = all

-- Session Snapper v3.52 by Tanel Poder @ E2SN ( http://tech.e2sn.com )

-------------------------------------------------------------------------------------------------------------------------------------
SID, USERNAME  , TYPE, STATISTIC                                                 ,     HDELTA, HDELTA/SEC,    %TIME, GRAPH
-------------------------------------------------------------------------------------------------------------------------------------
243, SOE       , STAT, session logical reads                                     ,    482.88k,     96.58k,
243, SOE       , STAT, application wait time                                     ,          1,         .2,
243, SOE       , STAT, user I/O wait time                                        ,        158,       31.6,
243, SOE       , STAT, non-idle wait time                                        ,        159,       31.8,
243, SOE       , STAT, non-idle wait count                                       ,      3.72k,      744.8,
243, SOE       , STAT, enqueue waits                                             ,         15,          3,
243, SOE       , STAT, enqueue requests                                          ,         14,        2.8,
243, SOE       , STAT, enqueue conversions                                       ,         21,        4.2,
243, SOE       , STAT, enqueue releases                                          ,         14,        2.8,
243, SOE       , STAT, global enqueue gets sync                                  ,         35,          7,
243, SOE       , STAT, global enqueue releases                                   ,         14,        2.8,
243, SOE       , STAT, physical read total IO requests                           ,      5.23k,      1.05k,
243, SOE       , STAT, physical read total multi block requests                  ,      4.91k,        982,
243, SOE       , STAT, physical read total bytes                                 ,      3.95G,    790.99M,
243, SOE       , STAT, cell physical IO interconnect bytes                       ,      1.72G,    343.34M,
243, SOE       , STAT, ges messages sent                                         ,         21,        4.2,
243, SOE       , STAT, consistent gets                                           ,    482.88k,     96.58k,
243, SOE       , STAT, consistent gets from cache                                ,         97,       19.4,
243, SOE       , STAT, consistent gets from cache (fastpath)                     ,         97,       19.4,
243, SOE       , STAT, consistent gets direct                                    ,    482.78k,     96.56k,
243, SOE       , STAT, physical reads                                            ,    482.78k,     96.56k,
243, SOE       , STAT, physical reads direct                                     ,    482.78k,     96.56k,
243, SOE       , STAT, physical read IO requests                                 ,      5.23k,      1.05k,
243, SOE       , STAT, physical read bytes                                       ,      3.95G,    790.99M,
243, SOE       , STAT, calls to kcmgcs                                           ,         97,       19.4,
243, SOE       , STAT, file io wait time                                         ,      19.7k,      3.94k,
243, SOE       , STAT, cell physical IO bytes eligible for predicate offload     ,      3.95G,    790.89M,
243, SOE       , STAT, cell smart IO session cache lookups                       ,          7,        1.4,
243, SOE       , STAT, cell smart IO session cache hits                          ,          7,        1.4,
243, SOE       , STAT, cell physical IO interconnect bytes returned by smart scan,      1.72G,    343.38M,
243, SOE       , STAT, table scans (long tables)                                 ,          7,        1.4,
243, SOE       , STAT, table scans (direct read)                                 ,          7,        1.4,
243, SOE       , STAT, table scan rows gotten                                    ,    147.19M,     29.44M,
243, SOE       , STAT, table scan blocks gotten                                  ,    480.07k,     96.01k,
243, SOE       , STAT, cell scans                                                ,          7,        1.4,
243, SOE       , STAT, cell blocks processed by cache layer                      ,    576.41k,    115.28k,
243, SOE       , STAT, cell blocks processed by txn layer                        ,    576.41k,    115.28k,
243, SOE       , STAT, cell blocks processed by data layer                       ,    485.06k,     97.01k,
243, SOE       , STAT, cell blocks helped by minscn optimization                 ,    576.28k,    115.26k,
243, SOE       , STAT, cell simulated session smart scan efficiency              ,      3.97G,     794.6M,
243, SOE       , STAT, cell IO uncompressed bytes                                ,      3.97G,    794.93M,
243, SOE       , TIME, DB CPU                                                    ,      2.63s,   526.72ms,    52.7%, |@@@@@@    |
243, SOE       , TIME, sql execute elapsed time                                  ,      4.01s,   801.13ms,    80.1%, |@@@@@@@@  |
243, SOE       , TIME, DB time                                                   ,      4.01s,   801.13ms,    80.1%, |@@@@@@@@  |
243, SOE       , WAIT, enq: KO - fast object checkpoint                          ,     2.19ms,    438.8us,      .0%, |          |
243, SOE       , WAIT, cell smart table scan                                     ,      1.58s,   316.49ms,    31.6%, |@@@@      |
243, SOE       , WAIT, events in waitclass Other                                 ,     8.54ms,     1.71ms,      .2%, |          |
--  End of Stats snap 1, end=2011-08-17 12:53:04, seconds=5

-----------------------------------------------------------------------
Active% | SQL_ID          | EVENT                     | WAIT_CLASS
-----------------------------------------------------------------------
66% | cfhpsq29gb49m   | ON CPU                    | ON CPU
34% | cfhpsq29gb49m   | cell smart table scan     | User I/O

--  End of ASH snap 1, end=2011-08-17 12:53:04, seconds=5, samples_taken=47

PL/SQL procedure successfully complet

Confusing, confusing. It was time to ask the experts: Frits Hoogland who reproduced the behaviour on his environment (11.2.0.1 and 11.2.0.2), as well as Kerry Osborne-both of which thought a smart scan should have happened even with the index. An Index FFS certainly used to be offloadable, and we think there is a regression bug visible with the cells. My version of the cell software is 11.2.2.2 with 11.2.0.1 BP 6. I first thought it might be a problem with 11.2.0.1 and the version of the cell software, but it probably isn’t: Kerry tested on 11.2.0.2 BP10 and cellsrv 11.2.2.3.2, Frits tested on 11.2.0.1 and 11.2.0.2 BP 6 and cellsrv 11.2.2.2.0_LINUX.X64_101206.2-1

Summary

What we are seeing isn’t something we should be seeing. I am trying to get this raised as an SR and see what happens. From my testing it seems that it potentially impacts anyone with indexes on their tables.

Are you getting the most out of your Exadata performance? Part 1

In almost all of the Exadata migration projects I’ve been part of, the client sees immediate speedup & performance increase when testing their workload on Exadata (of course, we’ve made sure that we do plan & execute the tasks right). However, my performance geek’s nature usually doesn’t allow to stop there and leave the client with just 2x or 3x performance increase. For data warehousing and reporting workloads, Exadata can do much better than just 2-3x performance increase! 

This is why I will write this article series about Getting the Most out of your Exadata Performance. I will write a bunch of random articles, based on my experience and lessons learned – and some day I may consolidate it all into a more formal paper.

So, here’s the first article (PDF format).

Linux takes a page from Solaris… pmap available on Linux.

Recently, there was a thread on an internal alias of old Sun guys.  The problem at hand was to track down a process that is consuming memory on Linux.  This is the type of problem that can be solved many ways (ps, top, etc…), but to my amazement someone mentioned that pmap could be used for Linux…. I guess I didn’t get the memo:)

About 6 months back I wrote a few entries that discussed Solaris tools for the Linux Guy in the posts:

The Oracle Exadata IO Resource Manager Limit Clause, part 2

In my previous post I described how the IO resource manager (IORM) in the exadata storage can be used both to guarantee a minimum amount of IO a database can get (which is what is covered in most material available), but also to set a maximum amount of IO. This is what Oracle calls an inter-database resource manager plan. This is set and configured at the cell level using the cellcli with ‘alter iormplan dbplan’.

Expert Oracle Exadata virtual conference

Today is the last day for getting the early bird’s rate!

http://blog.tanelpoder.com/seminar/expert-oracle-exadata-virtual-conference/

Also, our book will be out on Monday!

The Oracle Exadata IO Resource Manager Limit Clause

The Exadata IO Resource Manager (IORM) is a fundamental piece of the exadata database machine since the beginning. The function of the IORM is to guarantee a database and/or category (set via the database resource manager) gets the share of IO as defined in the resource plan at each storage server. The “share” here means the minimal amount of IO. This is widely known and documented in the Exadata documentation. Having a minimal amount of IO means you can guarantee a service level agreement (SLA).

But what if you want to define the maximal amount of IO a database may get? The case of a maximal amount of IO a database can get is that you can use a database on Exadata, and do not (or very limited) change the performance once more databases get onto the same database machine. So instead of having the minimal amount of IO a database must get guaranteed (which is what is documented in the exadata documentation), having a limit on the maximal amount a database can get?

Troubleshooting Oracle Exadata

The crew at My Oracle Support (MOS) [@myoraclesupport] have an excellent starting point for troubleshooting Oracle Exadata. I’d recommend to add this one to your MOS bookmarks.

Oracle Database Machine and Exadata Storage Server Information Center [ID 1306791.1]