Search

OakieTags

Who's online

There are currently 0 users and 28 guests online.

Recent comments

Affiliations

An investigation into exadata

This is an investigation into an half rack database machine (the half rack database machine at VX Company). It’s an exadata/database V2, which means SUN hardware and database and cell (storage) software version 11.2.

I build a table (called ‘CG_VAR’), which consists of:
- bytes: 50787188736 (47.30 GB)
- extents: 6194
- blocks: 6199608

The table doesn’t have a primary key, nor any other constraints, nor any indexes. (of course this is not a real life situation)

No exadata optimisation

At first I disabled the Oracle storage optimisation using the session parameter ‘CELL_OFFLOAD_PROCESSING’:
alter session set cell_offload_processing=false;

Then executed: select count(*) from cg_var where sample_id=1;
The value ’1′ in the table ‘CG_VAR’ accounts for roughly 25%.

Execution plan:

Execution Plan
----------------------------------------------------------
Plan hash value: 2301354116
-------------------------------------------------------------------------------------
| Id  | Operation                  | Name   | Rows  | Bytes | Cost (%CPU)| Time    |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |        |     1 |     3 |  1692K  (1)| 05:38:34 |
|   1 |  SORT AGGREGATE            |        |     1 |     3 |            |    |
|*  2 |   TABLE ACCESS STORAGE FULL| CG_VAR |   395M|  1131M|  1692K  (1)| 05:38:34 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("SAMPLE_ID"=1)

Statistics:

592,STAT opened cursors cumulative                                                     1
592,STAT user calls                                                                    3
592,STAT session logical reads                                                   6178215
592,STAT CPU used when call started                                                15398
592,STAT CPU used by this session                                                  15398
592,STAT DB time                                                                   30667
592,STAT user I/O wait time                                                        15397
592,STAT non-idle wait time                                                        15424
592,STAT non-idle wait count                                                       96193
592,STAT session uga memory                                                      2105504
592,STAT session uga memory max                                                   982320
592,STAT session pga memory                                                      1572864
592,STAT session pga memory max                                                  1572864
592,STAT enqueue waits                                                                 3
592,STAT enqueue requests                                                              2
592,STAT enqueue conversions                                                           3
592,STAT enqueue releases                                                              2
592,STAT global enqueue gets sync                                                      5
592,STAT global enqueue releases                                                       2
592,STAT physical read total IO requests                                           48448
592,STAT physical read total multi block requests                                  48446
592,STAT physical read requests optimized                                              2
592,STAT physical read total bytes                                           50610774016
592,STAT cell physical IO interconnect bytes                                 50610774016
592,STAT ges messages sent                                                             4
592,STAT consistent gets                                                         6178215
592,STAT consistent gets from cache                                                  142
592,STAT consistent gets from cache (fastpath)                                       142
592,STAT consistent gets direct                                                  6178073
592,STAT physical reads                                                          6178073
592,STAT physical reads direct                                                   6178073
592,STAT physical read IO requests                                                 48448
592,STAT physical read bytes                                                 50610774016
592,STAT calls to kcmgcs                                                             142
592,STAT calls to get snapshot scn: kcmgss                                             1
592,STAT file io wait time                                                         17477
592,STAT Number of read IOs issued                                                 48448
592,STAT no work - consistent read gets                                          6178073
592,STAT table scans (long tables)                                                     1
592,STAT table scans (direct read)                                                     1
592,STAT table scan rows gotten                                               1596587000
592,STAT table scan blocks gotten                                                6178073
592,STAT session cursor cache hits                                                     1
592,STAT session cursor cache count                                                    1
592,STAT parse count (total)                                                           1
592,STAT execute count                                                                 1
592,STAT bytes sent via SQL*Net to client                                            528
592,STAT bytes received via SQL*Net from client                                       11
592,STAT SQL*Net roundtrips to/from client                                             2
592,STAT cell flash cache read hits                                                    2
592,WAIT enq: KO - fast object checkpoint                                            621
592,WAIT direct path read                                                      153970472
592,WAIT SQL*Net message to client                                                     3
592,WAIT SQL*Net message from client                                          1782253728
592,WAIT kfk: async disk IO                                                       264299
592,WAIT events in waitclass Other                                                  2236
592,TIME parse time elapsed                                                           20
592,TIME DB CPU                                                                153981591
592,TIME sql execute elapsed time                                              306726075
592,TIME DB time                                                               306726219

The executionplan is different with the operation ‘TABLE ACCESS STORAGE FULL’. This means the database is aware the segment is on cell/exadata storage.

We see here the SQL took 306,726,095 microseconds (306.726 seconds; time: parse time+sql execute elapsed time) for scanning 47.30 GB (!!), of which roughly half the time (153.970 seconds) was taken by the wait ‘direct path read’. The I/O wait-time is also in the statistic ‘user I/O wait time’: 15397.

I’ve executed the SQL before; there is no hard parse in the ‘time section’, there’s also no ‘parse count (hard)’ in the statistics section.

I’ve done 48448 IO requests (‘physical read total IO requests’), of which almost all IO’s (48446) where multiblock requests (‘physical read total multi block requests’). This means the average IO wait time is: 153,970,472/48,448 = 3,178 which is 3ms!

Using exadata optimisation, without storage indexes

In order to enable exadata optimisation, I reverted the session parameter ‘CELL_OFFLOAD_PROCESSING’ to ‘TRUE’.

One of the optimisation techniques of exadata is called ‘storage index’. A storage index is a list kept in the memory of the cellservers which keeps track of minimum and maximum values of fields per chunk of 1 MB (called ‘storage region’). Storage indexes are build automatically during usage of the ‘storage region’. A way to flush the storage index is to restart the cell servers. (Warning! Do not do this in a production environment!)

Execution plan:

Execution Plan
----------------------------------------------------------
Plan hash value: 2301354116
-------------------------------------------------------------------------------------
| Id  | Operation                  | Name   | Rows  | Bytes | Cost (%CPU)| Time    |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |        |     1 |     3 |  1692K  (1)| 05:38:34 |
|   1 |  SORT AGGREGATE            |        |     1 |     3 |            |    |
|*  2 |   TABLE ACCESS STORAGE FULL| CG_VAR |   395M|  1131M|  1692K  (1)| 05:38:34 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - storage("SAMPLE_ID"=1)
       filter("SAMPLE_ID"=1)

Statistics:

592,STAT opened cursors cumulative                                                     1
592,STAT user calls                                                                    3
592,STAT session logical reads                                                   6178215
592,STAT CPU used when call started                                                 3707
592,STAT CPU used by this session                                                   3707
592,STAT DB time                                                                    3856
592,STAT user I/O wait time                                                          156
592,STAT non-idle wait time                                                          156
592,STAT non-idle wait count                                                        2437
592,STAT session uga memory                                                     24054408
592,STAT session uga memory max                                                 21948904
592,STAT session pga memory                                                     25755648
592,STAT session pga memory max                                                 23527424
592,STAT enqueue waits                                                                 3
592,STAT enqueue requests                                                              2
592,STAT enqueue conversions                                                           3
592,STAT enqueue releases                                                              2
592,STAT global enqueue gets sync                                                      5
592,STAT global enqueue releases                                                       2
592,STAT physical read total IO requests                                           53714
592,STAT physical read total multi block requests                                  52661
592,STAT physical read total bytes                                           50610774016
592,STAT cell physical IO interconnect bytes                                  5575531904
592,STAT ges messages sent                                                             4
592,STAT consistent gets                                                         6178215
592,STAT consistent gets from cache                                                  142
592,STAT consistent gets from cache (fastpath)                                       142
592,STAT consistent gets direct                                                  6178073
592,STAT physical reads                                                          6178073
592,STAT physical reads direct                                                   6178073
592,STAT physical read IO requests                                                 53714
592,STAT physical read bytes                                                 50610774016
592,STAT calls to kcmgcs                                                             142
592,STAT calls to get snapshot scn: kcmgss                                             1
592,STAT cell physical IO bytes eligible for predicate offload               50610774016
592,STAT cell physical IO interconnect bytes returned by smart scan           5575531904
592,STAT cell session smart scan efficiency                                            9
592,STAT table scans (long tables)                                                     1
592,STAT table scans (direct read)                                                     1
592,STAT table scan rows gotten                                                387130468
592,STAT table scan blocks gotten                                                1503621
592,STAT cell scans                                                                    1
592,STAT cell blocks processed by cache layer                                    6549118
592,STAT cell blocks processed by txn layer                                      6549118
592,STAT cell blocks processed by data layer                                     6183339
592,STAT cell blocks helped by minscn optimization                               6549118
592,STAT cell simulated session smart scan efficiency                        50653913088
592,STAT cell IO uncompressed bytes                                          50653913088
592,STAT session cursor cache count                                                    1
592,STAT parse count (total)                                                           1
592,STAT execute count                                                                 1
592,STAT bytes sent via SQL*Net to client                                            528
592,STAT bytes received via SQL*Net from client                                      327
592,STAT SQL*Net roundtrips to/from client                                             2
592,WAIT enq: KO - fast object checkpoint                                            492
592,WAIT cell smart table scan                                                   1560351
592,WAIT SQL*Net message to client                                                     3
592,WAIT SQL*Net message from client                                             5605106
592,WAIT events in waitclass Other                                                  1590
592,TIME parse time elapsed                                                           46
592,TIME DB CPU                                                                 37064365
592,TIME sql execute elapsed time                                               38571803
592,TIME DB time                                                                38571968

The execution plan shows the ‘TABLE ACCESS STORAGE FULL’ again, to indicate cell storage, but now it was able to offload, which is visible in the predicate information section with the predicate ‘storage(“SAMPLE_ID”=1)’. Let’s see the differences:

NR ,TYP Statistic                                                                  1st           2nd          Diff
592,STAT opened cursors cumulative                                                     1              1              0
592,STAT user calls                                                                    3              3              0
592,STAT session logical reads                                                   6178215        6178215              0
592,STAT CPU used when call started                                                15398           3707         -11691
592,STAT CPU used by this session                                                  15398           3707         -11691
592,STAT DB time                                                                   30667           3856         -26811
592,STAT user I/O wait time                                                        15397            156         -15241
592,STAT non-idle wait time                                                        15424            156         -15268
592,STAT non-idle wait count                                                       96193           2437         -93756
592,STAT session uga memory                                                      2105504       24054408       21948904
592,STAT session uga memory max                                                   982320       21948904       20966584
592,STAT session pga memory                                                      1572864       25755648       24182784
592,STAT session pga memory max                                                  1572864       23527424       21954560
592,STAT enqueue waits                                                                 3              3              0
592,STAT enqueue requests                                                              2              2              0
592,STAT enqueue conversions                                                           3              3              0
592,STAT enqueue releases                                                              2              2              0
592,STAT global enqueue gets sync                                                      5              5              0
592,STAT global enqueue releases                                                       2              2              0
592,STAT physical read total IO requests                                           48448          53714           5266
592,STAT physical read total multi block requests                                  48446          52661           4215
592,STAT physical read requests optimized                                              2              0             -2
592,STAT physical read total bytes                                           50610774016    50610774016              0
592,STAT cell physical IO interconnect bytes                                 50610774016     5575531904   -45035242112
592,STAT ges messages sent                                                             4              4              0
592,STAT consistent gets                                                         6178215        6178215              0
592,STAT consistent gets from cache                                                  142            142              0
592,STAT consistent gets from cache (fastpath)                                       142            142              0
592,STAT consistent gets direct                                                  6178073        6178073              0
592,STAT physical reads                                                          6178073        6178073              0
592,STAT physical reads direct                                                   6178073        6178073              0
592,STAT physical read IO requests                                                 48448          53714           5266
592,STAT physical read bytes                                                 50610774016    50610774016              0
592,STAT calls to kcmgcs                                                             142            142              0
592,STAT calls to get snapshot scn: kcmgss                                             1              1              0
592,STAT file io wait time                                                         17477              0         -17477
592,STAT Number of read IOs issued                                                 48448              0         -48448
592,STAT cell physical IO bytes eligible for predicate offload                         0    50610774016    50610774016
592,STAT cell physical IO interconnect bytes returned by smart scan                    0     5575531904     5575531904
592,STAT cell session smart scan efficiency                                            0              9              9
592,STAT no work - consistent read gets                                          6178073              0       -6178073
592,STAT table scans (long tables)                                                     1              1              0
592,STAT table scans (direct read)                                                     1              1              0
592,STAT table scan rows gotten                                               1596587000      387130468    -1209456532
592,STAT table scan blocks gotten                                                6178073        1503621       -4674452
592,STAT cell scans                                                                    0              1              1
592,STAT cell blocks processed by cache layer                                          0        6549118        6549118
592,STAT cell blocks processed by txn layer                                            0        6549118        6549118
592,STAT cell blocks processed by data layer                                           0        6183339        6183339
592,STAT cell blocks helped by minscn optimization                                     0        6549118        6549118
592,STAT cell simulated session smart scan efficiency                                  0    50653913088    50653913088
592,STAT cell IO uncompressed bytes                                                    0    50653913088    50653913088
592,STAT session cursor cache hits                                                     1              0             -1
592,STAT session cursor cache count                                                    1              1              0
592,STAT parse count (total)                                                           1              1              0
592,STAT execute count                                                                 1              1              0
592,STAT bytes sent via SQL*Net to client                                            528            528              0
592,STAT bytes received via SQL*Net from client                                       11            327            316
592,STAT SQL*Net roundtrips to/from client                                             2              2              0
592,STAT cell flash cache read hits                                                    2              0             -2
592,WAIT enq: KO - fast object checkpoint                                            621            492           -129
592,WAIT direct path read                                                      153970472              0     -153970472
592,WAIT cell smart table scan                                                         0        1560351        1560351
592,WAIT SQL*Net message to client                                                     3              3              0
592,WAIT SQL*Net message from client                                          1782253728        5605106    -1776648622
592,WAIT kfk: async disk IO                                                       264299              0        -264299
592,WAIT events in waitclass Other                                                  2236           1590           -646
592,TIME parse time elapsed                                                           20             46             26
592,TIME DB CPU                                                                153981591       37064365     -116917226
592,TIME sql execute elapsed time                                              306726075       38571803     -268154272
592,TIME DB time                                                               306726219       38571968     -268154251

Let start at the time spend: the first run, this was: 306726075+20= 307 seconds, the second run took: 38571803+46= 39 seconds (!!). That is a reduction of 87%. That _is_ spectacular!

The waits have changed: the ‘direct path read’ is swapped for ‘cell smart table scan’.

In the (session) statistics also have some things to notice:

  • physical read total bytes: Is the same(!) It’s not strange: the size of the segment to read is known by the database, and must be read entirely from the database’s perspective.
  • cell physical IO bytes eligible for predicate offload: this statistic shows the amount of data which the cell server is able to process on behalf of the database, instead of the database processing and the cell server just delivering blocks. In this case, all bytes are processed on the cellserver (cell physical IO bytes eligible for predicate offload=physical read total bytes)
  • cell physical IO interconnect bytes: Here we see that the amount of data traffic between the database and the cell servers. Because of the offload, the amount is reduced immensely (50,610,774,016 versus 5,575,531,904) which is a reduction of 88.9% (!!)
  • cell physical IO interconnect bytes returned by smart scan: This is the amount of data which is caused by the cell smart scan (the statistic name is quite self explaining). If we compare this statistic with ‘cell physical IO interconnect bytes’, we see all data on the interconnect was caused by the smart scan.
  • cell scans: the number of scans offloaded to cellservers. one in this case.
Using exadata optimisation, with storage indexes

Upon the next execution of this statement, the execution plan stays the same, but the cellservers built the storage index.

Let’s see a comparison of the statistics of the previous run (without a storage index in place) with a second run, which will use the storage index, which is built up during the first run:

NR ,TYP Statistic                                                                  1st           2nd          Diff
592,STAT opened cursors cumulative                                                     1              1              0
592,STAT user calls                                                                    3              3              0
592,STAT session logical reads                                                   6178215        6178215              0
592,STAT CPU used when call started                                                 3707           3713              6
592,STAT CPU used by this session                                                   3707           3713              6
592,STAT DB time                                                                    3856           3745           -111
592,STAT user I/O wait time                                                          156             37           -119
592,STAT non-idle wait time                                                          156             37           -119
592,STAT non-idle wait count                                                        2437           1825           -612
592,STAT session uga memory                                                     24054408              0      -24054408
592,STAT session uga memory max                                                 21948904              0      -21948904
592,STAT session pga memory                                                     25755648              0      -25755648
592,STAT session pga memory max                                                 23527424              0      -23527424
592,STAT enqueue waits                                                                 3              3              0
592,STAT enqueue requests                                                              2              2              0
592,STAT enqueue conversions                                                           3              3              0
592,STAT enqueue releases                                                              2              2              0
592,STAT global enqueue gets sync                                                      5              5              0
592,STAT global enqueue releases                                                       2              2              0
592,STAT physical read total IO requests                                           53714          53765             51
592,STAT physical read total multi block requests                                  52661          52688             27
592,STAT physical read requests optimized                                              0          35423          35423
592,STAT physical read total bytes                                           50610774016    50610774016              0
592,STAT cell physical IO interconnect bytes                                  5575531904     5570438336       -5093568
592,STAT ges messages sent                                                             4              4              0
592,STAT consistent gets                                                         6178215        6178215              0
592,STAT consistent gets from cache                                                  142            142              0
592,STAT consistent gets from cache (fastpath)                                       142            142              0
592,STAT consistent gets direct                                                  6178073        6178073              0
592,STAT physical reads                                                          6178073        6178073              0
592,STAT physical reads direct                                                   6178073        6178073              0
592,STAT physical read IO requests                                                 53714          53765             51
592,STAT physical read bytes                                                 50610774016    50610774016              0
592,STAT calls to kcmgcs                                                             142            142              0
592,STAT calls to get snapshot scn: kcmgss                                             1              1              0
592,STAT cell physical IO bytes eligible for predicate offload               50610774016    50610774016              0
592,STAT cell physical IO bytes saved by storage index                                 0    37006434304    37006434304
592,STAT cell physical IO interconnect bytes returned by smart scan           5575531904     5570438336       -5093568
592,STAT cell session smart scan efficiency                                            9              0             -9
592,STAT table scans (long tables)                                                     1              1              0
592,STAT table scans (direct read)                                                     1              1              0
592,STAT table scan rows gotten                                                387130468      387130468              0
592,STAT table scan blocks gotten                                                1503621        1503621              0
592,STAT cell scans                                                                    1              1              0
592,STAT cell blocks processed by cache layer                                    6549118        2025039       -4524079
592,STAT cell blocks processed by txn layer                                      6549118        2025039       -4524079
592,STAT cell blocks processed by data layer                                     6183339        1666003       -4517336
592,STAT cell blocks helped by minscn optimization                               6549118        2025039       -4524079
592,STAT cell simulated session smart scan efficiency                        50653913088    13647896576   -37006016512
592,STAT cell IO uncompressed bytes                                          50653913088    13647896576   -37006016512
592,STAT session cursor cache hits                                                     0              1              1
592,STAT session cursor cache count                                                    1             1            0
592,STAT parse count (total)                                                           1              1              0
592,STAT execute count                                                                 1              1              0
592,STAT bytes sent via SQL*Net to client                                            528            528              0
592,STAT bytes received via SQL*Net from client                                      327            327              0
592,STAT SQL*Net roundtrips to/from client                                             2              2              0
592,WAIT enq: KO - fast object checkpoint                                            492            790            298
592,WAIT cell smart table scan                                                   1560351         366466       -1193885
592,WAIT SQL*Net message to client                                                     3              2             -1
592,WAIT SQL*Net message from client                                             5605106        8163815        2558709
592,WAIT events in waitclass Other                                                  1590           1723            133
592,TIME parse time elapsed                                                           46             23            -23
592,TIME DB CPU                                                                 37064365       37131355          66990
592,TIME sql execute elapsed time                                               38571803       37456254       -1115549
592,TIME DB time                                                                38571968       37457846       -1114122

We see here the storage index reduced an enormous amount of I/O on the cells (cell physical IO bytes saved by storage index: 37006434304), but didn’t result in a enormous reduction of execution time (38.5 seconds versus 37.4 seconds).

This isn’t too unexpected:
1/without exadata optimisation: total time: 307 s, IO wait (direct path read): 154 s.
2/with exadata optimisation: total time: 39 s, IO wait (cell smart table scan): 2 s.

I think you get the point now: after IO waits are brought down to 2 seconds, any further optimisation on IO (which is what the storage indexes are) only will reduce the 2 seconds further. And it did! The optimisation of the storage indexes is again spectacular, only has little impact on overall response time, because it influences a too little part of it (in fact, that is Amdahl’s law):

3/with exadata optimisation and storage indexes: total time: 37 s, IO wait (cell smart table scan): 0.3 s.

Conclusion

This is only an investigation in some very specific things of Exadata. There is much more to explore.

Using the hardware in the database machine, a single process can read 47.30 GB in 154 seconds. That is a very decent number, and sustainable with multiple processes, alias Oracle Parallel Query. That on it self is something of which many shops can benefit from. But, it gets really interesting when using the exadata optimisations. In this case, the total response time dropped from 307 to 37 seconds, with barely a wait on I/O (0.3 seconds for 47.30 GB!!)

Tagged: oracle exadata database machine performance