Search

OakieTags

Who's online

There are currently 0 users and 40 guests online.

Recent comments

Affiliations

When is Exadata’s storage indexes used?

When is Exadata’s storage indexes used?

Exadata’s Storage indexes provides data pruning at the storage layer.  The storage indexes eliminating disk IO, by storing a summary of the data distribution on the disks.    MOS Note ID 1094934.1 provides the following summary:

Each disk in the Exadata storage cell is divided into equal sized pieces called storage regions (default 1MB). There is an index entry for every storage regions (1MB of data stored on disk). Each entry contains the minimum and maximum value for columns seen in ‘where’ clause predicates. Information for up to 8 columns can be stored. The index is then used to eliminate disk IO by identifying which storage regions don’t match the ‘where’ clause of a query.

The question I’ll try help answer in this blog post is, when Oracle can take advantage of the storage indexes, by sharing some of the test results and findings I came across when testing storage indexes.  The data used for these tests are simple versions of a dba_tab_columns including an id column.   The tests were performed on a quarter rack ExaData database machine (2 db nodes – with 16 cores each and 3 storage servers)

Create sorted and unsorted Test data for storage index testing
 SQL> create table pba_order AS SELECT rownum pba_id, column_name, owner, table_name
            from (SELECT column_name, owner, table_name
                          from dba_tab_columns where rownum <= 10000),
                      (select rownum from dual CONNECT BY LEVEL <=20000)
          order by pba_id;
Table created.

SQL> select count(*) from pba_order;
          COUNT(*)  --  200,000,000
 
SQL> select count(distinct(pba_id)) from pba_order;
           COUNT(DISTINCT(PBA_ID))   --      200,000,000
 
SQL> create table pba_rnd as  select * from pba_order order by DBMS_RANDOM.RANDOM;
Table created.
 
Create a few null value entries for id column
SQL> update pba_rnd set PBA_ID = null where pba_id between 100 and 120;
21 rows updated.
 
SQL> update PBA_ORDER set PBA_ID = null where pba_id between 100 and 120;
21 rows updated.
SQL> commit;
 
Collect CBO statistics for tables
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('JA2','PBA_RND');
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('JA2','PBA_ORDER');
PL/SQL procedure successfully completed.

 SQL> select TABLE_NAME, NUM_ROWS, SAMPLE_SIZE, LAST_ANALYZED from user_tables where table_name like '%PBA%'; 

TABLE_NAME     NUM_ROWS SAMPLE_SIZE LAST_ANALYZED
PBA_RND 200000000 200000000 13-SEP-10
PBA_ORDER  200000000 200000000 13-SEP-10


Check the storage session statistics before testing

Name Value
cell physical IO bytes saved by storage index 0
 
Storage indexes works on the columns in your where clause, so let’s check out a few different predicates, please notice that I have not created any indexes on either of the tables.
SQL> select count(*) from PBA_RND where pba_id is null;     -- IS NULL on random sorted data
  COUNT(*)  --    21
 Elapsed: 00:00:00.04
 
Name Value
cell physical IO bytes saved by storage index  10303389184
 
SQL>  select count(*) from PBA_ORDER where pba_id is null;    -- IS NULL on sorted data
    COUNT(*)   --     21
Elapsed: 00:00:00.04 
 
Name Value
cell physical IO bytes saved by storage index  20866007040
 So the entry for the storage index column can contain a NULL value unlike regular b-tree indexes, which is also what Kerry Osborne blog post on Exadata offload confirms http://kerryosborne.oracle-guy.com/2010/06/exadata-offload-the-secret-sauce/#more-2531

Check the storage session statistics before testing

Name Value
cell physical IO bytes saved by storage index  72634597376


SQL> select avg(pba_id) from pba_order where pba_id between 1000000 and 1000999;   - Range scan on sorted data.

    AVG(PBA_ID)  --    1000499.5
1 row selected.
Elapsed: 00:00:00.05 
 
Name Value
cell physical IO bytes saved by storage index  82996625408
 
 SQL> select avg(pba_id) from pba_rnd where pba_id between 1000000 and 1000999;  -- Range scan on random sorted data.
AVG(PBA_ID)  --    1000499.5
 Elapsed: 00:00:05.68 
 
Name Value
cell physical IO bytes saved by storage index  82996625408

I tried a few more test scenarios, with forcing direct reads, using parallel query but it did not make any difference for this SQL.  For the between test, I could only get the storage index to kick in for the ordered data.

Check the storage session statistics before testing

Name Value
cell physical IO bytes saved by storage index  93359243264


SQL> select avg(pba_id) from pba_order where pba_id <100;      -- LESS than on sorted data

AVG(PBA_ID)  --    50
Elapsed: 00:00:00.04
 
Name Value
cell physical IO bytes saved by storage index  103721861120
 
 SQL> select avg(pba_id) from pba_rnd where pba_id <100;       -- LESS than on unsorted data
    AVG(PBA_ID)  --        50
 Elapsed: 00:00:00.13
 
Name Value
cell physical IO bytes saved by storage index 113980342272
 
 SQL> select avg(pba_id) from pba_order where pba_id > 199999000;   -- greater than on sorted data
    AVG(PBA_ID)  --     199999501
Elapsed: 00:00:00.06
 
Name Value
cell physical IO bytes saved by storage index 144962977792
 
 SQL> select avg(pba_id) from pba_rnd where pba_id > 199999000;  -- greater than on unsorted data.
AVG(PBA_ID)  --    199999501
 Elapsed: 00:00:00.62
 
Name Value
cell physical IO bytes saved by storage index 154317127680

So we can confirm that Oracle can use storage indexing for both less and greater than predicates.

Check the storage session statistics before testing

Name Value
cell physical IO bytes saved by storage index 174033272832
 SQL> select pba_id from pba_order where pba_id = 100000000;   -- equal sign on sorted data
    PBA_ID  --   100000000
 Elapsed: 00:00:00.05 
 
Name Value
cell physical IO bytes saved by storage index 184395268096
 
SQL> select pba_id from pba_rnd where pba_id = 100000000;    -- equal sign on unsorted data
    PBA_ID  -- 100000000
 Elapsed: 00:00:05.66
 
Name Value
cell physical IO bytes saved by storage index 184395268096

Again I tried a few more test scenarios, with forcing direct reads, using parallel query but it did not make any difference in this case.  For the equal sign predicate test, I could only get the storage index to kick in for the ordered data.

Before finishing up, let’s check if storage indexes are kicking in on tables compressed with EHCC.   

Segment Name Size MB
PBA_ORDER 9,937
PBA_HCC 856

 

Table Name Last Analyzed Par Flash_C Cell_fl Compressed Compressed_for
PBA_HCC 14-SEP-10 No Default Default Enabled Query High Default
PBA_ORDER 14-SEP-10 No Default Default Disabled Default

Check the storage session statistics before testing 

Name Value
cell physical IO bytes saved by storage index 0

SQL>  select avg(pba_id) from pba_order where pba_id <100;

     AVG(PBA_ID) -- 50
 Elapsed: 00:00:00.13 
 
Name Value
cell physical IO bytes saved by storage index 10362617856
 

SQL> select avg(pba_id) from pba_hcc  where pba_id <100;
      AVG(PBA_ID)  --  50
Elapsed: 00:00:00.04  
 
Name Value
cell physical IO bytes saved by storage index 11097702400
 

I you like to read more on EHCC; I have blog post on EHCC with a few test-case and conclusions on http://www.oaktable.net/blog/what-can-ehcc-do-you

Conclusion

In Exadata, the Storage Index can evaluate predicates like <, <=, =, !=, >=, >, is NULL, is NOT NULL, as you saw above I was not able to get the storage indexes to kick in for the between and equal predicate for the random sorted data.    It may work in other circumstances or test-cases; currently it is very hard to evaluate, as there is no public tracing tool available. 

In general we saw marginal better IO savings on the test-case with sorted data and that storage indexes also worked on EHCC compressed tables.    I have also performed the same tests on a global temp table and the storage indexes did work with global temp tables as well.

Kerry Osborne has a good blog post, confirming that Storage indexes also works with bind variables: http://kerryosborne.oracle-guy.com/2010/09/do-storage-indexes-work-with-bind-variables/

The performance gained from the storage indexes really rocks my world and it made me review our design and index strategy for our warehouse.

Additional comments

Haven't had much time to update my entries on this site, but i just wanted to mentioned a few notes that i had recorded on this topic, from earlier:
1) For uncompressed blocks and OLTP compressed blocks, storage index is maintained during writes
2) For HCC compressed blocks, writes invalidate region index (not the whole storage index)
3) SI is effective for tables encrypted using tablespace encryption
4) For tablespace encryption as well, writes invalidate region index
5) Queries on tables encrypted using column level encryption are not helped by SI

It has been a year or so since i collected this information so this might be outdated by now or better documented elsewhere.
Thanks & Regards, - Peter

Hi Peter I'd like to know how

Hi Peter
I'd like to know how to Check the storage session statistics ? That is,how do I know whether my sql statement use storage index or not ? Thanks.

Linda

Reply to Linda

Hey Linda,

Sorry for the late reply, just came across you question, which i suspect that you have already found an answer for by now.  From the database you can use the wait event system to check for %storage% or %cell%, but apart from the wait event system i have not come across anything else yet - this blog post is from 2010 and i'm sure a lot have changed since, and somebody else might have found something by now.  I know that Tanel Poder (http://blog.tanelpoder.com/) is working on a utility (ExaSnapper), which i hope will include more visability into the storage cell in general.  As you may have found out regarding the storage indexes, you don't really have the full control of when they are kicking in, but when they do they often work wonders.

Thanks & Regards,
-Peter

Reply to Linda

Hey Linda,

Sorry for the late reply, just came across you question, which i suspect that you have already found an answer for by now. From the database you can use the wait event system to check for %storage% or %cell%, but apart from the wait event system i have not come across anything else yet - this blog post is from 2010 and i'm sure a lot have changed since, and somebody else might have found something by now. I know that Tanel Poder (http://blog.tanelpoder.com/) is working on a utility (ExaSnapper), which i hope will include more viability into the storage cell in general. As you may have found out regarding the storage indexes, you don't really have the full control of when they are kicking in, but when they do they often work wonders.

Thanks & Regards,
-Peter

One additional comment. The

One additional comment. The storage index is for 8 columns, *per 1MB storage region*. So it is possible for each region to have a different set of 8 columns!

Direct IO

Hi peter,

You wrote:

> I tried a few more test scenarios, with forcing direct reads, using parallel query but it did not make any difference for this SQL. For the between test, I could only get the storage index to kick in for the ordered data.

A direct read is fundamental for being able to use cell depended optimisations, like smart scans or storage indexes. This means a "normal" session will do direct reads if it's doing a scan which would do multiblock IO like TABLE ACCESS FULL or FAST FULL INDEX SCAN, which is the default starting from Oracle version 11.

Reply to comment Direct IO

Hi Frits,
Thanks for detailed explaination. Given it is default bahavior the extra tests may have been pointless, i just tempt to get extra careful when dealing with Storage indexes as it is an area which have not been fully explored yet.   
-Peter

thanks for the clarification

Hi Kerry, thanks for the clarificaton, it is a really good point and easy to overlook.  I look forward to read and learn more from your posts.
-Peter

Hi Peter, I meant to follow

Hi Peter,

I meant to follow up with you. The biggest thing I noticed in my recent little spat of testing was that implicit conversions were the biggest impediment to Storage Indexes getting used. I mentioned that in the post about SI's and bind variables, but it wasn't very prominent in the post. The implicit conversion issue is a little counter intuitive since we're not used to worrying about that when we're already doing a full scan. In particular, dates were an issue because of the funky format they apparently use ('syyyy-mm-dd hh24:mi:ss'). It makes sense if they are storing it as a string (which they probably are). But they seem to have limited conversion capability at this point since '10-aug-2010' gets converted to the proper format to use the SI, but '10-aug-10' does not. Anyway, it's fun trying to figure it all out. Thanks for the post and the plug.

Kerry 

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>, <c>, <cpp>, <drupal5>, <drupal6>, <java>, <javascript>, <sql>, <php>, <python>, <ruby>, <vb>, <xml>. The supported tag styles are: <foo>, [foo].
  • You may insert videos with [video:URL]

More information about formatting options

By submitting this form, you accept the Mollom privacy policy.