For those who have read the previous posting of how I engineered an Exadata disaster and want to reproduce it, here’s the script I used to generate the data.
create table t1 as with generator as ( select --+ materialize rownum id from dual connect by level <= 1e4 ) select rownum id, lpad(rownum,10,'0') v1, mod(rownum,10000) n1, trunc(rownum/100) n2, trunc(dbms_random.value(0,262144)) n_256K, trunc(dbms_random.value(0,131072)) n_128K, trunc(dbms_random.value(0,8192)) n_8k, rpad(dbms_random.string('U',3),60) padding from generator v1, generator v2 where rownum <= 1048576 ; insert into t1 select * from t1; commit; insert into t1 select * from t1; commit; insert into t1 select * from t1; commit; insert into t1 select * from t1; commit; insert into t1 select * from t1; commit; create table t1_qh compress for query high as select * from t1 ; create table t1_ah compress for archive high as select * from t1 ; begin dbms_stats.gather_table_stats( ownname => user, tabname =>'T1', method_opt => 'for all columns size 1' ); dbms_stats.gather_table_stats( ownname => user, tabname =>'T1_QH', method_opt => 'for all columns size 1' ); dbms_stats.gather_table_stats( ownname => user, tabname =>'T1_AH', method_opt => 'for all columns size 1' ); end; / create bitmap index t1_bi on t1(n_128K); create bitmap index t1_qh_bi on t1_qh(n_128K); create bitmap index t1_ah_bi on t1_ah(n_128K); select max(padding) from t1_ah where n_128k between 1000 and 1999 ;
You’ll need about 5 GB of space to create all these objects – but that shouldn’t be too difficult on the typical Exadata system. As you can see, I’ve created the data to cover a variety of tests and experiments, including comparing the side effects of different levels of compression (or not).
When you run the query, you may find that the optimizer arithmetic picks the tablescan option automatically – the choice will depend basically on your setting for the db_file_multiblock_read_count and/or the settings for the various system stats (sreadtim, mreadtim, MBRC and CPUspeed, or the equivalent no-workload settings); so, to see the performance impact, you may have to hint the query. (It’s slightly worrying that when you set up a (datawarehouse) system on Exadata it might be a good idea – at least in the short term – to set the db_file_multiblock_read_count to 128 because you really do want Oracle to think that tablescans are, in general, a pretty good fast option – we’ve only just got ver persuading people that you shouldn’t set this parameter in any version of 10g or above !
After running the query twice on each table (to get all the data cached somewhere, if possible) the most recent timings I got were as follows for the indexed access path (the 10% difference between this test and the time I reported in my last test could possibly be explained by the fact that this machine had seen some serious volume testing while the other machine was so new that I had probably been using the outer edge of every disc):
The number of different ways you can test against just this set of data is quite surprising – so don’t feel you have to stop with just the one demonstration.
Footnote – prompted by Kerry Osborne’s note, I’ve realised that my comment about disks and outer edges etc. is completely irrelevant. The entire time spent was CPU time, and the data was all cached when I ran these tests. I should have checked the machine types more closely – the slower machine was a V2, the faster was an X2.