I’ve commented in the past about the strange stories you can find on the internet about how Oracle works and how sometimes, no matter how daft those stories seem, there might be something behind them. Here’s one such remark I came across a little while ago – published in two or three places this year:
“An index that enforces referential integrity cannot be rebuilt online.”
For those not familiar with Richard Foote’s extensive blog about indexes (and if you’re not you should be) – the title of this note is a blatant hi-jacking of his preferred naming mechanism.
It’s just a short note to remind myself (and my readers) that anything you know about Oracle, and anything published on the Internet – even by Oracle Corp. and its employees – is subject to change without notice (and sometimes without being noticed). I came across one such change today while read the Expert Oracle Exadata book by Kerry Osborne, Randy Johnson and Tanel Poder. It was just a little throwaway comment to the effect that:
Here’s one of those quick answers I give sometimes on forums or newsgroups. I forget when I wrote this, and when, and what the specific question was – but it was something to do with rebuilding an index on a small table where data was constantly being deleted and inserted.
Another problem with high insert/delete rates appears with very small indexes.
If you have a table that is small but constantly recycles its space you may also find you have an index where the number of leaf blocks puts you close to the borderline between having blevel = 1 and blevel = 2. If the size crosses that border occasionally and the statistics are updated to reflect the change – which is quite likely for a table subject to lots of updates and deletes if you have automatic stats collection enabled – then execution plans could change, resulting in dramatic changes in performance.
The workaround is fairly obvious – don’t let Oracle collect stats automatically on that table, instead create a stats-collection strategy for eliminating the change in blevel. For example, keep the stats locked except when you run your own code to deal with the stats, making sure that you overwrite the index blevel with 1 even if it has just crossed the boundary to 2.
Footnote: the reason why a change from 1 to 2 is dramatic is because Oracle ignores the blevel in the optimizer arithmetic when it is set to 1; so the change from 1 to 2 actually has the impact of a change from zero to 2. Then the cost of a nested loop access is “cost of single access multiplied by number of times you do it” – so the sudden appearance of a 2 in the formula gives an increment in cost of “2 * number of times you visit the table” of your small table is the second table in a nested loop – and suddenly a nested loop becomes much more expensive without a real change in the data size.
Footnote 2: it should be obvious that you don’t need to rebuild the index once you know what the problem is; but since we’re talking about a small index with a blevel that is usually 1 it probably won’t take more than a fraction of a second to rebuild the index and there’s a fair chance you can find a safe moment to do it. In terms of complexity the solution is just as simple as the stats solution – so you might as well consider it. The only thing you need to be careful about is that you don’t happen to rebuild the index at a time when the blevel is likely to be 2.
Footnote 3: For an example of the type of code that will adjust the blevel of an index see this URL. (Note, the example talks about copying stats from one place to another – but the principle is the same.)
Yesterday I introduced a little framework I use to avoid the traps inherent in writing PL/SQL loops when modelling a session that does lots of simple calls to the database. I decided to publish the framework because I had recently come across an example where a series of SQL statements gives a very different result from a single PL/SQL block.
The model starts with a simple data set – which in this case is created in a tablespace using ASSM (automatic segment space management), an 8KB block size and 1MB uniform extents (in a locally management tablespace).
create table t1 tablespace test_8k_assm as select trunc((rownum-1)/100) n1, lpad('x',40) v1, rpad('x',100) padding from dual connect by rownum <= 20000 ; create index t1_i1 on t1(n1, v1) tablespace test_8k_assm ; validate index t1_i1; execute print_table('select * from index_stats');
You can see that the n1 column is defined to have 200 rows for each of 100 different values, and that each set of two hundreds rows is stored (at least initially) in a very small cluster of blocks.
With the data set in place I am now going to pick a set of two hundred rows at random, delete it, re-insert it, and commit; and I’m going to repeat that process 1,000 times.
declare rand number(3); begin for i in 1..1000 loop rand := trunc(dbms_random.value(0,200)); delete from t1 where n1 = rand ; insert into t1 select rand, lpad('x',40), rpad('x',100) from dual connect by rownum <= 100 ; commit; end loop; end; / validate index t1_i1; execute print_table('select * from index_stats');
You might think that this piece of code is a little strange – but it is a model of some processing that I’ve recently seen on a client site, and it has crossed my mind that it might appear in a number of systems hidden underneath the covers of dbms_job. So what does it do to the index ?
Given the delay that usually appears between the time an index entry is marked as deleted and the time that the space can be reused, and given the way I’ve engineered my date so that the space needed for the 200 rows for each key value is little more than a block (an important feature of this case), I wouldn’t be too surprised if the index had stabilised at nearly twice its original size. But that’s not what happened to my example running under ASSM. Here are the “before” and “after” results from my test:
Before After LF_ROWS 20,000 70,327 LF_BLKS 156 811 LF_ROWS_LEN 1,109,800 3,877,785 BR_ROWS 155 810 BR_BLKS 3 10 BR_ROWS_LEN 8,903 45,732 DEL_LF_ROWS 0 50,327 DEL_LF_ROWS_LEN 0 2,767,985 DISTINCT_KEYS 200 190 MOST_REPEATED_KEY 100 1,685 BTREE_SPACE 1,272,096 6,568,320 USED_SPACE 1,118,703 3,923,517 PCT_USED 88 60 ROWS_PER_KEY 100 370 BLKS_GETS_PER_ACCESS 54 189
It’s a small disaster – our index has grown in size by a factor of about five, and we have more deleted rows than “real” rows. (Note, by the way, that the awfulness of the index is NOT really indicated by the PCT_USED figure – one which is often suggested as an indicator of the state of an index).
Unfortunately this is the type of problem that doesn’t surprise me when using ASSM; it’s supposed to help with highly concurrent OLTP activity (typified by a large number of very small transactions) but runs into problems updating free space bitmaps whenever you get into “batch-like” activity.
However, there is a special consideration in play here – I’ve run the entire operation as a single pl/sql loop. Would the same problem appear if I ran each delete/insert cycle as a completely independent SQL script using the “start_1000.sql” script from my previous note ?
To test the effect of running 1,000 separate tasks, rather than executing a single pl/sql loop, I wrote the following code into the start_1.sql script that I described in the article before running start_1000.sql:
declare rand number(3); begin rand := trunc(dbms_random.value(0,200)); delete from t1 where n1 = rand ; insert into t1 select rand, lpad('x',40), rpad('x',100) from dual connect by rownum <= 100 ; commit; end; /
The impact was dramatically different. (Still very wasteful, but quite a lot closer to the scale of the results that you might expect from freelist management).
Before After --------- --------- LF_ROWS 20,000 39,571 LF_BLKS 156 479 LF_ROWS_LEN 1,109,800 2,196,047 BR_ROWS 155 478 BR_BLKS 3 6 BR_ROWS_LEN 8,903 26,654 DEL_LF_ROWS 0 19,571 DEL_LF_ROWS_LEN 0 1,086,247 DISTINCT_KEYS 200 199 MOST_REPEATED_KEY 100 422 BTREE_SPACE 1,272,096 3,880,192 USED_SPACE 1,118,703 2,222,701 PCT_USED 88 58 ROWS_PER_KEY 100 199 BLKS_GETS_PER_ACCESS 54 102
I haven’t yet investigated why the pl/sql loop should have produced such a damaging effect – although I suspect that it might be a side effect of the pinning of bitmap blocks (amongst others, of course) that takes place within a single database call. It’s possible that the repeated database calls from SQL*Plus keep “rediscovering” bitmap blocks that show free space while the pinning effects stop the pl/sql from “going back” to bitmap blocks that have recently acquired free space.
Interestingly the impact of using ASSM was dramatically reduced if one object used freelists and the other used ASSM – and with my specific example the combination of a freelist table with an ASSM index even did better than the expected 50% usage from the “traditional” option of using freelists for both the table and index.
Note – the purpose of this note is NOT to suggest that you should avoid using ASSM in general; but if you can identify code in your system that is doing something similar to the model then it’s worth checking the related indexes (see my index efficiency note) to see if any of them are displaying the same problem as this test case. If they are you may want to do one of two things: think about a schedule for coalescing or even rebuilding problem indexes on a regular basis, or see if you can move the table, index, or both, into a tablespace using freelist management.
I wrote a short note last week that linked to a thread on the Russian Oracle forum about indexing, and if you’ve followed the thread you will have seen a demonstration that seemed to be proving the point that there were cases where an index rebuild would be beneficial.
Of course it’s not difficult to come up with cases where index rebuilds should make a difference – but it’s harder to come up with demonstrations that look realistic, so I thought I’d review the example to explain why it doesn’t really work as a good example of why you might need to think about rebuilding some production index.
The code is simple – create a table with an index, insert a batch of rows, check the index usage, repeat a few times, then rebuild the index and show that the space usage drops dramatically. Here’s the starting code (with a couple of minor changes):
create table testin (s varchar2(1000)); create index testin on testin (s) tablespace test_16k ; execute dbms_random.seed(0) insert into testin select dbms_random.string('P', 1000) s from dual connect by level <= 1000 ; commit; validate index testin; select height, blocks, lf_blks, lf_rows, lf_rows_len, pct_used from index_stats ;
You’ll note that the test index is using a blocksize of 16KB. Since the key value is 1,000 characters it does make sense to use one of the larger block size for the index although, to highlight the more common variation on that particular theme, it’s a good idea to think about using larger block sizes for index organized tables because a single index entry is often quite large and this can have an unpleasant effect on leaf block splits.
The call to dbms_random.string(‘P’,1000) generates a string of 1,000 characters selected from the full character set – which maximises the possible variation in values.
After the first validation, the code did the following:
insert into testin select ... commit; validate index testin; select ... from index_stats; insert into testin select ... commit; validate index testin; select ... from index_stats; alter index testin rebuild; validate index testin; select ... from index_stats;
This is what the four sets of results from index_stats looked like when I ran the test:
HEIGHT BLOCKS LF_BLKS LF_ROWS LF_ROWS_LEN PCT_USED ---------- ---------- ---------- ---------- ----------- ---------- 2 128 99 1000 1013000 63 2 256 217 2000 2026000 58 2 384 320 3000 3039000 59 2 256 215 3000 3039000 87 -- rebuilt
Quite clearly the rebuild has had a significant effect on the index – the size has dropped by nearly 30 percent. But apart from the fact that it tells us that a rebuild can reduce the size of an index, does it tell us about anything that’s likely to happen in a real system ?
The answer is yes. It has given us a good indication that indexes with very long keys are likely to operate with a larger amount of free space per leaf block than indexes with short keys. The pct_used of around 60% is significantly lower than the value of around 70% that you might expect for the more general case b-tree index with random data arrival. (In fact when I ran the test using an 8KB block size the usage was closer to 50% – but then you can only get seven of these index entries into an 8KB block – giving a guaranteed wastage of 1KB).
Has it told us, though, that this type of index merits a rebuild ? Possibly not until you answer a few questions, of which the most important is probably: “What’s going to happen next on your production system ?”
We’ve started with 1,000 rows inserted at a time – is this going to go on for the lifetime of the index and, if so, how long is that lifetime ? Is our test unrepresentative because in “real life” we would be inserting 1,000 rows into a table holding 10,000,000 rows.
Perhaps, following an initial bit of batch loading, we’re going to be dribbling rows into the table in ones and twos – which, again, would make the test unrepresentative of the full lifetime of the index. The impact of an occasional big batch operation (often a delete rather than an insert) on an index is often an indicator that special measures should be taken – to the extent that sometimes we drop an index before a batch job and rebuild it afterwards.
Maybe we’re going to be deleting 1,000 rows rows at random from now on just before, or just after, we do the inserts – possibly with, or without, a commit in between. How would that affect any ideas we have about rebuilds ?
The fact that a rebuild of this index at this point in time gives a significant space saving doesn’t necessarily mean much in terms of the continued use and maintenance of the index. And just to show how potentially misleading this demonstration could be, let’s just repeat the load one more time, and see what the index looks like.
Here are two sets of results – one showing what happens on the next insert if you have rebuilt the index, the other showing what happens if you simply extend the test by one more cycle without rebuilding the index:
HEIGHT BLOCKS LF_BLKS LF_ROWS LF_ROWS_LEN PCT_USED ---------- ---------- ---------- ---------- ----------- ---------- 2 256 215 3000 3039000 87 -- rebuilt 2 512 432 4000 4052000 58 -- after next insert HEIGHT BLOCKS LF_BLKS LF_ROWS LF_ROWS_LEN PCT_USED ---------- ---------- ---------- ---------- ----------- ---------- 2 384 320 3000 3039000 59 -- not rebuilt 2 512 413 4000 4052000 61 -- after next insert
Not only is the “final” index larger because you rebuilt it part way through the test – to get to that larger state you have done 217 index leaf blocks splits as you inserted the data, compared to 93 index leaf block splits in the case where you didn’t rebuild the index. As a rough guideline, that’s about 4.5MB of extra redo log generated (trivial in absolute terms, but a significant fraction of the total redo generated).
To me, the importance of the original demonstration is not that it shows us an index that gets smaller when we rebuild it, its importance lies in the fact that it reminds us that we have to think carefully about how well our tests represent real scenarios.
A couple of days ago I found several referrals coming in from a question about indexing on the Russian Oracle Forum. Reading the thread I found a pointer to a comment I’d written for the Oracle-L list server a couple of years ago about Advanced Queueing and why you might find that it was necessary to rebuild the IOTs (index organized tables) that support AQ.
The queue tables are, of course, a perfect example of what I call the “FIFO” index so it’s not a surprise that they might need special consideration. Rather than rewrite the whole note I’ll just link to it from here. (One of the notes in the rest of the Oracle-L thread also points to MOS document 271855.1 which describes the whys and hows of rebuilding AQ tables.)
There are many suggestions floating around the internet about identifying which Oracle indexes to rebuild. One of these involves running the validate (or analyze index validate) command on an index and checking the resulting figures in view index_stats to determine whether or not del_lf_rows exceeds some percentage (usually 20%) of lf_rows and rebuilding the index if so. Since this suggestion came up, yet again, in a recent OTN thread about Oracle index rebuilding I thought I’d write up a quick note highlighting one of the obvious flaws in the argument.
I’m not going to bother pointing out the threat inherent in the table-locking when you use the “validate index” command (or the “analyze index validate” equivalent) but I think it’s worth making some comments about the misunderstanding built into this policy. So let’s start by building some data, creating an index on it, then deleting 90% of the data:
create table t1 as with generator as ( select --+ materialize rownum id from dual connect by rownum <= 10000 ) select 1000000 + rownum id, lpad(rownum,10,'0') small_vc from generator v1, generator v2 where rownum <= 40000 ; create index t1_i1 on t1(id); validate index t1_i1; select 100 * del_lf_rows/lf_rows deletion_percent from index_stats ; delete from t1 where mod(id,100) >= 10 ; commit; validate index t1_i1; select 100 * del_lf_rows/lf_rows deletion_percent from index_stats ;
I haven’t bothered to collect statistics in this code as I’m not interested in execution plans, only in the amount of data deleted and what this does to the physical structure of the index. Here’s the the output of my script starting from the moment just after I’ve created the index:
Index analyzed. DELETION_PERCENT ---------------- 0 1 row selected. 36000 rows deleted. Commit complete. Index analyzed. DELETION_PERCENT ---------------- 90 1 row selected.
According to the general advice, this index is now in need of a rebuild since del_lf_rows is far more than 20% of lf_rows – but before we rebuild the index let’s delete a little more data.
delete from t1 where mod(id,100) = 0 ; commit; validate index t1_i1; select 100 * del_lf_rows/lf_rows deletion_percent from index_stats ;
My first delete statement got rid of 90% of the data leaving the 4,000 rows where mod(id,100) was between zero and nine. So my second delete has eliminated 10% of the remaining 4,000 rows. Let’s see what we get when we validate the index:
400 rows deleted. Commit complete. Index analyzed. DELETION_PERCENT ---------------- 10 1 row selected.
How wonderful – by deleting a few more rows we’ve got to a state where we don’t need to rebuild the index after all !
This note is not intended to tell you when you should, or should not, rebuild an index. It is simply trying to highlight the fact that anyone who thinks that a figure exceeding 20% for del_lf_rows / lf_rows does not have a proper understanding of how indexes work, and is basing their assumption on an over-simplistic model. In this case the error in this model is that it allows you to miss indexes which might actually benefit from some action.
The problem is based on a fundamental misunderstanding, which is this: if you believe that an index entry reserves a spot in an index and leave a little hole that can never be reused when it is deleted (unless you re-insert exactly the same value) then inevitably you will think that the del_lf_rows figure is some sort of measure of actual space that could be reclaimed.
But, as the Oracle myth busters like Richard Foote have been saying for years, that’s not how Oracle’s B-tree indexes work. When you delete an index entry, Oracle marks it as deleted but leaves it in place. When you commit your transaction Oracle does nothing to the index entry – but other processes now know that the entry can be wiped from the block allowing the space to be re-used.
This is what del_lf_rows is about – it’s the number of rows that are marked as deleted by transactions that have committed; and since the validate command can only run if it has locked the table in exclusive mode, any index entries marked for deletion will inevitably be committed deletes. So after I had deleted (and commited) 36,000 rows there were 36,000 entries in the index marked as deleted and committed; when I deleted a few more entries my second transaction wiped the deleted entries from any leaf blocks it visited, tidying the blocks (with a “heap block compress”) before marking a few more rows as deleted.
The upshot of this is that many systems (especially OLTP systems) will see del_lf_rows as a fairly small fraction of lf_rows because most systems tend to do little updates scattered randomly across lots of leaf blocks – constantly wiping out the rows marked as deleted by earlier transactions.
It’s really only in the case where a single large delete has taken place in the recent past that you’re likely to see a lot of deleted rows still in the index when you validate it and – as most people are now aware – a process that is supposed to do a very large delete is a process that should be considered in the design phase as a candidate for special treatment such as dropping/disabling some indexes before the delete then rebuilding afterwards. It won’t be a process where you will have to validate the index to decide roughly how much data you’ve deleted, and where in the index that data was, it’s a process where you’ll know what’s going to happen before it happens.