Here’s a follow-up to a post I did some time ago about estimating the size of an index before you create it. The note describes dbms_stats.create_index_cost() procedure, and how it depends on the results of a call to explain plan. A recent question on the OTN database forum highlighted a bug in explain plan, however, which I can demonstrate very easily. I’ll start with a small amount of data to demonstrate the basic content that is used to calculate the index cost.
create table t1
as
with generator as (
select --+ materialize
rownum id
from dual
connect by
level <= 10000
)
select
rownum n1,
case when mod(rownum,2) = 0 then rownum end n2,
lpad(rownum,10,'0') v1,
case when mod(rownum,2) = 0 then rpad('x',10) end v2,
rpad('x',100) padding
from
generator v1,
generator v2
where
rownum <= 100000
;
begin
dbms_stats.gather_table_stats(
ownname => user,
tabname =>'T1',
method_opt => 'for all columns size 1'
);
end;
/
explain plan for
create index t1_v1 on t1(v1);
select * from table(dbms_xplan.display);
You’re likely to get some variation on the results below, particularly in the note about the estimated index size, but here’s the result I got last time I ran the test:
Explained. PLAN_TABLE_OUTPUT -------------------------------------------------------------------- Plan hash value: 2170349088 ---------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | ---------------------------------------------------------------- | 0 | CREATE INDEX STATEMENT | | 100K| 1074K| 286 | | 1 | INDEX BUILD NON UNIQUE| T1_V1 | | | | | 2 | SORT CREATE INDEX | | 100K| 1074K| | | 3 | TABLE ACCESS FULL | T1 | 100K| 1074K| 278 | ---------------------------------------------------------------- Note ----- - cpu costing is off (consider enabling it) - estimated index size: 3145K bytes 15 rows selected.
As you can see, I’ve got an accurate estimate of the rows that will go into the index, with a reasonably accurate estimate of the total volume of data in the rows (column lengths only) that will go into the index. The estimated index size then allows for things like rowids, row overhead, pctfree, space management blocks, and extent sizes, resulting in a larger figure (significantly larger in this case given the relatively small length of the column used in the index).
Let’s repeat the experiment a few times with increasing numbers of rows in the table, though. Rather than give you the full output for each test, I’ll just produced the Rows and Bytes figure and the estimated size – the last three sets of figures are a little clue about what’s gone wrong:
Rows Bytes Est. Size
------ ------ ---------
1M 10MB 24MB
10M 104MB 243MB
100M 1049MB 2415MB
1000M 10GB 2483MB
10000M 102GB 2818MB
------ ------ ---------
176M 1846MB 4227MB
177M 1856MB 0B
180M 1888MB 67MB
------ ------ ---------
It looks like the estimated size is captured as a 32 bit number, so it rolls over to zero at roughly 4.3 billion. The bug is still there in 11.2.0.2, I haven’t yet checked 10.2.0.5 or 11.2.0.3
Footnote: Just in case you’re wondering, I didn’t actually create a table with 10 billion rows in it, I just used dbms_stats.set_table_stats() to tell Oracle that the table has 10 billion rows.
Recent comments
5 days 22 hours ago
1 week 16 hours ago
1 week 19 hours ago
1 week 21 hours ago
6 weeks 1 day ago
6 weeks 1 day ago
7 weeks 2 days ago
11 weeks 2 days ago
17 weeks 13 hours ago
19 weeks 5 days ago