I came across a nice bit of code on OTN recently that someone had written several years ago (in 2007, in fact) to demonstrate the different ways in which the optimizer would handle “select count({something}) from table;”. If you want to copy and repeat the test code, you may need to adjust it slightly – it references a type vc2s, which I changed to dbms_stats.chararray, and it references a plan table called toad_plan_table, which I replaced with references to the standard plan_table (getting rid of the truncate as I did so).
The code simply executes a call to explain plan for different statements, then extracts the projection information for the “sort aggregate” line of the execution plan (examining the operation and options columns to do so). The thing that particularly surprised me was that there was a difference between the following two queries:
select count( 1) from t1; select count(-1) from t1;
According to the test, the first query is converted to count(*), but the second query really does count a load ’-1′s (whatever that means).
This didn’t seem entirely reasonable, but even though the code was over 5 years old it had been tested against 10gR2. I decided to repeat the test, and found that there was a slight flaw in the test because Oracle wasn’t quite telling the truth. (The implication of the test is still correct, by the way, because the two queries do, surprisingly, act differently). To highlight the anomaly, I’ve created a data set, and generated the full execution plan with projection for the two queries – and there’s an important detail in the data set that makes it easy to see the difference:
create table t1
as
select *
from all_objects
where rownum <= 20000 ; create bitmap index t1_b1 on t1(owner); begin dbms_stats.gather_table_stats( ownname => user,
tabname =>'T1',
estimate_percent => 100,
method_opt => 'for all columns size 1'
);
end;
/
Note, particularly that I’ve created a bitmap index on the owner column; with stats in place the fastest way of counting the rows in this table will be through a fast full scan on the (tiny) bitmap index. Here are the two plans – count(1) first, then count(-1):
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | BITMAP CONVERSION COUNT | | 20000 | 1 (0)| 00:00:01 | | 3 | BITMAP INDEX FAST FULL SCAN| T1_B1 | | | | ------------------------------------------------------------------------------- Column Projection Information (identified by operation id): ----------------------------------------------------------- 1 - (#keys=0) COUNT(*)[22] 2 - COUNT(*)[22] 3 - STRDEF[BM VAR, 10], STRDEF[BM VAR, 10], STRDEF[BM VAR, 7920] ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | BITMAP CONVERSION TO ROWIDS | | 20000 | 1 (0)| 00:00:01 | | 3 | BITMAP INDEX FAST FULL SCAN| T1_B1 | | | | ------------------------------------------------------------------------------- Column Projection Information (identified by operation id): ----------------------------------------------------------- 1 - (#keys=0) COUNT((-1))[22] 2 - "T1".ROWID[ROWID,10] 3 - STRDEF[BM VAR, 10], STRDEF[BM VAR, 10], STRDEF[BM VAR, 7920]
There are two things to notice here. For count(1) the optimizer has a special operation on bitmap indexes which is the bitmap conversion count – it’s basically counting set bits. For count(-1) the optimizer doesn’t recognise that the same strategy could be used, instead it does a load of arithmetic to convert bits to rowids (that’s why the projection in line 2 is t1.rowid[rowid,10]), so at this point the “-1″ simply doesn’t exist which means the sort aggregate in line 1 is actually counting rowids, even though it seems to be claiming that what it had counted was a load of ‘-1′s. The little lie at the end doesn’t really matter, of course – but I have occasionally seen variations on the bit in the middle (the unnecessary conversion to rowids) producing significant performance side effects in production code.
Recent comments
21 weeks 2 days ago
31 weeks 14 hours ago
32 weeks 5 days ago
35 weeks 6 days ago
38 weeks 1 day ago
47 weeks 5 days ago
49 weeks 2 days ago
50 weeks 2 days ago
50 weeks 3 days ago
1 year 1 week ago