Search

Top 60 Oracle Blogs

Recent comments

_cursor_obsolete_threshold

At the recent Trivadis Performance Days in Zurich, Chris Antognini answered a question that had been bugging me for some time. Why would Oracle want to set the default value of _cursor_obsolete_threshold to a value like 8192 in 12.2 ?

In 11.2.0.3 the parameter was introduced with the default value 100; then in 11.2.0.4, continuing into 12.1, the default value increased to 1,024 – what possible reason could anyone have for thinking that 8192 was a good idea ?

The answer is PDBs – specifically the much larger number of PDBs a single CBD can (theoretically) support in 12.2.

In fact a few comments, and the following specific explanation, are available on MoS in Doc ID 2431353.1 “High Version Counts For SQL Statements (>1024) Post Upgrade To 12.2 and Above Causing Database Slow Performance”:

The default value of _cursor_obsolete_threshold is increased heavily (8192 from 1024) from 12.2 onwards in order to support 4096 PDBs which was only 252 PDBs till 12.1. This parameter value is the maximum limit for obsoleting the parent cursors in an multitenant environment and cannot be increased beyond 8192.

Having said, this is NOT applicable for non-CDB environment and hence for those databases, this parameter should be set to 12.1 default value manually i.e. 1024. The default value of 1024 holds good for non-CDB environment and the same parameter can be adjusted case-to-case basis should there be a problem.

It’s all about PDBs – more precisely, it’s all about CDBs running a huge number of PDBs, which is not necessarily the way that many companies are likely to use PDBs. So if you’re a fairly typical companyy running a handful of PDBs in a single CDB then it’s probably a good idea to set the parameter down to the 12.1 value of 1024 (and for bad applications I’d consider going even lower) – and this MOS note actually makes that an official recommendation.

Impact analysis

What’s the worst that could happen if you actually have many PDBs all executing the same application and that application has a few very popular and frequently executed statements? Chris Antognini described a model he’d constructed and some tests he’d done to show the effects. The following code is a variation onhis work. It addresses the following question:

If you have an application that repeatedly issues (explicitly or implicitly) parse calls but doesn’t take advantage of the session cursor cache it has to search the library cache by hash_value / sql_id for the parent cursor, then has to walk the chain of child cursors looking for the right child. What’s the difference in the work done if this “soft parse” has to walk the list to child number 8,191 instead of finding the right cursor at child number 0.

Here’s the complete code for the test:


create table t1
select 1 id from dual
/

alter table t1 add constraint t1_pk primary key (id)
/

spool cursor_obsolete.lst

alter system flush shared_pool;
alter system flush shared_pool;

set serveroutput off
select /*+ index(t1) */ id from t1 where id > 0;
select * from table(dbms_xplan.display_cursor);

execute snap_my_stats.start_snap
execute snap_my_stats.start_snap

declare
        m_id number;
begin
        for i in 100+1..100+8192 loop
                execute immediate 'alter session set optimizer_index_cost_adj = ' || i ;
                select /*+ index(t1) */ id into m_id from t1 where id > 0;
        end loop;
end;
/

set serveroutput on
execute snap_my_stats.end_snap

column sql_text format a60
select sql_id, child_number, loaded_versions, executions, sql_text from v$sql where sql_text like 'SELECT%T1%' order by child_number;

prompt  ===============
prompt  Low child reuse
prompt  ===============

set serveroutput off
execute snap_my_stats.start_snap

declare
        m_id number;
begin
        for i in 100+1..100+1024 loop
                execute immediate 'alter session set optimizer_index_cost_adj = ' || i ;
                select /*+ index(t1) */ id into m_id from t1 where id > 0;
        end loop;
end;
/

set serveroutput on
execute snap_my_stats.end_snap

prompt  ================
prompt  High child reuse
prompt  ================

set serveroutput off
execute snap_my_stats.start_snap

declare
        m_id number;
begin
        for i in 7168+1..7168+1024 loop
                execute immediate 'alter session set optimizer_index_cost_adj = ' || i ;
                select /*+ index(t1) */ id into m_id from t1 where id > 0;
        end loop;
end;
/

set serveroutput on
execute snap_my_stats.end_snap

spool off

I’ve created a table with just one row and given it a primary key. My testing query is going to be very short and simple. A query hinted to return that one row by primary key index range scan.

I’ve flushed the shared pool (twice) to minimise fringe contention from pre-existing information, then executed the statement to populate the dictionary cache and some library cache information and to check the execution plan.

The call to the package snap_my_stats is my standard method for reporting changes in v$mystat across the test. I’ve called the start_snap procedure twice in a row to make sure that its first load doesn’t add some noise to the statistics that we’re trying to capture.

The test runs in three parts.

  • First I loop 8192 times executing the same statement, but with a different value for the optimizer_index_cost_adj for each execution – this gives me the limit of 8192 child cursors, each reporting “Optimizer Mismatch” as the reason for not sharing. I’ve run a query against v$sql after this to check that I have 8192 child cursors – you’ll need to make sure your shared pool is a few hundred megabytes if you want to be sure of keeping them all in memory.
  • The second part of the test simply repeats the loop, but only for the first 1,024 child cursors. At this point the child cursors exist, so the optimizer should be doing “soft” parses rather than hard parses.
  • The final part of the test repeats the loop again, but only for the last 1,024 child cursors. Again they should exist and be usable, so the optimizer should again be doing “soft” parses rather than hard parses.

What I’m looking for is the extra work it takes for Oracle to find the right child cursor when there’s a very long chain of child cursors. From my memory of dumping the library cache in older versions of Oracle, the parent will point to a “segmented array” of pointers to child cursors, and each segment of the array will consist of 16 pointers, plus a pointer to the next segment. So if you have to find child cursor 8191 you will have to following 512 segment pointers, and 16 pointers per segment (totalling 8708 pointers) before you find the child you want – and you’re probably holding a mutex (or latch) while doing so.

One preipheral question to ask, of course, is whether Oracle keeps appending to the segmented array, or whether it uses a “pushdown” approach when allocating a new segment so that newer child cursors are near the start of the array. (i.e. will searching for child cursor 0 be the cheapest one or the most expensive one).

And the results, limited to just the second and third parts, with just a couple of small edits are as follows:


host sdiff -w 120 -s temp1.txt temp2.txt >temp.txt

===============                                            |    ================
Low child reuse                                            |    High child reuse
===============                                            |    ================

Interval:-  0 seconds                                      |    Interval:-  6 seconds

opened cursors cumulative                      2,084       |    opened cursors cumulative                      2,054
recursive calls                                6,263       |    recursive calls                                6,151
recursive cpu usage                               33       |    recursive cpu usage                              570
session logical reads                          1,069       |    session logical reads                          1,027
CPU used when call started                        33       |    CPU used when call started                       579
CPU used by this session                          37       |    CPU used by this session                         579
DB time                                           34       |    DB time                                          580
non-idle wait count                               16       |    non-idle wait count                                5
process last non-idle time                         1       |    process last non-idle time                         6
session pga memory                           524,288       |    session pga memory                            65,536
enqueue requests                                  10       |    enqueue requests                                   3
enqueue releases                                  10       |    enqueue releases                                   3
consistent gets                                1,069       |    consistent gets                                1,027
consistent gets from cache                     1,069       |    consistent gets from cache                     1,027
consistent gets pin                            1,039       |    consistent gets pin                            1,024
consistent gets pin (fastpath)                 1,039       |    consistent gets pin (fastpath)                 1,024
consistent gets examination                       30       |    consistent gets examination                        3
consistent gets examination (fastpath)            30       |    consistent gets examination (fastpath)             3
logical read bytes from cache              8,757,248       |    logical read bytes from cache              8,413,184
calls to kcmgcs                                    5       |    calls to kcmgcs                                    3
calls to get snapshot scn: kcmgss              1,056       |    calls to get snapshot scn: kcmgss              1,026
table fetch by rowid                              13       |    table fetch by rowid                               1
rows fetched via callback                          6       |    rows fetched via callback                          1
index fetch by key                                 9       |    index fetch by key                                 1
index scans kdiixs1                            1,032       |    index scans kdiixs1                            1,024
session cursor cache hits                         14       |    session cursor cache hits                          0
cursor authentications                         1,030       |    cursor authentications                         1,025
buffer is not pinned count                     1,066       |    buffer is not pinned count                     1,026
parse time cpu                                    23       |    parse time cpu                                   558
parse time elapsed                                29       |    parse time elapsed                               556
parse count (total)                            2,076       |    parse count (total)                            2,052
parse count (hard)                                11       |    parse count (hard)                                 3
execute count                                  1,050       |    execute count                                  1,028
bytes received via SQL*Net from client         1,484       |    bytes received via SQL*Net from client         1,486

Two important points to note:

  • the CPU utilisation goes up from 0.33 seconds to 5.7 seconds.
  • the number of hard parses is zero, this is all about searching for the

You might question are the 2,048-ish parse count(total) – don’t forget that we do an “execute immediate” to change the optimizer_index_cost_adj on each pass through the loop. That’s probably why we double the parse count, although the “alter session” doesn’t then report as an “execute count”.

The third call to a statement is often an important one – it’s often the first one that doesn’t need “cursor authentication”, so I ran a similar test executing the last two loops a second time – there was no significant change in the CPU or parse activity between the 2nd and 3rd executions of each cursor. For completeness I also ran a test with the loop for the last 1,024 child cursors ran before the loop for the first child cursors. Again this made no significant difference to the results – the low number child cursors take less CPU to find than the high number child cursors.

Bottom line

The longer the chain of child cursors the more time (elapsed and CPU) you spend searching for the correct child; and when a parent is allowed 8,192 child cursors the extra time can become significant. I would claim that the ca. 5 seconds difference in CPU time appearing in this test corresponds purely to an extra 5 milliseconds walking an extra 7,000 steps down the chain.

If you have a well-behaved application that uses the session cursor cache effectively, or uses “held cursors”, then you may not be worried by very long chains of child cursors. But I have seen many applications where cursor caching is not used and every statement execution from the client turns into a parse call (usually implicit) followed by a hunt through the library cache and walk along the child chain. These applications will not scale well if they are cloned to multiple PDBs sharing the same CDB.

Footnote 1

The odd thing about this “cursor obselete” feature is that I have a distinct memory that when  PDBs were introduced at an ACE Director’s meeting a few years ago the first thought that crossed my mind was about the potential for someone running multiple copies of the same application as separate PDBs seeing a lot of library cache latch contention or cursor mutex contention because any popular statement would now be hitting the same parent cursor from multiple PDBs. I think the casual (i.e. neither formal, nor official) response I got when I raised the point was that the calculation of the sql_id in future releases would take the con_id into consideration. It seems that that idea fell by the wayside.

Footnote 2

If you do see a large number of child cursors for a single parent then you will probably end up looking at v$sql_shared_cursor for the sql_id to see if that gives you some good ideas about why a particular statement has generated so many child cursors. For a list of explainations of the different reasons captured in this view MOS Doc Id  296377.1“Troubleshooting: High Version Count Issues” is a useful reference.