Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Thanks, Koert (and Marco and James ...)!

The members of the Oak Table Network had long bemoaned the slightly tired look of the old oaktable.net website. James Morle put a lot of spare-time effort into running this for years now but it really needed some attention in terms of the content and presentation.

But, as all discussions about the website seemed to take place over whisky, beer or wine ... nothing much seemed to happen!

Actually, what happened was that a small team of volunteers got together and got to work, each with their own responsibilities. As far as I know, that team consisted of Kurt the Dude guy, Marco Gralike of Amis and James. Apologies if I've forgotten someone or over-emphasise Koert's role in doing the majority of the Drupal work, but the new site is a joy to behold, relative to the old one, and as it includes Twitter and blog feeds, I think it'll be a more useful daily resource.

Thanks, guys - great job! (Although I'm not suprised ....)

If anyone has any feedback, we're all ears.

OTN

That’s the Oak Table Network, not the other OTN. We have a new website which, amongst other things,  includes a convenient blog aggregator that will include the blogs of all Oak Table members. I’ll probably be taking all the Oak table members off my blog roll some time soon since I’ve found it very convenient to [...]

Statistics on Partitioned Tables - Part 6b - COPY_TABLE_STATS - Mistakes

Sigh ... these posts have become a bit of a mess.

There are so many different bits and pieces I want to illustrate and I've been trying to squeeze them in around normal work. Worse still, because I keep leaving them then coming back to them and re-running tests it's easy to lose track of where I was, despite using more or less the same test scripts each time (any new scripts tend to be sections of the main test script). I suspect my decision to only pull out the more interesting parts of the output has contributed to the difficulties too, but with around 18.5 thousand lines of output, I decided that was more or less essential.

It has got so bad that I noticed the other day that there were a couple of significant errors in the last post which are easy to miss when you're looking at detailed output and must be even less obvious if you're looking at it for the first time.

The fact no-one said much about these errors reinforces my argument with several bloggers that less people read and truly absorb the more technical stuff than they think. They just pick up the messages they need and take more on trust than you might imagine!

So what were the errors? Possibly more important, why did they appear? The mistakes are often as instructive as the successes.

Error 1

This is the tail-end of the subpartition stats at the end of part 5

SQL> select  table_name, subpartition_name, global_stats, last_analyzed, num_rows
  2  from dba_tab_subpartitions
  3  where table_name='TEST_TAB1'
  4  and table_owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     SUBPARTITION_NAME              GLO LAST_ANALYZED          NUM_ROWS
------------------------------ ------------------------------ --- -------------------- ----------
<>                                              
 
TEST_TAB1                      P_20100209_GROT                NO  28-FEB-2010 21:41:47          3
TEST_TAB1                      P_20100209_HALO                NO  28-FEB-2010 21:41:49          3
TEST_TAB1                      P_20100209_JUNE                NO  28-FEB-2010 21:41:49          3
TEST_TAB1                      P_20100209_OTHERS              NO  28-FEB-2010 21:41:50          3

Compared to the supposedly same section produced at the start of part 6a :-

TABLE_NAME                     SUBPARTITION_NAME              GLO LAST_ANALYZED          NUM_ROWS           
------------------------------ ------------------------------ --- -------------------- ----------           
<>                                              

TEST_TAB1                      P_20100209_GROT                YES 28-MAR-2010 15:38:32          3           
TEST_TAB1                      P_20100209_HALO                YES 28-MAR-2010 15:38:32          3           
TEST_TAB1                      P_20100209_JUNE                YES 28-MAR-2010 15:38:32          3           
TEST_TAB1                      P_20100209_OTHERS              YES 28-MAR-2010 15:38:33          3           

Spot the difference? Forget the timestamps for now, although I hope it's clear that :-

- The stats were gathered at different times
- I really need to get my blogging act together ;-)

Instead, notice that GLOBAL_STATS is set to NO in part 5 and YES in part 6a. How could that happen?

The first thing to note is that it's probably not as significant as it first appears because what does it mean for Subpartition stats to be Global when there are no underlying sub-components of a Subpartition? In fact I'd argue that all Subpartition stats are Global by implication but I may be missing something important. (Comments welcome ...)

Instead I'll focus on how you can manage to get the two different results. The output from part 5 was the result of gathering statistics on a load table (LOAD_TAB1) and then exchanging it with the relevant subpartition of TEST_TAB1 (as shown in part 5). When you do that, the GLOBAL_STATS flag will be set to NO.

If I take the alternate approach of exchanging LOAD_TAB1 with the subpartition of TEST_TAB1 and only then gathering statistics on the subpartition of TEST_TAB1 then GLOBAL_STATS will be YES for that subpartition. That's the most obvious reason I can think of for the discrepancy but I can't be certain because the log files that I took the output from are history now.

At some point when ripping parts of a master script to run in isolation for each blog post I've changed the stats gathering approach from gather-then-exchange to exchange-then-gather. The output shown in part 5 was correct so I've updated part 6a to reflect that and added a note.

Error 2

I think this one is worse, because it's down to me mixing up some pastes because the original results looked wrong when they were, in fact right. It's extremely rare for me to edit results and I regret doing it here. Whenever you start tampering with the evidence, you're asking for trouble!

When I'd been pasting in the final example output, showing that subpartition stats had been copied for the new P_20100210_GROT subpartition, I saw another example when the subpartition stats hadn't been copied, so I decided I was mistaken and fixed the post. But the original was correct so I've put it back the way it should be and added a further note.

If you weren't confused already, you have my permission to be utterly confused now ;-)

Summary

On a more serious note, let's recap what I'm trying to do here and what does and doesn't work.

I've added a new partition for the 20100210 reporting date and I tried to copy the partition and subpartition stats from the previous partition (P_20100209) to the new partition. I attempted a Partition-level copy

SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', 
                                      srcpartname => 'P_20100209', dstpartname => 'P_20100210');

PL/SQL procedure successfully completed.

in the expectation that DBMS_STATS would copy the partition stats as well as all of the subpartition stats from the P_20100209 partition. I'll repeat how the stats looked here ...

SQL> select  table_name, global_stats, last_analyzed, num_rows
  2  from dba_tables
  3  where table_name='TEST_TAB1'
  4  and owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     GLO LAST_ANALYZED          NUM_ROWS                                       
------------------------------ --- -------------------- ----------                                       
TEST_TAB1                      NO                                                                        

SQL>
SQL> select  table_name, partition_name, global_stats, last_analyzed, num_rows
  2  from dba_tab_partitions
  3  where table_name='TEST_TAB1'
  4  and table_owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     PARTITION_NAME                 GLO LAST_ANALYZED          NUM_ROWS        
------------------------------ ------------------------------ --- -------------------- ----------        
TEST_TAB1                      P_20100131                     NO                                         
TEST_TAB1                      P_20100201                     NO                                         
TEST_TAB1                      P_20100202                     NO                                         
TEST_TAB1                      P_20100203                     NO                                         
TEST_TAB1                      P_20100204                     NO                                         
TEST_TAB1                      P_20100205                     NO                                         
TEST_TAB1                      P_20100206                     NO                                         
TEST_TAB1                      P_20100207                     NO                                         
TEST_TAB1                      P_20100209                     NO  28-MAR-2010 15:38:38         12        
TEST_TAB1                      P_20100210                     NO                                         

10 rows selected.

SQL> select  table_name, subpartition_name, global_stats, last_analyzed, num_rows
  2  from dba_tab_subpartitions
  3  where table_name='TEST_TAB1'
  4  and table_owner='TESTUSER'
  5  order by 1, 2, 4 desc nulls last;

TABLE_NAME                     SUBPARTITION_NAME              GLO LAST_ANALYZED          NUM_ROWS        
------------------------------ ------------------------------ --- -------------------- ----------        
TEST_TAB1                      P_20100131_GROT                NO                                         
TEST_TAB1                      P_20100131_HALO                NO                                         

<>

TEST_TAB1                      P_20100209_GROT                NO  28-MAR-2010 15:38:32          3        
TEST_TAB1                      P_20100209_HALO                NO  28-MAR-2010 15:38:32          3        
TEST_TAB1                      P_20100209_JUNE                NO  28-MAR-2010 15:38:32          3        
TEST_TAB1                      P_20100209_OTHERS              NO  28-MAR-2010 15:38:33          3        
TEST_TAB1                      P_20100210_GROT                NO  28-MAR-2010 15:38:33          3    
TEST_TAB1                      P_20100210_HALO                NO                                         
TEST_TAB1                      P_20100210_JUNE                NO                                         
TEST_TAB1                      P_20100210_OTHERS              NO                                         

40 rows selected.

So here's where we are

- No sign of partition statistics for P_20100210, despite the P_20100209 'source' partition having valid stats.
- The subpartition stats have been copied from P_20100209_GROT to P_20100210_GROT.
- The subpartition stats have not been copied for the other three P_20100210 partitions.

Weird, right? I've checked this over and over and I'm pretty sure I'm right, but decided to upload the entire script and output here in case others can spot some mistake I'm making.

Getting back down to earth, though, this isn't the end of the world. It just means that for our particular stats collection strategy of loading data into load tables, exchanging them with subpartitions and then copying earlier stats, we just need to make sure we're working at the subpartition level all the time and that's what I'll look at next.

Finally, I'll re-emphasise that this is not the only strategy and it's fair to say it's flushing out some unusual effects that you might never see if you work primarily with Table and Partition-level stats!

cursor: pin S waits, sporadic CPU spikes and systematic troubleshooting

I recently consulted one big telecom and helped to solve their sporadic performance problem which had troubled them for some months. It was an interesting case as it happened in the Oracle / OS touchpoint and it was a product of multiple “root causes”, not just one, an early Oracle mutex design bug and a Unix scheduling issue – that’s why it had been hard to resolve earlier despite multiple SRs opened etc.

Martin Meyer, their lead DBA, posted some info about the problem and technical details, so before going on, you should read his blog entry and read my comments below after this:

Problem:

So, the problem was, that occasionally the critical application transactions which should have taken very short time in the database (<1s), took 10-15 seconds or even longer and timed out.

Symptoms:

  1. When the problem happened, the CPU usage also jumped up to 100% for the problem duration (from few tens of seconds up to few minutes).
  2. In AWR snapshots (taken every 20 minutes), the “cursor: pin S” popped into top TOP5 waits (around 5-10% of total instance wait time) and sometimes also “cursor: pin S wait on X” which is a different thing, also “latch: library cache” and interestingly “log file sync”. These waits had then much higher average wait times per wait occurrence than normal (tens or hundreds of milliseconds per wait, on average).
  3. The V$EVENT_HISTOGRAM view showed lots of cursor: pin S waits taking very long time (over a second, some even 30+ seconds) and this certainly isn’t normal (Martin has these numbers in his blog entry)

AWR and OS CPU usage measurement tools are system-wide tools (as opposed to session-wide tools).

Troubleshooting:

I can’t give you exact numbers or AWR data here, but will explain the flow of troubleshooting and reasoning.

  • As the symptoms involved CPU usage spikes, I first checked whether there were perhaps logon storms going on due a bad application server configuration, where the app server suddenly decides to fire up hundreds of more connections at the same time (that happens quite often, so it’s a usual suspect when troubleshooting such issues). A logon storm can consume lots of CPU as all these new processes need to be started up in OS, they attach to SGA (syscalls, memory pagetable set-up operations) and eventually they need to find & allocate memory from shared pool and initialize session structures. This all takes CPU.However the logons cumulative statistic in AWR didn’t go up almost at all during the 20 minute snapshot, so that ruled out a logon storm. As the number of sessions in the end of AWR snapshot (compared to the beginning of it) did not go down, this ruled out a logoff storm too (which also consumes CPU as now the exiting processes need to release their resources etc).
  • It’s worth mentioning that log file sync waits also went up by over an order of magnitude (IIRC from 1-2ms to 20-60 ms per wait) during the CPU spikes. However as log file parallel write times didn’t go up so radically, this indicated that the log file sync wait time was wasted somewhere else too – which is very likely going to be CPU scheduling latency (waiting on the CPU runqueue) when CPUs are busy.
  • As one of the waits which popped up during the problem was cursor: pin S, then I chcecked V$MUTEX_SLEEP_HISTORY and it did not show any specific cursor as a significant contention point (all contention recorded in that sleep history buffer was evenly spread across many different cursors), so that indicated to me that the problem was likely not related to a single cursor related issue (a bug or just too heavy usage of that cursor). Note that this view was not queried during the worst problem time, so there was a chance that some symptoms were not in there anymore (V$MUTEX_SLEEP_HISTORY is a circular buffer of few hundred last mutex sleeps).
  • So, we had CPU starvation and very long cursor: pin S waits popping up at the same time. cursor: pin S operation should happen really fast as it’s a very simple operation (few tens of instructions only) of marking the cursor’s mutex in-flux so its reference count could be bumped up for a shared mutex get.
  • Whenever you see CPU starvation (CPUs 100% busy and runqueues are long) and latch or mutex contention, then the CPU starvation should be resolved first, as the contention may just be a symptom of the CPU starvation. The problem is that if you get unlucky and a latch or mutex holder process is preempted and taken off CPU by the scheduler, the latch/mutex holder can’t release the latch before it gets back onto CPU to complete its operation! But OS doesn’t have a clue about this, as latches/mutexes are just Oracle’s memory structures in SGA. So the latch/mutex holder is off CPU and everyone else who gets onto CPU may want to take the same latch/mutex. They can’t get it and spin shortly in hope that the holder releases it in next few microseconds, which isn’t gonna happen in this case, as the latch/mutex holder is still off CPU!
  • And now comes a big difference between latches and mutexes in Oracle 10.2: When a latch getter can’t get the latch after spinning, it will go to sleep to release the CPU. Even if there are many latch getters in the CPU runqueue before the latch holder, they all spin quickly and end up sleeping again. But when a mutex getter doesn’t get the mutex after spinning, it will not go to sleep!!! It will yield() the CPU instead, which means that it will go to the end of runqueue and try to get back onto CPU as soon as possible. So, mutex getters in 10.2 are much less graceful, they can burn a lot of CPU when the mutex they want is held by someone else for long time.
  • But so what, if a mutex holder is preempted and taken off CPU by OS scheduler – it should get back onto CPU pretty fast, once it works its way through the CPU runqueue?
  • Well, yes IF all the processes in the system have the same priority.
  • This is where a second problem comes into play – Unix process priority decay. When a process eats a lot of CPU (and does little IO / voluntary sleeping) then the OS lowers that processes CPU scheduling priority so that other, less CPU hungry processes would still get their fair share of CPU (especially when coming back from an IO wait for example etc).
  • When a mutex holder has a lower priority than most other processes and is now taken off CPU, a thing called priority inversion happens. Even though other processes do have higher priority, they can not proceed, as the critical lock or resource they need, is already held by the other process with a lower priority who can’t complete its work as the “high priority” processes keep the CPUs busy.
  • In case of latches, the problem is not that bad as the latch getters go to sleep until they are posted when the latch is released by the holder process (I’ve written about it here). But the priority inversion takes a crazy turn in case of mutexes – as their getters don’t sleep (not even for a short time) by default, but yield the CPU and try to get back to it immediately and so on until they get the mutex. That can lead to huge CPU runqueue spikes, unresponsive systems and even hangs.
  • This is why starting from Oracle 11g the mutex getters do sleep instead of just yielding the CPU and also Oracle has backported the fix into Oracle 10.2.0.4, where a patch must be applied and where the _first_spare_parameter will specify the sleep duration in centiseconds.
  • So, knowing how mutexes worked in 10.2, all the symptoms led me to suspect this priority inversion problem, greatly amplified by how the mutex getters do never sleep by default. And we checked the effective priorities of all Oracle processes in the server, and we hit the jackpot – there was a number of processes with significantly lower priorities than all other processes had. And it takes only one process with low priority to cause all this trouble, just wait until it starts modifying a mutex and is preempted while doing this.
  • So, in order to fix both of the problems which amplified each other, we had to enable HPUX_SCHED_NOAGE Oracle parameter, to prevent priority decay of the processes and set the _first_spare_parameter to 10, which meant that default mutex sleep time will be 10 centiseconds (which is pretty long time in mutex/latching world, but better than crazily retrying without any sleeping at all). That way no process (the mutex holder) is pushed back and kept away from CPU for long periods of time.

This was not a trivial problem, as it happened in Oracle / OS touchpoint and happened not because a single reason, but as a product of multiple separate reasons, amplifying each other.

There are few interesting, non-technical points here:

  1. When troubleshooting, don’t let performance tools like AWR (or any other tool!) tell you what your problem is! Your business, your users should tell you what the problem is and the tools should only be used for symptom drilldown (This is what Cary Millsap has been constantly telling us). Note how I mentioned the problem and symptoms separately in the beginning of my post – and the problem was that some business transactions (systemwide) timed out because the database response time was 5-15 seconds!
  2. The detail and scope of your performance data must have at least as good detail and scope of your performance problem!
    In other words, if your problem is measured in few seconds, then your performance data should also be sampled at least every few seconds in order to be fully systematic.

    The classic issue in this case was that the 20 minute AWR reports still showed IO wait times as main DB time consumers, but that was averaged over 20 minutes. But our problem happened severely and shortly within few seconds in that 20 minutes, so the averaging and aggregation over long period of time did hide the extreme performance issue that happened in a very short time.

Next time when it seems to be impossible to diagnose a problem and if the troubleshooting effort ends up going in circles, then you should ask, “what’s the real problem and who and how is experiencing it” and see if your performance data’s detail and scope matches that problem!

Oh, this is a good point to mention that in addition to my Advanced Oracle Troubleshooting/SQL Tuning seminars I also actually perform advanced Oracle troubleshooting consulting too! I eat mutexes for breakfast ;-)

Share/Bookmark

KGH: NO ACCESS – Buffer cache inside streams pool too!

Some time ago I wrote that since Oracle 10.2, some of the buffer cache can physically reside within shared pool granules.

I just noticed this in an 11.2 instance:

SQL> select * from v$sgastat where name like ‘KGH%’;
POOL         NAME                            BYTES
------------ -------------------------- ----------
streams pool KGH: NO ACCESS                4186144
So, it looks that also streams pool can surrender parts of its memory granules to buffer cache, if it’s unable to flush everything out from the granule for complete granule handover.
Let’s see whether that’s the case:

SQL> select last_oper_type, last_oper_mode from v$sga_dynamic_components where component = 'streams pool';
LAST_OPER_TYP LAST_OPER
------------- ---------
SHRINK        DEFERRED
Yep, the last streams pool shrink operation was left in DEFERRED status, which means the granule wasn’t handed over – streams pool kept the granule for itself, marked everything it could flush out as KGH: NO ACCESS in its heap header and handed these chunks over to buffer cache manager.
Lets check whether these chunks are actually used by buffer cache buffers:
(NB! Think twice before running this query in production as it may hold your shared pool latches for very long time):
SQL> select ksmchidx,ksmchdur,ksmchcom,ksmchptr,ksmchsiz,ksmchcls from x$ksmsst where ksmchcom = 'KGH: NO ACCESS';
  KSMCHIDX   KSMCHDUR KSMCHCOM         KSMCHPTR           KSMCHSIZ KSMCHCLS
---------- ---------- ---------------- ---------------- ---------- --------
         1          4 KGH: NO ACCESS   00000003A2401FE0    4186144 no acce
So, there’s only one chunk flushed & handed over to buffer cache in the streams pool heap. The KSMCHPTR column shows the starting address of that chunk in SGA and the KSMCHSIZ is the size of that chunk.
So, let’s see if there are any buffers within that address range. First I’ll calculate the end address of that chunk (start address + size -1 = end address)

SQL> @calc 0x00000003A2401FE0 + 4186143
                     DEC                  HEX
------------------------ --------------------
         15611199487.000            3A27FFFFF
And now lets query X$BH using that address range to see if / how many buffer cache buffers have been placed in there:

SQL> select count(*) from x$bh where rawtohex(ba) between '00000003A2401FE0' and '00000003A27FFFFF';

  COUNT(*)
----------
       483
We have just proven, that there are 483 buffer cache buffers (~3.8MB, 8kB buffers) which reside physically in streams pool heap!

Share/Bookmark

DB Optimizer 2.5 Beta Opportunity

I now have the opportunity to offer a 6 week beta trial of DB Optimizer 2.5 which shows the performance of the databases, identifying load, bottlenecks, top SQL and top Users as well as offering SQL tuning advice, index analysis and visual SQL tuning is now available for beta testing.
More information on database performance in DB Optimizer at
More information on visual SQL tuning and analysis at
To participate in this beta, you will need to complete the application form, as well as DBOptimizer Beta Agreement.

ORA-01719 is partially relaxed

You most likely have seen this error before: ORA-01719: outer join operator (+) not allowed in operand of OR or IN Cause: An outer join appears in an or clause. Action: If A and B are predicates, to get the effect of (A(+) or B), try (select where (A(+) and not B)) union all (select [...]

Something I recently unlearned...

This is how many of the things I learn everyday come into being. They are actually things I have to "unlearn" because what used to be true has changed over time.

Once upon a time ago - I remember the day I learned this, it was during a benchmark in 1993, I learned that UNINDEXED foreign keys had some locking implications. Specifically if you:

  • update the parent primary key (which does happen, some 'frameworks' update every column even if the value did not change)
  • delete from parent

Then you should probably index the foreign key in the child table - else there will be a full table lock placed on the child table - for the duration of the transaction.

Then Oracle 9i was released and I had to relearn the rule. The rule in 9i was as above still - just modified as to the duration of the lock (many people think the restriction actually went away - but it did not, it was changed). In 9i and above, if you update the parent or delete from the parent with an unindexed foreign key - the child table is still locked - just for the duration of the update or delete! The lock is released after the statement processed - not when you commit. This was "better", but the lock still exists.

Sometime during 9i - I learned yet another modification to the rule above. The rule in 9i now has to include:

  • if you merge into the parent table

in addition to update and delete. As I was getting ready to add that to the 2nd Edition of Expert Oracle Database Architecture - I learned something new, the rule has changed again. The MERGE doesn't always lock the table anymore in 11g Release 1 and above - so we are back to just update and delete (sort of!).

Here is the small test case you can use to verify - the set up is:

ops$tkyte%ORA9IR2> create table p ( x int primary key, y int );
Table created.

ops$tkyte%ORA9IR2> insert into p values ( 1, null );
1 row created.

ops$tkyte%ORA9IR2> insert into p values ( 2, null );
1 row created.

ops$tkyte%ORA9IR2> create table c ( x references p );
Table created.

ops$tkyte%ORA9IR2> create or replace
procedure modify_p( p_what in varchar2 )
2 as
3 pragma autonomous_transaction;
4 deadlock exception;
5 pragma exception_init( deadlock, -60 );
6 begin
7 if ( p_what = 'DELETE' ) then delete from p where x = 2;
8 elsif ( p_what = 'UPDATE' ) then update p set x = 2 where x = 2;
9 elsif ( p_what = 'MERGE' ) then
10 merge into p using (select 2 x, 42 y from dual) d
11 on (p.x=d.x)
12 when matched then update set y = d.y
13 when not matched then insert(x,y) values (d.x,d.y);
14 end if;
15 rollback;
16 dbms_output.put_line( p_what || ': successful...' );
17 exception
18 when deadlock then
19 dbms_output.put_line( p_what ||
': we deadlocked, we needed full table lock');
20 rollback;
21 end;
22 /
Procedure created.

So, a parent table with two rows - 1 and 2. An empty child table with an unindexed foreign key. A stored procedure that runs as an autonomous transaction - so it cannot share the locks of the parent transaction, if the parent transaction has anything locked - the autonomous_transaction will NOT be able to also lock it. The autonomous transaction attempts to either

  • delete row X=2 from parent
  • update row X=2 in parent
  • merge into row x=2 in parent using when matched then update, when not matched then insert

and if it deadlocks - prints out a message telling us that and rolls back. If successful, prints out a message telling us that and likewise rolls back.

To test, we just insert into the child table a record that points to row x=1 in parent (we'll never touch that row in the parent table) and then try the three DML opertions:

ops$tkyte%ORA9IR2> insert into c values ( 1 );
1 row created.

ops$tkyte%ORA9IR2> exec modify_p( 'DELETE' );
DELETE: we deadlocked, we needed full table lock
PL/SQL procedure successfully completed.

ops$tkyte%ORA9IR2> rollback;
Rollback complete.

ops$tkyte%ORA9IR2>
ops$tkyte%ORA9IR2> insert into c values ( 1 );
1 row created.

ops$tkyte%ORA9IR2> exec modify_p( 'UPDATE' );
UPDATE: we deadlocked, we needed full table lock
PL/SQL procedure successfully completed.

ops$tkyte%ORA9IR2> rollback;
Rollback complete.

ops$tkyte%ORA9IR2> insert into c values ( 1 );
1 row created.

ops$tkyte%ORA9IR2> exec modify_p( 'MERGE' );
MERGE: we deadlocked, we needed full table lock
PL/SQL procedure successfully completed.

ops$tkyte%ORA9IR2> rollback;
Rollback complete.

There you go, you can see it deadlocked on all three - they all needed to lock the child table before doing their work.

If you run that in 10gr1 and 10gr2 - you'll see the same results - all three lock. However, starting in 11g Release 1 - you'll see this:

ops$tkyte%ORA11GR1> insert into c values ( 1 );
1 row created.

ops$tkyte%ORA11GR1> exec modify_p( 'DELETE' );
DELETE: we deadlocked, we needed full table lock
PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR1> rollback;
Rollback complete.

ops$tkyte%ORA11GR1> insert into c values ( 1 );
1 row created.

ops$tkyte%ORA11GR1> exec modify_p( 'UPDATE' );
UPDATE: we deadlocked, we needed full table lock
PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR1> rollback;
Rollback complete.

ops$tkyte%ORA11GR1> insert into c values ( 1 );
1 row created.

ops$tkyte%ORA11GR1> exec modify_p( 'MERGE' );
MERGE: successful...
PL/SQL procedure successfully completed.

ops$tkyte%ORA11GR1> rollback;
Rollback complete.

That merge no longer locks the child table. Curious as to this change - I wanted to see if it was official or not and found bug 5970280 - from which I learned that it was officially changed and that some of you running 10.2 might see a different result for this test case (the fix was backported and is available for 10.2).

The fix is more complex than appears (aren't they always?) It is not just "turn off lock for MERGE", it is "analyze the merge and

  • if the merge just inserts - treat as an insert
  • if merge does an update or update and insert (and we are NOT updating the primary key!) treat as an update to non-primary key columns
  • if merge does an update or update and insert (and we are updating the primary key) lock child table
  • if merge includes a delete - treat as a delete and lock child table


So, if your merge was:


10 merge into p using (select 2 x, 42 y from dual) d
11 on (p.y=d.x)
12 when matched then update set x = d.x
13 when not matched then insert(x,y) values (d.x,d.y);

then you would see:


ops$tkyte%ORA11GR1> exec modify_p( 'MERGE' );
MERGE: we deadlocked, we needed full table lock
PL/SQL procedure successfully completed.

or if your merge included a possible delete branch, you would see the same.

So, I guess the rule in 11gR1 and above is, if you

  • update parent primary key
  • delete from parent
  • use a merge that does either of the above

and you have an unindexed foreign key - you should expect a full table lock on the child table. If you index that foreign key - no untoward locking will take place.

Things change over time :)

This Is A VERY Boring Blog!

I’ve been stranded in Europe for 4 days and the situation persists!  Needless to say I haven’t been thinking that much about blogging I do have a post nearly ready to go about booting 4s48c Opteron 6100 systems with _enable_NUMA_support set to TRUE. There are some caveats, and some very significant benefits as well. I’ll [...]

5th Planboard DBA Symposium: Registration now open

On June 8 Planboard will run her 5th Dutch DBA Symposium and the registration is now open. This “for Dutch DBA’s, by Dutch DBA’s” symposium has become the place to be for the serious DBA who wants to share his or her knowledge with other DBA’s in an open environment with plenty of networking time […]