First of all, the RAC Attack deep dive at Collaborate went great – thanks to everyone who participated! The room was full (20 participants) and I got evaluations from about half of them. Here’s a summary of the eval results:
There were several positive comments such as this: “I would recommend this class to others. This setup is perfect to pick up new skills and expose what ifs w/out worrying about pressing the wrong button.”
I am playing around with the Grid Infrastructure 11.2.0.2 PSU 2 and found an interesting note on My Oracle Support regarding the Patch Set Update. This reminds me that it’s always a good idea to search for a patch number on Metalink before applying a PSU. It also seems to be a good idea to wait for a few days before trying a PSU (or maybe CPU) on your DEV environment for the first time (and don’t even think about applying a PSU on production without thorough testing!)
OK, back to the story: there is a known issue with the patchset which has to do with the change in the Mutex behaviour which the PSU was intended to fix. To quote MOS note “Oracle Database Patch Set Update 11.2.0.2.2 Known Issues (Doc ID 1291879.1)”, Patch 12431716 Is a Recommended Patch for 11.2.0.2.2. In fact, Oracle strongly recommends you to apply the patch to fix Bug 12431716 – Unexpected change in mutex wait behavior in 11.2.0.2.2 PSU (higher CPU possible).
In a nutshell, not applying the patch can cause your system to suffer from excessive CPU usage and more than expected mutex contention. More information can be found in the description of Bug 12431716 Mutex waits may cause higher CPU usage in 11.2.0.2.2 PSU / GI PSU which is worth reading.
Besides this, the PSU was applied without any problems to my four node cluster, I just wish there was a way to roll out a new version of opatch to all cluster node’s $GRID_HOME and $ORACLE_HOME in one command. The overall process for the PSU is the same as already described in my previous post about Bundle Patch 3:
Then wait, and after a little while you spend trailing the logfile in $GRID_HOME/cfgtoollogs/ and having a coffee the process eventually finishes. Repeat on each node and you’re done. I’m really happy there aren’t these long readme files anymore with 8 steps to be performed, partially as root, partially as CRS owner/RDBMS owner. It reduces tge tune ut takes to apply a PSU significantly.
Happy patching!
Dave Abercrombie mailed me earlier that the May 2011 NoCOUG Journal is already out and can be viewed online at http://www.nocoug.org/Journal/NoCOUG_Journal_201105.pdf.
I met Dave at Hotsos 2011 when he attended my presentation.. apparently we have the same interest about Statistics, Linear Regression, AWR, AAS and he’s got awesome blogs about it at http://aberdave.blogspot.com. Also being the Journal Editor at Northern California Oracle Users’s Group, he invited me to write a short article about the question:
6台のクラスターストレージでの検証
InfiniBandインターフェースカード3枚ざしで3枚のInfiniBandインターフェースを差せば10GB/sを突破できる?と書いてしまった。しかし、6GB/s程度しかでなかった。
そこで、ASUS CROSSHAIR IV FORMULAを購入し再度挑戦!
結果は6.4GBps程度でした。そして、InfiniBandインターフェース別の転送量に偏りがある。これではDBのストレージとしては使えない。もしかしたら、高価なIntel Z68チップセットにすれば変わるのか?再度挑戦:
I don’t have an answer to the following TCP performance difference between two machines, but thought I’d post about what I’ve seen as I think it’s interesting, and maybe someone else has ideas.
I’m running netio (http://freshmeat.net/projects/netio/) on one machine (opensolaris) and contacting two different Linux machines (both on 2.6.18-128.el5 ), machine A and machine B.
Machine A has a network throughput of 10MB/sec with netio and machine B 100MB/sec with netio. Netio is set to send 32K chunks:
Linux machine: netio -s -b 32k -t -p 1234
Opensolaris: netio -b 32k -t -p 1234 linuxmachine
On the opensolaris machine I used dtrace to trace the connections. From dtrace, all the interactions TCP settings look the same – same windows sizes on the receive and send, same ssthresh, same congestion window sizes, but the slow machine is sending an ACK for every 2 or 3 receives whereas the fast machine is sending an ACK every 12 receives.
All three machines are on the same switch.
Here is the Dtrace output:
Fast Machine:
delta send recd (us) bytes bytes swnd snd_ws rwnd rcv_ws cwnd ssthresh 122 1448 \ 195200 7 131768 2 128872 1073725440 37 1448 \ 195200 7 131768 2 128872 1073725440 20 1448 \ 195200 7 131768 2 128872 1073725440 18 1448 \ 195200 7 131768 2 128872 1073725440 18 1448 \ 195200 7 131768 2 128872 1073725440 18 1448 \ 195200 7 131768 2 128872 1073725440 18 1448 \ 195200 7 131768 2 128872 1073725440 19 1448 \ 195200 7 131768 2 128872 1073725440 18 1448 \ 195200 7 131768 2 128872 1073725440 18 1448 \ 195200 7 131768 2 128872 1073725440 57 1448 \ 195200 7 131768 2 128872 1073725440 171 1448 \ 195200 7 131768 2 128872 1073725440 29 912 \ 195200 7 131768 2 128872 1073725440 30 / 0 195200 7 131768 2 128872 1073725440
slow machine:
delta send recd (us) bytes bytes swnd snd_ws rwnd rcv_ws cwnd ssthresh 161 / 0 195200 7 131768 2 127424 1073725440 52 1448 \ 195200 7 131768 2 128872 1073725440 33 1448 \ 195200 7 131768 2 128872 1073725440 11 1448 \ 195200 7 131768 2 128872 1073725440 143 / 0 195200 7 131768 2 128872 1073725440 46 1448 \ 195200 7 131768 2 130320 1073725440 31 1448 \ 195200 7 131768 2 130320 1073725440 11 1448 \ 195200 7 131768 2 130320 1073725440 157 / 0 195200 7 131768 2 130320 1073725440 46 1448 \ 195200 7 131768 2 131768 1073725440 18 1448 \ 195200 7 131768 2 131768 1073725440
Dtrace code
dtrace: 130717 drops on CPU 0 #!/usr/sbin/dtrace -s #pragma D option quiet #pragma D option defaultargs inline int TICKS=$1; inline string ADDR=$$2; dtrace:::BEGIN { TIMER = ( TICKS != NULL ) ? TICKS : 1 ; ticks = TIMER; TITLE = 10; title = 0; walltime=timestamp; printf("starting up ...\n"); } tcp:::send / ( args[2]->ip_daddr == ADDR || ADDR == NULL ) / { nfs[args[1]->cs_cid]=1; /* this is an NFS thread */ delta= timestamp-walltime; walltime=timestamp; printf("%6d %8d \ %8s %8d %8d %8d %8d %8d %12d %12d %12d %8d %8d %d \n", delta/1000, args[2]->ip_plength - args[4]->tcp_offset, "", args[3]->tcps_swnd, args[3]->tcps_snd_ws, args[3]->tcps_rwnd, args[3]->tcps_rcv_ws, args[3]->tcps_cwnd, args[3]->tcps_cwnd_ssthresh, args[3]->tcps_sack_fack, args[3]->tcps_sack_snxt, args[3]->tcps_rto, args[3]->tcps_mss, args[3]->tcps_retransmit ); flag=0; title--; } tcp:::receive / ( args[2]->ip_saddr == ADDR || ADDR == NULL ) && nfs[args[1]->cs_cid] / { delta=timestamp-walltime; walltime=timestamp; printf("%6d %8s / %8d %8d %8d %8d %8d %8d %12d %12d %12d %8d %8d %d \n", delta/1000, "", args[2]->ip_plength - args[4]->tcp_offset, args[3]->tcps_swnd, args[3]->tcps_snd_ws, args[3]->tcps_rwnd, args[3]->tcps_rcv_ws, args[3]->tcps_cwnd, args[3]->tcps_cwnd_ssthresh, args[3]->tcps_sack_fack, args[3]->tcps_sack_snxt, args[3]->tcps_rto, args[3]->tcps_mss, args[3]->tcps_retransmit ); flag=0; title--; }
Followup , since I did the above, I have since changed the dtrace code to include the number of unacknowledged bytes and it turns out the slow code does run up it’s unacknowleged bytes until it hits the congestion window, where as the fast machine never hits it’s congestion window:
unack unack delta bytes bytes send receive cong ssthresh bytes byte us sent recieved window window window sent recieved 139760 0 31 1448 \ 195200 131768 144800 1073725440 139760 0 33 1448 \ 195200 131768 144800 1073725440 144104 0 29 1448 \ 195200 131768 146248 1073725440 145552 0 31 / 0 195200 131768 144800 1073725440 145552 0 41 1448 \ 195200 131768 147696 1073725440 147000 0 30 / 0 195200 131768 144800 1073725440 147000 0 22 1448 \ 195200 131768 76744 72400 147000 0 28 / 0 195200 131768 76744 72400 147000 0 18 1448 \ 195200 131768 76744 72400 147000 0 26 / 0 195200 131768 76744 72400 147000 0 17 1448 \ 195200 131768 76744 72400 147000 0 27 / 0 195200 131768 76744 72400 147000 0 18 1448 \ 195200 131768 76744 72400 147000 0 56 / 0 195200 131768 76744 72400 147000 0 22 1448 \ 195200 131768 76744 72400
dtrace code:
#!/usr/sbin/dtrace -s #pragma D option quiet #pragma D option defaultargs inline int TICKS=$1; inline string ADDR=$$2; tcp:::send, tcp:::receive / ( args[2]->ip_daddr == ADDR || ADDR == NULL ) / { nfs[args[1]->cs_cid]=1; /* this is an NFS thread */ delta= timestamp-walltime; walltime=timestamp; printf("%6d %6d %6d %8d \ %8s %8d %8d %8d %8d %8d %12d %12d %12d %8d %8d %d \n", args[3]->tcps_snxt - args[3]->tcps_suna , args[3]->tcps_rnxt - args[3]->tcps_rack, delta/1000, args[2]->ip_plength - args[4]->tcp_offset, "", args[3]->tcps_swnd, args[3]->tcps_snd_ws, args[3]->tcps_rwnd, args[3]->tcps_rcv_ws, args[3]->tcps_cwnd, args[3]->tcps_cwnd_ssthresh, args[3]->tcps_sack_fack, args[3]->tcps_sack_snxt, args[3]->tcps_rto, args[3]->tcps_mss, args[3]->tcps_retransmit ); } tcp:::receive / ( args[2]->ip_saddr == ADDR || ADDR == NULL ) && nfs[args[1]->cs_cid] / { delta=timestamp-walltime; walltime=timestamp; printf("%6d %6d %6d %8s / %-8d %8d %8d %8d %8d %8d %12d %12d %12d %8d %8d %d \n", args[3]->tcps_snxt - args[3]->tcps_suna , args[3]->tcps_rnxt - args[3]->tcps_rack, delta/1000, "", args[2]->ip_plength - args[4]->tcp_offset, args[3]->tcps_swnd, args[3]->tcps_snd_ws, args[3]->tcps_rwnd, args[3]->tcps_rcv_ws, args[3]->tcps_cwnd, args[3]->tcps_cwnd_ssthresh, args[3]->tcps_sack_fack, args[3]->tcps_sack_snxt, args[3]->tcps_rto, args[3]->tcps_mss, args[3]->tcps_retransmit ); }
So the fact that it looked like the slow machine was acknowledging every second or third send was due to the fact that the receiver was already behind on acknowledging previous packets.
Now the question remains is why did the receiver get so far behind on one machine and not the other.
I’ve check the rmem values on both machines and set the to the same:
This was meant to be published shortly after my latest quiz night post as an explanatory follow up, but unfortunately I only managed to complete this note by now.
There is a more or less famous bug in ASSM (see bug 6918210 in MOS as well as Greg Rahn's and Jonathan Lewis' post) in versions below 11.2 that so far has been classified as only showing up in case of a combination of larger block sizes (greater the current default of 8K) and excessive row migrations. With such a combination it was reproducible that an UPDATE of the same data pattern residing in an ASSM tablespace caused significantly more work than doing the same in a MSSM tablespace, because apparently ASSM had problems finding suitable blocks to store the migrated rows.

Since I’ve been on a blogging hiatus for the past few months (and hopefully that will change shortly) I thought I’d mention that the inverview I did with the famous Gwen (Chen) Shapira has now been published in the May 2011 NoCOUG Journal. Hopefully you find it interesting and insightful. Feel free to leave me a comment on your thoughts.
name="googleone_share_1" style="position:relative;z-index:5;float: right; margin-left: 10px;">Thanks to The Human Fly via Twitter @sjaffarhussain I see that Oracle Corporation have a published note on How to Perform a Database Health Check. (Note 122669.1). I read this with some interest as this is something that I do quite frequently as part of my day job. (If you’d like to get me to [...]
In my earlier post I me wrote about #222222; line-height: 16px;">#2288bb; text-decoration: none;">Tanel Poder's Virtual Class #2. #222222; line-height: 16px;">Tanel is a world famous expert in Oracle internals. I was fortunate to be able to attend the class on all days except Friday. It was solid 4 hours of learning every day. Let me summarize some of the key points I appreciated as an attendee. I hope it will be useful for you making a decision to attend one later.
I'd bet regular readers might have guessed I'd never get back to the stats series, particularly given my extremely limited output this year Well, here goes ...
The theme of this post is already covered in the paper and the presentation, so if you've read either of those, then you might want to skip this. While working on the paper I realised that I'd made yet-another-incorrect-assumption about a copy_table_stats bug. I was right about the bug, which was described in an earlier post in the section titled "ORA-03113 / 07445 while copying list partition statistics". To recap, when copying statistics on multiple list subpartitions, Oracle disconnects the session and core dumps whilst copying the stats for the final OTHERS subpartition which is a DEFAULT list subpartition. I incorrectly assumed that this was because the OTHERS subpartition was the last subpartition that I was copying stats for, which seemed to make sense. But, as I was working on the paper, I thought I would have a dig around on My Oracle Support to see if there had been any reports of this bug.
Sure enough, there had been ....
Bug 10268597: DBMS_STATS.COPY_TABLE_STATS DISCONNECTS & GENERATES ORA-7445 [QOSPMINMAXPARTCOL]
Which is still a bug in 10.2.0.5 and 11.2.0.2. The real issue occurs when you copy the statistics for a DEFAULT list subpartition and I was able to confirm this by changing the order in which I copied the subpartition statistics.
Here is the example from the original post.
SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname => 'P_20100209_GROT', dstpartname => 'P_20100211_GROT'); PL/SQL procedure successfully completed. SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname => 'P_20100209_JUNE', dstpartname => 'P_20100211_JUNE'); PL/SQL procedure successfully completed. SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname => 'P_20100209_HALO', dstpartname => 'P_20100211_HALO'); PL/SQL procedure successfully completed. SQL> exec dbms_stats.copy_table_stats(ownname => 'TESTUSER', tabname => 'TEST_TAB1', srcpartname => 'P_20100209_OTHERS', dstpartname => 'P_20100211_OTHERS'); SQL> show parameter user_dump_dest ERROR: ORA-03114: not connected to ORACLE
If I change the order in which the subpartition stats are copied, it becomes clearer that the ORA-07445 occurs whenever I try to copy the OTHERS subpartition stats
SQL> exec dbms_stats.copy_table_stats(ownname => 'PERF_SUPPORT', tabname => 'TEST_TAB1',
srcpartname => 'P_20110201_OTHERS', dstpartname => 'P_20110212_OTHERS');
BEGIN dbms_stats.copy_table_stats(ownname => 'PERF_SUPPORT', tabname => 'TEST_TAB1',
srcpartname => 'P_20110201_OTHERS', dstpartname => 'P_20110212_OTHERS'); END;
*
ERROR at line 1:
ORA-03113: end-of-file on communication channel
SQL> exec dbms_stats.copy_table_stats(ownname => 'PERF_SUPPORT', tabname => 'TEST_TAB1',
srcpartname => 'P_20110201_MOSCOW', dstpartname => 'P_20110212_MOSCOW');
ERROR:
ORA-03114: not connected to ORACLE
SQL> exec dbms_stats.copy_table_stats(ownname => 'PERF_SUPPORT', tabname => 'TEST_TAB1',
srcpartname => 'P_20110201_LONDON', dstpartname => 'P_20110212_LONDON');
ERROR:
ORA-03114: not connected to ORACLE
SQL> exec dbms_stats.copy_table_stats(ownname => 'PERF_SUPPORT', tabname => 'TEST_TAB1',
srcpartname => 'P_20110201_SYDNEY', dstpartname => 'P_20110212_SYDNEY');
ERROR:
ORA-03114: not connected to ORACLE
Recent comments
21 weeks 1 day ago
31 weeks 6 hours ago
32 weeks 5 days ago
35 weeks 6 days ago
38 weeks 1 day ago
47 weeks 5 days ago
49 weeks 2 days ago
50 weeks 2 days ago
50 weeks 3 days ago
1 year 1 week ago