If you are attending Collaborate 2012, you might be interested in my content-rich sessions below :
Session Number: 326
Session Title: SCAN, VIP, HAIP, and other RAC acronyms
Session Date/Time/Room: Tue, Apr 24, 2012 (10:45 AM – 11:45 AM) : Surf C
Session Number: 327
Session Title: Internals and Performance Boot Camp: Truss, pstack, pmap, and more
Session Date/Time/Room: Wed, Apr 25, 2012 (03:00 PM – 04:00 PM) : Palm A
Hope to see you there!
Update: I am uploading presentation files. Presentations are much more recent than the document
Last week (March 2012), I was conducting Advanced RAC Training online. During the class, I was recreating a ‘gc buffer busy’ waits to explain the concepts and methods to troubleshoot the issue.
Let’s define these events first. Event ‘gc buffer busy’ event means that a session is trying to access a buffer,but there is an open request for Global cache lock for that block already, and so, the session must wait for the GC lock request to complete before proceeding. This wait is instrumented as ‘gc buffer busy’ event.
From 11g onwards, this wait event is split in to ‘gc buffer busy acquire’ and ‘gc buffer busy release’. An attendee asked me to show the differentiation between these two wait events. Fortunately, we had a problem with LGWR writes and we were able to inspect the waits with much clarity during the class.
Temporary tablespaces are shared objects and they are associated to an user or whole database (using default temporary tablespace). So, in RAC, temporary tablespaces are shared between the instances. Many temporary tablespaces can be created in a database, but all of those temporary tablespaces are shared between the instances. Hence, temporary tablespaces must be allocated in shared storage or ASM. We will explore the space allocation in temporary tablespace in RAC, in this blog entry.
In contrast, UNDO tablespaces are owned by an instance and all transactions from that instance is exclusively allocated in that UNDO tablespace. Remember that other instances can read blocks from remote undo tablespace, and so, undo tablespaces also must be allocated from shared storage or ASM.
Space allocation in TEMP tablespace
There was a question about the wait event ‘rdbms ipc message’ in Oracle-l list. Short answer is that ‘rdbms ipc message’ event means that a process is waiting for an IPC message to arrive. Usually, this wait event can be ignored, but there are few rare scenarios this wait event can’t be completely ignored.
What is ‘rdbms ipc message’ wait means?
It is typical of Oracle Database background processes to wait for more work. For example, LGWR will wait for more work until another (foreground or background ) process request LGWR to do a log flush. In UNIX platforms, wait mechanism is implemented as a sleep on a specific semaphore associated with that process. This wait time is accounted towards database wait events ‘rdbms ipc message’.
Also note that, semaphore based waits are used in other wait scenarios too, not just ‘rdbms ipc message’ waits.
Time to Trace
It is probably easy to calculate hourly redo rate or daily redo rate using AWR data. For example, my script awr_redo_size.sql can be used to calculate daily redo rate, and awr_redo_size_history.sql can be used to calculate hourly redo rate. Hourly redo rate is especially useful since you can export to an excel spreadsheet, graph it to see redo rate trend.
Update: I added another script to calculate redo rate if you don’t have AWR license. redo_size_archived_log.sql.
Introduction to Direct Mode Writes
This video was created circa July 2011. Click the Read More link to review the video. Version Oracle Database 126.96.36.199
Synopsis: Essentially, we probe the importance of LMS processes using DTrace. Explain why LMS should run in elevated priority. How to review deep statistics about LMS processes and much more.
In this blog entry, we will explore the wonderful world of SCNs and how Oracle database uses SCN internally. We will also explore few new bugs and clarify few misconceptions about SCN itself.
What is SCN?
SCN (System Change Number) is a primary mechanism to maintain data consistency in Oracle database. SCN is used primarily in the following areas, of course, this is not a complete list:
You might encounter RAC wait event ‘gc cr disk read’ in 11.2 while tuning your applications in RAC environment. Let’s probe this wait event to understand why a session would wait for this wait event.
Understanding the wait event
Let’s say that a foreground process running in node 1, is trying to access a block using a SELECT statement and that block is not in the local cache. To maintain the read consistency, foreground process will require the block consistent with the query SCN. Then the sequence of operation is(simplified):
On February 14-16, I’ll be at the Colorado Convention Center in Denver, Colorado for RMOUG’s Training Days Conference. This is the largest regional Oracle User Conference in North America and attracts presenters from all around the country and the globe. I’ll be presenting:
Presentation Name: Troubleshooting RAC Background Process
Abstract: RAC background process performance is critical to keep the application performance. This session will demo techniques to review the performance of RAC background processes such as LMS, LMD, LMON, etc. using various statistics and UNIX tools. The presentation will also discuss why certain background processes must run in higher priority to maintain the application performance in RAC.
Presentation Name: A Kind and Gentle Introduction to RAC
Waits for ‘DFS lock handle’ can cause massive performance issues in a busy RAC cluster. In this blog entry, we will explore the DFS lock handle wait event, and understand how to troubleshoot the root cause of these waits. I am also going to use locks and resources interchangeably in this blog, but internally, they are two different types of structures.
A little background
DFS (stands for Distributed File System) is an ancient name, associated with cluster file system operations, in a Lock manager supplied by vendors in Oracle Parallel Server Environment (prior name for RAC). But, this wait event has morphed and is now associated with waits irrelevant to database files also. Hence, it is imperative to understand the underlying details to debug the ‘DFS lock handle’ waits.
How does it work?