You might encounter RAC wait event ‘gc cr disk read’ in 11.2 while tuning your applications in RAC environment. Let’s probe this wait event to understand why a session would wait for this wait event.
Understanding the wait event
Let’s say that a foreground process running in node 1, is trying to access a block using a SELECT statement and that block is not in the local cache. To maintain the read consistency, foreground process will require the block consistent with the query SCN. Then the sequence of operation is(simplified):
LMS is a light weight process
Global cache operations must complete quickly, in the order of milli-seconds, to maintain the overall performance of RAC database. LMS is a critical process and does not do heavy lifting tasks such as disk I/O etc. If LMS process has to initiate I/O, instead of initiating I/O, LMS will downgrade the block mode and send the block to the requesting foreground process (this is known as Light Works rule). Foreground process will apply undo records to the block to construct CR version of the block.
Now, the Foreground process might not find the undo blocks in the local cache as the transactions happened in the remote cache. A request is sent to remote LMS process to access undo block. If the undo block is not in the remote cache either, remote LMS process will send a grant to the foreground process to read the undo block from the disk. Foreground process accounts this wait time for the undo segment block grants to the ‘gc cr disk read’ wait event.
There are other reasons as to why FG process might have to read undo block. One of them is that Fairness downconvert triggered by LMS process. Essentially, if a block is requested too many times leading to many CR block fabrication, then instead of LMS doing more work, LMS process will simply down convert the block, send the block and grant to the requester an access to the block. FG process will apply undo to construct CR block itself.
gv$cr_block_server can be used to review the number of down converts, Light works etc. But, it is probably not possible to identify the reason for a block down convert after the event.
Why do we need this new event?
There is a very good reason why this event was introduced. Prior to 11g, waits for single block CR grants are accounted to wait event such as ‘gc cr block 2-way’, ‘gc cr block 3-way’ etc. Waits for grants on remote-instance-undo-blocks for CR fabrication is special, in the sense that, this is an additional unnecessary work from the application point of view. We need to be able to differentiate the amount of time spent waiting for undo block grants for CR fabrication vs other types of grants (such as data blocks etc). So, it looks like, Oracle has introduced this new event and I do think that this will be very useful for debugging performance issues.
Prior to 11g, you could still differentiate waits for single block grants for undo using ASH data or trace files. But, you will have to use the obj# field for this differentiation and obj# is set to 0 or -1 in the case of undo blocks/undo header blocks.
Test case
Of course, a test case would be nice, Just any regular table will do and my table structure have just two columns number, varchar2(255) with 1000 rows or so.
create table rs.t_1000 (n1 number, v1 varchar2(255));
insert into rs.t_1000 select n1, lpad(n1, 255, 'DEADBEAF') from (select level n1 from dual connect by level <=1000;
commit;
At step 3 in our test case, I flushed the buffer cache in node 2. When I reread the block again in node 1, here is an approximate sequence of operations that occurred:
(I formatted trace lines to improve readability and my comments are inline)
.
PARSING IN CURSOR #18446741324873694136 len=162 dep=0 uid=0 oct=3 lid=0 tim=1103269602 hv=361365006 ad='73fba918' sqlid='7cmy6msasmzhf'
select dbms_rowid.rowid_relative_fno (rowid) fno,
dbms_rowid.rowid_block_number(rowid) block,
dbms_rowid.rowid_object(rowid) obj, v1 from rs.t_1000 where n1=100
END OF STMT
c=0,e=17012,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=479790187,tim=1103269601
c=0,e=12393,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=479790187,tim=1103282129
nam='SQL*Net message to client' ela= 4 driver id=1650815232 #bytes=1 p3=0 obj#=603 tim=1103288000
nam='Disk file operations I/O' ela= 7 FileOperation=2 fileno=4 filetype=2 obj#=603 tim=1103288120
nam='db file sequential read' ela= 786 file#=4 block#=2891 blocks=1 obj#=85844 tim=1103289701
nam='db file sequential read' ela= 560 file#=4 block#=2892 blocks=1 obj#=85844 tim=1103290450
nam='Disk file operations I/O' ela= 4 FileOperation=2 fileno=7 filetype=2 obj#=85844 tim=1103290546
nam='db file sequential read' ela= 542 file#=7 block#=2758 blocks=1 obj#=75154 tim=1103291149
-- RS: Following is for undo header block to find the transaction.
nam='gc cr disk read' ela= 633 p1=6 p2=176 p3=43 obj#=0 tim=1103292048
nam='db file sequential read' ela= 662 file#=6 block#=176 blocks=1 obj#=0 tim=1103292843
-- RS: Following read is for undo block itself to rollback the transaction changes.
nam='gc cr disk read' ela= 483 p1=6 p2=955 p3=44 obj#=0 tim=1103293699
nam='db file sequential read' ela= 569 file#=6 block#=955 blocks=1 obj#=0 tim=1103294355
nam='library cache pin' ela= 1045 handle address=2043988208 pin address=1969440144 100*mode+namespace=48820893384706 obj#=0 tim=1103295827
FETCH :c=0,e=8033,p=5,cr=6,cu=0,mis=0,r=1,dep=0,og=1,plh=479790187,tim=1103296065
WAIT : nam='SQL*Net message from client' ela= 323 driver id=1650815232 #bytes=1 p3=0 obj#=0 tim=1103296522
FETCH :c=0,e=24,p=0,cr=1,cu=0,mis=0,r=0,dep=0,og=1,plh=479790187,tim=1103296586
STAT id=1 cnt=1 pid=0 pos=1 obj=75154 op='TABLE ACCESS BY INDEX ROWID T_1000 (cr=7 pr=5 pw=0 time=6505 us cost=2 size=261 card=1)'
STAT id=2 cnt=1 pid=1 pos=1 obj=85844 op='INDEX RANGE SCAN T_1000_N1 (cr=3 pr=2 pw=0 time=2494 us cost=1 size=0 card=1)'
WAIT : nam='SQL*Net message to client' ela= 1 driver id=1650815232 #bytes=1 p3=0 obj#=0 tim=1103296728
Let’s review few lines from the trace file. Block 2758 holds the row physically. After reading that block, a ‘gc cr disk read’ wait event is encountered. Essentially, the FG identified that there is a pending transaction, so sent a request to LMS accounting the wait time to ‘gc cr disk read’ event. For this ‘gc cr disk read’ event, p1 is file_id, p2 is block_id, p3 seems to be a counter increasing by 1 for each of these waits. For example, for the next gc cr disk read, p3 is set to 44. obj#=0 indicates that this is an undo block or undo header block.
nam='db file sequential read' ela= 542 file#=7 block#=2758 blocks=1 obj#=75154 tim=1103291149
nam='gc cr disk read' ela= 633 p1=6 p2=176 p3=43 obj#=0 tim=1103292048
nam='db file sequential read' ela= 662 file#=6 block#=176 blocks=1 obj#=0 tim=1103292843
Commit cleanouts is another reason why you would encounter this wait event. When a session commits, that session will revisit the modified blocks to clean out the ITL entries in the modified blocks. But, this cleanout does not happen in all scenarios. For example, if the number of block changes exceeds list of blocks that session maintains in SGA then the session will mark the transaction table as committed, without cleaning out ITL entries in the actual blocks. Commit cleanout doesn’t happen if the modified block is not in buffer cache anymore (We will use this idea to trigger commit cleanout in the other node).
Next session reading that block will close out the ITL entry in the block with an upper bound SCN. Let’s tweak our test case little bit to simulate commit cleanouts.
One notable difference in the test case below is that after flushing the buffer cache, we switch back to the session updating the block,commit the transaction and flush the buffer cache again. With these two flush we guarantee that session will not commit cleanout and undo block is flushed from the cache. SELECT statement is executed after the commit. When we read the same block from a different node then the reader session will cleanout the ITL entry. Cleanout operation require to identify the transaction state and max query SCN. To identify whether the transaction is committed or not, session in node2 must read the undo header block. Waits to receive the grants for the undo|undo header block are accounted towards gc cr disk read wait event.
Trace file
My comments are inline
...
-- block with the row is read below
WAIT : nam='db file sequential read' ela= 417 file#=7 block#=2758 blocks=1 obj#=75154 tim=783492616
WAIT : nam='Disk file operations I/O' ela= 2 FileOperation=2 fileno=6 filetype=2 obj#=75154 tim=783492751
-- Undo header block is read with time accounted towards gc cr disk read. Note that, there are no other reads after this.
-- In the prior test case that we saw earlier, there was an additional read to read undo block;
-- In this test case, no additional read for undo block as only commits need to be cleaned out.
WAIT : nam='gc cr disk read' ela= 12779 p1=6 p2=160 p3=41 obj#=0 tim=783507346
WAIT : nam='db file sequential read' ela= 778 file#=6 block#=160 blocks=1 obj#=0 tim=783508447
So, what can we do?
Summary
In summary, this is an useful event to differentiate CR fabrication performance issues and other performance issues. Using few techniques mentioned here, you can reduce the impact of this event.
Full name
Riyaj Shamsudeen
My company
http://www.orainternals.com
Recent comments
17 weeks 2 days ago
27 weeks 18 hours ago
28 weeks 5 days ago
32 weeks 4 hours ago
34 weeks 2 days ago
43 weeks 5 days ago
45 weeks 2 days ago
46 weeks 2 days ago
46 weeks 3 days ago
49 weeks 1 day ago