That’s Global Temporary Table, of course.
A request appeared on the OTN database forum recently asking if it was possible to allocate a global temporary table to a particular temporary tablespace because the programmer didn’t want typical users to be at risk of running out of temporary space when a particular global temporary table suddenly got loaded with millions of rows.
If you’re running 11.1 or above, go straight to the footnote, otherwise read on.
At first sight the answer seems to be no – but an important point in the question was the comment: “I have a procedure which …”. This may make it possible to isolate this table in the required way.
When someone inserts data into a global temporary table the tablespace used to instantiate their version of the table is the temporary tablespace of the schema that is inserting the data. But remember, if you call a procedure to insert the data it’s (usually) the schema that owns the procedure that is doing the insert. (I say “usually” because you could, of course, have a procedure defined as “authid current_user”)
Consider the following (running on 11.1.0.6):
select
username, temporary_tablespace
from
dba_users
where
username in ('U1','TEST_USER')
;
USERNAME TEMPORARY_TABLESPACE
--------------- --------------------
U1 TEMP
TEST_USER TEMP2
I have two users with different temporary tablespaces. In schema test_user we do the following:
create global temporary table t as select * from all_objects where rownum = 0; grant all on t to u1; create or replace procedure pt as begin insert into t select * from all_objects where rownum <= 1000; end; / grant execute on pt to u1;
Now we connect as user u1 and run the following:
SQL> insert into test_user.t select * from all_objects where rownum < = 10; 10 rows created. SQL> commit; Commit complete. SQL> execute test_user.pt PL/SQL procedure successfully completed. SQL> commit; Commit complete.
The first commit it very important, by the way. In another session logged in through a suitably privileged account, execute the following before each commit:
select username, tablespace from v$sort_usage;
The results after the direct call to insert into the GTT will be:
USERNAME TABLESPACE -------- ---------------- U1 TEMP
The results from the call to the procedure will be:
USERNAME TABLESPACE -------- ---------------- U1 TEMP2
The direct insert has used the temporary tablespace of schema U1, but the insert from the procedure call has used the temporary tablespace of schema test_user. So there are ways to allocate a global temporary table to a particularly temporary tablespace if you always use a procedure to populate it. I did point out, though, that the first commit in the demonstration was important. If you don’t do this commit the global temporary table will have been instantiated in the “wrong” temporary tablespace at the point you call the procedure, so the data from the procedure call would be added to the existing data – i.e. where you didn’t want it to go.
Footnote: Everything changes – you’ll see in a response to the thread from Uwe Hesse that from 11g onwards you can specify a tablespace for a global temporary tablespace, and that’s where the data will go. (But I’d drafted this note before I got back to the thread – and it’s still valid for earlier versions of Oracle.)
Footnote 2: I had to create a spare temporary tablespace for this demo. It’s funny little detail that although you: “create TEMPORARY tablespace … TEMPfile…”, when it comes to dropping it’s: “drop tablespace … including … datafiles”.
Here’s an execution plan pulled from v$sql_plan in 10.2.0.3 with the “gather_plan_statistics” option enabled:
---------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | A-Rows | A-Time | Buffers | ---------------------------------------------------------------------------------------- | 1 | NESTED LOOPS | | 1 | 0 |00:00:00.01 | 608 | | 2 | TABLE ACCESS BY INDEX ROWID| PARENT | 1 | 200 |00:00:00.01 | 6 | | 3 | INDEX RANGE SCAN | PAR_I1 | 1 | 200 |00:00:00.01 | 2 | | 4 | TABLE ACCESS BY INDEX ROWID| CHILD | 200 | 0 |00:00:00.01 | 602 | | 5 | INDEX UNIQUE SCAN | CHI_I1 | 200 | 200 |00:00:00.01 | 402 | ----------------------------------------------------------------------------------------
The code defining the two tables is the same:
create table &m_table_name
as
select
rownum id,
trunc(dbms_random.value(1,&m_rowcount)) n1,
lpad(rownum,10,'0') small_vc,
rpad('x',100) padding
from
big_row_source
where
rownum <= &m_rowcount
;
The indexes par_i1 and chi_i1 are on the id column and, as you can see for chi_i1, have been created as unique indexes. Unlike the table, though, the code defining them isn’t necessarily identical.
Please state, with justification, what you think the blevel is for index chi_i1 ?
(Note – to avoid any risk of confusion, remember that height = blevel + 1; and just in case anyone thinks it makes any difference, the tables and indexes were all created in the same tablespace which is my usual setup of 1MB uniform extents, locally managed, 8K block size and freelist management.)
The quiz isn’t intended to be cunning, devious and tricksy, by the way. It’s all about observing and understanding a fairly ordinary situation. On the other hand I’m sure there are interesting variations with strange side effects that could be introduced if you really want to get clever.
Answer 23rd Jan:
Rather than writing my answer into a new post I’ve added it below – but to give new visitors a chance to work out the answer before seeing mine I’ve left a big gap before the answer.
Reading the comments I think all the interesting points have been covered. We have comments covering special treatment of root blocks in nested loop joins, the “buffer is pinned count” statistic, possible changes in 11g, the hypothesis that I had set a large pctfree for the child index,and the link between “A-rows” in one line of the plan and “Starts” in another line.
For reference, my query looked like this:
select par.small_vc, chi.small_vc from parent par, child chi where par.id between 301 and 500 and chi.id = par.n1 and chi.n1 = 0 ;
The mechanics of the plan are:
Line 3 – index range scan of par_i1 (which finds ultimately 200 matching rowids).
Line 2 – for each rowid from line 3 check the matching table row (which ultimately returns 200 rows)
Then two hundred calls to:
Line 5 – index unique scan of chi_i1, which always finds it single row – hence returning 200 rowids in total
Line 4- check the matching row in the table – which always fails – hence returning zero rows in total
The critical counts that allow you to answer my question are the 402 and 602 buffer gets in lines 5 and 4. Ignoring, temporarily, the odd 2 these numbers are clearly 200 times something – which is rather nice given that we know that we are doing something 200 times. The simplest solution, of course is that the (602 – 402 =) 200 gets due to line 4 represents the visits to the 200 table blocks, leaving us with 400 (plus an odd bit) to account for.
Now consider the possibilities:
So, at first sight we might decide the only way to get 400 buffer visits from the index is to have Blevel = 1, visit the root and the leaf. But that’s not the way it works (apart from a couple of versions which had a bug).
When running a nested loop join, Oracle “pins” the root block of the inner (second table) index, which means it attaches a structure to the “users’ list” in the buffer header that links the session’s state object directly to the buffer header. So, in my nested loop join, Oracle gets the root block once and keeps it pinnned, then gets branch and leaf blocks 200 times. The blevel on the child index is 2. (The blevel of the parent index is only 1 – I had set the pctfree on the child index to 90 to make it much larger than it needed to be.)
This “root block pin” isn’t the only pinning optimisation in the query, though. In detail, the steps are as follows – and this is something you would be table to see in the trace file if you enabled event 10200 (one of the ‘consistent reads’ traces).
There a couple of deviations from this cycle, of course. The rows I needed from parent were spread across 4 consecutive blocks in the table so, roughly every 50 rows from parent, line (9) above would become “release current pinned parent block, get new parent block and pin it”. The other little oddity that I can’t explain is that Oracle does “get” the child root block on the second visit to the index as well as the first visit – and then pins it from that moment onwards. So the counts are:
Inevitably, things change – Oracle keeps getting smarter about things like ‘consistent gets – examination’, ‘buffer is pinned count’ and, in 11g, “fastpath” access to buffered blocks. This is a clue to the difference in gets that Charles Hooper recorded in 11.2.0.2 – and explains why I chose to use 10.2.0.3 with a unique index for my example. If you want to investigate other variations all it takes is snapshots of v$mystat, calls to “alter system flush buffer_cache”, and event 10200; the treedump can also be very helpful for identifying block addresses.
While checking my backlog of drafts (currently 75 articles in note form) I came across this one from August 2009 and was a little upset that I hadn’t finished it sooner – it’s a nice example of geek stuff that has the benefit of being useful.
From the Oracle newsgroup comp.databases.oracle.server, here’s an example of how to recreate a performance problem due to maintenance on ASSM bitmaps in 10.2.0.4.
Create a table in a tablespace with an 8KB block size, locally managed tablespace with uniform 1MB extents, and automatic segment space management (ASSM). Check the newsgroup thread if you want complete details on reproducing the test:
Session 1: Insert 100,000 rows of about 490 bytes into a table using a pl/sql loop and commit at end.
Session 1: Insert 1,000 rows into the table with single SQL inserts and no commits
Session 1: delete all data from the table with a single statement – but do not commit
Session 2: Insert 1,000 rows into the table with single SQL inserts and no commits – it’s very slow.
As one person on the thread pointed out – it looks as if Oracle is doing a full tablescan of the table, one block at a time showing “db file sequential read” waits all the way through the table. (If your db_cache_size is large enough you might not see this symptom).
I simplified the test – inserting just 100,000 of the rows (with the commit), then deleting them all (without the commit), then inserting one row from another session. Taking a snapshot of x$kcbsw and x$kcbwh, I got the following figures for the activity that took place inserting that one extra row (this was on Oracle 10.2.0.3):
---------------------------------
Buffer Cache - 23-Dec 11:50:36
Interval:- 0 seconds
---------------------------------
Why0 Why1 Why2 Other Wait
---- ---- ---- ----------
1,457 0 0 0 ktspfwh10: ktspscan_bmb
8 0 0 0 ktspswh12: ktspffc
1 0 0 0 ktsphwh39: ktspisc
7,061 0 0 0 ktspbwh1: ktspfsrch
1 0 0 0 ktuwh01: ktugus
7,060 0 0 0 ktuwh05: ktugct
1 0 0 0 ktuwh09: ktugfb
2 0 0 0 kdswh02: kdsgrp
2 0 0 0 kdiwh06: kdifbk
2 0 0 0 kdiwh07: kdifbk
---- ---- ---- ----------
15,595 0 0 0 Total: 10 rows
The figures tell us how much work Oracle has to do to find a table block that could accept a new row. The idea is simple – Oracle checks the first “level 3″ bitmap block (which is actually part of the segment header block) to find a pointer to the current “level 2″ bitmap block; it checks this level 2 bitmap block to find a pointer to the current “level 1″ bitmap block; and finally it checks the “level 1″ bitmap block to find a pointer to a data block that shows some free space.
Unfortunately every block in our table is apparently empty – but that’s only going to be true once session 1 commits. In this version of Oracle the blocks are all visible as “x% free” in the level 1 bitmaps – but when Oracle visits each block (“ktspbwh1: ktspfsrch”) it checks the ITL entry, which points it to the transaction table slot in the related undo segment header block to Get the Commit Time for the transaction (“ktuwh05: ktugct”) and finds that the transaction is not committed so the space is not really free. So Oracle has to visit the next block shown as free in the bitmap.
In our “bulk delete / no commit” case, we end up visitng every (or nearly every) block in the entire table before we find a block we can actually use – and, given the nature of the ASSM bitmap implementation, the order of the block visits is the “natural” table order, so we see something that looks like a full tablescan operating through single blocks reads (apart, perhaps, from a few blocks that are still cached).
I can’t explain why we do 1,457 visits to bitmap blocks (“ktspfwh10: ktspscan_bmb”) in this version of Oracle, but perhaps it’s simply an indication that Oracle picks (say) five “free block” entries from the bitmap block each time it visits it and therefore has to visit each bitmap block about 12 times if it doesn’t find a data block with space that really is free in it search.
Note – these results will be hugely dependent on version of Oracle – in an earlier version of Oracle the bitmap blocks were not updated by the delete until some time after the commit – and this variation of delayed block cleanout produced other unpleasant anomalies; and, just to make like difficult in later versions of Oracle, the x$kcbsw / x$kcbwh objects are not populated properly in 11g.
Footnote: In case you hadn’t realised, ASSM is a mechanism aimed at OLTP systems with a reasonable degree of concurrency – so it’s not too surprising that you can find problems and weak spots if you hit it with processing which is biased towards the DW and batch processing end of the spectrum.
I’ve lost count of the number of times I’ve said something like: “Ideally you really need to do your testing on the production system”. Kevin Closson has recently written some interesting notes (with more to come) about clonedb – an 11.2 feature that may make it possible to get pretty close to this ideal.
Footnote: for purposes of removing any confusion, please note that actually testing on your production system is not a good idea
A couple of days ago I published a link to some comments I had made on OTN about differences in redo generation between 10g and earlier versions of Oracle. This raised a few questions that suggested a follow-up (or perhaps “prequel”) note might be a little helpful. So I’ve created a simple SQL script to help demonstrate the differences and show how some of the pieces hang together. The whole posting is a little lengthy, but a large fraction of it is just the continuous output from a couple of trace files from which I’ll be highlighting just a few lines.
drop table t1; create table t1 ( n1 number, n2 number, n3 number, n4 number ); create index i1 on t1(n1); create index i2 on t1(n2); insert into t1 values (1,1,1,1); commit; alter system switch logfile; insert into t1 values (2,2,2,2); insert into t1 values (3,3,3,3); commit; execute dump_log
All I’ve done is create a simple table with two indexes. I insert a row and commit because the first insert of a row into a table results in all sorts of funny bits of recursive activity (mostly related space management) and I want to get that out of the way so that it doesn’t obscure the more typical activity. Then I “switch logfile”, do a little work, and use a little procedure I’ve written to dump the current log file. The log file switch is just a convenient way to keep the dump small – but there are strategies for being selective about the parts of the redo log file you dump.
The work I’ve done is just two inserts into the table – this means I expect to find something like the following bits of activity listed in the redo log file:
What I’m going to show you next is the redo log file dump from Oracle 9.2.0.8 and then the corresponding redo log file dump from 10.2.0.3.
In the 9.2.0.8 dump we’ll see the “traditional” sequence of redo generation, which follows very closely to the steps listed above and also shows that Oracle pairs a “redo change vector” for a table or index block with the matching “redo change vector” for an undo record that describes how to reverse the table (or index) change – an undo record, of course, is just a piece of information that gets stored in an undo block. In general, each pair of redo change vectors will be gathered into a single “redo record”.
In the 10.2.0.3 dump we’ll see that the same redo change vectors still exist but they’ve all been lumped into a single redo record, and their ordering in that record is completely different from the 9.2.0.8 ordering, demonstrating (or at least supporting) the point that the “private redo/in-memory undo” mechanism uses two separate buffers, one for redo change vectors related to table/index blocks (the private redo bit) and the other for redo change vectors related to undo records (the in-memory undo bit).
The key expressions to watch out for in the dump are: “REDO RECORD” and “CHANGE #” (I’ve inserted a few blank lines into the trace file to give a little white space between records and vectors):
REDO RECORD - Thread:1 RBA: 0x000360.00000002.0010 LEN: 0x0170 VLD: 0x01 SCN: 0x0000.03ec99f5 SUBSCN: 1 12/30/2010 14:36:06 CHANGE #1 TYP:0 CLS:23 AFN:2 DBA:0x00800039 SCN:0x0000.03ec99bf SEQ: 1 OP:5.2 ktudh redo: slt: 0x0016 sqn: 0x00001a69 flg: 0x0012 siz: 80 fbi: 0 uba: 0x00800040.075d.26 pxid: 0x0000.000.00000000 CHANGE #2 TYP:0 CLS:24 AFN:2 DBA:0x00800040 SCN:0x0000.03ec99be SEQ: 1 OP:5.1 ktudb redo: siz: 80 spc: 2708 flg: 0x0012 seq: 0x075d rec: 0x26 xid: 0x0004.016.00001a69 ktubl redo: slt: 22 rci: 0 opc: 11.1 objn: 45648 objd: 45648 tsn: 12 Undo type: Regular undo Begin trans Last buffer split: No Temp Object: No Tablespace Undo: No 0x00000000 prev ctl uba: 0x00800040.075d.25 prev ctl max cmt scn: 0x0000.03ebf084 prev tx cmt scn: 0x0000.03ebf106 KDO undo record: KTB Redo op: 0x03 ver: 0x01 op: Z KDO Op code: DRP row dependencies Disabled xtype: XA bdba: 0x02c0000a hdba: 0x02c00009 itli: 2 ispac: 0 maxfr: 4863 tabn: 0 slot: 1(0x1) CHANGE #3 TYP:2 CLS: 1 AFN:11 DBA:0x02c0000a SCN:0x0000.03ec99f2 SEQ: 1 OP:11.2 KTB Redo op: 0x01 ver: 0x01 op: F xid: 0x0004.016.00001a69 uba: 0x00800040.075d.26 KDO Op code: IRP row dependencies Disabled xtype: XA bdba: 0x02c0000a hdba: 0x02c00009 itli: 2 ispac: 0 maxfr: 4863 tabn: 0 slot: 1(0x1) size/delt: 15 fb: --H-FL-- lb: 0x2 cc: 4 null: ---- col 0: [ 2] c1 03 col 1: [ 2] c1 03 col 2: [ 2] c1 03 col 3: [ 2] c1 03 CHANGE #4 MEDIA RECOVERY MARKER SCN:0x0000.00000000 SEQ: 0 OP:5.20 session number = 9 serial number = 6 transaction name = -------------- REDO RECORD - Thread:1 RBA: 0x000360.00000002.0180 LEN: 0x0108 VLD: 0x01 SCN: 0x0000.03ec99f5 SUBSCN: 1 12/30/2010 14:36:06 CHANGE #1 TYP:0 CLS:24 AFN:2 DBA:0x00800040 SCN:0x0000.03ec99f5 SEQ: 1 OP:5.1 ktudb redo: siz: 96 spc: 2626 flg: 0x0022 seq: 0x075d rec: 0x27 xid: 0x0004.016.00001a69 ktubu redo: slt: 22 rci: 38 opc: 10.22 objn: 45649 objd: 45649 tsn: 12 Undo type: Regular undo Undo type: Last buffer split: No Tablespace Undo: No 0x00000000 index undo for leaf key operations KTB Redo op: 0x04 ver: 0x01 op: L itl: xid: 0x000a.012.00002552 uba: 0x0080009c.09cb.0c flg: C--- lkc: 0 scn: 0x0000.03ec99f2 Dump kdilk : itl=2, kdxlkflg=0x1 sdc=96880916 indexid=0x2c00409 block=0x02c0040a purge leaf row key: (10): 02 c1 03 06 02 c0 00 0a 00 01 CHANGE #2 TYP:2 CLS: 1 AFN:11 DBA:0x02c0040a SCN:0x0000.03ec99f2 SEQ: 1 OP:10.2 index redo (kdxlin): insert leaf row KTB Redo op: 0x11 ver: 0x01 op: F xid: 0x0004.016.00001a69 uba: 0x00800040.075d.27 Block cleanout record, scn: 0x0000.03ec99f5 ver: 0x01 opt: 0x02, entries follow... itli: 2 flg: 2 scn: 0x0000.03ec99f2 REDO: SINGLE / -- / -- itl: 2, sno: 1, row size 14 insert key: (10): 02 c1 03 06 02 c0 00 0a 00 01 -------------- REDO RECORD - Thread:1 RBA: 0x000360.00000003.0098 LEN: 0x0108 VLD: 0x01 SCN: 0x0000.03ec99f5 SUBSCN: 1 12/30/2010 14:36:06 CHANGE #1 TYP:0 CLS:24 AFN:2 DBA:0x00800040 SCN:0x0000.03ec99f5 SEQ: 2 OP:5.1 ktudb redo: siz: 96 spc: 2528 flg: 0x0022 seq: 0x075d rec: 0x28 xid: 0x0004.016.00001a69 ktubu redo: slt: 22 rci: 39 opc: 10.22 objn: 45650 objd: 45650 tsn: 12 Undo type: Regular undo Undo type: Last buffer split: No Tablespace Undo: No 0x00000000 index undo for leaf key operations KTB Redo op: 0x04 ver: 0x01 op: L itl: xid: 0x000a.012.00002552 uba: 0x0080009c.09cb.0d flg: C--- lkc: 0 scn: 0x0000.03ec99f2 Dump kdilk : itl=2, kdxlkflg=0x1 sdc=96880916 indexid=0x2c00489 block=0x02c0048a purge leaf row key: (10): 02 c1 03 06 02 c0 00 0a 00 01 CHANGE #2 TYP:2 CLS: 1 AFN:11 DBA:0x02c0048a SCN:0x0000.03ec99f2 SEQ: 1 OP:10.2 index redo (kdxlin): insert leaf row KTB Redo op: 0x11 ver: 0x01 op: F xid: 0x0004.016.00001a69 uba: 0x00800040.075d.28 Block cleanout record, scn: 0x0000.03ec99f5 ver: 0x01 opt: 0x02, entries follow... itli: 2 flg: 2 scn: 0x0000.03ec99f2 REDO: SINGLE / -- / -- itl: 2, sno: 1, row size 14 insert key: (10): 02 c1 03 06 02 c0 00 0a 00 01 -------------- REDO RECORD - Thread:1 RBA: 0x000360.00000003.01a0 LEN: 0x00f4 VLD: 0x01 SCN: 0x0000.03ec99f5 SUBSCN: 1 12/30/2010 14:36:06 CHANGE #1 TYP:0 CLS:24 AFN:2 DBA:0x00800040 SCN:0x0000.03ec99f5 SEQ: 3 OP:5.1 ktudb redo: siz: 64 spc: 2430 flg: 0x0022 seq: 0x075d rec: 0x29 xid: 0x0004.016.00001a69 ktubu redo: slt: 22 rci: 40 opc: 11.1 objn: 45648 objd: 45648 tsn: 12 Undo type: Regular undo Undo type: Last buffer split: No Tablespace Undo: No 0x00000000 KDO undo record: KTB Redo op: 0x02 ver: 0x01 op: C uba: 0x00800040.075d.26 KDO Op code: DRP row dependencies Disabled xtype: XA bdba: 0x02c0000a hdba: 0x02c00009 itli: 2 ispac: 0 maxfr: 4863 tabn: 0 slot: 2(0x2) CHANGE #2 TYP:0 CLS: 1 AFN:11 DBA:0x02c0000a SCN:0x0000.03ec99f5 SEQ: 1 OP:11.2 KTB Redo op: 0x02 ver: 0x01 op: C uba: 0x00800040.075d.29 KDO Op code: IRP row dependencies Disabled xtype: XA bdba: 0x02c0000a hdba: 0x02c00009 itli: 2 ispac: 0 maxfr: 4863 tabn: 0 slot: 2(0x2) size/delt: 15 fb: --H-FL-- lb: 0x2 cc: 4 null: ---- col 0: [ 2] c1 04 col 1: [ 2] c1 04 col 2: [ 2] c1 04 col 3: [ 2] c1 04 -------------- REDO RECORD - Thread:1 RBA: 0x000360.00000004.00a4 LEN: 0x00c8 VLD: 0x01 SCN: 0x0000.03ec99f5 SUBSCN: 1 12/30/2010 14:36:06 CHANGE #1 TYP:0 CLS:24 AFN:2 DBA:0x00800040 SCN:0x0000.03ec99f5 SEQ: 4 OP:5.1 ktudb redo: siz: 80 spc: 2364 flg: 0x0022 seq: 0x075d rec: 0x2a xid: 0x0004.016.00001a69 ktubu redo: slt: 22 rci: 41 opc: 10.22 objn: 45649 objd: 45649 tsn: 12 Undo type: Regular undo Undo type: Last buffer split: No Tablespace Undo: No 0x00000000 index undo for leaf key operations KTB Redo op: 0x02 ver: 0x01 op: C uba: 0x00800040.075d.27 Dump kdilk : itl=2, kdxlkflg=0x1 sdc=96880916 indexid=0x2c00409 block=0x02c0040a purge leaf row key: (10): 02 c1 04 06 02 c0 00 0a 00 02 CHANGE #2 TYP:0 CLS: 1 AFN:11 DBA:0x02c0040a SCN:0x0000.03ec99f5 SEQ: 1 OP:10.2 index redo (kdxlin): insert leaf row KTB Redo op: 0x02 ver: 0x01 op: C uba: 0x00800040.075d.2a REDO: SINGLE / -- / -- itl: 2, sno: 2, row size 14 insert key: (10): 02 c1 04 06 02 c0 00 0a 00 02 -------------- REDO RECORD - Thread:1 RBA: 0x000360.00000004.016c LEN: 0x00c8 VLD: 0x01 SCN: 0x0000.03ec99f5 SUBSCN: 1 12/30/2010 14:36:06 CHANGE #1 TYP:0 CLS:24 AFN:2 DBA:0x00800040 SCN:0x0000.03ec99f5 SEQ: 5 OP:5.1 ktudb redo: siz: 80 spc: 2282 flg: 0x0022 seq: 0x075d rec: 0x2b xid: 0x0004.016.00001a69 ktubu redo: slt: 22 rci: 42 opc: 10.22 objn: 45650 objd: 45650 tsn: 12 Undo type: Regular undo Undo type: Last buffer split: No Tablespace Undo: No 0x00000000 index undo for leaf key operations KTB Redo op: 0x02 ver: 0x01 op: C uba: 0x00800040.075d.28 Dump kdilk : itl=2, kdxlkflg=0x1 sdc=96880916 indexid=0x2c00489 block=0x02c0048a purge leaf row key: (10): 02 c1 04 06 02 c0 00 0a 00 02 CHANGE #2 TYP:0 CLS: 1 AFN:11 DBA:0x02c0048a SCN:0x0000.03ec99f5 SEQ: 1 OP:10.2 index redo (kdxlin): insert leaf row KTB Redo op: 0x02 ver: 0x01 op: C uba: 0x00800040.075d.2b REDO: SINGLE / -- / -- itl: 2, sno: 2, row size 14 insert key: (10): 02 c1 04 06 02 c0 00 0a 00 02 -------------- REDO RECORD - Thread:1 RBA: 0x000360.00000005.0044 LEN: 0x0070 VLD: 0x01 SCN: 0x0000.03ec99f6 SUBSCN: 1 12/30/2010 14:36:06 CHANGE #1 TYP:0 CLS:23 AFN:2 DBA:0x00800039 SCN:0x0000.03ec99f5 SEQ: 1 OP:5.4 ktucm redo: slt: 0x0016 sqn: 0x00001a69 srt: 0 sta: 9 flg: 0x12 ktucf redo: uba: 0x00800040.075d.2b ext: 0 spc: 2200 fbi: 0
The dump from 9.2.0.8 consists of seven redo records – and most of those records consist of two redo change vectors. We’ll start by looking at the fourth redo record (starting at line 112) which is a typical “old-style” redo record. It consists of two redo change vectors.
In the chosen record, change #1 (line 115) describes how to change an undo block at address (DBA) 0×00800040 which is operating at sequence number (seq: ) 0x075d. The change is to insert a new undo record, with record number (rec: ) 0×29. If you look at lines 126 – 129 you see that this undo record is an instruction to Delete a Row Piece (DRP) from data block (bdba) 0x02c0000a; the row piece is identified as the pointed to by slot 2 of table 0 in the block’s row directory.
Change # 2 (line 131) describes the change needed to create that row piece (note the IRP – Insert Row Piece at line 135). Again we see references to the block, table number and slot number.
The next two redo records are very similar: they describe how to insert and delete a key entry for each of the two indexes that have to be maintained as the row is inserted into the table – again the redo change vector for the “undo” appears before the redo change vector for the “apply”.
The first three redo records form a similar group – the first one is about inserting/removing a table row, the second and third are about the insert/remove pairs for the two indexes – but the first record of the three is more complex, consisting of four redo change vectors rather than just two.
The first redo record happens to be the first record of the transaction – so it contains (for 9i and below) two extra redo change vectors. The standard pair for the table row appear as change #2 and change #3, but in change #1 we see the redo change vector describing a change to the undo segment header block (line 5: ktudh) as we acquire a “transaction table slot” (see my glossary, and “cleanout”) specifically slot 0×0016, which we stamp with sequence number 0x00001a69. The final change vector in this record (change #4 at line 41) is marked as a “Media Recovery Marker” – it’s actually a type of audit label – and the example here is the short form; there is a much longer version used for the first transaction of a session’s lifetime, which is also used after each log file switch that takes place in a session’s lifetime (and in 8i there was a bug that meant the full vector appeared for every transaction start – with could result in a nasty volume of excess redo).
The last redo record is the so-called “commit record”. It consists of just one change vector (line 209). This change vector describes the update to the undo segment header as we commit (ktucm) the transaction; the status (sta: ) of transaction table slot 0×0016 is set to 9 to show that the slot is now free, and the “free pool” list (ktucf) is updated to show that the last undo block we were using has some space available in it that could be used by the next transaction that acquires a slot in this undo segment.
So that’s the redo generated by 9i. The basic pattern is that a typical redo record consists of a pair of redo change vectors, the first describing an undo record and the second describing how to make the required data change.
An important scalability detail (or threat, rather) that goes with this implementation is that the session has to allocate memory in the redo log buffer and copy each redo record in turn so, for our table with two indexes, every row we insert requires us to acquire the redo allocation latch and redo copy latch three times.
Now we move on to see what shows up in the log file when we run our code on 10.2.0.3
REDO RECORD - Thread:1 RBA: 0x00034b.00000002.0010 LEN: 0x0638 VLD: 0x0d SCN: 0x0000.00d60a85 SUBSCN: 1 12/30/2010 14:24:01 CHANGE #1 TYP:2 CLS: 1 AFN:6 DBA:0x01804d8a OBJ:70577 SCN:0x0000.00d60a81 SEQ: 5 OP:11.2 KTB Redo op: 0x01 ver: 0x01 op: F xid: 0x0008.025.00003632 uba: 0x008031d3.07f1.27 KDO Op code: IRP row dependencies Disabled xtype: XA flags: 0x00000000 bdba: 0x01804d8a hdba: 0x01804d89 itli: 2 ispac: 0 maxfr: 4863 tabn: 0 slot: 1(0x1) size/delt: 15 fb: --H-FL-- lb: 0x2 cc: 4 null: ---- col 0: [ 2] c1 03 col 1: [ 2] c1 03 col 2: [ 2] c1 03 col 3: [ 2] c1 03 CHANGE #2 TYP:0 CLS:31 AFN:2 DBA:0x00800079 OBJ:4294967295 SCN:0x0000.00d60a5f SEQ: 1 OP:5.2 ktudh redo: slt: 0x0025 sqn: 0x00003632 flg: 0x0012 siz: 104 fbi: 0 uba: 0x008031d3.07f1.27 pxid: 0x0000.000.00000000 CHANGE #3 TYP:2 CLS: 1 AFN:6 DBA:0x01804e0a OBJ:70578 SCN:0x0000.00d60a81 SEQ: 2 OP:10.2 index redo (kdxlin): insert leaf row KTB Redo op: 0x11 ver: 0x01 op: F xid: 0x0008.025.00003632 uba: 0x008031d3.07f1.28 Block cleanout record, scn: 0x0000.00d60a84 ver: 0x01 opt: 0x02, entries follow... itli: 2 flg: 2 scn: 0x0000.00d60a81 REDO: SINGLE / -- / -- itl: 2, sno: 1, row size 14 insert key: (10): 02 c1 03 06 01 80 4d 8a 00 01 CHANGE #4 TYP:2 CLS: 1 AFN:6 DBA:0x01804e8a OBJ:70579 SCN:0x0000.00d60a81 SEQ: 2 OP:10.2 index redo (kdxlin): insert leaf row KTB Redo op: 0x11 ver: 0x01 op: F xid: 0x0008.025.00003632 uba: 0x008031d3.07f1.29 Block cleanout record, scn: 0x0000.00d60a84 ver: 0x01 opt: 0x02, entries follow... itli: 2 flg: 2 scn: 0x0000.00d60a81 REDO: SINGLE / -- / -- itl: 2, sno: 1, row size 14 insert key: (10): 02 c1 03 06 01 80 4d 8a 00 01 CHANGE #5 TYP:0 CLS: 1 AFN:6 DBA:0x01804d8a OBJ:70577 SCN:0x0000.00d60a85 SEQ: 1 OP:11.2 KTB Redo op: 0x02 ver: 0x01 op: C uba: 0x008031d3.07f1.2a KDO Op code: IRP row dependencies Disabled xtype: XA flags: 0x00000000 bdba: 0x01804d8a hdba: 0x01804d89 itli: 2 ispac: 0 maxfr: 4863 tabn: 0 slot: 2(0x2) size/delt: 15 fb: --H-FL-- lb: 0x2 cc: 4 null: ---- col 0: [ 2] c1 04 col 1: [ 2] c1 04 col 2: [ 2] c1 04 col 3: [ 2] c1 04 CHANGE #6 TYP:0 CLS: 1 AFN:6 DBA:0x01804e0a OBJ:70578 SCN:0x0000.00d60a85 SEQ: 1 OP:10.2 index redo (kdxlin): insert leaf row KTB Redo op: 0x02 ver: 0x01 op: C uba: 0x008031d3.07f1.2b REDO: SINGLE / -- / -- itl: 2, sno: 2, row size 14 insert key: (10): 02 c1 04 06 01 80 4d 8a 00 02 CHANGE #7 TYP:0 CLS: 1 AFN:6 DBA:0x01804e8a OBJ:70579 SCN:0x0000.00d60a85 SEQ: 1 OP:10.2 index redo (kdxlin): insert leaf row KTB Redo op: 0x02 ver: 0x01 op: C uba: 0x008031d3.07f1.2c REDO: SINGLE / -- / -- itl: 2, sno: 2, row size 14 insert key: (10): 02 c1 04 06 01 80 4d 8a 00 02 CHANGE #8 TYP:0 CLS:31 AFN:2 DBA:0x00800079 OBJ:4294967295 SCN:0x0000.00d60a85 SEQ: 1 OP:5.4 ktucm redo: slt: 0x0025 sqn: 0x00003632 srt: 0 sta: 9 flg: 0x2 ktucf redo: uba: 0x008031d3.07f1.2c ext: 38 spc: 1718 fbi: 0 CHANGE #9 TYP:0 CLS:32 AFN:2 DBA:0x008031d3 OBJ:4294967295 SCN:0x0000.00d60a5d SEQ: 1 OP:5.1 ktudb redo: siz: 104 spc: 2250 flg: 0x0012 seq: 0x07f1 rec: 0x27 xid: 0x0008.025.00003632 ktubl redo: slt: 37 rci: 0 opc: 11.1 objn: 70577 objd: 70577 tsn: 5 Undo type: Regular undo Begin trans Last buffer split: No Temp Object: No Tablespace Undo: No 0x00000000 prev ctl uba: 0x008031d3.07f1.26 prev ctl max cmt scn: 0x0000.00d5fa91 prev tx cmt scn: 0x0000.00d5fb64 txn start scn: 0x0000.00d60a81 logon user: 52 prev brb: 8401354 prev bcl: 0 KDO undo record: KTB Redo op: 0x03 ver: 0x01 op: Z KDO Op code: DRP row dependencies Disabled xtype: XA flags: 0x00000000 bdba: 0x01804d8a hdba: 0x01804d89 itli: 2 ispac: 0 maxfr: 4863 tabn: 0 slot: 1(0x1) CHANGE #10 TYP:0 CLS:32 AFN:2 DBA:0x008031d3 OBJ:4294967295 SCN:0x0000.00d60a85 SEQ: 1 OP:5.1 ktudb redo: siz: 96 spc: 2144 flg: 0x0022 seq: 0x07f1 rec: 0x28 xid: 0x0008.025.00003632 ktubu redo: slt: 37 rci: 39 opc: 10.22 objn: 70578 objd: 70578 tsn: 5 Undo type: Regular undo Undo type: Last buffer split: No Tablespace Undo: No 0x00000000 index undo for leaf key operations KTB Redo op: 0x04 ver: 0x01 op: L itl: xid: 0x0005.01e.000035e3 uba: 0x008034a5.083a.1d flg: C--- lkc: 0 scn: 0x0000.00d60a81 Dump kdilk : itl=2, kdxlkflg=0x1 sdc=0 indexid=0x1804e09 block=0x01804e0a (kdxlpu): purge leaf row key: (10): 02 c1 03 06 01 80 4d 8a 00 01 CHANGE #11 TYP:0 CLS:32 AFN:2 DBA:0x008031d3 OBJ:4294967295 SCN:0x0000.00d60a85 SEQ: 2 OP:5.1 ktudb redo: siz: 96 spc: 2046 flg: 0x0022 seq: 0x07f1 rec: 0x29 xid: 0x0008.025.00003632 ktubu redo: slt: 37 rci: 40 opc: 10.22 objn: 70579 objd: 70579 tsn: 5 Undo type: Regular undo Undo type: Last buffer split: No Tablespace Undo: No 0x00000000 index undo for leaf key operations KTB Redo op: 0x04 ver: 0x01 op: L itl: xid: 0x0005.01e.000035e3 uba: 0x008034a5.083a.1e flg: C--- lkc: 0 scn: 0x0000.00d60a81 Dump kdilk : itl=2, kdxlkflg=0x1 sdc=0 indexid=0x1804e89 block=0x01804e8a (kdxlpu): purge leaf row key: (10): 02 c1 03 06 01 80 4d 8a 00 01 CHANGE #12 TYP:0 CLS:32 AFN:2 DBA:0x008031d3 OBJ:4294967295 SCN:0x0000.00d60a85 SEQ: 3 OP:5.1 ktudb redo: siz: 64 spc: 1948 flg: 0x0022 seq: 0x07f1 rec: 0x2a xid: 0x0008.025.00003632 ktubu redo: slt: 37 rci: 41 opc: 11.1 objn: 70577 objd: 70577 tsn: 5 Undo type: Regular undo Undo type: Last buffer split: No Tablespace Undo: No 0x00000000 KDO undo record: KTB Redo op: 0x02 ver: 0x01 op: C uba: 0x008031d3.07f1.27 KDO Op code: DRP row dependencies Disabled xtype: XA flags: 0x00000000 bdba: 0x01804d8a hdba: 0x01804d89 itli: 2 ispac: 0 maxfr: 4863 tabn: 0 slot: 2(0x2) CHANGE #13 TYP:0 CLS:32 AFN:2 DBA:0x008031d3 OBJ:4294967295 SCN:0x0000.00d60a85 SEQ: 4 OP:5.1 ktudb redo: siz: 80 spc: 1882 flg: 0x0022 seq: 0x07f1 rec: 0x2b xid: 0x0008.025.00003632 ktubu redo: slt: 37 rci: 42 opc: 10.22 objn: 70578 objd: 70578 tsn: 5 Undo type: Regular undo Undo type: Last buffer split: No Tablespace Undo: No 0x00000000 index undo for leaf key operations KTB Redo op: 0x02 ver: 0x01 op: C uba: 0x008031d3.07f1.28 Dump kdilk : itl=2, kdxlkflg=0x1 sdc=0 indexid=0x1804e09 block=0x01804e0a (kdxlpu): purge leaf row key: (10): 02 c1 04 06 01 80 4d 8a 00 02 CHANGE #14 TYP:0 CLS:32 AFN:2 DBA:0x008031d3 OBJ:4294967295 SCN:0x0000.00d60a85 SEQ: 5 OP:5.1 ktudb redo: siz: 80 spc: 1800 flg: 0x0022 seq: 0x07f1 rec: 0x2c xid: 0x0008.025.00003632 ktubu redo: slt: 37 rci: 43 opc: 10.22 objn: 70579 objd: 70579 tsn: 5 Undo type: Regular undo Undo type: Last buffer split: No Tablespace Undo: No 0x00000000 index undo for leaf key operations KTB Redo op: 0x02 ver: 0x01 op: C uba: 0x008031d3.07f1.29 Dump kdilk : itl=2, kdxlkflg=0x1 sdc=0 indexid=0x1804e89 block=0x01804e8a (kdxlpu): purge leaf row key: (10): 02 c1 04 06 01 80 4d 8a 00 02
The entire transaction is wrapped up into a single redo record with fourteen redo change vectors. (The 9i dump showed a total of fifteen redo change vectors – we’ve lost the “audit” vector in our 10g setup.) In order, the vectors describe how to…
It is reasonable to infer from this that Oracle is generating a list of “forward” changes (i.e. the private redo) in one buffer and a list of “reverse” changes (i.e. the in-memory undo) in another buffer – and simply concatenates the two lists on the commit.
On the plus side, if we’ve only got one redo record we only have to allocate and copy into the (public) redo buffer once – which reduces contention on a single important hot spot in the Oracle database; on the other hand we have to keep a number of data blocks pinned in memory until the commit, and we have to modify all those data blocks at the same time as we copy the two private buffers into the (public) redo log buffer, and this requirement introduces a different concurrency threat.
Because of this alternative threat there is a fairly tight limit on how much change a single transaction is allowed to make using this new mechanism . For 32-bit Oracle the buffers are about 64Kb each, and for 64-bit Oracle they are about 128KB – and if a session fills either of the two buffers it has acquired then Oracle flushes the contents to the public log buffer and the session then carries on using the public log buffer in the “traditional” way (while still keeping “ownership” of the pair of private buffers that it had allocated.)
A final thought – if you issue a rollback instead of a commit, the “commit record” (change #8 in the above) disappears from the redo record, and the redo stream gets a further six redo records showing how each of the previous undo records have been applied at the user’s request; these records are followed by one last redo record which describes the release of the transaction table slot i.e. a “commit record”.
Many people in the UK like to greet the new year with some sort of party, and in this household the type of party we like includes old-fashioned party games (can you draw “Brief Encounters” in 10 seconds ?); so, for all those members of the Oracle community who haven’t prepared their party games this year, here’s a little quiz:
Which processes could report waits for “log file sequential read”, and why ?
Warning: the number of answers I have may, or may not, be the 10 suggested by the list.
Update Jan 5th: There were a couple of answers I wasn’t expecting (and would need to check), but some things are fairly easy to check. Here, for example, is a simple script that will identify sessions that regularly wait for ‘log file sequential read’.
select ses.program from v$session_event sev, V$session ses where sev.event = 'log file sequential read' and ses.sid = sev.sid ;
The obvious one, of course, is ARCH (or ARCn as they tend to be) the archiver process(es). Archiving requires something to read the online redo and copy them to some other location. Here’s a brief extract from tracing an archive process – showing reads of 1MB (2048 x 512 bytes per block) -just after a log file switch:
WAIT #0: nam='log file sequential read' ela= 38410 log#=0 block#=1 blocks=2048 obj#=-1 tim=101515708285 WAIT #0: nam='Log archive I/O' ela= 183580 count=1 intr=256 timeout=-1 obj#=-1 tim=101515892589 WAIT #0: nam='log file sequential read' ela= 84409 log#=0 block#=2049 blocks=2048 obj#=-1 tim=101515977375 WAIT #0: nam='Log archive I/O' ela= 116641 count=1 intr=256 timeout=-1 obj#=-1 tim=101516094684 WAIT #0: nam='log file sequential read' ela= 31139 log#=0 block#=4097 blocks=2048 obj#=-1 tim=101516126157 WAIT #0: nam='Log archive I/O' ela= 122492 count=1 intr=256 timeout=-1 obj#=-1 tim=101516250399 WAIT #0: nam='log file sequential read' ela= 32197 log#=0 block#=6145 blocks=2048 obj#=-1 tim=101516282965 WAIT #0: nam='Log archive I/O' ela= 122779 count=1 intr=256 timeout=-1 obj#=-1 tim=101516552698 WAIT #0: nam='Log archive I/O' ela= 4 count=0 intr=32 timeout=2147483647 obj#=-1 tim=101516552996 WAIT #0: nam='Log archive I/O' ela= 25 count=1 intr=32 timeout=2147483647 obj#=-1 tim=101516553297
Slightly less obvious is LGWR, the log writer itself. Apart from reading the log files at (normal) startup to check that they’re there, LGWR reads the start of the current and next log files on every log switch. Here’s the trace file of LGWR on a log file switch where there are two files in each group:
WAIT #0: nam='log file sequential read' ela= 32788 log#=0 block#=1 blocks=1 obj#=-1 tim=100016453547 WAIT #0: nam='log file sequential read' ela= 18309 log#=1 block#=1 blocks=1 obj#=-1 tim=100016471921 WAIT #0: nam='log file single write' ela= 518 log#=0 block#=1 blocks=1 obj#=-1 tim=100016472654 WAIT #0: nam='log file single write' ela= 516 log#=1 block#=1 blocks=1 obj#=-1 tim=100016473225 Media recovery not enabled or manual archival only 0x10000 WAIT #0: nam='log file sequential read' ela= 90 log#=0 block#=1 blocks=1 obj#=-1 tim=100016473376 WAIT #0: nam='log file sequential read' ela= 73 log#=1 block#=1 blocks=1 obj#=-1 tim=100016473474 WAIT #0: nam='log file single write' ela= 477 log#=0 block#=1 blocks=1 obj#=-1 tim=100016473981 WAIT #0: nam='log file single write' ela= 856 log#=1 block#=1 blocks=1 obj#=-1 tim=100016474885
By playing around with files per group we can work out that the order of activity is:
Next up: any process which is involved in (asynchronous) Change Data Capture (CDC) or Streams will be reading the log files, as will any process running a Log Miner session.
A couple of people suggested the SMON and PX slaves could read the log file on database startup – but as another poster pointed out, it’s the startup process that reads the log files – and if parallel recovery is invoked it passes change vectors to the PX slaves, the slaves don’t read the log files directly. Apart from that, I don’t really know which processes read the log file – I don’t want to say “process X doesn’t” read the log file, because it may do at times that I haven’t considered. (And there are lots of processes to look into in 11g.)
However, there is one (generic) process that can read the log files – and that’s ANY process that has to handle data blocks. And that’s really why I wrote this note in the first place. If a process finds at any time that the buffer it’s looking at in memory is corrupt it may do real-time block recovery if it thinks the corruption has occurred between the moment the block was read from disk and the present. There are various flags and details in the buffer header that tell it that the block is dirty (differs from what’s on disk) and the redo block address (RBA) where the earliest known change to the in-memory block can be found in the log files.
So any process may, in conditions that shouldn’t happen often (we hope), identify an in-memory corrupt, read a block from disc, then go backwards in time in the redo logs and start walking the redo logs to find the redo needed to bring the block up to date. So if you see excessive amounts of “log file sequential read” coming from your foreground session, check what’s going on – it might be an indication of a hardware fault.
Thanks to anyone who participated – and if anyone runs the little script and finds some new examples of log file sequential reads, please add them to the list.
Update Jan 2011:
Thanks to Pavol Babel in comment 8 and his reply to comment 7 for adding MRP0 and RFS to the list of processes that are expected to read the log file.
I wrote an article about system statistics / CPU Costing for Oracle magazine a few years ago – and last week I realised that I’ve never supplied a link to it in the notes and comments I’ve made about system statistics. So I’ve just run a search through the Oracle website trying to find it – and discovered that it’s no longer available. Apparently the editors have decided that any technical articles over a certain age should be withdrawn in case they are out of date and misleading. (Clearly they’ve read my blog on trust – I wish the people maintaining Metalink would do the same as the magazine editors – but they probably have a much larger volume to worry about).
However, I have discovered translations of the article in Russian, Korean and Chinese – so if you can read any of these languages, you might want to take a look at them before they disappear too.
If you want an original English version – dated April 2004, which is when I sent it in to Oracle Magazine, and before it underwent some editing – I’ve posted it as a pdf file.
In a recent article about list partitioned tables I raised some questions about the cases where the optimizer can’t (yet) do partitioning pruning even when the opportunity is clearly visible to the human eye. The most important example was the case where each partition was defined to hold rows for just one partition key value – but the optimizer could not prune out the redundant partition for a query like: “partition_key != {constant}”.
I recently came across a situation where this really made a big difference. The system had a huge table that had been list partitioned as follows (with some camouflage):
partition by list (status) (
partition p_state01 values ('STATE01'),
partition p_state02 values ('STATE02'),
....
partition p_state25 values ('STATE25'),
partition p_handled values ('Completed')
)
The table was defined to allow row movement, and every day there would be tens of thousands of rows moving through various states until they reached the “Completed” state.
There are various pros and cons to this setup. The most significant con is that when you update the status of a row Oracle actually has to update the row “in situ”, then delete it from what is now the wrong partition and insert it into the right partition. The most dramatic pro is that if the rows you’re interested in are (almost always) the ones that haven’t got to the “Completed” you’ve put all the boring old garbage out of the way where it doesn’t cause any problems. (In fact, if you’re running 11.2 you might choose to declare some of the “Completed” partitions of any local indexes as unusable and save yourself a lot of space – and by the time I’ve finished this article you might think this is a truly wonderful idea.) In the case of the client, there were about 200 million rows in the completed partition, and barely 2 million spread over the other partitions.
There was a bit of a problem, though. Some of the indexes on this table had been created as local indexes (arguably they should all have been local)and this resulted in some odd optimisation side effects. Here’s a little bit of code to build a table that demonstrates an interesting issue:
create table t1 (
id,
n1,
small_vc,
padding
)
partition by list (n1) (
partition p0 values(0),
partition p1 values(1),
partition p2 values(2),
partition p3 values(3),
partition p4 values(4),
partition p5 values(5)
)
as
with generator as (
select --+ materialize
rownum id
from dual
connect by
rownum <= 10000
)
select
rownum id,
trunc(log(10,rownum)) n1,
lpad(rownum,10,'0') small_vc,
rpad('x',100) padding
from
generator v1,
generator v2
where
rownum <= 999999
;
create index t1_n1 on t1(n1, small_vc) local nologging;
create index t1_id on t1(id, small_vc) local nologging;
begin
dbms_stats.gather_table_stats(
ownname => user,
tabname =>'T1',
estimate_percent => 100,
method_opt => 'for all columns size 1'
);
end;
/
break on index_name skip 1
select
index_name, partition_name, blevel, num_rows, leaf_blocks
from
user_ind_partitions -- but see comment #1 below from Tony Sleight
order by
index_name, partition_name
;
Thanks to the log definition of column n1, you will see a very skewed distribution of data across the partitions, and the output from the query against the index partitions shows this quite dramatically. Since the code sample uses a 100% sample on the stats, you should get the following figures for the indexes (with a little variation in leaf blocks, perhaps, depending on your version and tablespace definitions. I was using 11.1.0.6 with 8KB blocks, locally managed tablespaces, freelists, and 1MB uniform extents.)
INDEX_NAME PARTITION_NAME BLEVEL LEAF_BLOCKS NUM_ROWS
-------------------- -------------------- ---------- ----------- ----------
T1_ID P0 0 1 9
P1 0 1 90
P2 1 4 900
P3 1 33 9000
P4 1 339 90000
P5 2 3384 900000
T1_N1 P0 0 1 9
P1 0 1 90
P2 1 4 900
P3 1 32 9000
P4 1 314 90000
P5 2 3136 900000
So here’s important question number 1: What do you think the blevel will be at the global level for the two indexes ?
Important question number 2: If you have query with a where clause like “id between 100 and 400 and n1 != 5″ – which is designed very precisely to exclude the last partition – what value of blevel is Oracle going to use when considering the cost of using the index t1_id to access the data ?
My answers are in this follow-up post.
Here’s a little detail that I’ve known for years – but keep forgetting until I embarrass myself by rediscovering it (usually in front of a client). I’ll start with a demonstration of a useful little feature of mandatory columns:
drop table t1 purge; create table t1 as select * from all_objects where rownum <= 10000 ; execute dbms_stats.gather_table_stats(user,'t1') create index t1_i1 on t1(object_name); set autotrace traceonly explain select count(*) from t1; /* -------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time -------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 (0)| 00:00:0 | 1 | SORT AGGREGATE | | 1 | | | 2 | INDEX FAST FULL SCAN| T1_I1 | 10000 | 13 (0)| 00:00:0 -------------------------------------------------------------------- */
Oracle can use the index on column object_name to count the number of rows in the table because the column has been declared NOT NULL, so every row in the table also has to appear in the index. Let’s just demonstrate that by changing the column definition:
alter table t1 modify object_name null; select count(*) from t1; /* ------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 40 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | TABLE ACCESS FULL| T1 | 10000 | 40 (0)| 00:00:01 | ------------------------------------------------------------------- */
Now let’s make the column mandatory again – by adding a constraint:
Have you ever wondered if there’s a way to see when the data in a table was last modified ? Depending on what you’re really trying to achieve, how accurate you want the answer to be, what resources you’re prepared to use, and how you’ve configured your database, you may be able to get an answer that’s good enough very easily.
If all you want is a rough indication that the table hasn’t changed over the last few days, or weeks, or even months, you may be able to run a simple, but potentially brutal, query against the table to find out. Here’s an example that get’s there in three steps – using the table sys.link$ as a target. Steps 1 and 2 are for explanatory purposes only.
select ora_rowscn from link$ ; select first_change#, first_time from v$log_history order by first_change#; select min(first_time) from v$log_history where first_change# >= ( select max(ora_rowscn) from link$ ) ;
The pseudo-column ora_rowscn is an upper bound for the SCN at which the last change to a row committed. If you have enabled rowdependencies this will generally be the actual commit SCN for the row, otherwise Oracle has various algorithms for working out the largest SCN avaiable that is not less than the SCN at which the transaction committed. If all you’re interested in is an indication of when the table last changed even this is upper bound SCN may be good enough.
Step 2 is there to remind you that v$log_history records the SCN and timestamp for the first change in the file. This gives us a very crude conversion between SCNs and timestamps.
The final step gives us the answer we want. We find the approximate SCN of the most recent change to the table, and then report the timestamp of the first log file that started at a higher SCN.
Recent comments
16 weeks 5 days ago
26 weeks 3 days ago
28 weeks 1 day ago
31 weeks 2 days ago
33 weeks 4 days ago
43 weeks 1 day ago
44 weeks 5 days ago
45 weeks 5 days ago
45 weeks 6 days ago
48 weeks 4 days ago