“Thicker Than Water” is book 4 in the Felix Castor series by Mike Carey.
I’m not even going to try and summarize this bad-boy. It’s just a crazy-paced page turner. We find out more about Felix’s life before he became a freelance exorcist. More about his brother, the Catholic Priest. More about the Anathemata, the excommunicated millitant arm of the Catholic Church. Juliet, the succubus, gets involved in the action again. It’s just bam, bam, bam. Love it.
I missed my fifth blogging anniversary by a month – my first post was dated 24th Oct 2006 on the topic of the parameter optimizer_index_cost_adj.
Since then I’ve posted about 670 articles (and deleted a few temporary ones) and drafted another 110; I’ve acquired 750 followers; and the blog has inspired a little over 6,000 comments. It’s amazing how time passes and the numbers accumulate.
To celebrate the event I’ve created a new category (see top right) of “Site History” so that I could label that first post and the various posts I’ve made about viewing figures and popular posts.
Recent Charles Hooper’s post on the topic of Recycle bin (which is, BTW, documented behavior) reminded me of an issue with that functionality I’ve seen recently. The problem was a single-row INSERT INTO table VALUES () statement was hanging for more than an hour burning CPU.
Here is how SQL Monitoring report for the statement looked like:
SQL Monitoring Report SQL Text ------------------------------ INSERT INTO TAB(X1,X2,X3,X4) VALUES (:1,:2,:3,:4) Global Information ------------------------------ Status : EXECUTING Instance ID : 1 Session : USER1 (101:34227) SQL ID : 55wauznayyk1r SQL Execution ID : 16777303 Execution Started : 11/23/2011 13:11:41 First Refresh Time : 11/23/2011 13:11:49 Last Refresh Time : 11/23/2011 14:26:50 Duration : 4511s Module/Action : JDBC Thin Client/- Service : SYS$USERS Program : JDBC Thin Client Global Stats =========================================================================================== | Elapsed | Cpu | IO | Concurrency | PL/SQL | Other | Buffer | Read | Read | | Time(s) | Time(s) | Waits(s) | Waits(s) | Time(s) | Waits(s) | Gets | Reqs | Bytes | =========================================================================================== | 4528 | 4525 | 0.71 | 0.01 | 0.01 | 2.46 | 202M | 71 | 568KB | =========================================================================================== SQL Plan Monitoring Details (Plan Hash Value=0) =============================================================================================================================== | Id | Operation | Name | Rows | Cost | Time | Start | Execs | Rows | Activity | Activity Detail | | | | | (Estim) | | Active(s) | Active | | (Actual) | (%) | (# samples) | =============================================================================================================================== | 0 | INSERT STATEMENT | | | | | | 1 | | | | | 1 | LOAD TABLE CONVENTIONAL | | | | | | 1 | | | | ===============================================================================================================================
And pstack output for Oracle process looked like this most of the time:
ffffffff7c5da730 times (64, 4a6f0, 49e2794, 10c20aba8, 10c1c04b8, 10c1ce850) + 8 000000010633d76c opiosq0 (4a, e, ffffffff7ffdc650, 20, 0, ffffffff7ffdc6c9) + 4cc 00000001035a0444 opiall0 (ffffffff7ffdc728, 1, ffffffff7ffdce10, 0, ffffffff7ffdcd28, ffffffff7ffdccd0) + 964 00000001035cf01c opikpr (65, 1c, ffffffff74167bd8, 10c000, 10c1ca000, 1648) + 21c 00000001035b86b4 opiodr (10a755, 1c, 1035cee00, 10a400, 1400, ffffffff7bc20178) + 594 0000000105143a64 rpidrus (ffffffff7ffdda78, 10c1c9ce8, 1, 0, 10a6a2, 10a400) + c4 00000001077e6260 skgmstack (ffffffff7ffddce0, 10c1c9ce8, 1051439a0, ffffffff7ffddcb8, ffffffff7ffddce0, ffffffff7ffddcb8) + 60 0000000105143fa4 rpidru (ffffffff7ffde5c8, 10a686, 10a400, 105143f00, 105000, 10a400) + a4 0000000105142cac rpiswu2 (67252c600, 0, ffffffff7ffde438, 10a6a2, 10c1ca490, 0) + 28c 000000010514b588 kprball (64, 8, 0, 2000, 105143f00, 4000) + 588 0000000105c23814 kdpurcrb (1154, 11, 0, ffffffff7ffdf308, 10a6a2, 10c1ce) + 1d4 0000000105c27a9c kdpurts (ffffffff7ffdf580, ffffffff7ffdf550, 1, ffffffff7ffdf5fc, 1a00000000, 380016) + 1a7c 0000000105c28250 kdpursp (5, 100, ffffffff7ffdf838, 1b0, 7ffffffc, 7ffffc00) + 70 00000001023c2fd4 ktsx_purge_bin (5, 80, ffffffff7ffdff7c, ffffffff7ffdff78, 5, 10c000) + 134 0000000102441ad0 ktspnr_next (ffffffff7ffe06c0, 102440e20, 10ac50558, 10ac50500, 10ac00, 10a6a2) + cb0 0000000101a0bca0 ktrsexec (ffffffff7ffe06c0, ffffffff7ffe08b0, 380000, 380025000, 380025, b) + 240 0000000102436128 ktspbmphwm (ffffffff79af7fe8, 0, 0, 4cd9980, 76, 1527801) + 468 0000000102430204 ktspmvhwm (ffffffff7ffe0ec0, 1024313cc, e, 10c1c9, ffffffff79af7fe8, 409) + 24 0000000102431448 ktsp_bump_hwm (10a400, ffffffff7ffe3fa8, 794, 10c1ca490, 1, 582) + c8 00000001024321f4 ktspgsp_main (0, 152780100000000, 380024ed8, 0, 0, 3) + 5d4 0000000103888dbc kdisnew (ffffffff7bc78860, 50c43e2, ffffffff7ffe4678, 2, ffffffff79af7fe8, 0) + 11c 000000010388bdc8 kdisnewle (ffffffff79af7fd0, 50c43e2, 2d, ffffffff7ffe4678, 0, 0) + a8 00000001038b2334 kdisle (0, 2d, 1527803, 50c43e2, 1, 2c) + 1e34 00000001037c3c54 kdiins0 (6371cfdb0, 0, 0, ffffffff7fff59a8, 0, 0) + 2bb4 000000010381a1a0 kdiinsp (6371cfdb0, ff000000000000, 0, ffffffff7fff59a8, 2, 2) + 60 0000000100c71154 kauxsin (10a400, ffffffff7bc78860, 0, 10c000, 10a6a2000, 0) + 7d4 00000001062d2764 qesltcLoadIndexList (ffffffff79ce5530, 6371cfe30, 0, 0, 0, 636f17848) + 384 00000001061d5a00 qerltcNoKdtBufferedInsRowCBK (ffffffff79ce54e8, 636f17848, 6371d0140, 10c1ca490, 500080025, 1061d5880) + 180 00000001061d10bc qerltcLoadStateMachine (1061d4, 636f17848, 10c1ca2e0, 1061d4ee0, ffffffff79ce54e8, 1) + dc 00000001061d1a38 qerltcInsertSelectRop (ffffffff7fff7f88, ffffffff7fff7ee8, ffffffff79cc0518, 10c1ca490, 104c00, 0) + d8 0000000106068284 qeruaRowProcedure (69000001, 7fff, ffffffff7fff7c18, 800000, 1061d1960, 80000) + 124 0000000106125330 qerfiFetch (642d22848, ffffffff7bc6ce58, 106068160, ffffffff7fff7c18, 7fff, 107) + 50 000000010272148c rwsfcd (642d227d0, ffffffff7bc6ce88, 106068160, ffffffff7fff7c18, 7fff, 106068160) + 6c 0000000106068620 qeruaFetch (33000002, 33008002, 106068160, ffffffff7fff7c18, 642d227d0, ffffffff79ce51b0) + 1a0 000000010272148c rwsfcd (66375c718, ffffffff79ce5458, 1061d1960, ffffffff7fff7f88, 7fff, 1061d1960) + 6c 00000001061cdfbc qerltcFetch (800000, 636f17848, 104a70c00, ffffffff79ce54e8, 10c1ca2e0, 1061d1960) + 39c 0000000104a711dc insexe (10c1df, 1061cdc20, 66, 0, 10c000, 104800) + 35c 00000001035c305c opiexe (49, 3, ffffffff7fffa26c, 0, 0, bfffffff) + 263c 0000000103e6d560 kpoal8 (5e, 1c, ffffffff7fffd728, 10c000, 10c1ca000, 1648) + 940 00000001035b86b4 opiodr (10a755, 1c, 103e6cc20, 10a400, 1400, 10c1c9000) + 594 00000001070fd3a0 ttcpip (0, 10a686d94, 10c1ca2e0, ffffffff7fffd728, ffffffff7fffc170, 10c1e0e98) + 420 000000010358c018 opitsk (10a686d94, 10c1e62e8, 10c1e0ca4, 10c1defa8, 0, 10c1c9fa0) + 5f8 000000010359d2e8 opiino (2270, 10c1e0d20, 10c000, 380000, 9b, ffffffff7ffff080) + 3e8 00000001035b86b4 opiodr (10c1e0000, 4, 10359cf00, 10c000, 1400, 10c1c9000) + 594 0000000103580c4c opidrv (3c, 4, ffffffff7ffff080, 1ea1c8, ffffffff7c742fd0, ffffffff7ffff560) + 44c 00000001046e6f3c sou2o (ffffffff7ffff058, 3c, 4, ffffffff7ffff080, 10c000, 10b800) + 5c 0000000100604430 opimai_real (0, ffffffff7ffff328, ffffffff7f73aeb8, 10b800, 1, 2) + 130 00000001046fd140 ssthrdmain (10c000, 2, 44d000, 100604300, 10c1ef000, 10c1ef) + 140 0000000100604214 main (2, ffffffff7ffff438, 0, ffffffff7ffff338, ffffffff7ffff448, ffffffff7eb00200) + 134 00000001006040bc _start (0, 0, 0, 0, 0, 0) + 17c
So looking at this process stack trace it’s possible to make a valid assumption that Oracle tried to bump High Water Mark (line 19) while executing client’s INSERT statement, then tried to free space from the recycle bin (line 14) which in turn resulted in executing a recursive statement (line 5). To find which statement is executed by the process we used errorstack at level 3 and the culprit SQL was dumped to trace in the beginning:
----- Error Stack Dump ----- ----- Current SQL Statement for this session (sql_id=b52m6vduutr8j) ----- delete from RecycleBin$ where bo=:1
This statement was executed in a recursive session by Oracle probably in a loop. Cumulative SQL runtime execution statistics showed that it has been executed ~1M times, making ~800M buffer gets. With 1024 blocks occupied by recyclebin$ and average row length of the table at 50 bytes (low estimate), it looks like 1) a bit of extra-executing 2) Oracle tries to clean up whole recycle bin, which is strange. I think SQL tracing of a such situation can give more details of how Oracle manages data that is resided in recycle bin. For me it was enough to know that issue 1) is definitely caused by the Recycle Bin functionality 2) we have an application installation code that can re-create tables many times and which most likely was catalyst to the problem.
November 28, 2011 There are a few articles that can be found on the Internet that describe the cause of indexes with names similar to BIN$ld5VAtb88PngQAB/AQF8hg==$0 and BIN$PESygWW5R0WhbOaDugxqwQ==$0 appearing in execution plans. As is likely known by readers, the Oracle Database documentation describes that these object names are associated with the recycle bin that was introduced in Oracle Database [...]
I’ve just received an email telling me that Apress is having a “Happy Cyber Monday”. Translated into English this means that for today only they’re selling every eBook at $15.00 each. If you were planning to get an eBook version of Oracle Core, today might be the best day to do it. Here’s a link to the specific page on the Apress site. I’ve repeated this link in the picture of the book at the top-right of this page.
If you want the paper copy, it started shipping from Amazon in the US quite recently but isn’t yet shipping from Amazon UK.
[Update: Although Amazon UK is not shipping the book, you will find that there are a couple of alternative suppliers listed on Amazon UK and one who claims that they can deliver in 3 to 4 working days.]
Note – the book is available in ePub, Mobi, and pdf formats.
Note 2 – it’s possible that some local sales tax will be added dependent on where you are purchasing from.
I’ve uploaded the source code library to my blog as a zip file. Instructions for download are on my index page for the book.
Although the book isn’t available directly from Amazon UK, it’s been on their rankings as the number 1 Oracle book for the last 24 hours. That’s nice, of course, but the reason I wrote this update wasn’t to brag, it was to point out that one of the other categories that Amazon has assigned to the book to is Home Office Databases!
If you go to the main page I’ve set up for the book you will find details of two discount code for the book – one of them expires on 16th Dec. (You’ll also find the link to the source code that I’ve set up for my copy of the zip file.)
Before I dive into this blog post, quick heads up for anyone attending UKOUG: on Tuesday only, I’ll be hanging out with some very smart people from the IOUG RAC Special Interest Group in the “gallery” above the exhibition hall. We’re ready to help anyone run a RAC cluster in a virtual environment on their own laptop. And if your laptop doesn’t meet the minimum requirements then you can try with one of our demo workstations. Come find us!!
I’ve heard Kyle Hailey speak on a few different occasions, and more than once he’s talked about the power of visualizing data. (In fact Kyle was a key person behind Grid Control’s performance screens.)
I couldn’t agree more. I regularly visualize data when I’m working. Two reasons come immediately to mind:
Last week, I made a few quick charts as illustrations for a performance report. The process really isn’t that complicated, but I thought I’d put the steps into a blog post… for myself to reference in the future and for anyone else who might find this helpful. :)
This demonstration will use data from the AWR to build graphs. Note that if you run these queries, Oracle legally requires you to purchase the extra-cost “diagnostic pack” license. But similar queries could be written from free statspack or S-ASH tables.
You need multiple data points to make a graph. For this demo, The AWR was configured to take snapshotsd every 30 minutes and I’m looking at a query which ran for about 10 hours. Also, it was the only query running in the instance for most of that time – so I will also look at some instance-wide statistics.
The first step is to get any needed parameters for pulling performance data. In the case of the AWR, I will need the INSTANCE_NUMBER, the SQL_ID and the first/last SNAP_ID.
It’s pretty easy to get this information from Grid Control or from Database Console. But if you don’t have access to the web console then you can still get the info from SQLPlus.
Here’s a useful query to get an overview of the SNAP_IDs:
SQL> select to_char(BEGIN_INTERVAL_TIME,'MON YYYY') month, min(snap_id) min_snap, max(snap_id) max_snap from dba_hist_snapshot where instance_number=4 group by to_char(BEGIN_INTERVAL_TIME,'MON YYYY'); MONTH MIN_SNAP MAX_SNAP -------- ------------ ------------ APR 2009 10239 10240 DEC 2009 28752 31924 FEB 2010 38115 40939 MAY 2010 47498 48783 AUG 2010 54975 55013 NOV 2010 60979 61986 DEC 2010 61987 64218 JAN 2011 64219 66448 FEB 2011 66449 67803 MAR 2011 67804 69291 APR 2011 69292 70731 MAY 2011 70732 72219 JUN 2011 72220 73655 JUL 2011 73656 75139 AUG 2011 75140 76608 SEP 2011 76609 78048 OCT 2011 78049 79536 NOV 2011 79537 80338 18 rows selected.
Something similar to this might also be useful:
SQL> select snap_id,instance_number,begin_interval_time,snap_level from dba_hist_snapshot where begin_interval_time between '11-nov-11 17:30' and '11-nov-11 19:00' and instance_number=4 order by snap_id, instance_number;
For this demo I’m going to use INSTANCE_NUMBER 4 and SQL_ID 8suhywrkmpj5c between snaps 80298 and 80318.
Now create a new spreadsheet in your office suite. I use the free OpenOffice spreadsheet application, but Excel or iWork Numbers should work pretty much the same.
In the second row of the new spreadsheet, enter the time of the first snapshot you’re going to analyze. In the third row, enter this formula:
= A2 + 1/24/60 * [minutes between snaps]
Select several rows below this formula and select Edit > Fill > Down to copy the formula to the following rows. Repeat this until you have reached the end of your analysis window.
Open a SQLPlus session. We will copy-and-paste directly from SQLPlus into the spreadsheet.
Use a SQL like this to retrieve data for one system statistic:
set pagesize 999 col value format 999999999999999 select value from dba_hist_sysstat where instance_number=4 and snap_id between 80298 and 80318 and stat_name='transaction tables consistent read rollbacks' order by snap_id
You can copy the statistic name directly from an AWR report of there’s a certain stat you’re interested in. You can find more information about system statistics in Oracle’s docs.
Now move right to the next empty column. First, copy the name of this statistic into the first row. Then, in the second box, enter a formula to find the difference between its left peer and the left upper peer. For cell C3, the formula is B3-B2. Choose Edit > Fill > Down again, as before.
= B3 - B2
You can repeat this step to access further system statistics. You can also create another column where you divide or multiply each other.
The previous SQL statement retrieved system statistics. Another easy query runs against historical V$SQLSTAT snapshots. (This only works for long-running queries.)
SQL> select BUFFER_GETS_TOTAL value from dba_hist_sqlstat where instance_number=4 and snap_id between 80298 and 80318 and sql_id='8suhywrkmpj5c' order by snap_id
Once again, you can read about the available fields and statistics in the oracle docs. You can repeat this step to quickly get additional statistics for a particular SQL, and you can then combine some stats for better graphs.
That’s it! I know, really not that complicated. I hope it’s helpful. :)
For anyone who’s curious, the charts in this article are related to a SQL report which was recently discussed on the mailing list.
Thought of the day:
An XFS file system with 98% free space, 6 files and 61.11% fragmentation:
# df -h . Filesystem Size Used Avail Use% Mounted on /dev/sdb1 100G 1.1G 99G 2% /test # # find . -type f -print | wc -l 6 # xfs_db -r -c frag /dev/sdb1 actual 18, ideal 7, fragmentation factor 61.11%
When I asked about this oddity in a conversation with Dave Chinner (XFS Kernel owner) I was expecting a lot of complex background on what this 61.11% actually means. His response? I’ll quote:
18 – 7 / 18 = 0.6111111
[…]it’s been that way forever. Ignore it – it’s much more important to look at the files themselves […]
I like Dave’s candor and have found that individual file analysis does yield interesting information as well as I showed in my post entitled Little Things Doth Crabby Make – Part XVII. I See xfs_mkfile(8) Making Fragmented Files.
As for deprecated tools, I also have no problem with that. There may have been a day when this command spat out useful information (perhaps in XFS’s previous SGI Unix life?) and folks have scripted to it. Basically, OS distributions can’t just discard such a command. It just goes that way…no problem.
Way Off Topic
Maybe the next time Dave is in the Bay Area we can repeat the curry! That would be nice.
Filed under: oracle
It was good to see the answers to the last Quiz Night accumulating. The problem posed was simply this: I have two IOTs and I’ve inserted the same data into them with the same “insert as select” statement. Can you explain the cost of a particular query (and it’s the same for both tables) and extreme differences in work actually done. Here’s the query, the critical stats on the primary key indexes, the shared plan, and the critical execution statistic for running the plan.
SQL> select max(v1) from tX; INDEX_NAME BLEVEL LEAF_BLOCKS AVG_LPK AVG_DPK CLUF ---------- ------- ----------- ---------- ---------- ------- T3_PK 1 8 1 1 2000 T4_PK 1 12 1 1 2000 --------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | --------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 3 | 2003 | | 1 | SORT AGGREGATE | | 1 | 3 | | | 2 | INDEX FAST FULL SCAN| Tx_PK | 2000 | 6000 | 2003 | --------------------------------------------------------------- Statistics (for tX = table t3) ---------------------------------------------------------- 2012 consistent gets Statistics (for tX = table t4) ---------------------------------------------------------- 17 consistent gets
The answer was in the definition of the two tables – which is why I didnt’s supply the DDL to create the structures. In fact I had made it virtually impossible for anyone to get to a totally accurate result because two of the columns were defined as varchar2(40) but held values which were no more than 4 characters long; nevertheless several people deduced that the answer related to overflow segments, and different declarations for the overflow for the two tables.
Here are the original declarations of the two tables (the t1 table used the same select statement, selecting 2,000 rows, to create a simple heap table):
create table t3 ( id1, id2, v1, v2, padding, constraint t3_pk primary key(id1, id2) ) organization index including id2 -- Note the included column overflow as select mod(rownum,20) id1, trunc(rownum,100) id2, to_char(mod(rownum,20)) v1, to_char(trunc(rownum,100)) v2, rpad('x',500) padding from t1 where rownum = 0 ; create table t4 ( id1, id2, v1, v2, padding, constraint t4_pk primary key(id1, id2) ) organization index including v2 -- Note the included column overflow as select mod(rownum,20) id1, trunc(rownum,100) id2, to_char(mod(rownum,20)) v1, to_char(trunc(rownum,100)) v2, rpad('x',500) padding from t1 where rownum = 0 ;
Note especially the included clause and the overflow keyword. A better choice of syntax for the included clause to be redefined as the exclude everything after clause because that is (essentially) the effect it has. All columns after the included column are stored in a separate segment and do not appear in the index segment.
With this in mind, we can understand that the v1 column will be in the overflow segment for t3, but in the index segment for t4. The existence of an overflow segment accounts for the cost reported, and the difference in the content of the overflow accounts for the difference in the work done.
When we have an overflow segment, Oracle can produce a clustering_factor for the index segment. The optimizer then assumes that ANY column – other than primary key columns – referenced in the select list must be in the overflow, whether or not that is in accordance with the object definition. This behaviour accounts for the large cost reported for both queries: the data scatter in the overflow segment is very bad because of the way I generated the data, and the optimizer assumes that the v1 value will require a visit to the overflow in both queries.
When I run the queries, Oracle finds the v1 column in the index segment for t4 as it does the index fast full scan, so the work done is basically the buffer visits for each block in the index segment (plus a couple extra for the segment header visits); however, for each index entry visited in the t3 fast full scan Oracle finds that it has to jump to a different overflow block from the previous one it was looking at, so the number of block visits is essentially the same as the number of rows in the table. (autotrace and tkprof didn’t quite agree on the number of blocks visited – which is why the autotrace figure is a little lower than you might expect.)
The included clause requires a little more subtlety than just checking “columns after the included column”. The internal (data dictionary) column order is not necessarily the same as the visible order when you create or describe the object. One of the critical differences is that Oracle moves the primary key columns to be the first columns in the internal order; it will also move a long column (if you have one) to be the last column.
- new easier installation process
- new metrics history (IO and system)
- improved support for RAC / multi database in one repository
- Oracle scheduler is used to manage jobs
This version of OraSASH has been tested with Oracle 10g and 11g R1 and R2 as a target database and Oracle 11gR2 Express Edition as repository.