Here’s a treat for Oracle geeks, hackers and maniacs out there…
Update: As the beginning says, this article was meant as something interesting about Oracle’s internals and CTRL+C / OCICancel() handling. There’s a more practical way for canceling session calls if you are running Oracle Enterprise Edition and are currently using resource manager:
You can set the consumer group for a session to CANCEL_SQL to cancel its current call:
session_id IN NUMBER,
session_serial IN NUMBER,
consumer_group IN VARCHAR2);
Thanks to commenter “null” for this info. Note that I haven’t tested how/whether this feature works correctly so there’s homework for you ;-)
I recently received a question about how to cancel queries running in another Oracle session, so that the session would not be killed, but would remain alive.
Well, there’s no supported way I can tell you, but thanks to how Oracle handles out-of-band breaks on Unix platforms, you can cancel database calls using an OS tool – kill.
Before we go on, here’s how query cancellation (pressing CTRL+C in sqlplus for example) works in Oracle:
It is very disappointing to me that I had to cancel my trip to RMOUG training days. I am sick and was not able to catch the flight due to that.
But, I can always share my presentations here. I had two presentations planned in this training day and can be accessed as below:
Why optimizer hates my sql
RMOUG training days audience: Please accept my sincere apologies.
If you are in Singapore and have 24th Feb afternoon available then you can register and join a free Oracle performance troubleshooting seminar I’m doing in Singapore Management University’s (SMU) campus.
The seminar will be about:
The date is Wednesday, 24th Feb
The seminar time is from 15:30-19:00 (don’t be late)
Registration and more details are here:
Welcome back for the concluding part of this series of blogs.
In the last part I went through a brief primer about code execution and stack frames in preparation for this posting. I hope that wasn’t too boring for everyone – I think it’s really important to make sure all the preliminary knowledge is covered and not make too many assumptions about the reader. In this part I will finally get to the point, and talk about some alternative techniques for determining the reasons for poor performance for our example user session.
To recap very quickly, we have a user session taking 100% CPU in top, nothing is emitted in SQL trace files, Oracle’s Time Model shows it is executing SQL, and that’s about all the information we have. As Tanel stated in his posting for Part Two, this is the stage where it makes good sense to dig into session-level statistics (not system-level statistics) to see if any of the counters are incrementing in a conspicuous way. If you don’t already have a copy of Tanel’s Snapper code, I suggest you grab it and get familiar with it – it’s a nice implementation of an essential tool that many DBAs already have some kind of script for.
Tanel found in Part Two that the SQL execution was revisiting the same small number of buffers very frequently, indicating that there might be a problem with one of the Nested Loop join methods used by the optimiser. Perhaps there is something wrong with that part of the plan (or with the whole plan)? Using counters can often reveal a potential problem this way, but they don’t necessarily put a neon sign reading “Here’s Where All The Time Went” on top of the offending item. This is where we might want to dig further using some of our knowledge of code execution and stack traces, in combination with a dynamic tracing infrastructure such as Dtrace or Systemtap.
The power of Dtrace and friends lies in a few key areas:
These are really key features which, when combined, allow for tremendously powerful on-demand tracing.
Consider the following sample output from a very small Dtrace script to look into what happens when a single printf() library call is made from a C program:
CPU FUNCTION 0 -> my_function USER CODE (57 lines of dynamic library loading omitted) 0 -> printf USER CODE 0 -> _setorientation USER CODE 0 <- _setorientation USER CODE 0 -> _ndoprnt USER CODE 0 -> _findbuf USER CODE 0 -> isatty USER CODE 0 -> ___errno USER CODE 0 <- ___errno USER CODE 0 -> ioctl USER CODE 0 -> syscall_mstate KERN 0 -> gethrtime_unscaled KERN 0 -> tsc_gethrtimeunscaled_delta KERN 0 -> tsc_gethrtimeunscaled KERN 0 <- tsc_gethrtimeunscaled KERN 0 <- tsc_gethrtimeunscaled_delta KERN (180 more lines of kernel functions for ioctl call omitted) 0 <= ioctl SYSCALL: ioctl 1743296 ns (34 lines of kernel function (probably DTrace intrumentation!) omitted) 0 <- ioctl USER CODE 0 <= fstat64 SYSCALL: fstat64 352699 ns (~300 lines of kernel and dynamic library loading omitted) 0 -> _write USER CODE (~200 lines of kernel and dynamic library loading omitted) 0 <= write SYSCALL: write 2154427 ns (123 lines of kernel code omitted) 0 <- printf USER CODE
In the above output, there are around 1000 functions called in User and Kernel domains. The output gives you some idea of the level of detail possible with Dtrace; we can see all of the following attributes:
Note: The printf() itself is very much a user-mode call,part of the standard C library (libc). It actually makes a couple of system calls under the bonnet: Initially an ioctl() and eventually the write(2) system call.
Let’s look at something else – time spent in the Virtual Memory subsystem of the kernel doing Page Ins (page faults). Again, this is a very simple script, and produces output similar to the following:
WAIT: PAGE IN: 0 ms WAIT: PAGE IN: 0 ms WAIT: PAGE IN: 0 ms WAIT: PAGE IN: 0 ms WAIT: PAGE IN: 0 ms WAIT: PAGE IN: 90 ms WAIT: PAGE IN: 836 ms WAIT: PAGE IN: 1098 ms WAIT: PAGE IN: 853 ms WAIT: PAGE IN: 816 ms FINAL WAIT: PAGE IN: 263 ms
So now we can quantify how much time is spent servicing page faults, in this case showing approximately how much per second. Here we have very specific information about how much time our particular session is spending servicing page faults. Don’t assume that soft page faults are cheap/fast, either – with increasing virtualisation trends, soft page faults as observed by the guest operating system could actually be hard page faults in the underlying hypervisor.
These are just a few of the things that can be monitored in Dtrace. Now let’s combine the Oracle stack tracing that we introduced in part 3a. In that posting I concluded that we could dump the stack trace once in a while, observe the name of the current function, and determine where we are spending the most time. With Dtrace we can make this much more concrete, and can quantitatively report the time spent in each function, and combine other contributing factors from the Operating System! I have put together a Dtrace script to demonstrate this concept. I won’t post the script here (yet) because it is still very much in development and still has a number of deficiencies. What I’m trying to get to here is a really quick way to see exactly what an Oracle session is doing by only looking at the running process using Dtrace. The script currently has the following features:
The rowsource operators are taken from tracing the function calls to the various rowsource operators defined in Metalink note 175982.1. Specifically, the script currently traces calls to the following rowsource operators:
Running the same misbehaving SQL as Tanel in his last posting, we get the following output (snippet only):
LAST1S (ms): wt=0 pgtm=0 latch=2 rsrc=(0:GS 464:JO 0:UA 0:VW 0:HJ 0:TB 21:JOT 0:SO) LAST1S (ms): wt=0 pgtm=0 latch=1 rsrc=(0:GS 430:JO 0:UA 0:VW 0:HJ 0:TB 9:JOT 0:SO) LAST1S (ms): wt=0 pgtm=0 latch=2 rsrc=(0:GS 471:JO 0:UA 0:VW 0:HJ 0:TB 20:JOT 0:SO) LAST1S (ms): wt=0 pgtm=0 latch=2 rsrc=(0:GS 467:JO 0:UA 0:VW 0:HJ 0:TB 20:JOT 0:SO) LAST1S (ms): wt=0 pgtm=0 latch=2 rsrc=(0:GS 466:JO 0:UA 0:VW 0:HJ 0:TB 20:JOT 0:SO) LAST1S (ms): wt=0 pgtm=0 latch=2 rsrc=(0:GS 469:JO 0:UA 0:VW 0:HJ 0:TB 20:JOT 0:SO) SYSCALL: times 7104 ns SYSCALL: times 1749 ns LAST1S (ms): wt=0 pgtm=0 latch=2 rsrc=(0:GS 468:JO 0:UA 0:VW 0:HJ 0:TB 20:JOT 0:SO) LAST1S (ms): wt=0 pgtm=0 latch=2 rsrc=(0:GS 466:JO 0:UA 0:VW 0:HJ 0:TB 20:JOT 0:SO) LAST1S (ms): wt=0 pgtm=0 latch=2 rsrc=(0:GS 468:JO 0:UA 0:VW 0:HJ 0:TB 20:JOT 0:SO) LAST1S (ms): wt=0 pgtm=0 latch=2 rsrc=(0:GS 469:JO 0:UA 0:VW 0:HJ 0:TB 22:JOT 0:SO)
Remember that this SQL example precisely fits our profile where the session is using 100% usermode CPU and emitting no wait information, so we cannot see any OWI waits detailed here – just summaries of where the time was spent every second and a couple of system calls. We also have no page faults taking place on this test, so no time is accounted for there. The big user of CPU (468ms per second) is the nested loop join rowsource operator, which agrees with Tanel’s findings using session-statistics. We are also spending a little time acquiring latches, but it’s not very significant. There are clearly some rowsource functions missing, given that we can only account for half the CPU time, so I did a bit more function tracing to find the names of the missing functions, and added them to the script. The output now looks as follows (scroll right to see the new functions):
LAST1S (ms): wt=0 pgtm=0 latch=0 rsrc=(0:GS 461:JO 0:UA 0:VW 0:HJ 0:TB 0:JOT 0:SO 163:IXT 759:PF) LAST1S (ms): wt=0 pgtm=0 latch=0 rsrc=(0:GS 691:JO 0:UA 0:VW 0:HJ 0:TB 455:JOT 0:SO 304:IXT 843:PF) LAST1S (ms): wt=0 pgtm=0 latch=0 rsrc=(0:GS 460:JO 0:UA 0:VW 0:HJ 0:TB 0:JOT 0:SO 162:IXT 759:PF) SYSCALL: times 7810 ns SYSCALL: times 1750 ns LAST1S (ms): wt=0 pgtm=0 latch=0 rsrc=(0:GS 686:JO 0:UA 0:VW 0:HJ 0:TB 444:JOT 0:SO 301:IXT 842:PF) LAST1S (ms): wt=0 pgtm=0 latch=0 rsrc=(0:GS 436:JO 0:UA 0:VW 0:HJ 0:TB 0:JOT 0:SO 177:IXT 721:PF) LAST1S (ms): wt=0 pgtm=0 latch=0 rsrc=(0:GS 708:JO 0:UA 0:VW 0:HJ 0:TB 444:JOT 0:SO 286:IXT 880:PF) LAST1S (ms): wt=0 pgtm=0 latch=0 rsrc=(0:GS 357:JO 0:UA 0:VW 0:HJ 0:TB 0:JOT 0:SO 228:IXT 588:PF) LAST1S (ms): wt=0 pgtm=0 latch=0 rsrc=(0:GS 788:JO 0:UA 0:VW 0:HJ 0:TB 445:JOT 0:SO 235:IXT 1013:PF) LAST1S (ms): wt=0 pgtm=0 latch=0 rsrc=(0:GS 277:JO 0:UA 0:VW 0:HJ 0:TB 0:JOT 0:SO 276:IXT 458:PF)
Looks like we’ve found the missing time, and then some (the timers add up to more than 1000ms each second). The reason for this timing exaggeration is that there is a good deal of double accounting going on with the current version of the script – child function times are also accounted for in the parent. This is a known bug, hopefully I’ll get a chance to fix that one soon.
Well that’s about it for now. I hope that this journey has been an interesting one, I know it’s been a lot of reading to take in all the parts. We started with a session that was emitting no wait information, and now we have a simple Dtrace script that is easy to execute with no impact to other sessions, and easy to stop once the underlying cause is determined. The query does not need to be stopped and restarted, and no additional overhead is imposed from enabling additional Oracle diagnostics such as setting statistics_level=all. I intend to expand on this approach over time, with a particular focus on:
I’ve slipped this post in as there have been a number of discussions recently on how execution plans have changed while nothing else appears to have changed in the database. How can an execution plan suddenly change when no one has made any changes to the database ? By no changes, it means that there [...]
The LatchProf and LatchProfX scripts allow you to be more systematic with latch contention troubleshooting and tuning. No more guesswork is needed as these scripts give you exact session IDs and in this version also SQLIDs of the troublemaking applications.
You can download the new versions here:
Example output (with SQLID info) is below:
I have written the first article to the troubleshooting section of my new website tech.E2SN.com:
It’s about a very valuable Oracle troubleshooting tool -> ERRORSTACK trace.
I cover 4 frequently asked questions there:
You can read it here:
By the way, if you like my new website, feel free to link to it !!! ;-)
Just to let you know where I’ll be in the near future :)
I have rearranged and adjusted the material so it flows smoother, has even more practical tools and scripts and describes some internals even deeper ;-)
These dates are close, last chance to register ;-) Note that after these I won’t be doing an Advanced Oracle Troubleshooting class in US for a while…
I will deliver 2 presentations at the Hotsos Symposium 2010
Also, I will be the Training Day speaker after the conference (I feel quite honored about this btw ;)
Note that I will show some very cool (and of course, useful) stuff at the Training Day! Some things are gonna be so cool that I almost want to sit in the audience myself! One of the things is called MOATS – the Mother Of All Tuning Scripts for example ;-)
I just published a new script and article about calculating the real Oracle process memory usage on Solaris.
The problem with V$PROCESS* views (and the V$SESSTAT) is that they will tell you what Oracle thinks it’s using, not the real amount of memory used. There will be discrepancies due how memory is actually allocated in OS, libraries out of Oracle’s control, the static memory areas inside Oracle binary and of course bugs.
I was working on one of such problems and decided to make my life easier by writing the script. It’s not so much about calculating the exact figures (they will never be 100% correct), but more about presenting the memory usage data in a better and more convenient fashion.
The script is called procmm and stands for Process Memory Matrix as it shows the memory usage in a matrix grid.
Here’s an example output to show what I’m talking about: