This blog entry is about modern Linuxes. In other words RHEL6 equivalents with 2.6.3x kernels and not the ancient RHEL5 with 2.6.18 kernel (wtf?!), which is the most common in enterprises unfortunately. And no, I’m not going to use kernel debuggers or SystemTap scripts here, just plain old “cat /proc/PID/xyz” commands against some useful /proc filesystem entries.
Here’s one systematic troubleshooting example I reproduced in my laptop. A DBA was wondering why their find command had been running “much slower”, without returning any results for a while. Knowing the environment, we had a hunch, but I got asked about what would be the systematic approach for troubleshooting this – already ongoing – problem right now.
Luckily the system was running OEL6, so had a pretty new kernel. Actually the 2.6.39 UEK2.
A customer of mine run out of memory due to much server processes of dedicated connected client sessions. As an alternative I tried to explain the options between DEDICATED and SHARED SERVER concepts as an initial attempt/workaround for the client software problems. Looking around on the internet for pictures and or newbie documentation, I found …
If you already downloaded snapper v4, then better re-download it again as the v4.03 also runs in SQL Developer!
Snapper used to require access to DBMS_LOCK, so it could sleep for X seconds between the “before” and “after” performance data snapshots. Now it is possible to get away without using DBMS_LOCK. Instead you will run Snapper twice, once for taking the “before” snapshot, then run your workload and then run Snapper again for taking the “after” snapshot and print the output.
So, the usual way of running snapper is this:
@snapper4 all 5 1 152
This would take 1 5-second performance snapshot SID 152′s V$ views.
With Snapper4 you can use the old way or just add BEGIN or END keywords to the 1st parameter, like this:
I have fixed most of the bugs that showed up during the Snapper launch party session and uploaded the new version (v4.02) of Snapper here:
I have also uploaded the launch party hacking session video to enkitec.tv:
I have not updated the snapper documentation yet, but here are the main improvements:
Troubleshooting Runaway Processes
Everyone reads Tanel Poder’s material—for good reason.
I took particular interest in his recent post about investigating where an apparently runaway, cpu-bound, Oracle foreground process is spending its time. That article can be found here.
I’ve been meaning to do some blogging about analyzing Oracle execution with perf(1). I think Tanel’s post is a good segue for me to do so. You might ask, however, why I would bother attempting to add value in a space Tanel has already blogged. Well, to that I would simply say that modern systems professionals need as many tools as they can get their hands on.
Tanel, please ping me if you object, for any reason, to my direct links to your scripts.
Here’s a little follow-on from Friday’s posting. I’ll start it off as a quiz, and follow up tomorrow with an explanation of the results (though someone will probably have given the correct solution by then anyway).
I have a simple heap table t1(id number(6,0), n1 number, v1 varchar2(10), padding varchar2(100)). The primary key is the id column, and the table holds 3,000 rows where id takes the values from 1 to 3,000. There are no other indexes. (I’d show you the code, but I don’t want to make it too easy to run the code, I want you to try to work it out in your heads).
I run the following pl/sql block.
Followers of the blog will know I dig virtualization. I first ran Oracle in virtualized environments over a decade ago.
In my current company there is a strong virtualization presence in the Windows space. Pretty much all Windows servers, including those running MS SQL Server, are actually VMs running on a VMware farm. The UNIX/Linux side is a little different. Most stuff is still done on physical boxes and what little virtualization is done, uses CentOS and KVM for freebie open source solutions.
There are a lot of architectural changes going on at the moment and I’ve been pushing *very hard* for a switch to the virtual infrastructure (VI) for all our middle tier servers and a few of our databases. It is looking very likely (but not guaranteed) that this will happen.
I’ve recently realized that I didn’t post anywhere the second version of my presentation — Under The Hood of Oracle Clusterware 2.0: Grid Infrastructure, codenamed UTHOC2. I think it would be very useful as I still see lots of questions being asked and UTHOC1 covers Oracle RAC 10g and 11gR1 only. 11g Release 2 brought many changes in the clusterware and the slides needed some good refresh.
Update 7-May-2013: Almost 100 people filled in the survey and here are the result: