Search

Top 60 Oracle Blogs

Recent comments

debug

Smoke and mirrors: monitoring function calls that do not exist anymore

During investigating I ran once again into statistics in the Oracle database that still provide a useful details, but the actual naming of the statistic is describing a situation that in reality does not exist anymore. The statistics I am talking about are ‘calls to kcmgcs’, ‘calls to kcmgrs’, ‘calls to kcmgas’ and ‘calls to get snapshot scn: kcmgss’.

Disclaimer: this is research. Any of these techniques potentially can crash your instance or leave your database in a corrupted state. Test the techniques used in this article severely before applying it in an actual situation. Use at your own risk.

Back to the ‘calls to’ statistics. To see what I mean here, you can look up the functions in symbol table in the Oracle executable. There are several ways to do that, one way is using gdb:

Watching is in the eye of the beholder

Recently I was investigating the inner working of Oracle. One of the things that is fundamental to the Oracle database is the SCN (system change number). SCNs are used to synchronise changes in the database. There is one source for SCNs in every instance (kcbgscn; the global or current SCN in the fixed SGA), and there are multiple tasks for which Oracle keeps track of synchronisation using SCNs. A few of these tasks for which Oracle stores and uses SCNs to keep track of progress are on disk SCN and lwn SCN.

This blogpost is about some oddities I found when using gdb (the GNU debugger) to watch memory locations of a running Oracle database. This should not be done on a production instance, and is purely for research purposes. Only use the methods mentioned in this article if you are absolutely sure what you are doing, and/or if you are using an Oracle instance that can be crashed and can be restored.

The Oracle wait interface granularity of measurement

The intention of this blogpost is to show the Oracle wait time granularity and the Oracle database time measurement granularity. One of the reasons for doing this, is the Oracle database switched from using the function gettimeofday() up to version 11.2 to clock_gettime() to measure time.

This switch is understandable, as gettimeofday() is a best guess of the kernel of the wall clock time, while clock_gettime(CLOCK_MONOTONIC,…) is an monotonic increasing timer, which means it is more precise and does not have the option to drift backward, which gettimeofday() can do in certain circumstances, like time adjustments via NTP.

The first thing I wanted to proof, is the switch of the gettimeofday() call to the clock_gettime() call. This turned out not to be as simple as I thought.

How Exadata smartscans work

I guess everybody who is working with Oracle databases and has been involved with Oracle Exadata in any way knows about smartscans. It is the smartscan who makes the magic happen of full segment scans with sometimes enormously reduced scan times. The Oracle database does smartscans which something that is referred to as ‘offloading’. This is all general known information.

But how does that work? I assume more people are like me, and are anxious to understand how that exactly works. But the information on smartscans is extremely scarce. Of course there is the Oracle public material, which looks technical, but is little/nothing more than marketing. On My Oracle Support, I can’t find anything on the inner working. Even in the ‘Expert Oracle Exadata’ book (which I still regard as the best source of Exadata related information) there is no material on the mechanics of smartscans.

Systemtap revisited

Some time back, I investigated the options to do profiling of processes in Linux. One of the things I investigated was systemtap. After careful investigation I came to the conclusion that systemtap was not really useful for my investigations, because it only worked in kernelspace, only very limited in userspace. The limitation of working in userspace was that you had to define your own markers in the source code of the program you wanted to profile with systemtap and compile that. Since my investigations are mostly around Oracle products, which are closed source, this doesn’t help me at all.

When the Oracle wait interface isn’t enough, part 2: understanding measurements.

In my blogpost When the oracle wait interface isn’t enough I showed how a simple asynchronous direct path scan of a table was spending more than 99% of it’s time on CPU, and that perf showed me that 68% (of the total elapsed time) was spent on a spinlock unlock in the linux kernel which was called by io_submit().

This led to some very helpful comments from Tanel Poder. This blogpost is a materialisation of his comments, and tests to show the difference.

First take a look at what I gathered from ‘perf’ in the first article: