The Hotsos Symposium 2011 schedule has been published. On the one hand, it's one of my easiest to plan because there are only two streams which takes away the ridiculous choice of something like Openworld, on the other it can be extremely difficult to pick because the quality is uniformly high and the presentations all tend to be about performance. There are always some clashes I simply can't solve. Anyway, here's my best guess at the moment.
Mon
08:30 AM Welcome and Keynote by Kerry Osborne
09:45 AM Kerry Osborne: Tuning Exadata
11:00 AM Cary Millsap: Thinking Clearly about Performance
01:00 PM Toon Koppelaars: Thinking Clearly in SQL or Gerwin Hendriksen: GAPP Improvements
02:15 PM Henry Poras: Determining Resource Utilization and Saturation Limits in a Multi-User, Mixed Workload Environment
03:30 PM Andrey Nikolaev: Contemporary Latch Internals
04:45 PM Shan Nawaz: Oracle's Forgotten Children or Stephan Haisley: Tentative: Streams and Golden Gate
Tues
08:30 AM Margaret Norman: Moving to 11g Statistics
09:45 AM Doug Burns: Statistics on Partitioned Objects
11:00 AM Riyaj Shamsudeen: Advanced RAC Troubleshooting, Part I
01:00 PM Riyaj Shamsudeen: Advanced RAC Troubleshooting, Part II
02:15 PM Miladin Modrakovic: Oracle Diagnostic Events in 11g or Dan Fink: Variance Analysis, Profiling, and Trending using ASH and AWR
03:30 PM Neil Gunther: Brooks, Cooks, and Response Time Scalability
04:45 PM Alex Gorbachev: Database I/O Performance or Tim Gorman: Forensic Analysis using AWR and ASH
Wed
08:30 AM Margaret Norman: How to Set Up Stable Parallel Execution or Karl Arao: Mining the AWR Repository
09:45 AM Toon Koppelaars: "Triggers Considered Harmful," Considered Harmful or Gary Propeck: The CBO's Look at Clusters, IOTs, Partitions, and Other Objects
11:00 AM Maria Colgan: Implement Best Practices for Extreme Performance with Oracle Data Warehousing
01:00 PM Tanel Põder: Troubleshooting the Most Complex Oracle Performance Problem I've Ever Seen
02:15 PM Tom Kyte: Just in Time
I think you'll agree that it looks terrific and it isn't false modesty to say that I do pinch myself sometimes that I make it on to the agenda for a conference like this.
Despite my attempts to not travel (much) anymore, I’m still going to speak at two events in first half of 2011.
The first event is at the Suncoast Oracle User Group meeting in Tampa, FL on Thursday, 27th January. I’ll speak about Advanced Oracle Troubleshooting in 60 minutes.
I will deliver the second presentation at Hotsos Symposium 2011 near Dallas, TX on Wednesday 9th March. I will talk about Troubleshooting the Most Complex Oracle Performance Problem I’ve ever seen. If you think that this performance problem was complex, then be prepared to see an even worse one there!
Here’s an execution plan pulled from v$sql_plan in 10.2.0.3 with the “gather_plan_statistics” option enabled:
---------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | A-Rows | A-Time | Buffers | ---------------------------------------------------------------------------------------- | 1 | NESTED LOOPS | | 1 | 0 |00:00:00.01 | 608 | | 2 | TABLE ACCESS BY INDEX ROWID| PARENT | 1 | 200 |00:00:00.01 | 6 | | 3 | INDEX RANGE SCAN | PAR_I1 | 1 | 200 |00:00:00.01 | 2 | | 4 | TABLE ACCESS BY INDEX ROWID| CHILD | 200 | 0 |00:00:00.01 | 602 | | 5 | INDEX UNIQUE SCAN | CHI_I1 | 200 | 200 |00:00:00.01 | 402 | ----------------------------------------------------------------------------------------
The code defining the two tables is the same:
create table &m_table_name
as
select
rownum id,
trunc(dbms_random.value(1,&m_rowcount)) n1,
lpad(rownum,10,'0') small_vc,
rpad('x',100) padding
from
big_row_source
where
rownum <= &m_rowcount
;
The indexes par_i1 and chi_i1 are on the id column and, as you can see for chi_i1, have been created as unique indexes. Unlike the table, though, the code defining them isn’t necessarily identical.
Please state, with justification, what you think the blevel is for index chi_i1 ?
(Note – to avoid any risk of confusion, remember that height = blevel + 1; and just in case anyone thinks it makes any difference, the tables and indexes were all created in the same tablespace which is my usual setup of 1MB uniform extents, locally managed, 8K block size and freelist management.)
The quiz isn’t intended to be cunning, devious and tricksy, by the way. It’s all about observing and understanding a fairly ordinary situation. On the other hand I’m sure there are interesting variations with strange side effects that could be introduced if you really want to get clever.
Answer 23rd Jan:
Rather than writing my answer into a new post I’ve added it below – but to give new visitors a chance to work out the answer before seeing mine I’ve left a big gap before the answer.
Reading the comments I think all the interesting points have been covered. We have comments covering special treatment of root blocks in nested loop joins, the “buffer is pinned count” statistic, possible changes in 11g, the hypothesis that I had set a large pctfree for the child index,and the link between “A-rows” in one line of the plan and “Starts” in another line.
For reference, my query looked like this:
select par.small_vc, chi.small_vc from parent par, child chi where par.id between 301 and 500 and chi.id = par.n1 and chi.n1 = 0 ;
The mechanics of the plan are:
Line 3 – index range scan of par_i1 (which finds ultimately 200 matching rowids).
Line 2 – for each rowid from line 3 check the matching table row (which ultimately returns 200 rows)
Then two hundred calls to:
Line 5 – index unique scan of chi_i1, which always finds it single row – hence returning 200 rowids in total
Line 4- check the matching row in the table – which always fails – hence returning zero rows in total
The critical counts that allow you to answer my question are the 402 and 602 buffer gets in lines 5 and 4. Ignoring, temporarily, the odd 2 these numbers are clearly 200 times something – which is rather nice given that we know that we are doing something 200 times. The simplest solution, of course is that the (602 – 402 =) 200 gets due to line 4 represents the visits to the 200 table blocks, leaving us with 400 (plus an odd bit) to account for.
Now consider the possibilities:
So, at first sight we might decide the only way to get 400 buffer visits from the index is to have Blevel = 1, visit the root and the leaf. But that’s not the way it works (apart from a couple of versions which had a bug).
When running a nested loop join, Oracle “pins” the root block of the inner (second table) index, which means it attaches a structure to the “users’ list” in the buffer header that links the session’s state object directly to the buffer header. So, in my nested loop join, Oracle gets the root block once and keeps it pinnned, then gets branch and leaf blocks 200 times. The blevel on the child index is 2. (The blevel of the parent index is only 1 – I had set the pctfree on the child index to 90 to make it much larger than it needed to be.)
This “root block pin” isn’t the only pinning optimisation in the query, though. In detail, the steps are as follows – and this is something you would be table to see in the trace file if you enabled event 10200 (one of the ‘consistent reads’ traces).
There a couple of deviations from this cycle, of course. The rows I needed from parent were spread across 4 consecutive blocks in the table so, roughly every 50 rows from parent, line (9) above would become “release current pinned parent block, get new parent block and pin it”. The other little oddity that I can’t explain is that Oracle does “get” the child root block on the second visit to the index as well as the first visit – and then pins it from that moment onwards. So the counts are:
Inevitably, things change – Oracle keeps getting smarter about things like ‘consistent gets – examination’, ‘buffer is pinned count’ and, in 11g, “fastpath” access to buffered blocks. This is a clue to the difference in gets that Charles Hooper recorded in 11.2.0.2 – and explains why I chose to use 10.2.0.3 with a unique index for my example. If you want to investigate other variations all it takes is snapshots of v$mystat, calls to “alter system flush buffer_cache”, and event 10200; the treedump can also be very helpful for identifying block addresses.
January 21, 2011 I found an interesting SQL statement on the OTN forums today. When executing the SQL statement Oracle Database returns the following error: ORA-00604: error occurred at recursive SQL level 1 ORA-06502: PL/SQL: numeric or value error: character string buffer too small ORA-06512: at line 12 ORA-00920: invalid relational operator 00604. 00000 - "error [...]![]()
Luis Daziano from Argentina asked me about the location of the series I wrote for Oracle Technology Network on Advanced Linux commands. Fortunately, OTN still has all of them, although it has become a little difficult to find. Here are the five articles, if you are interested.
http://www.oracle.com/technetwork/articles/linux/part1-091089.html
http://www.oracle.com/technetwork/articles/linux/part3-090589.html
http://www.oracle.com/technetwork/articles/linux/part2-085179.html
http://www.oracle.com/technetwork/articles/linux/part4-097181.html
http://www.oracle.com/technetwork/articles/linux/part5-096399.html
Enjoy.
Recently, I received into my Spam folder an ad claiming that a product could “...improve performance 1000%.” Claims in that format have bugged me for a long time, at least as far back as the 1990s, when some of the most popular Oracle “tips & techniques” books of the era used that format a lot to state claims.
Beware of claims worded like that.
Whenever I see “...improve performance 1000%,” I have to do extra work to decode what the author has encoded in his tidy numerical package with a percent-sign bow. The two performance improvement formulas that make sense to me are these:
As you probably know that my first co-authored book Expert Oracle Practices was released in 2009. I have co-authored one more book Pro Oracle SQL with my esteemed colleagues. This books covers many aspects of better SQL development. Have fun reading
I also will be presenting in few conferences in the upcoming months: RMOUG Training days 2011 , Hotsos symposium ’11, and IOUG Collob . Hopefully, I will see you in one of the conference (or all of the conferences)
I will be talking about advanced RAC troubleshooting in Hotsos symposium ’11. Hotsos Symposium, conducted every March in Dallas, TX, is an intensive seminar series probing the deep waters of Performance related to Oracle Database.
There are many great speakers in this conference. It gives me a great pleasure to meet many folks that I have known for years, exchange ideas, and learn from industry leaders.
Hope to see you there!
Yesterday I have run a benchmark on a 2 node RAC cluster (ProLiant BL685c G6 with 4 Six-Core AMD Opteron 8431) and 32G RAM each. It’s running Oracle Grid Infrastructure 11.2.0.2 as well as an Oracle 11.2.0.2 database on Oracle Enterprise Linux 5.5 64bit and device-mapper-multipath.
I was testing how the system would react under load but also wanted to see if the Runtime Load Balancing was working. The easiest way to check this is to view the AQ events that are generated for a service if AQ HA notifications is set to true. They can either be dequeued from the database as described in chapter 11 of Pro Oracle Database 11g RAC on Linux or alternatively queried from the database. The latter is the quicker method and this article will focus on it.
Before you can make use of Runtime Load Balancing you need to set at least 2 properties in your service:
.Net applications require AQ HA notifications to be set to true as these can’t directly make use of Fast Application Notification (FAN) events as said in the introduction. My JDBC application is fully capable of using the FAN events, however as you will see later I am using the AQ notifications anyway to view the events.
Connected as the owner of the Oracle binaries, I created a new service to make use of both instances:
$ srvctl add service -d TEST -s TESTSRV -r TEST1,TEST2 -P BASIC \ > -l PRIMARY -y MANUAL -q true -x false -j short -B SERVICE_TIME \ > -e SESSION -m BASIC -z 0 -w 0
The service TESTSRV for database TEST has TEST1 and TEST2 as preferred instances, and the service should be started (manually) when the database is in the primary role. AQ Notifications are enabled, and I chose the connection load balancing goal to be “short” (usually ok with web applications and connection pooling) and a runtime load balancing goal of service time (should also be appropriate for many short transactions typical for a web environment). The remaining paramters define Transparent Application Failover. Please refer to the output of “srvctl add service -h” for more information about the command line parameters.
The result of this endavour can be viewed with srvctl config service:
$ srvctl config service -d TEST -s TESTSRV Service name: TESTSRV Service is enabled Server pool: TEST_TESTSRV Cardinality: 2 Disconnect: false Service role: PRIMARY Management policy: MANUAL DTP transaction: false AQ HA notifications: true Failover type: SESSION Failover method: BASIC TAF failover retries: 0 TAF failover delay: 0 Connection Load Balancing Goal: SHORT Runtime Load Balancing Goal: SERVICE_TIME TAF policy specification: BASIC Edition: Preferred instances: TEST1,TEST2 Available instances:
So to begin with I created the order entry schema (SOE) in preparation of a swingbench run. (I know that Swingbench’s Order Entry is probably not the best benchmark out there but my client knows and likes it). Once about 10G of data were generated I started a swingbench run with 300 users, and reasonably low think time (min transaction time 20ms and max of 60ms). The connect string was //scan1.example.com:1800/TESTSRV
A query against gv$session showed an even balance of sessions, which was good:
select count(inst_id), inst_id from gv$session where username = 'SOE' group by inst_id
However, whatever I did I couldn’t get the Runtime Load Balancing in sys.sys$service_metrics_tab to chanage. They always looked like this (column user_data):
{instance=TEST1 percent=50 flag=UNKNOWN aff=TRUE}{instance=TEST2 percent=50 flag=UNKNOWN aff=TRUE} }
That sort of made sense as none of the nodes broke into a sweat-the system was > 50% idle with a load average of about 12. So that wouldn’t cut it. Instead of trying to experiment with the Swingbench parameters, I decided to revert back to the silly CPU burner: a while loop which generates random numbers. I wasn’t interested in I/O at this stage, and created this minimal script:
$ cat dothis.sql declare n number; begin while (true) loop n:= dbms_random.random(); end loop; end; /
A simple for loop can be used to start the load test:
$ for i in $(seq 30); do > sqlplus soe/soe@scan1.example.com:1800/TESTSRV @dothis & done
This created an even load on both nodes. I then started another 20 sessions on node1 against TEST1 to trigger the change in behaviour. And fair enough, the top few lines of “top” revealed the difference. The output for node 1 was as follows:
top - 10:59:30 up 1 day, 21:16, 6 users, load average: 42.44, 20.23, 10.07 Tasks: 593 total, 48 running, 545 sleeping, 0 stopped, 0 zombie Cpu(s): 99.9%us, 0.1%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 32960688k total, 11978912k used, 20981776k free, 441260k buffers Swap: 16777208k total, 0k used, 16777208k free, 8078336k cached
Whereas node 2 was relatively idle.
top - 10:59:22 up 5 days, 17:45, 4 users, load average: 15.80, 10.53, 5.74 Tasks: 631 total, 16 running, 605 sleeping, 10 stopped, 0 zombie Cpu(s): 58.8%us, 0.6%sy, 0.0%ni, 40.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 32960688k total, 11770080k used, 21190608k free, 376672k buffers Swap: 16777208k total, 0k used, 16777208k free, 7599496k cached
Would that imbalance finally make a difference? It did, as the user_data column (truncated here for better readability) reveals:
SQL> select user_data
2 from sys.sys$service_metrics_tab
3 order by enq_time desc;
{instance=TEST1 percent=4 flag=GOOD aff=TRUE}{instance=TEST2 percent=96 flag=GOOD aff=TRUE} } timestamp=2011-01-20 11:01:16')
{instance=TEST1 percent=6 flag=GOOD aff=TRUE}{instance=TEST2 percent=94 flag=GOOD aff=TRUE} } timestamp=2011-01-20 11:00:46')
{instance=TEST1 percent=10 flag=GOOD aff=TRUE}{instance=TEST2 percent=90 flag=GOOD aff=TRUE} } timestamp=2011-01-20 11:00:16')
{instance=TEST1 percent=18 flag=GOOD aff=TRUE}{instance=TEST2 percent=82 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:59:46')
{instance=TEST1 percent=28 flag=GOOD aff=TRUE}{instance=TEST2 percent=72 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:59:16')
{instance=TEST1 percent=35 flag=GOOD aff=TRUE}{instance=TEST2 percent=65 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:58:46')
{instance=TEST1 percent=40 flag=GOOD aff=TRUE}{instance=TEST2 percent=60 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:58:16')
{instance=TEST1 percent=43 flag=GOOD aff=TRUE}{instance=TEST2 percent=57 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:57:46')
{instance=TEST1 percent=44 flag=GOOD aff=TRUE}{instance=TEST2 percent=56 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:57:16')
{instance=TEST1 percent=48 flag=GOOD aff=TRUE}{instance=TEST2 percent=52 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:56:46')
{instance=TEST1 percent=49 flag=GOOD aff=TRUE}{instance=TEST2 percent=51 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:56:16')
{instance=TEST1 percent=50 flag=GOOD aff=TRUE}{instance=TEST2 percent=50 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:55:46')
{instance=TEST1 percent=50 flag=GOOD aff=TRUE}{instance=TEST2 percent=50 flag=GOOD aff=TRUE} } timestamp=2011-01-20 10:55:16')
{instance=TEST1 percent=50 flag=UNKNOWN aff=TRUE}{instance=TEST2 percent=50 flag=UNKNOWN aff=TRUE} } timestamp=2011-01-20 10:54:46')
{instance=TEST1 percent=50 flag=UNKNOWN aff=TRUE}{instance=TEST2 percent=50 flag=UNKNOWN aff=TRUE} } timestamp=2011-01-20 10:54:16')
Where it was initially even at 50-50 it soon became imbalanced, and TEST2 would be preferred after a few minutes in the test. So everything was working as expected, I just didn’t manage to put enough load on the system intially.
Recent comments
21 weeks 1 day ago
30 weeks 6 days ago
32 weeks 4 days ago
35 weeks 5 days ago
38 weeks 13 hours ago
47 weeks 4 days ago
49 weeks 1 day ago
50 weeks 1 day ago
50 weeks 2 days ago
1 year 6 days ago