I posted a fair amount of stuff on how Oracle is generating IOs, and especially large IOs, meaning more than one Oracle block, so > 8KB. This is typically what is happening when the Oracle database is executing a row source which does a full segment scan. Let’s start off with a quiz: what you think Oracle is the maximum IO size the Oracle engine is capable of requesting of the Operating System (so the IO size as can be seen at the SCI (system call interface) layer? If you made up your answer, remember it, and read on!
The real intention of this blogpost is to describe what is going on in the Oracle database kernel, but also what is being done in the Linux kernel. Being a performance specialised Oracle DBA means you have to understand what the operating system does. I often see that it’s of the utmost importance to understand how an IO ends up as a request at the NAS or SAN head, so you understand what a storage admin is talking about.
One of the presentations I went to at the DOAG conference earlier on this month was called “PL/SQL Tuning, finding the perf. bottleneck with hierarchical profiler” by Radu Parvu from Finland. If you do a lot of PL/SQL programming and haven’t noticed the dbms_hprof package yet make sure you take a good look at it.
A peripheral question that came up at the end of the session asked about problems with line numbers in pl/sql procedures; why, when you get a run-time error, does the reported line number sometimes look wrong, and how do you find the right line. I can answer (or give at least one reason for) the first part, but not the second part; Julian Dontcheff had an answer for the second bit, but unfortunately I failed to take a note of it.
I’m going to take a break from the OMS/OMR Performance pages and jump over to the Agents page in release 220.127.116.11. You can access this page from the Setup menu drop down in EMCC:
There was a question on OTN a few days ago asking the following question:
Here’s a query that ran okay on 11g, but crashed with Oracle error “ORA-01843: not a valid month” after upgrade to 12c; why ?
The generically correct answer, of course, is that the OP had been lucky (or unlucky, depending on your point of view) on 11g – and I’ll explain that answer in another blog posting.
That isn’t the point of this posting, though. This posting is a test of observation and deduction. One of the respondants in the thread had conveniently supplied a little bit of SQL that I copied and fiddled about with to demonstrate a point regarding CPU costing, but as I did so I thought I’d show you the following and ask a simple question.’
Those of you that have been around Oracle for some time may already be aware of the various OTN tours. These tours bring well known speakers to some of the smaller (relatively speaking) regions that often don’t get to see some of the big name speakers, simply because of audience size. Over the past couple of years, I’ve been involved in the OTN APAC tour, and recently returned from the New Zealand leg of the tour for this year. I presented two workshops – one on Database Lifecycle Management and one on DBaaS – as well as standing in for a sick presenter and covering Snap Clone in Enterprise Manager 12c. For me, the best value for the conference was (as it so often is) the time spent interacting both with customers and other speakers / exhibitors. It was great to catch up with so many people I haven’t seen for a long time, both from within Oracle and outside of it.
I’m not very keen on bending the rules on production systems, I’d prefer to do things that look as if they could have happened in a completely legal fashion, but sometimes it’s necessary to abuse the system and here’s an example to demonstrate the point. I’ve got a simple SQL statement consisting of nothing more than an eight table join where the optimizer (on the various versions I’ve tested, including 12c) examines 5,040 join orders (even though _optimizer_max_permutations is set to the default of 2,000 – and that might come as a little surprise if you thought you knew what that parameter was supposed to do):
While creating a POC of a SQL rewrite recently I received a little surprise as I switched my query from serial execution to parallel execution and saw the optimizer’s estimated cost increase dramatically. I’ll explain why in a moment, but it made me think it might be worth setting up a very simple demonstration of the anomaly. I created a table t1 by copying view all_source – which happened to give me a table with about 100,000 rows and 1117 blocks – and then ran the query ‘select max(line) from t1;’ repeating the query with a /*+ parallel(t1 2) */ hint. From 18.104.22.168 here are the two execution plans I got:
That’s data that isn’t there until you look for it, sort of, from the optimizer’s perspective.
Here’s some code to create a sample data set:
create table t1 as with generator as ( select --+ materialize rownum id from dual connect by level <= 1e4 ) select rownum id, mod(rownum-1,200) mod_200, mod(rownum-1,10000) mod_10000, lpad(rownum,50) padding from generator v1, generator v2 where rownum <= 1e6 ; begin dbms_stats.gather_table_stats( ownname => user, tabname =>'T1', method_opt => 'for all columns size 1' ); end; /
Now derive the execution plans for a couple of queries noting, particularly, that we are using queries that are NOT CONSISTENT with the current state of the data (or more importantly the statistics about the data) – we’re querying outside the known range.
“You can’t compare apples with oranges.”
Oh, yes you can! The answer is 72,731,533,037,581,000,000,000,000,000,000,000.
I was in Munich a few weeks ago running a course on Designing Optimal SQL and Troubleshooting and Tuning, but just before I flew in to Munich one of the attendees emailed me with an example of a statement that behaved a little strangely and asked me if we could look at it during the course. It displays an odd little feature, and I thought it might be interesting to write up what I did to find out what was going on. We’ll start with the problem query and execution plan: