Search

OakieTags

Who's online

There are currently 0 users and 38 guests online.

Recent comments

Affiliations

Oakies Blog Aggregator

H. G. Wells and SQL: Traveling in the Second Dimension

SQL is traditionally one-dimensional in that you can access values in a
row, in the horizontal dimension. SQL Server 2012's support of LAG and LEAD
and other so-called window functions open up the...



Read the full post at www.gennick.com/database.

H. G. Wells and SQL: Traveling in the Second Dimension

SQL is traditionally one-dimensional in that you can access values in a
row, in the horizontal dimension. SQL Server 2012's support of LAG and LEAD
and other so-called window functions open up the...



Read the full post at www.gennick.com/database.

Detail in the Aggregate

SQL Server 2012 – the "Denali" release – brings improved support for window
functions as defined in the SQL standard. One of the things you can now do
is to mix summary and detail results in the...



Read the full post at www.gennick.com/database.

Detail in the Aggregate

SQL Server 2012 – the "Denali" release – brings improved support for window
functions as defined in the SQL standard. One of the things you can now do
is to mix summary and detail results in the...



Read the full post at www.gennick.com/database.

DOAG 2011 Unconference Wrap-Up

My sessions at DOAG all went well, and I in particular liked the Unconference ones. As promised I held two of them and they were rather different.

The first one only had a couple of attendees (including OakTable fellow Christian Antognini from Switzerland), so we could gather around my laptop and do a real "Optimizer Hacking Session". Actually we had to do that because the projector promised by DOAG wasn't there. I talked mainly about some common traps when performing SQL statement troubleshooting and what to do in order to avoid them, but also showed some cool stuff that I'll shortly blog about separately.

The second session two hours later was attended by many more than I expected, so there was no chance at all to gather around my laptop. Originally I planned to talk about some common pitfalls why the estimates of the optimizer can go wrong (even with 100% computed statistics) and what to do about, but I then realized that I should at least give a short summary of what I've shown in the first session to those that hadn't attended (which were most of the attendees anyway). This started off an interesting discussion with many questions - a surprising number revolved around the usage of user-defined PL/SQL functions.

Since even the WiFi connection didn't work properly I could only mention briefly some important articles that should be read if one wants to get a good understanding of how the optimizer can (actually should!) be helped when dealing with PL/SQL functions.

So for reference I repeat here a summary of relevant articles:

1. Expert Oracle Practices (the book I co-authored), Chapter 7, "PL/SQL and the CBO" by OakTable fellow Joze Senegacnik

2. OakTable Member Adrian Billington on pl/sql functions and cbo costing, which is inspired by Joze's work above

3. Adrian Billington on "setting cardinality for pipelined and table functions"

4. Adrian Billington's generic "Collection Cardinality Utility", for background info see the article above, the source code includes a description when and how to use it

5. The problem of the CBO ignoring the projection for costing, described on my blog

There were some other questions in the same context, for example, what would a Real Time SQL Monitoring report look like with user-defined PL/SQL functions - here is an example you can view / download (remember that the file can be shared since it is self-contained) that is taken from a query quite similar to the one described in my blog post. By the way, it's a query that gives the approach a try discussed during the session if there a chance of getting the "projection" cost right by playing clever tricks that combine projection and selection. I didn't get it to work as desired, but at least the "selection" cost showed up, which however doesn't address the actual problem of the "projection" cost in my particular test case. So Oracle still attempts to merge the views and has then to run the user-defined function many more times than necessary - in fact the query represents the worst case because the user-defined function is called for both, selection and projection.

Sample Real Time SQL Monitoring Report

You can see from the report that at least some PL/SQL time shows up, but it also shows a deficiency of the report - it doesn't show detailed logical I/O values, only a summary of Buffer Gets.

And interestingly, although nothing about this is visible from the end-user report, the report file itself (in the XML data section) contains at least a couple of optimizer parameter settings - to address another question raised during the session.

All in all I really liked these sessions and I hope the attendees enjoyed them as much as I did.

Last, but not least: It's not definitive yet but very likely I'll give a few free "Optimizer Hacking Sessions" in the future on the internet, so stay tuned!

I Can Help You Trace It

The first product I ever created after leaving Oracle Corporation in 1999 was a 3-day course about optimizing Oracle performance. The experiences of teaching this course from 2000 through 2003 (heavily revising the material each time I taught it) added up to the knowledge that Jeff Holt and I needed to write Optimizing Oracle Performance (2003).

Between 2000 and 2006, I spent many weeks on the road teaching this 3-day course. I stopped teaching it in 2006. An opportunity to take or teach a course ought to be a joyous experience, and this one had become too much of a grind. I didn’t figure out how to fix it until this year. How I fixed it is the story I’d like to tell you.

The Problem

The problem was simply inefficiency. The inefficiency began with the structure of the course, the 3-day lecture marathon. Realize, 6 × 3 = 18 hours of sitting in a chair, listening attentively to a single voice (my voice) is the equivalent of a 6-week university term of a 3-credit-hour course, taught straight through in three days. No hour-plus homework assignment after each hour of lecture to reinforce the lessons; but a full semester’s worth of listening to one voice, straight through, for three days. What retention rate would you expect from a university course compressed into just 3 days?

So, I optimized. I have created a new course that lasts one day (not even an exhausting full day at that). But how can a student possibly learn as much in 1 day as we used to teach in 3 days? Isn’t a 1-day event bound to be a significantly reduced-value experience?

On the contrary, I believe our students benefit even more now than they used to. Here are the big differences, so you can see why.

The Time Savings

In the 3-day course, I would spend half a day explaining why people should abandon their old system-wide-ratio-based ways of managing system performance. In the new 1-day course, I spend less than an hour explaining the Method R approach to thinking about performance. The point of the new course is not to convince people to abandon anything they’re already doing; it’s to show students the tremendous additional opportunities that are available to them if they’ll just look at what Oracle trace files have to offer. Time savings: 2 hours.

In the 3-day course, I would spend a full day explaining how to interpret trace data. By hand. These were a few little lab exercises, about an hour’s worth. Students would enter dozens of numbers from trace files into laptops or pocket calculators and write results on worksheets. In the new 1-day course, the software tools that a student needs to interpret files of any size—or even directories full of files—are included in the price of the course. Time savings: 5 hours.

In the 3-day course, I would spend half a day explaining how to collect trace data. In the new 1-day course, the software tools that a student needs to get started collecting trace files are included in the price of the course. For software architectures that require more work than our software can do for you, there’s detailed instruction in the course book. Time savings: 3 hours.

In the 3-day course, I would spend half a day working through about five example cases using a software tool to which students would have access for 30 days after they had gone home. In the new 1-day course, I spend one hour working through about eight example cases using software tools that every student will take home and keep forever. I can spend less time per case yet teach more because the cases are thoroughly documented in the course book. So, in class, we focus on the high-level decision making instead of the gnarly technical details you’ll want to look up later anyway. Time savings: 3 hours.

...That’s 13 classroom hours we’ve eliminated from the old 3-day experience. I believe that in these 13 hours, I was teaching material that students weren’t retaining to begin with.

The Book

The next big difference: the book.

In the old 3-day course, I distributed two books: (1) the “Course Notebook,” which was a black and white listing of the course PowerPoint slides, and (2) a copy of Optimizing Oracle Performance (O’Reilly 2003). The O’Reilly book was great, because it contained a lot of detail that you would want to look up after the course. But of course it doesn’t contain any new knowledge we’ve learned since 2003. The Course Notebook, in my opinion, was never worth much to begin with. (In my opinion, no PowerPoint slide printout is worth much to begin with.)

The Mastering Oracle Trace Data (MOTD) book we give each student in my new 1-day course is a full-color, perfect-bound book that explains the course material and far more in deep detail. It is full-color for an important reason. It’s not gratuitous or decorative; it’s because I’ve been studying Edward Tufte. I use color throughout the book to communicate detailed, high-resolution information faster to your brain.

Color in the book helps to reduce student workload and deliver value long after a student has left the classroom. In this class, there is no collection of slide printouts like you’ve archived after every Oracle class you’ve been to since the 1980s. The MOTD book is way better than any other material I’ve ever distributed in my career. I’ve heard students tell their friends that you have to see it to believe it.

“A paper record tells your audience that you are serious, responsible, exact, credible. For deep analysis of evidence and reasoning about complex matters, permanent high-resolution displays [that is, paper] are an excellent start.” —Edward Tufte

The Software

So, where does a student recoup all the time we used to spend going through trace files, and studying how to collect trace data on half a dozen different software architectures? In the thousands of man-hours we’ve invested into the software that we give you when you come to the course. Instead of explaining every little detail about quirks in Oracle trace data that change between Oracle versions 10.1 and 10.2 and 11.2 or 11.2.0.2 and 11.2.0.4, the software does the work for you. Instead of having to explain all the detail work, we have time to explain how to use the results of our software to make decisions about your data.

What’s the catch? Of course, we hope you’ll love our software and want to buy it. The software we give you is completely full-featured and yours to keep forever, but the license limits you to using it only with one login id, and it doesn’t include patches and upgrades, which we release a few times each year. We hope you’ll love our software so much that you’ll want to buy a license that lets you use it on any of your systems and that includes the right to upgrade as we fix bugs and add features. We hope you’ll love it so much that you encourage your colleagues to buy it.

But there’s really no catch. You get software and a course (and a book and a shirt) for less than the daily rate that we used to charge for just a course.

A Shirt?

MOTD London 2011-09-08: “I can help you trace it.”

Yes, a shirt. Each student receives a Method R T-shirt that says, “I can help you trace it.” We don’t give these things away to anyone except for students in my MOTD course. So if you see one, the person wearing it can, in actual fact, Help You Trace It.

The Net Result

The net result of all this optimization is benefits on several fronts:

  • The course costs a lot less than it used to. The fee is presently only about 25% of the 3-day course’s price, and the whole experience requires less than ⅓ of time away from work that the original course did.
  • In the new course, our students don’t have to work so hard to make productive use of the course material. The book and the software take so much of the pressure off. We do talk about what the fields in raw trace data mean—I think it’s necessary to know that in order to use the data properly, and have productive debates with your sys/SAN/net/etc. administration colleagues. But we don’t spend your time doing exercises to untangle nested (recursive) calls by hand. The software you take home does that for you. That’s why it is so much easier for a student to put this course to work right away.
  • Since the course duration is only one day, I can visit far more cities and meet far more students each year. That’s good for students who want to participate, and it’s great for me, because I get to meet more people.

Plans

The only thing missing from our Mastering Oracle Trace Data course right now is you. I have taught the event now in Southlake, Texas (our home town), in Copenhagen, and in London. It’s field-tested and ready to roll. We have several cities on my schedule right now. I’ll be teaching the course in Birmingham UK on the day after UKOUG wraps up, December 8. I’ll be doing Orlando and Tampa in mid-December. I’ll teach two courses this coming January in Manhattan and Long Island. There’s Billund (Legoland) DK in April. We have more plans in the works for Seattle, Portland, Dallas, and Cleveland, and we’re looking for more opportunities.

Share the word by linking the official
MOTD sticker to http://method-r.com/.

My wish is for you to help me book more cities in North America and Europe (I hope to expand beyond that soon). If you are part of a company or a user group with colleagues who would be interested in attending the course, I would love to hear from you. Registering en masse saves you money. The magic number for discounting is 10 students on a single registration from one company or user group.

I can help you trace it.

Planet Earth

I know it’s another post that’s not about Oracle, but someone sent me this video link a couple of days ago and it’s too wonderful not to share. (I’ve just got back from Iceland, so the Aurora Borealis at 1:05 is particularly relevant)

Coming Soon

Just a quick reminder if you’re in Northern Europe: I’ll be in Denmark on 13th December presenting a new tutorial event Beating the Oracle Optimizer.

If you’ve been on my “Designing Optimal SQL” course, this is a follow up that takes you in the realm of designing “strange” SQL for those occasions when the Optimizer isnt going to do well enough with “normal” SQL.

In other news:

Don’t forget the UKOUG annual conference in Birmingham from 5th – 7th Dec.

Anyone attending the full event can register for a free seat at the Oak Table Day on Sunday 4th Dec. where I’ll be presenting a geeky session about Redo.

If your abstract wasn’t selected this year, then I see there’s a series of “unconference” slots set aside in the exhibition gallery on the Monday to Wednesday.

 

Hinting

As I’ve often pointed out, this blog isn’t AskTom, or the OTN forum, so I don’t expect to have people asking me to solve their problems; neither do I answer email questions about specific problems. Occasionally, though, questions do appear that are worth a little public airing, and one of these came in by email a couple of weeks ago. The question is longer than the answer I sent, my contribution to the exchange doesn’t start until the heading: “My Reply”.

Last week I find a very interesting thing about use_hash hint accidentally. That is when you have join two tables using unique column from one table and you have a equal predicate on that column, you cannot use hint to make them using hash join.  I know that it does not make sense to use hash join in this case because nested loop is the best way to do it. The point is why Oracle ignore the hint here.

Here is the test case.

--Table creation

create table t1 as
select  rownum id,
        object_name name
from    all_objects
where   rownum <= 1000
;

create table t2 as
select
        mod(rownum,20)+1 id
from    dual
connect by
        rownum <= 1000;

-- Index creation (for table T1, we can create a primary key index or unique index)

alter table t1 add constraint t1_pk primary key(id);
create index ind_t2 on t2(id);

-- Gather table statistics here

select    /*+ ordered use_hash(t2) */
        *
from    t1, t2
where   t1.id = t2.id
and     t1.id = 1
;

Here, the use_hash hint will be ignored. Without rewriting the query, oracle only uses nested loop (which is the best thing, other join method are completely no make sense).

In your article, you said:

Why do people think that Oracle “ignores” hints ? There are two main reasons.

    1.  The available hints are not properly documented
    2.  the hints are rarely used properly – because they are not documented properly.
    3.  (there are a few bugs that make things go really wrong anyway) – and yes, I know that’s the third reason of two.

In the above test case, reasons 1 and 2 do not apply. So is it an Oracle feature or a bug?

We can argue that this is a feature since in this case oracle really know nested loop is the best thing. Then for table T1

select  /*+ full(t1) */
        *
from    t1
where   t1.id = 1

The above query should also ignore the FULL hint, but it does not.

Now, if we change the the query to:

select  /*+ ordered use_hash(t2) */
        *
from    t1,t2
where   t1.id = t2.id
and     t1.id in (1,2)
;

The hint works and oracle use hash join here.

Since oracle can use transitive closure when generating query plan, now, we rewrite the first query to an equivalent one:

select  /*+ ordered use_hash(t2) */
        *
from    t1,t2
where   t1.id = t2.id
and     t2.id = 1
;

Oracle will use the hint here even though oracle knows we want t1.id = 1.
So, it looks this is more like a bug than a feature. What do you think?

My reply

If you read chapter 6 of Cost Based Oracle – Fundamentals, somewhere around page 142, you will see what’s going on. There is an inconsistency in this part of the optimizer code here which could be addressed but might be hard to change. (So the answer to your question is that this is more like a bug than a feature – but it’s a side effect of a more significant defect in the code, rather than a very local bug.)

In the first case (predicate on t1), Oracle uses transitive closure to generate a predicate on t2 - and as it does so it drops the join predicate. This makes the hash join impossible and puts the hint out of context.

In the second case, Oracle keeps the join predicate (and that’s the inconsistency in behaviour), so the hash join is still legal and therefore the hint has to be obeyed.

I would guess that the logic works like this:

Predciate on t1: We generate a transitive predicate, but the source predicate is strong>equality on a , so the join predicate is redundant and can be dropped.

 Predicate on t2: We generate a transitive predicate, but the source predicate is not a guaranteed to be single row predicate, so the join predicate has to be kept and checked. (The “not unique” requirement is also why the join on the in-list behaves the way it does).

Arguably Oracle could introduce a second pass in the optimizer code which could note that the generated predicate has resulted in a predicate with equality on a unique key, and with this change in place the optimizer could decide to drop the join predicate and we would be back to consistent behaviour. In fact, to my mind, the code should never drop predicates – but it needs to change so that it recognises “redundant” predicates properly and doesn’t double count them in the calculation of join selectivity.

Friday Philosophy – OK, so I am on Twitter Now

Not a very exciting Friday Philosophy this week I’m afraid, just a self-publicising announcement that I am now on Twitter. I’ve put the wordpress widget on the blog for a while (days or weeks, I don’t know), my twitter name is MDWidlake. {I was a little surprised mwidlake had gone already but that says more about how rare I consider my surname to be than naivety, I hope}. It seems you can click on a part of the widget to follow me, which is a pretty safe thing to do as I am not very verbal as yet.

As I said, I’m not very active at the moment, I’m more following just a few friends and seeing what people use Twitter for. So far it mostly seems to be about:

  • Random stuff posted when bored
  • Complaining about work or, more specifically, tasks that are proving trickier than hoped
  • Drinking
  • Random stuff posted when bored
  • Articles that have caught someone’s eye
  • …or more often, about tweets about articles that have caught someone’s eye
  • Chatty stuff that only makes sense between social peers (and isn’t that one of the main points of something like Twitter?)
  • Random stuff posted when bored
  • Cuddly toys. I think that is a result of low sample size and that Doug Burns is away at a conference. I worry about his sanity sometimes.

Niall Litchfield, Neil Chandler and Doug Burns were right {thanks again for your advice, gents}, there is some nice stuff on there and I’ve already seen some articles and web pages I found interesting via it – but I have also failed to get on with proper work-like stuff I should have been doing as a result.

I also like the chatty extension to real social engagement that Twitter gives but I hold out on my final decision as to whether this makes up for the negative impact it seems to have on real, meeting-in-person socialising.

The interface to Twitter seems a bit, well, rubbish to me. I know, I’ve been on there for all of a week and I am probably missing the Bleedin’ Obvious  but it seems the stuff I see in Timeline, the default view, is just a subset of what people I follow say. I suspect that it’s got something to do with whether the person the tweet is replying to is on my follow list. To understand half the social stuff you have to go clicking around on people’s full tweet history and follow the thread back. Surely there is an easier way than this, maybe some connect-by tree-walk SQL could be invoked…

I’ve already dropped one person off my “following” list. I only followed one celebrity and I decided I could live without the random musings of Simon Pegg. I can imagine people get addicted to following several dozen b to z level celebs, maybe it’s like constantly living in some sort of poor quality reality tv show {Personally I tend to avoid all reality TV, I prefer reality. Except that I am forced to watch that dancing thing on BBC by my wife. And like most men who make that sort of defence, I can’t quite explain away why I still watch it if she is away…}.

So, don’t expect too much in the way of interesting, witty, insightful or even existing tweets from me as yet, but if you want to follow, heck you can always drop me like a sack of manure any time you like :-) .