Search

OakieTags

Who's online

There are currently 1 user and 34 guests online.

Online users

Recent comments

Oakies Blog Aggregator

WordPress 3.4 Released…

WordPress 3.4 has been released. You can download it here, or use the automatic update for existing blogs.

Cheers

Tim…


WordPress 3.4 Released… was first posted on June 13, 2012 at 8:33 pm.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.




Consultancy

I have a permanent job at the NetCracker‘s System Performance group. Recently I was offered to do one day job outside, on-site in another company, which coincidentally has an office close to NetCracker’s Moscow office. It was an opportunity to apply my skills in a completely different situation which I couldn’t miss; plus I’ve never done public presentations before and this was a good occasion to practice that. Here I’d like to write down some notes how the event went.

2 or 3 weeks before the event I’ve got the list of topics to work on. It included several Java and Oracle subjects. I’ve offered couple of topics myself too. The preparation went well although it took a lot of time to make some slides (which I don’t really like doing hence as a result slides look clumsy. Just as my blog :)) and to put together meaningful examples. In fact I didn’t prepare everything in the volume I thought about initially because of the time – and as it turned out it was right decision. I didn’t practice presentations even though I tried. Each time I’ve tried I took different ways to describe same things so I’ve decided to go unprepared and see how it will go. Since I have clear understanding of what I’m talking about and had a plan in slides it wasn’t so hard. What was a bit of a problem I didn’t know what to expect from the audience and how to keep the balance between interesting and boring slides. Both problems didn’t materialize and were somehow avoided :) Must be because of simplicity I think. I tried hard to make it simple and keep what’s important in my opinion.

So I took a day off and went to present. It started badly. My notebook is a tiny Aspire S3 with only HDMI output available. When I tried to connect it to the projector, it just didn’t work. I spent half an hour trying different things until when I did something wrong and the screen went black with nothing on the projector screen. After a reboot laptop didn’t boot up properly, and I had no other choice than to use safe mode and apply System Restore which added another 10 minutes. Luckily restore finished well. It must have been Friday 13th impact :) but no. Turned out the plug I tried was not for the projector, but for the speakers!! The right plug was hided behind the screen wall. Fortunately there was another laptop available with DVI port, which I used to show slides. Mine notebook was behind the wall connected via HDMI, so each time I needed to show a demo I had to go up, switch the projector’s input and hide myself from the audience (the only thing visible was my back) appearing from time to time. It was funny.

First part was about JVM performance. I talked about general HotSpot JVM features such as just in time compiler and garbage collection. The latter was the main topic of this part. I covered JVM memory layout, GC types, their high-level algorithms and options to tune them based on my experience. In the end of the section there was a demo for out-of-the-box monitoring/debugging/analyzing and third-party tools such as Memory Analyzer. With only 16 slides it took almost 2 hours to talk and answer questions. I didn’t anticipate it that long, and expected to fit into an hour or so. Apparently I can talk for a long time :)

Next topic was Oracle JDBC. I’ve started with an overview of what is end-to-end workflow of an Oracle client, i.e. basic Oracle instance architecture, connection methods and cursors life cycle. Then I showed demos of several important JDBC features: fetch size; update batching; working with Oracle collections; client- and server-side cursors cache. Examples were focused on the performance of the work with database and its consequences for the client. I’ve finished this section with an explanation how important it is to instrument applications (which is a must for all Oracle-based apps) and a short list of common JDBC driver exceptions. Most of the things I’ve explained are pretty well described in the FAQ, documentation and white papers.

Then I went to database topics, beginning with a general topic of “Performance”. Although I’ve been working in this area for about five years, it was hard to find the right words to begin. And I have to admit it sucked a little bit. Two opening slides & explanation were not the same quality as the rest. I know that, because I didn’t sound convincing to myself. Good thing it was only introduction and it didn’t take too much time to go through. After introduction I’ve reviewed most common Oracle tools for performance diagnostics: out-of-the-box, such as SQL Trace and SQL Monitor, and also some third-party tools, such as MOATS and Snapper. I also showed what scripts and queries I use most of the time in my work. They are very simple BTW; I plan to write about them some day in the blog.
As part of this section I thought I should talk about reading execution plans. But after some thinking I’ve decided to skip this very important and necessary topic and just point to the excellent Christian Antognini’s paper. Instead I showed how to get run-time execution statistics for a plan with an example and where to look at to understand why the query runs slow. Initially I thought I’ll create just one complex example based on a query I’ve seen recently, and then use it throughout the event to also demonstrate several CBO issues such as blind guesses, arithmetic bugs, transformations and something “unknown” which I’ve never seen before. Unfortunately I wasn’t able to build a test case as I wanted, and it served only as an example of a bad performing query.
Then I moved to interpreting Statspack reports. I had sample reports from their production system which experienced some issues. Based on these reports I’ve shown how to approach analysis of the information, what (not) to expect from reports, and where (not) to look for and how to make balanced conclusions about the amount of work done by the database. Also I’ve explained some questionable configuration parameters; where to look next for more data and tried to explain what a potential reason of the problem could be. It is in fact the impossible reason (which I’ve picked up in the OakTable mailing list), but fits perfectly well and explains all issues at once. It is unfortunate system’s DBA missed the event so I couldn’t talk about all of the “why”s. Maybe next time :) I also thought about showing some more reports but once again it took me almost an hour to overview just a single report even without looking at the SQL execution statistics sections. I think I did this part very well and it was enough of analyzing reports for one day.

And the final topic of the day was statistics. This part was a basic explanation of how Oracle deals with execution plans, what kind of features influences its decisions, how they impact building an execution plan and fundamental understanding of what is statistics and what are the things to consider when it should (not) be gathered. I didn’t try to explain the term “cost” in this section and I only talked about cardinality estimates instead. It is enough for beginning, I think. For demos I’ve shown two very common examples when CBO estimates may go off: linear decay for out of range values (which can be “fixed” by manual adjustment to low/high column values) and correlation between columns (which is nicely covered by 11g’s extended statistics).
I’ve also shown in action my script stat.sql which I use all the time to get the basic understanding of table and index statistics.
I have to admit that in this section I only touched this complicated topic. It’s relatively easy to talk about it for the whole day with tons of examples and gotchas. Nevertheless I’m sure that I’ve covered the basics.
And the last part was supposed to be Q&A session (with questions given to me before the event), but I didn’t have time to try and write some code so it went without examples.

All in all it went smoothly; not everything went very well, but some parts were pretty darn good and some were normal. It sucked little bit for 10 minutes or so, which is acceptable. Presentations looked not too sexy – plain text with minimal drawings. Well, it wasn’t about presentations after all. I know there’s plenty of room for improvement in this area. The feedback from the audience was quite positive which is excellent. To me it was a very good experience and I’m happy I did it.

Lessons learned:

  • HDMI may be a problem
  • Don’t try to cover too much the same day
  • Preparation takes longer than expected due to various reasons
  • Preparation brings something new
  • It’s better to know who will attend
  • It would be better not only talk, but also “do something” useful with a real system
  • VM is better choice for demos
  • Google Docs and Dropbox offer very good services

PS. If you think you may need something similar in (preferably) Moscow region, drop me an e-mail. It’s doable.

Filed under: CBO, Java, JDBC, Oracle, Performance

New SQL Puzzle

Sorry for the drought — to keep everyone’s mind fresh, how about a little puzzle?

Use SQL to create a symmetric Pascal matrix, with the output being (i,j,v). So that a Pascal matrix of size 4 would end up as:

1,1,1
2,1,1
3,1,1
4,1,1
1,2,1
2,2,2
3,2,3
4,2,4
1,3,1
2,3,3
3,3,6
4,3,10
1,4,1
2,4,4
3,4,10
4,4,20

Try to stay with ANSI SQL 2008 if possible — bailing out to other languages and functions is discouraged :-)

Efficiency is an interesting idea here, as it’s pretty easy to do this by brute force calculations of factorials for every entry — but where’s the fun in that?

Extra credit for solutions which can take in the matrix size as some sort of parameter.

Kernel UEK 2 on Oracle Linux 6.2 fixed lab server memory loss

A few days ago I wrote about my new lab server and the misfortune with kernel UEK (aka 2.6.32 + backports). It simply wouldn’t recognise the memory in the server:

# free -m
             total       used       free     shared    buffers     cached
Mem:          3385        426       2958          0          9        233
-/+ buffers/cache:        184       3200
Swap:          511          0        511

Ouch. Today I gave it another go, especially since my new M4 SSD has arrived. My first idea was to upgrade to UEK2. And indeed, following the instructions on Wim Coekaerts’s blog (see references), it worked:

[root@ol62 ~]# uname -a
Linux ol62.localdomain 2.6.39-100.7.1.el6uek.x86_64 #1 SMP Wed May 16 04:04:37 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux
[root@ol62 ~]# free -m
             total       used       free     shared    buffers     cached
Mem:         32221        495      31725          0          5         34
-/+ buffers/cache:        456      31764
Swap:          511          0        511

Note the 2.6.39-100.7.1! It’s actually past that and version 3.x, but to preserve compatibility with a lot of software parsing the kernel revision number in 3 tuples Oracle decided to stick with 2.6.39. But then the big distributions don’t really follow the mainstream kernel numbers anyway.

Now if anyone could tell me if UEK2 is out of beta? I know it’s not supported for the database yet, but it’s a cool kernel release and I can finally play around with the “perf” utility Kevin Closson and Frits Hoogland have mentioned so much about recently.

This here is a lot more like it:

top - 13:57:56 up  1:14,  5 users,  load average: 0.22, 0.55, 0.96
Tasks: 240 total,   2 running, 238 sleeping,   0 stopped,   0 zombie
Cpu0  : 31.8%us,  5.3%sy,  0.0%ni, 62.3%id,  0.3%wa,  0.0%hi,  0.3%si,  0.0%st
Cpu1  :  1.8%us,  0.6%sy,  0.0%ni, 97.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  6.7%us,  4.8%sy,  0.0%ni, 88.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  :  0.6%us,  0.0%sy,  0.0%ni, 99.4%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :  2.8%us,  2.0%sy,  0.0%ni, 95.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :  0.3%us,  0.3%sy,  0.0%ni, 99.0%id,  0.3%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu8  :  0.4%us,  0.0%sy,  0.0%ni, 99.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu9  :  3.3%us,  6.0%sy,  0.0%ni, 90.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu10 :  0.3%us,  0.7%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu11 :  0.5%us,  0.5%sy,  0.0%ni, 99.1%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu12 : 14.7%us,  1.5%sy,  0.0%ni, 83.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu13 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu14 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu15 :  3.9%us,  5.7%sy,  0.0%ni, 90.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu16 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu17 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu18 :  2.7%us,  1.6%sy,  0.0%ni, 95.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu19 :  1.7%us,  0.7%sy,  0.0%ni, 97.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu20 :  0.0%us,  0.2%sy,  0.0%ni, 99.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu21 :  6.3%us,  4.4%sy,  0.0%ni, 89.3%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu22 :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu23 :  3.9%us,  0.4%sy,  0.0%ni, 95.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  32994384k total, 17078956k used, 15915428k free,   115172k buffers
Swap:   524284k total,        0k used,   524284k free, 14839688k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
14908 oracle    20   0 10.2g 113m 106m S 29.6  0.4   0:00.89 oracleorcl (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
...

And whilst I’m on it-here is the output from collectl during the database creation for a SLOB run on my SSD (sda). It created the 4GB online redo logs at the time.

### RECORD   35 >>> ol62 <<< (1339593441.001)  ###

# SINGLE CPU[HYPER] STATISTICS
#   Cpu  User Nice  Sys Wait IRQ  Soft Steal Idle
      0     0    0    0    0    0    0     0  100
      1     0    0    0    0    0    0     0  100
      2     0    0    0    0    0    0     0  100
      3     0    0    0    0    0    0     0    0
      4     0    0    0    0    0    0     0    0
      5     0    0    0    0    0    0     0    0
      6     0    0    0    0    0    0     0   98
      7     0    0    0    0    0    0     0    0
      8     0    0    0    0    0    0     0  100
      9     0    0    0    0    0    0     0    0
     10     0    0    0    0    0    0     0    0
     11     0    0    0    0    0    0     0    0
     12     0    0    1    0    0    0     0   99
     13     3    0    3    0    0    0     0   92
     14     0    0    0    0    0    0     0  100
     15     0    0    0    0    0    0     0  100
     16     0    0    0    0    0    0     0  100
     17     0    0    0    0    0    0     0    0
     18     0    0    0    0    0    0     0    0
     19     0    0    0    0    0    0     0    0
     20     0    0    0    0    0    0     0    0
     21     0    0    0    0    0    0     0  100
     22     0    0    0    0    0    0     0    0
     23     0    0    0    0    0    0     0    0

# DISK STATISTICS (/sec)
#           Pct
#Name       KBytes Merged  IOs Size  KBytes Merged  IOs Size  RWSize  QLen  Wait SvcTim Util
sda              0      0    0    0  189575      0  554  342     342     2     4      1   95
sdb              0      0    0    0       0      0    0    0       0     0     0      0    0
sdd              0      0    0    0       0      0    0    0       0     0     0      0    0
sde              0      0    0    0       0      0    0    0       0     0     0      0    0
sdc              0      0    0    0       0      0    0    0       0     0     0      0    0
sdf              0      0    0    0       0      0    0    0       0     0     0      0    0

References

Introducing A LinkedIn Group For SLOB Users

This is just a very short blog entry to inform folks that there is an open discussion group over at LinkedIn for SLOB topics of interest.

The group can be accessed through the following link:  SLOB LinkedIn Group.

Filed under: oracle

Ad: Mastering Oracle Trace Data

This is just a short post to point out that the company I work for, Trivadis, is organizing 3 classes with Cary Millsap. The topic, as the title suggests, is Cary’s 1-day class entitled “Mastering Oracle Trace Data”.

The following dates and locations are planned:

  • September 11, 2012 – Münich (DE)
  • September 13, 2012 – Zürich (CH)
  • September 18, 2012 – Vienna (AT)

For detailed information have a look to the flyer. Note that the early bird registration period, that entitles you a 15% discount, ends on August 1.

Simple SLOB Init.ora Parameter File For Read IOPS Testing

This is just a quick blog entry to show the very simple init.ora parameter file I use to stress simple read IOPS testing with SLOB.  On 2s16c32t E5-2600 servers attached to very fast storage this init.ora parameter delivers on the order of 275,000 physical IOPS with 64 SLOB sessions.

I’ll post an init.ora that I use for the REDO model and DBWR testing as soon as possible.

Thanks to Yury for the recommended hidden init.ora parameters to boost the ratio of db file sequential reads.

Additional information can be found here: README file.

Here is the init.ora:

db_create_file_dest = '/mnt/dsk/slob'
control_files=('/mnt/dsk/slob/cntlSLOB.dbf')
db_name = SLOB
compatible = 11.2.0.2
UNDO_MANAGEMENT=AUTO
db_block_size = 8192
db_files = 20000
processes = 500
shared_pool_size = 5000M
db_cache_size=10M
filesystemio_options=setall
parallel_max_servers=0
_db_block_prefetch_limit=0
_db_block_prefetch_quota=0
_db_file_noncontig_mblock_read_count=0
log_buffer=134217728
cpu_count=1
pga_aggregate_target=8G

Filed under: oracle

Report Generators And Query Transformations

Usually the Cost-Based Optimizer arrives at a reasonable execution plan if it gets the estimates regarding cardinality and data scattering / clustering right (if you want to learn more about that why not watch my Webinar available at "AllThingsOracle.com"?).

Here is an example I've recently come across where this wasn't case - the optimizer obviously preferred plans with a significantly higher cost.

The setup to reproduce the issue is simple:

create table t1 as select rownum as id , mod(rownum, 100) + 1 as attr1 , 'DESC' || to_char(mod(rownum, 100) + 1, 'TM') as attr2 , mod(rownum, 10) + 1 as attr3 , rpad('x', 100) as filler from dual connect by level <= 1000000 ; exec dbms_stats.gather_table_stats(null, 't1') create table t2 as select rownum as id , mod(rownum, 2) + 1 as attr1 , 'DESC' || to_char(mod(rownum, 2) + 1, 'TM') as attr2 , mod(rownum, 10) + 1 as attr3 , rpad('x', 100) as filler from dual connect by level <= 10000 ; exec dbms_stats.gather_table_stats(null, 't2', no_invalidate => false)

So we have two tables, and the smaller one of the tables basically was used as a "security" information to restrict the data from the larger table according to some user entitlement (apologies for the generic column and table names that don't make this more obvious).

The query that I'm looking at is the following:

with a as ( select * from t1 where attr3 in (select distinct attr3 from t2 where attr1 = 1) ), b as ( select attr1 as "attr1" , min(attr2) as "min_attr2" from a group by attr1 ) select "attr1" as "memberuniquename" , min("min_attr2") as "member" from b where 1 = 1 group by "attr1" ;

At first sight you might ask yourself, why should anyone write a query like that, in particular the duplicated aggregation step? The answer is of course, a human being very likely wouldn't write such a query, but this is what you get from some report generators based on generic meta data.

In principle the query is looking for a simple list of values from a table (in this case dimension tables that are browsed interactively), but restricted according to the "profile" of a particular user, here represented by the unique list of values of T2.ATTR3 restricted on T2.ATTR1 = 1.

There are a couple of design issues with the reproduced approach, but the query doesn't look too complex after all.

One specific issue of the query is that the MIN() aggregate allows Oracle a certain kind of flexibility when to apply the aggregate function: It could be applied on duplicated data and still generate the correct answer, something that is not necessarily possible with other aggregate functions like COUNT() for example.

One potential threat therefore comes from the fact that a query transformation of the IN clause to a kind of join can produce a huge intermediate result set since ATTR3 in both tables only has ten distinct values. Usually this is not a problem since the optimizer, provided it gets the estimates right, should recognize this pattern and calculate a corresponding increased cost for such plans, and therefore prefer other execution plans that avoid this in some way, for example by using a DISTINCT / GROUP BY operation early before a join, or simply using a semi join operation.

In fact, the query could be simplified to the following:

select attr1 as "memberuniquename" , min(attr2) as "member" from t1 where t1.attr3 in (select t2.attr3 from t2 where attr1 = 1) group by attr1 ;

The database version in question was 11.2.0.1, and if you run the simplified query, then you'll get the following execution plan:

Plan hash value: 1591942764 ---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100 | 4500 | 4858 (3)| 00:00:59 | | 1 | HASH GROUP BY | | 100 | 4500 | 4858 (3)| 00:00:59 | |* 2 | HASH JOIN | | 708 | 31860 | 4857 (3)| 00:00:59 | | 3 | VIEW | VW_GBF_6 | 10 | 30 | 50 (4)| 00:00:01 | | 4 | HASH GROUP BY | | 10 | 60 | 50 (4)| 00:00:01 | |* 5 | TABLE ACCESS FULL| T2 | 5000 | 30000 | 48 (0)| 00:00:01 | | 6 | VIEW | VW_GBC_5 | 708 | 29736 | 4807 (3)| 00:00:58 | | 7 | HASH GROUP BY | | 708 | 9204 | 4807 (3)| 00:00:58 | | 8 | TABLE ACCESS FULL| T1 | 1000K| 12M| 4707 (1)| 00:00:57 | ---------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ITEM_1"="ITEM_1") 5 - filter("ATTR1"=1)

The execution plan is reasonable and performs reasonable - the optimizer used the so called "Group By Placement" transformation which took one of the predicted approaches to apply the DISTINCT (transformed to a GROUP BY) as early as possible and arrived at the following query after transformation and costing, taken from the optimizer trace file and slightly re-formatted for better readability:

SELECT "VW_GBC_5"."ITEM_3" "ATTR1" , MIN("VW_GBC_5"."ITEM_2") "MIN(ATTR2)" FROM ( SELECT "T2"."ATTR3" "ITEM_1" FROM "T2" "T2" WHERE "T2"."ATTR1"=1 GROUP BY "T2"."ATTR3" ) "VW_GBF_6" , ( SELECT "T1"."ATTR3" "ITEM_1" , MIN("T1"."ATTR2") "ITEM_2" , "T1"."ATTR1" "ITEM_3" FROM "T1" "T1" GROUP BY "T1"."ATTR3" , "T1"."ATTR1" ) "VW_GBC_5" WHERE "VW_GBC_5"."ITEM_1"="VW_GBF_6"."ITEM_1" GROUP BY "VW_GBC_5"."ITEM_3" ;

which is funnily a bit similar to the original query initially mentioned.

If you however check the execution plan of the original query, you'll get the following:

Plan hash value: 3221309153 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 100 | 3900 | 81386 (95)| 00:16:17 | | 1 | HASH GROUP BY | | 100 | 3900 | 81386 (95)| 00:16:17 | | 2 | VIEW | | 100 | 3900 | 81386 (95)| 00:16:17 | | 3 | HASH GROUP BY | | 100 | 1900 | 81386 (95)| 00:16:17 | |* 4 | HASH JOIN | | 500M| 9059M| 10222 (54)| 00:02:03 | |* 5 | TABLE ACCESS FULL| T2 | 5000 | 30000 | 48 (0)| 00:00:01 | | 6 | TABLE ACCESS FULL| T1 | 1000K| 12M| 4707 (1)| 00:00:57 | ------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("ATTR3"="ATTR3") 5 - filter("ATTR1"=1)

This execution plan looks quite different, and it shows exactly the problem I've described - the optimizer has the option to postpone the aggregation after the join due to the specifics of the query: A MIN() aggregate can be applied late, but the problem can be seen from the cardinality estimate of the join - a whopping 500 million rows, that are crunched afterwards into 100 rows. This doesn't look like an efficient approach, and indeed at actual execution time a lot of CPU time (several minutes, depending on the CPU speed) gets used for handling that huge amount of intermediate data. It is probably a bug that it doesn't use at least a simple semi join to avoid the duplicates.

When I first saw this problem I was pretty sure that this is a typical problem of the impact of the WITH clause on query transformations in versions prior to 11.2.0.3 where that problem finally was addressed (to a very large extend, although not completely eliminated).

And indeed, when rewriting the original query to the corresponding one using inline views instead of Subquery Factoring / Common Table Expression:

select "attr1" as "memberuniquename" , min("min_attr2") as "member" from ( select attr1 as "attr1" , min(attr2) as "min_attr2" from ( select * from t1 where attr3 in ( select distinct attr3 from t2 where attr1 = 1 ) ) a group by attr1 ) b where 1 = 1 group by "attr1" ;

the following execution plan will be generated:

Plan hash value: 1699732834 ------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 100 | 3900 | 4858 (3)| 00:00:59 | | 1 | HASH GROUP BY | | 100 | 3900 | 4858 (3)| 00:00:59 | | 2 | VIEW | | 100 | 3900 | 4858 (3)| 00:00:59 | | 3 | HASH GROUP BY | | 100 | 4500 | 4858 (3)| 00:00:59 | |* 4 | HASH JOIN | | 708 | 31860 | 4857 (3)| 00:00:59 | | 5 | VIEW | VW_GBF_8 | 10 | 30 | 50 (4)| 00:00:01 | | 6 | HASH GROUP BY | | 10 | 60 | 50 (4)| 00:00:01 | |* 7 | TABLE ACCESS FULL| T2 | 5000 | 30000 | 48 (0)| 00:00:01 | | 8 | VIEW | VW_GBC_7 | 708 | 29736 | 4807 (3)| 00:00:58 | | 9 | HASH GROUP BY | | 708 | 9204 | 4807 (3)| 00:00:58 | | 10 | TABLE ACCESS FULL| T1 | 1000K| 12M| 4707 (1)| 00:00:57 | ------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("ITEM_1"="ITEM_1") 7 - filter("ATTR1"=1)

which is almost identical to the execution plan of the simplified query. Notice also the huge difference in cost compared to the execution plan for the original query. A test on 11.2.0.3 showed the same execution plan without the re-write to inline views, so the changes introduced in 11.2.0.3 address this issue.

I've advised therefore to disable the usage of the WITH clause in the report generator, which was a matter of a simple click in the tool.

More Query Variations

To that point I thought this could be filed as just another example of the already known Subquery Factoring / Query Transformations issues, but it's not over yet - the report generator managed to produce more variations of the query. I liked in particular this one, which used the new setting disabling the WITH clause:

select "attr1" as "memberuniquename" , min("min_attr2") as "member" from ( select attr1 as "attr1" , min(attr2) as "min_attr2" from ( select * from t1 where attr3 in ( select distinct attr3 from t2 where attr1 = 1 ) ) a group by attr1 having 1 = 1 ) b group by "attr1" ;

Whereas in the previous example an innocent "WHERE 1 = 1" predicate was added and simply ignored by the optimizer, this time a seemingly innocent "HAVING 1 = 1" was added, and this identified another "weak spot" in the optimizer code, because now the execution plan generated was again the HASH JOIN with the huge intermediate result.

In fact, when changing the simplified query by adding the redundant HAVING clause like this:

select attr1 as "memberuniquename" , min(attr2) as "member" from t1 where t1.attr3 in (select t2.attr3 from t2 where attr1 = 1) group by attr1 having 1 = 1 ;

The bad execution plan can be reproduced, too. Even 11.2.0.3 doesn't fix this, so in this case one possible tactical workaround was to use the NO_UNNEST hint for the subquery because we knew the data and the number of distinct keys used as input/output to the subquery were always only a few, so we could benefit from filter subquery caching a lot. Notice that changing in the above query the MIN() aggregate to a COUNT() for example will change the join to a semi join, avoiding the problem of the huge set of intermediate duplicated data.

The strategic solution in this case of course was to use a correct data model that prevented the duplicates in the "security" table in first place. This prevented the huge intermediate result set even with when the "good" query transformations were not applied.

Footnote

Watch out if you upgrade to 11.2.0.3. Potentially a lot of execution plans using the WITH clause might change, and as you know, not always for the better.

Repairman Jack : Bloodline

Bloodline is the eleventh book in the Repairman Jack series by F. Paul Wilson.

After taking a break from doing fix-its, Jack takes on a simple case for Christy Pickering. She is looking for dirt on her daughter’s boyfriend, who is twice her daughters age. Not Jack’s normal work, but it sounds simple enough, until the first guy Christy hired for the job turns up dead. As usual, nothing is ever simple for Jack…

There is some very dodgy stuff going on in this book. I’m not going to go into it because it will be a spoiler and I don’t want the subject matter to appear on this blog. It’s not the sort of thing you want appearing when someone does a search on you or your blog. :)

At this point in the story it is less than a year away from the armageddon that keeps being foretold…

Cheers

Tim…


Repairman Jack : Bloodline was first posted on June 10, 2012 at 8:14 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.




Discovering More Work-Related Windows Apps…

I know it’s a little sad, but I’m kinda enjoying discovering new little apps for Windows to do all those things that come out-of-the-box on Linux. Don’t get me wrong, I’m not loving Windows, but enjoying the discovery process itself.

X Server

On a Linux desktop you don’t need to worry about this, as your desktop is already using one. To get GUIs on the remote server to run on the local desktop, you just need connect to the remote server using “ssh -X oracle@myserver.example.com”

On Windows you need some additional software. There are plenty of paid for options, like Exceed, but there are also some free options. Many moons ago I used to use Cygwin. It’s still around and still does the job just fine. Remember to install the Cygwin/X libraries. Once it’s installed, do the following:

  • Start a Cygwin terminal.
  • Type “startx” and return.
  • In the resulting X windows interface, connect to your server of interest using “ssh -Y oracle@myserver.example.com”
  • In some cases I still had to manually run “export DISPLAY=mypc.example.com:0.0″ on the remote server.

It works OK, but the process feels a little clumsy.

Whilst Googling for alternatives I came across MobaXterm, which is very neat. It’s a self contained executable, like Putty.exe, and comes with a whole host of connectivity options, including automatically starting a local X Server for all SSH connections. No configuration required. No messing. Just start a GUI in the remote machine via SSH and it appears on your PC screen. Cool. I’ve now replaced Putty.exe (and PuttyCM), Cygwin and the occasional WinSCP with MobaXterm. Give it a try. I think you’ll like it.

A nice side benefit of Cygwin and MobaXterm is they give you access to lots of UNIX/Linux commands on Windows. :)

SCP/SFTP

On Linux it’s real easy. You just call it from the command line. See “man scp”.

On Windows, I started to use WinSCP, which is OK, but it feels very clumsy. Since trying MobaXterm I don’t bother with WinSCP anymore, because connections to a machine using SSH automatically start an SFTP panel. Very handy.

Editors

There’s no getting away from vi on Linux, and I’m happy to use it to edit odd files, but I’m not enough of a fan to do anything large with it. Followers of the blog know I’m an UltraEdit fan. As far as free Windows editors go, Notepad++ is a pretty neat editor. Having said that, MobaXterm includes quite a good editor, especially useful for editing large files on remote systems.

PortableApps

I guess a special mention should go out to PortableApps, which helped me through the first few days at work while I was struggling to get stuff installed on my PC. There’s some really cool stuff there.

Over the last couple of weeks I’ve bounced between a number of different Windows apps and things are finally starting to settle. I’d still prefer to be using Linux, but at least I’m functional now. :) Any suggestions for cool work-related apps welcome.

Cheers

Tim…


Discovering More Work-Related Windows Apps… was first posted on June 9, 2012 at 9:30 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.