I have a permanent job at the NetCracker‘s System Performance group. Recently I was offered to do one day job outside, on-site in another company, which coincidentally has an office close to NetCracker’s Moscow office. It was an opportunity to apply my skills in a completely different situation which I couldn’t miss; plus I’ve never done public presentations before and this was a good occasion to practice that. Here I’d like to write down some notes how the event went.
2 or 3 weeks before the event I’ve got the list of topics to work on. It included several Java and Oracle subjects. I’ve offered couple of topics myself too. The preparation went well although it took a lot of time to make some slides (which I don’t really like doing hence as a result slides look clumsy. Just as my blog :)) and to put together meaningful examples. In fact I didn’t prepare everything in the volume I thought about initially because of the time – and as it turned out it was right decision. I didn’t practice presentations even though I tried. Each time I’ve tried I took different ways to describe same things so I’ve decided to go unprepared and see how it will go. Since I have clear understanding of what I’m talking about and had a plan in slides it wasn’t so hard. What was a bit of a problem I didn’t know what to expect from the audience and how to keep the balance between interesting and boring slides. Both problems didn’t materialize and were somehow avoided :) Must be because of simplicity I think. I tried hard to make it simple and keep what’s important in my opinion.
So I took a day off and went to present. It started badly. My notebook is a tiny Aspire S3 with only HDMI output available. When I tried to connect it to the projector, it just didn’t work. I spent half an hour trying different things until when I did something wrong and the screen went black with nothing on the projector screen. After a reboot laptop didn’t boot up properly, and I had no other choice than to use safe mode and apply System Restore which added another 10 minutes. Luckily restore finished well. It must have been Friday 13th impact :) but no. Turned out the plug I tried was not for the projector, but for the speakers!! The right plug was hided behind the screen wall. Fortunately there was another laptop available with DVI port, which I used to show slides. Mine notebook was behind the wall connected via HDMI, so each time I needed to show a demo I had to go up, switch the projector’s input and hide myself from the audience (the only thing visible was my back) appearing from time to time. It was funny.
First part was about JVM performance. I talked about general HotSpot JVM features such as just in time compiler and garbage collection. The latter was the main topic of this part. I covered JVM memory layout, GC types, their high-level algorithms and options to tune them based on my experience. In the end of the section there was a demo for out-of-the-box monitoring/debugging/analyzing and third-party tools such as Memory Analyzer. With only 16 slides it took almost 2 hours to talk and answer questions. I didn’t anticipate it that long, and expected to fit into an hour or so. Apparently I can talk for a long time :)
Next topic was Oracle JDBC. I’ve started with an overview of what is end-to-end workflow of an Oracle client, i.e. basic Oracle instance architecture, connection methods and cursors life cycle. Then I showed demos of several important JDBC features: fetch size; update batching; working with Oracle collections; client- and server-side cursors cache. Examples were focused on the performance of the work with database and its consequences for the client. I’ve finished this section with an explanation how important it is to instrument applications (which is a must for all Oracle-based apps) and a short list of common JDBC driver exceptions. Most of the things I’ve explained are pretty well described in the FAQ, documentation and white papers.
Then I went to database topics, beginning with a general topic of “Performance”. Although I’ve been working in this area for about five years, it was hard to find the right words to begin. And I have to admit it sucked a little bit. Two opening slides & explanation were not the same quality as the rest. I know that, because I didn’t sound convincing to myself. Good thing it was only introduction and it didn’t take too much time to go through. After introduction I’ve reviewed most common Oracle tools for performance diagnostics: out-of-the-box, such as SQL Trace and SQL Monitor, and also some third-party tools, such as MOATS and Snapper. I also showed what scripts and queries I use most of the time in my work. They are very simple BTW; I plan to write about them some day in the blog.
As part of this section I thought I should talk about reading execution plans. But after some thinking I’ve decided to skip this very important and necessary topic and just point to the excellent Christian Antognini’s paper. Instead I showed how to get run-time execution statistics for a plan with an example and where to look at to understand why the query runs slow. Initially I thought I’ll create just one complex example based on a query I’ve seen recently, and then use it throughout the event to also demonstrate several CBO issues such as blind guesses, arithmetic bugs, transformations and something “unknown” which I’ve never seen before. Unfortunately I wasn’t able to build a test case as I wanted, and it served only as an example of a bad performing query.
Then I moved to interpreting Statspack reports. I had sample reports from their production system which experienced some issues. Based on these reports I’ve shown how to approach analysis of the information, what (not) to expect from reports, and where (not) to look for and how to make balanced conclusions about the amount of work done by the database. Also I’ve explained some questionable configuration parameters; where to look next for more data and tried to explain what a potential reason of the problem could be. It is in fact the impossible reason (which I’ve picked up in the OakTable mailing list), but fits perfectly well and explains all issues at once. It is unfortunate system’s DBA missed the event so I couldn’t talk about all of the “why”s. Maybe next time :) I also thought about showing some more reports but once again it took me almost an hour to overview just a single report even without looking at the SQL execution statistics sections. I think I did this part very well and it was enough of analyzing reports for one day.
And the final topic of the day was statistics. This part was a basic explanation of how Oracle deals with execution plans, what kind of features influences its decisions, how they impact building an execution plan and fundamental understanding of what is statistics and what are the things to consider when it should (not) be gathered. I didn’t try to explain the term “cost” in this section and I only talked about cardinality estimates instead. It is enough for beginning, I think. For demos I’ve shown two very common examples when CBO estimates may go off: linear decay for out of range values (which can be “fixed” by manual adjustment to low/high column values) and correlation between columns (which is nicely covered by 11g’s extended statistics).
I’ve also shown in action my script stat.sql which I use all the time to get the basic understanding of table and index statistics.
I have to admit that in this section I only touched this complicated topic. It’s relatively easy to talk about it for the whole day with tons of examples and gotchas. Nevertheless I’m sure that I’ve covered the basics.
And the last part was supposed to be Q&A session (with questions given to me before the event), but I didn’t have time to try and write some code so it went without examples.
All in all it went smoothly; not everything went very well, but some parts were pretty darn good and some were normal. It sucked little bit for 10 minutes or so, which is acceptable. Presentations looked not too sexy – plain text with minimal drawings. Well, it wasn’t about presentations after all. I know there’s plenty of room for improvement in this area. The feedback from the audience was quite positive which is excellent. To me it was a very good experience and I’m happy I did it.
PS. If you think you may need something similar in (preferably) Moscow region, drop me an e-mail. It’s doable.
Sorry for the drought — to keep everyone’s mind fresh, how about a little puzzle?
Use SQL to create a symmetric Pascal matrix, with the output being (i,j,v). So that a Pascal matrix of size 4 would end up as:
Try to stay with ANSI SQL 2008 if possible — bailing out to other languages and functions is discouraged
Efficiency is an interesting idea here, as it’s pretty easy to do this by brute force calculations of factorials for every entry — but where’s the fun in that?
Extra credit for solutions which can take in the matrix size as some sort of parameter.
A few days ago I wrote about my new lab server and the misfortune with kernel UEK (aka 2.6.32 + backports). It simply wouldn’t recognise the memory in the server:
# free -m total used free shared buffers cached Mem: 3385 426 2958 0 9 233 -/+ buffers/cache: 184 3200 Swap: 511 0 511
Ouch. Today I gave it another go, especially since my new M4 SSD has arrived. My first idea was to upgrade to UEK2. And indeed, following the instructions on Wim Coekaerts’s blog (see references), it worked:
[root@ol62 ~]# uname -a Linux ol62.localdomain 2.6.39-100.7.1.el6uek.x86_64 #1 SMP Wed May 16 04:04:37 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux [root@ol62 ~]# free -m total used free shared buffers cached Mem: 32221 495 31725 0 5 34 -/+ buffers/cache: 456 31764 Swap: 511 0 511
Note the 2.6.39-100.7.1! It’s actually past that and version 3.x, but to preserve compatibility with a lot of software parsing the kernel revision number in 3 tuples Oracle decided to stick with 2.6.39. But then the big distributions don’t really follow the mainstream kernel numbers anyway.
Now if anyone could tell me if UEK2 is out of beta? I know it’s not supported for the database yet, but it’s a cool kernel release and I can finally play around with the “perf” utility Kevin Closson and Frits Hoogland have mentioned so much about recently.
This here is a lot more like it:
top - 13:57:56 up 1:14, 5 users, load average: 0.22, 0.55, 0.96 Tasks: 240 total, 2 running, 238 sleeping, 0 stopped, 0 zombie Cpu0 : 31.8%us, 5.3%sy, 0.0%ni, 62.3%id, 0.3%wa, 0.0%hi, 0.3%si, 0.0%st Cpu1 : 1.8%us, 0.6%sy, 0.0%ni, 97.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu2 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu3 : 6.7%us, 4.8%sy, 0.0%ni, 88.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu4 : 0.6%us, 0.0%sy, 0.0%ni, 99.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu6 : 2.8%us, 2.0%sy, 0.0%ni, 95.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu7 : 0.3%us, 0.3%sy, 0.0%ni, 99.0%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st Cpu8 : 0.4%us, 0.0%sy, 0.0%ni, 99.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu9 : 3.3%us, 6.0%sy, 0.0%ni, 90.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu10 : 0.3%us, 0.7%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu11 : 0.5%us, 0.5%sy, 0.0%ni, 99.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu12 : 14.7%us, 1.5%sy, 0.0%ni, 83.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu15 : 3.9%us, 5.7%sy, 0.0%ni, 90.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu16 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu17 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu18 : 2.7%us, 1.6%sy, 0.0%ni, 95.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu19 : 1.7%us, 0.7%sy, 0.0%ni, 97.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu20 : 0.0%us, 0.2%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu21 : 6.3%us, 4.4%sy, 0.0%ni, 89.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu22 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu23 : 3.9%us, 0.4%sy, 0.0%ni, 95.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 32994384k total, 17078956k used, 15915428k free, 115172k buffers Swap: 524284k total, 0k used, 524284k free, 14839688k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14908 oracle 20 0 10.2g 113m 106m S 29.6 0.4 0:00.89 oracleorcl (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq))) ...
And whilst I’m on it-here is the output from collectl during the database creation for a SLOB run on my SSD (sda). It created the 4GB online redo logs at the time.
### RECORD 35 >>> ol62 <<< (1339593441.001) ### # SINGLE CPU[HYPER] STATISTICS # Cpu User Nice Sys Wait IRQ Soft Steal Idle 0 0 0 0 0 0 0 0 100 1 0 0 0 0 0 0 0 100 2 0 0 0 0 0 0 0 100 3 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 98 7 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 100 9 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 12 0 0 1 0 0 0 0 99 13 3 0 3 0 0 0 0 92 14 0 0 0 0 0 0 0 100 15 0 0 0 0 0 0 0 100 16 0 0 0 0 0 0 0 100 17 0 0 0 0 0 0 0 0 18 0 0 0 0 0 0 0 0 19 0 0 0 0 0 0 0 0 20 0 0 0 0 0 0 0 0 21 0 0 0 0 0 0 0 100 22 0 0 0 0 0 0 0 0 23 0 0 0 0 0 0 0 0 # DISK STATISTICS (/sec) # Pct #Name KBytes Merged IOs Size KBytes Merged IOs Size RWSize QLen Wait SvcTim Util sda 0 0 0 0 189575 0 554 342 342 2 4 1 95 sdb 0 0 0 0 0 0 0 0 0 0 0 0 0 sdd 0 0 0 0 0 0 0 0 0 0 0 0 0 sde 0 0 0 0 0 0 0 0 0 0 0 0 0 sdc 0 0 0 0 0 0 0 0 0 0 0 0 0 sdf 0 0 0 0 0 0 0 0 0 0 0 0 0
This is just a short post to point out that the company I work for, Trivadis, is organizing 3 classes with Cary Millsap. The topic, as the title suggests, is Cary’s 1-day class entitled “Mastering Oracle Trace Data”.
The following dates and locations are planned:
For detailed information have a look to the flyer. Note that the early bird registration period, that entitles you a 15% discount, ends on August 1.
This is just a quick blog entry to show the very simple init.ora parameter file I use to stress simple read IOPS testing with SLOB. On 2s16c32t E5-2600 servers attached to very fast storage this init.ora parameter delivers on the order of 275,000 physical IOPS with 64 SLOB sessions.
I’ll post an init.ora that I use for the REDO model and DBWR testing as soon as possible.
Thanks to Yury for the recommended hidden init.ora parameters to boost the ratio of db file sequential reads.
Additional information can be found here: README file.
Here is the init.ora:
db_create_file_dest = '/mnt/dsk/slob' control_files=('/mnt/dsk/slob/cntlSLOB.dbf') db_name = SLOB compatible = 184.108.40.206 UNDO_MANAGEMENT=AUTO db_block_size = 8192 db_files = 20000 processes = 500 shared_pool_size = 5000M db_cache_size=10M filesystemio_options=setall parallel_max_servers=0 _db_block_prefetch_limit=0 _db_block_prefetch_quota=0 _db_file_noncontig_mblock_read_count=0 log_buffer=134217728 cpu_count=1 pga_aggregate_target=8G
Filed under: oracle
Usually the Cost-Based Optimizer arrives at a reasonable execution plan if it gets the estimates regarding cardinality and data scattering / clustering right (if you want to learn more about that why not watch my Webinar available at "AllThingsOracle.com"?).
Here is an example I've recently come across where this wasn't case - the optimizer obviously preferred plans with a significantly higher cost.
The setup to reproduce the issue is simple:
So we have two tables, and the smaller one of the tables basically was used as a "security" information to restrict the data from the larger table according to some user entitlement (apologies for the generic column and table names that don't make this more obvious).
The query that I'm looking at is the following:
At first sight you might ask yourself, why should anyone write a query like that, in particular the duplicated aggregation step? The answer is of course, a human being very likely wouldn't write such a query, but this is what you get from some report generators based on generic meta data.
In principle the query is looking for a simple list of values from a table (in this case dimension tables that are browsed interactively), but restricted according to the "profile" of a particular user, here represented by the unique list of values of T2.ATTR3 restricted on T2.ATTR1 = 1.
There are a couple of design issues with the reproduced approach, but the query doesn't look too complex after all.
One specific issue of the query is that the MIN() aggregate allows Oracle a certain kind of flexibility when to apply the aggregate function: It could be applied on duplicated data and still generate the correct answer, something that is not necessarily possible with other aggregate functions like COUNT() for example.
One potential threat therefore comes from the fact that a query transformation of the IN clause to a kind of join can produce a huge intermediate result set since ATTR3 in both tables only has ten distinct values. Usually this is not a problem since the optimizer, provided it gets the estimates right, should recognize this pattern and calculate a corresponding increased cost for such plans, and therefore prefer other execution plans that avoid this in some way, for example by using a DISTINCT / GROUP BY operation early before a join, or simply using a semi join operation.
In fact, the query could be simplified to the following:
The database version in question was 220.127.116.11, and if you run the simplified query, then you'll get the following execution plan:
The execution plan is reasonable and performs reasonable - the optimizer used the so called "Group By Placement" transformation which took one of the predicted approaches to apply the DISTINCT (transformed to a GROUP BY) as early as possible and arrived at the following query after transformation and costing, taken from the optimizer trace file and slightly re-formatted for better readability:
which is funnily a bit similar to the original query initially mentioned.
If you however check the execution plan of the original query, you'll get the following:
This execution plan looks quite different, and it shows exactly the problem I've described - the optimizer has the option to postpone the aggregation after the join due to the specifics of the query: A MIN() aggregate can be applied late, but the problem can be seen from the cardinality estimate of the join - a whopping 500 million rows, that are crunched afterwards into 100 rows. This doesn't look like an efficient approach, and indeed at actual execution time a lot of CPU time (several minutes, depending on the CPU speed) gets used for handling that huge amount of intermediate data. It is probably a bug that it doesn't use at least a simple semi join to avoid the duplicates.
When I first saw this problem I was pretty sure that this is a typical problem of the impact of the WITH clause on query transformations in versions prior to 18.104.22.168 where that problem finally was addressed (to a very large extend, although not completely eliminated).
And indeed, when rewriting the original query to the corresponding one using inline views instead of Subquery Factoring / Common Table Expression:
the following execution plan will be generated:
which is almost identical to the execution plan of the simplified query. Notice also the huge difference in cost compared to the execution plan for the original query. A test on 22.214.171.124 showed the same execution plan without the re-write to inline views, so the changes introduced in 126.96.36.199 address this issue.
I've advised therefore to disable the usage of the WITH clause in the report generator, which was a matter of a simple click in the tool.
More Query Variations
To that point I thought this could be filed as just another example of the already known Subquery Factoring / Query Transformations issues, but it's not over yet - the report generator managed to produce more variations of the query. I liked in particular this one, which used the new setting disabling the WITH clause:
Whereas in the previous example an innocent "WHERE 1 = 1" predicate was added and simply ignored by the optimizer, this time a seemingly innocent "HAVING 1 = 1" was added, and this identified another "weak spot" in the optimizer code, because now the execution plan generated was again the HASH JOIN with the huge intermediate result.
In fact, when changing the simplified query by adding the redundant HAVING clause like this:
The bad execution plan can be reproduced, too. Even 188.8.131.52 doesn't fix this, so in this case one possible tactical workaround was to use the NO_UNNEST hint for the subquery because we knew the data and the number of distinct keys used as input/output to the subquery were always only a few, so we could benefit from filter subquery caching a lot. Notice that changing in the above query the MIN() aggregate to a COUNT() for example will change the join to a semi join, avoiding the problem of the huge set of intermediate duplicated data.
The strategic solution in this case of course was to use a correct data model that prevented the duplicates in the "security" table in first place. This prevented the huge intermediate result set even with when the "good" query transformations were not applied.
Watch out if you upgrade to 184.108.40.206. Potentially a lot of execution plans using the WITH clause might change, and as you know, not always for the better.
After taking a break from doing fix-its, Jack takes on a simple case for Christy Pickering. She is looking for dirt on her daughter’s boyfriend, who is twice her daughters age. Not Jack’s normal work, but it sounds simple enough, until the first guy Christy hired for the job turns up dead. As usual, nothing is ever simple for Jack…
There is some very dodgy stuff going on in this book. I’m not going to go into it because it will be a spoiler and I don’t want the subject matter to appear on this blog. It’s not the sort of thing you want appearing when someone does a search on you or your blog.
At this point in the story it is less than a year away from the armageddon that keeps being foretold…
I know it’s a little sad, but I’m kinda enjoying discovering new little apps for Windows to do all those things that come out-of-the-box on Linux. Don’t get me wrong, I’m not loving Windows, but enjoying the discovery process itself.
On a Linux desktop you don’t need to worry about this, as your desktop is already using one. To get GUIs on the remote server to run on the local desktop, you just need connect to the remote server using “ssh -X firstname.lastname@example.org”
On Windows you need some additional software. There are plenty of paid for options, like Exceed, but there are also some free options. Many moons ago I used to use Cygwin. It’s still around and still does the job just fine. Remember to install the Cygwin/X libraries. Once it’s installed, do the following:
It works OK, but the process feels a little clumsy.
Whilst Googling for alternatives I came across MobaXterm, which is very neat. It’s a self contained executable, like Putty.exe, and comes with a whole host of connectivity options, including automatically starting a local X Server for all SSH connections. No configuration required. No messing. Just start a GUI in the remote machine via SSH and it appears on your PC screen. Cool. I’ve now replaced Putty.exe (and PuttyCM), Cygwin and the occasional WinSCP with MobaXterm. Give it a try. I think you’ll like it.
A nice side benefit of Cygwin and MobaXterm is they give you access to lots of UNIX/Linux commands on Windows.
On Linux it’s real easy. You just call it from the command line. See “man scp”.
On Windows, I started to use WinSCP, which is OK, but it feels very clumsy. Since trying MobaXterm I don’t bother with WinSCP anymore, because connections to a machine using SSH automatically start an SFTP panel. Very handy.
There’s no getting away from vi on Linux, and I’m happy to use it to edit odd files, but I’m not enough of a fan to do anything large with it. Followers of the blog know I’m an UltraEdit fan. As far as free Windows editors go, Notepad++ is a pretty neat editor. Having said that, MobaXterm includes quite a good editor, especially useful for editing large files on remote systems.
I guess a special mention should go out to PortableApps, which helped me through the first few days at work while I was struggling to get stuff installed on my PC. There’s some really cool stuff there.
Over the last couple of weeks I’ve bounced between a number of different Windows apps and things are finally starting to settle. I’d still prefer to be using Linux, but at least I’m functional now. Any suggestions for cool work-related apps welcome.