I've built up quite a blogging back-log that I hope to clear a little over the next week, but I saw something today that's more important than any of the posts to come so I wanted to blog about that quickly first.
John Scott, a pillar of the Oracle Apex community came up with a great idea. There'll be no wild emotion round here because I didn't know any of the people involved well, but I can wholeheartedly recommend people buy that book. Not only can I be fairly certain that it'll be a high quality book written by people who are knowlegable and passionate about Apex, but it's going to help out the families of a couple of good friends of the Apex community. Even my miserable and cloudy personality is struggling to see a down-side there
I tip my hat to the authors and hope as many people as possible place an order.
Nice idea John!
I am pleased to announce that we, the program committee members, just finalized the program for the 6th Planboard DBA Symposium to be held at May 17 2011 in Amstelveen. There are 10 presentations scheduled in 2 parallel tracks by Frits Hoogland, Johan van Veen, Iloon Ellen-Wolff, Paul der Kinderen, Wouter Wethmar, Alex Nuijten, Gert […]
I had Virgin Media installed on 9th March and I am completely happy with the cable broadband service, but they were unable to take on my phone line because of capacity issues. That resulted in a complete mess over the billing of my service because of the way Virgin Media process discounts on bundles. I contacted customer services and after a lot of messing about the billing situation was resolved to my satisfaction.
A few weeks later I was notified of my first bill and it was completely wrong. I got in touch with customer services and after a very long time the issue seemed to be resolved AGAIN, but I was told I would be contacted by midday the same day for confirmation by a manager. It is now 6 days later and I’ve not received a call. In that time I’ve been phoning trying to get it resolved.
I’ve just got off the phone and I’ve been assured it is sorted now, but I know it will all be a steaming pile of poo next month when the bill is sorted again.
Each person I’ve talked to has been fine, but the whole process and the systems involved are terrible. It stinks. In no way can this be called good customer services. It is a complete nightmare.
Sometimes bad customer service is because the people answering the phones are morons. Sometimes it is because the procedures and systems they are working with are idiotic. It comes to something when the person on the other end of the phone is unable to tell you how much you will be charged because the systems they are using are so convoluted with a variety of possible discounts that may or may not be applied…
So in conclusion, Virgin Media customer services is a festering pile of donkey crap…
I always take notice when Oracle introduces a new “FAST” option, so it was with some excitement when I first noticed in Oracle 11g Rel 1 there was a new FAST option when running the ANALYZE TABLE CASCADE VALIDATE STRUCTURE command. This was described in the manuals as introducing a hashing scheme that was significantly [...]
I’m very excited to announce this somewhat old news — we have a brand new Oracle ACE at Pythian — Yury Velikanov. Fantastic addition to Pythian’s team of ACEs and ACE Directors — Fahd Mirza, Gwen Shapira, Jared Still, Christo Kutrovsky and myself. If you want to know more about Oracle ACE Program, the latest issue of Oracle Magazine has an article written by the fellow Oracle ACE Mike Riley — Everything Is Coming Up ACEs!.
I’ve known Yury for a while as I met him online on Oracle-L many years ago. The world is small and after living on different continents (North America and Europe), providence put us together on the third continent — Australia. As I was building our Australian business in Sydney, it took me a year to convince Yury to join Pythian. But I tell you it was worth the efforts!
Yury is one of our top Senior Oracle Apps DBAs and he is also extremely capable when it comes to Oracle core database technology and middleware which you would expect from a very good Apps DBA. A decade ago, Yury was already fusing Oracle E-Business Suite with Oracle RAC when most people couldn’t even afford to think about it. In 2003, Yury became one of the first five Oracle Certified Masters in Europe. In other words, Yury is very talented individual with top notch Oracle EBS and Oracle database skills.
As you probably know, excellent technical skills are not the only required quality of an Oracle ACE. Oracle ACEs are distinguished for their contributions to the community and Yury has always been an enthusiastic Oracle expert willing to both share his knowledge and learn from others. As a result, he is an active contributor to Oracle community, regionally and globally.
Yury founded Oracle e-Business Suite related mailing list several years ago to help Oracle Apps DBAs around the world exchange their experiences and help others. In Australia, Yury has become an organizer of the Sydney Oracle Meetup – an informal group of Oracle professionals meeting regularly face to face.
Yury also has a long list of conferences he presented at including Hotsos Symposium, UKOUG, AUSOUG and others. Recently, Yury has also started blogging on Pythian Blog and is planning to do more and more.
I would like to note that Yury is a person who never refuse to help — he will lose sleep, work 42 hours per day if need but won’t step away if asked for help. I’ve been there myself and I know that he is a very approachable individual and will do whatever it takes.
After all this, how could I not nominate him as an Oracle ACE? Right. There was no way! Now few months late Yury is happily joined our Oracle ACE Team at Pythian. It’s actually happened a few weeks ago, it just took me so long to write about it!
Ok, it’s official – the first and only Oracle Troubleshooting TV show is live now!
The first show is almost 2 hours about the ORA-4031 errors and shared pool hacking. It’s a recording of the US/EMEA timezone online hacking session I did some days ago.
There are a couple of things to note:
View the embedded video below or go to my official Oracle Troubleshooting TV show channel:
Yesterday I introduced a little framework I use to avoid the traps inherent in writing PL/SQL loops when modelling a session that does lots of simple calls to the database. I decided to publish the framework because I had recently come across an example where a series of SQL statements gives a very different result from a single PL/SQL block.
The model starts with a simple data set – which in this case is created in a tablespace using ASSM (automatic segment space management), an 8KB block size and 1MB uniform extents (in a locally management tablespace).
create table t1 tablespace test_8k_assm as select trunc((rownum-1)/100) n1, lpad('x',40) v1, rpad('x',100) padding from dual connect by rownum <= 20000 ; create index t1_i1 on t1(n1, v1) tablespace test_8k_assm ; validate index t1_i1; execute print_table('select * from index_stats');
You can see that the n1 column is defined to have 200 rows for each of 100 different values, and that each set of two hundreds rows is stored (at least initially) in a very small cluster of blocks.
With the data set in place I am now going to pick a set of two hundred rows at random, delete it, re-insert it, and commit; and I’m going to repeat that process 1,000 times.
declare rand number(3); begin for i in 1..1000 loop rand := trunc(dbms_random.value(0,200)); delete from t1 where n1 = rand ; insert into t1 select rand, lpad('x',40), rpad('x',100) from dual connect by rownum <= 100 ; commit; end loop; end; / validate index t1_i1; execute print_table('select * from index_stats');
You might think that this piece of code is a little strange – but it is a model of some processing that I’ve recently seen on a client site, and it has crossed my mind that it might appear in a number of systems hidden underneath the covers of dbms_job. So what does it do to the index ?
Given the delay that usually appears between the time an index entry is marked as deleted and the time that the space can be reused, and given the way I’ve engineered my date so that the space needed for the 200 rows for each key value is little more than a block (an important feature of this case), I wouldn’t be too surprised if the index had stabilised at nearly twice its original size. But that’s not what happened to my example running under ASSM. Here are the “before” and “after” results from my test:
Before After LF_ROWS 20,000 70,327 LF_BLKS 156 811 LF_ROWS_LEN 1,109,800 3,877,785 BR_ROWS 155 810 BR_BLKS 3 10 BR_ROWS_LEN 8,903 45,732 DEL_LF_ROWS 0 50,327 DEL_LF_ROWS_LEN 0 2,767,985 DISTINCT_KEYS 200 190 MOST_REPEATED_KEY 100 1,685 BTREE_SPACE 1,272,096 6,568,320 USED_SPACE 1,118,703 3,923,517 PCT_USED 88 60 ROWS_PER_KEY 100 370 BLKS_GETS_PER_ACCESS 54 189
It’s a small disaster – our index has grown in size by a factor of about five, and we have more deleted rows than “real” rows. (Note, by the way, that the awfulness of the index is NOT really indicated by the PCT_USED figure – one which is often suggested as an indicator of the state of an index).
Unfortunately this is the type of problem that doesn’t surprise me when using ASSM; it’s supposed to help with highly concurrent OLTP activity (typified by a large number of very small transactions) but runs into problems updating free space bitmaps whenever you get into “batch-like” activity.
However, there is a special consideration in play here – I’ve run the entire operation as a single pl/sql loop. Would the same problem appear if I ran each delete/insert cycle as a completely independent SQL script using the “start_1000.sql” script from my previous note ?
To test the effect of running 1,000 separate tasks, rather than executing a single pl/sql loop, I wrote the following code into the start_1.sql script that I described in the article before running start_1000.sql:
declare rand number(3); begin rand := trunc(dbms_random.value(0,200)); delete from t1 where n1 = rand ; insert into t1 select rand, lpad('x',40), rpad('x',100) from dual connect by rownum <= 100 ; commit; end; /
The impact was dramatically different. (Still very wasteful, but quite a lot closer to the scale of the results that you might expect from freelist management).
Before After --------- --------- LF_ROWS 20,000 39,571 LF_BLKS 156 479 LF_ROWS_LEN 1,109,800 2,196,047 BR_ROWS 155 478 BR_BLKS 3 6 BR_ROWS_LEN 8,903 26,654 DEL_LF_ROWS 0 19,571 DEL_LF_ROWS_LEN 0 1,086,247 DISTINCT_KEYS 200 199 MOST_REPEATED_KEY 100 422 BTREE_SPACE 1,272,096 3,880,192 USED_SPACE 1,118,703 2,222,701 PCT_USED 88 58 ROWS_PER_KEY 100 199 BLKS_GETS_PER_ACCESS 54 102
I haven’t yet investigated why the pl/sql loop should have produced such a damaging effect – although I suspect that it might be a side effect of the pinning of bitmap blocks (amongst others, of course) that takes place within a single database call. It’s possible that the repeated database calls from SQL*Plus keep “rediscovering” bitmap blocks that show free space while the pinning effects stop the pl/sql from “going back” to bitmap blocks that have recently acquired free space.
Interestingly the impact of using ASSM was dramatically reduced if one object used freelists and the other used ASSM – and with my specific example the combination of a freelist table with an ASSM index even did better than the expected 50% usage from the “traditional” option of using freelists for both the table and index.
Note – the purpose of this note is NOT to suggest that you should avoid using ASSM in general; but if you can identify code in your system that is doing something similar to the model then it’s worth checking the related indexes (see my index efficiency note) to see if any of them are displaying the same problem as this test case. If they are you may want to do one of two things: think about a schedule for coalescing or even rebuilding problem indexes on a regular basis, or see if you can move the table, index, or both, into a tablespace using freelist management.
I know I have mentioned it before, but I am a big fan of the OEM performance screens that are derived from the ASH/AWR information. One of the things I really like about it is the immediate information it gives you, in one glance, that things are “not normal”. Once you notice that things are not normal you can then, within a few seconds, get a feel for what is still probably OK and where you have something that has changed.
As an example of the immediate information, I recently came back to my desk and glanced at my OEM performance screen. It was showing the below:
“data load has just ran” I said to my comrade-in-arms. “which one?” he asked. “The Delta – It ran the quick plan. But it started a bit late, 12:15. Oh, and looks like the transaction view code has swapped back to the full table scan plan and the summary code is not playing up at the moment.”
“you’re turning into Neo you are – can you see a lady in a red dress???” he asked.
That was of course a reference to the “Matrix” films where at times you see the virtual world displayed on a screen as a stream of characters running down the screen – but once you get used to it you can apparently “see” what is going.
The screen shot above is not even actually a very good example of what the performance screens can show you. One of my minor complaints about the performance screens is that it scales to show the greatest of the largest peak or a number of sessions to match the number of CPUs (real or fake) that are available to you. So if you have more CPU available than you need, you can’t see much detail in the graph. And if you have had a nasty peak of activity, again, all detail is squeezed out. In my case, the box is sized to cope in 12 months and the system is new, so activity is scuttling along the bottom of the graph.
However, “poor” though the example is, it told me what was going on across my system at a glance, something about the major tasks we are running, that one problem is currently occurring and that several of the other issues I need to keep an eye out for are not occurring.
That is why I love these screens – I recognise “my” activity patterns from the graph, I now recognise the SQL IDs for my key statements. If I see a pattern in the graph I don’t recognise, I need to check things out immediately. Three or four times over the last 2 weeks I have spotted an issues, started investigating and found out the cause before the Operations desk has even noticed an issue.
Oh, and what is SQL type 189? It is a merge statement. Our implementation of OEM is a little old, it does not correctly interpret that SQL command type. It might be a little old, it is still a lot useful.
March 30, 2011 If you take a look at the Oracle Database Performance Tuning Guide, you will find the following SQL statement to help identify the session waiting in an enqueue and the session that currently prevents that session from continuing: SELECT DECODE(request,0,'Holder: ','Waiter: ') || sid sess, id1, id2, lmode, request, type [...]