Oracle Open World 2001 in Berlin: The Truth (Finally)

It’s time that we admit it. We did horrible things at OOW in Berlin. We’ve not told anyone for all these years, but the pressure is building inside. So I’ve decided to come clean.

We had just started Miracle, so we were only about eight folks or so in total. So we decided to go to the conference in Berlin all of us. We rented two or three apartments and also invited our friends (customers) to stay with us.

We drove down there in a few cars and found out upon arrival that the apartments were empty except for the mattresses on the floor. Oh well, easier to find your way around.

I’m still not sure why Peter Gram or someone else decided to bring along our big office printer/scanner/copier, but the guys quickly set up the network, the printer and the laptops, and then we just sat around, worked on the laptops, drank beers and talked about all sorts of Oracle internals.

I went down to registration and got a badge, so that was good. Then someone (forget who) came up with the idea that we should simply copy my badge so the rest of the guys could get in for free.

It wasn’t because we didn’t have the money or anything. Oh no. It was just because it sounded stupid and a little risky. So that’s why you’ll find pictures here and there (including in my office) of the guys copying and modifying badges.

The biggest challenge was that the badges had an “Oracle-red” stripe at the bottom.

But Oracle Magazine had a special conference edition out which had a lot of “Oracle-red” on the front cover, so it was just a matter of using the scissors in the Swiss army knife.

It worked perfectly for the whole conference and we were very proud, of course.

It was also the conference where I was introduced to James Morle by Anjo Kolk, our old-time friend from Oracle. I had placed myself strategically in a café/bar between the two main halls in the conference center which meant that everybody came walking by sooner or later. So I met lots of old friends that way. And a new friend named James Morle, who was in need for an assignment – and we had a customer in Germany who badly need his skills, so he ended up working for Mobilcom for half a year or more.

So the next bad thing we did was to crash the Danish country dinner. Oracle Denmark might not have been too fond of us back then, because they thought we were too many who had left in one go. Nevertheless, we thought it was not exactly stylish of them not to invite us to the Danish country dinner – as the only Danish participants.

Our friend (and future customer) Ivan Bajon from Simcorp stayed with us in the apartments and he was invited to the country dinner. So we found out where it was, snooped around a little, and then simply climbed a rather high fence and gate-crashed the dinner.

That was fun. The Oracle folks there were visibly nervous when we suddenly stormed in, but what could they do in front of all the customers, who very well knew who we were? So we sat down at the tables and had a good evening with all the other Danes there.

We had lots of fun during those few days in Berlin, had many political debates and beers, and went home smiling but tired.

To my knowledge we’ve not faked badges or gate-crashed country dinners since.

There have been a few suggestions since then that the badges we copied were actually free to begin with, but that can't possible be. I strongly object to that idea.

RMOUG Training Days

RMOUG Training Days was once again a fantastic experience. It is an amazing value for the cost, and Denver is such a beautiful place to visit. The dehydration thing seriously effects me though ... you'd think I'd remember that and start chugging water before I'm 24 hours into it. Maybe next year I'll do better :)Highlights of the conference for me included:Tim Gorman's AWR session: I've used

Diving in Iceland, June 2009

It seems to everyone that I travel a lot. I guess I do compared to most people, but I enjoy traveling, seeing new places, new people, and old friends about as much as I enjoy anything. It’s usually part of my job anyway. So, with a once-in-a-lifetime chance to visit a place I’ve never been and may not have much reason or opportunity to visit again plus do some scuba diving, I couldn’t pass it up.

That’s right, in June 2009, I will visit Iceland and willfully plunge into the +2 C water that is the clearest body of water in the world. The reasons it is so clear have something to do with the fact that the water is the runoff from melting glaciers, filtered by volcanic rocks, and is very, very cold. It supports no wildlife (another reason it’s so clear/clean). Rumor has it that visibility is over 300 feet–that is something I really do have to see to believe.

The trip is being arranged by my friend Mogens Nørgaard who may very well be completely crazy. If you ever get a chance to meet and engage in conversation with him (a.k.a. “Moans Nogood”), do it. You won’t regret it, guaranteed.

The trip is highlighted on, Iceland’s (probably only) dive shop website. Oh, I forgot to mention that the lake bottom is where two tectonic plates (the North American and Eurasian plates, to be precise) meet up (!), so you’re essentially diving on or in one of the continental divides.

Of course, I’m very excited about this trip and hope that Ice, land can continue to function as their economic issues seem to be a little worse than everyone else’s. In the small world department, I have made contact with an Iceland native that I worked with back at Tandem (acquired by Compaq -> HP) in the late 90s. Hopefully, I can meet up with Leifur while I’m in the country. There are only about 300,000 people in the whole country, so he shouldn’t be *that* hard to find. On the other hand, it is possible that Leifur is like “John” is in the US. We’ll see.

forcedirectio: Another victim

I see it all the time: people using Best Practices and ending up in a big mess afterwards. This time it is the mount option forcedirectio. According to a NetApp best practice for Oracle one should always use forcedirectio for the File Systems that store the Oracle Files. So people migrating to these systems read the white papers and best practices and then run into performance problems. A quick diagnosis shows that it is all related to IO. Ofcourse the NAS is blamed and NetApp gets a bad reputation. It is not only NetApp, it is true for all vendors that advice you to use the forcedirectio.

What does forcedirectio do?

It basically by passes the File System buffer cache and because of that it is using a shorter and faster code path in the OS to get the I/O done. That it is ofcourse what we want, however you are now nolonger using the File System Buffer Cache. Depending on your OS and defaults, a large portion of your internal memory could be used as the FS Buffer Cache. So most DBAs don’t dare to set Oracle Buffer Caches bigger than 2-3 GB and don’t dare to use Raw Devices. So the FS cache is used very heavily. It is not uncommon to see that on Oracle Database uses 2 to 10 more caching in the FS than the Oracle Buffer Cache. I have seen a system that used 20 to 40 times more caching the FS than the Oracle Buffer Cache.

So just imagine what happens if one than bypasses the FS Buffer Cache

IO performance: Oracle 8/6130 SAN/Veritas Volume Manager.

I had a look a system here in the Netherlands for a company that was having severe I/O performance problems. They tried to switch to DirectIO (filesystemset_option=setall), but they discovered that the IO performance got worse. So they quickly turned it back to ASYNC. The System Admins told the DBAs several times that there were no OS or File System issues. However they did recently migrate from one SAN to another with Volumne Manager Mirrorring. And the local was remotely mirrored, supposedly in asynchronous mode.

So then I had a look at the system. The system has 24 CPUs, 57GB of internal memory and 14 Oracle databases running on it. The two most important databases did according to statspack around 1700 I/Os per second. So I did a quickscan of the Physical Reads per SQL statement and found that some of the statements were missing indexes (based on Tapios Rules of Engagement), I also noticed that the buffer caches were really small.

After adding some indexes and increasing the buffer cache(s) it was discovered that the writes were still a bit (understatement) slow. To get the focus of the database and more onto the system and OS, I decided to use the test program for writes from Jonathan Lewis. The tests performed were 1K, 10000 blocks sequential writes, 8K, 10000 blocks random writes and 16K, 10000 blocks sequential writes. The tests were run on different mounts points and the interesting things were observered. The tests also performed slow on the database mount points. So the database was no longer the root of problem, something else was. Now the system administrators had to pay attention

Google Sync for Windows Mobile for Contacts and Calendar

Just started to use this new Google feature and installed for my Google Apps domain ( and it works great. I can now sync my Contacts and Google Calendar automatically. It uses the Active Sync utility from Windows Mobile and you have to enable this feature for your Google Apps Domain.

Now the only thing to do is the Tasks list from the mail view and I am all set

Missing. The. Point....

I probably buy 80 to 90% of my non-grocery items online.  Furniture, pictures, gifts, TV's, books, kitchen stuff - whatever I can - all online.  I hate the "store" experience.  Before I go into an actual physical store I usually know exactly what I want - buy it and leave.  It took me about 5 minutes to buy shoes this weekend :)

I buy online for the convenience - and the experience is fairly similar regardless where you shop.  You typically have to create "that account" (even if you never intend to shop there again..) and you get that form to opt in or out of mailing.  They almost always default to "opt in" and I invariably set it to "opt out"

I just bought some shelves while sitting here in King of Prussia, PA (I live in VA, another benefit of shopping online, just do it when/where-ever you want)... I received two emails.  Email 1 - my receipt (great).  Email 2, well, it was in response to me opting out:

While registering as a shopper with, you chose not to receive our promotional Email. This is being sent to confirm that will not receive Email from

The decision to receive Email is personal and can be influenced for a variety of reasons. In an attempt to better understand and respond to our customers, we would appreciate it if you would answer a short survey on this topic.

That just strikes me as "missing the point" :)

Can you imagine what my survey comment field might have contained.... The survey did contain

We value your feedback and encourage you to give us candid answers. Are there any comments you would like to make to xxxx? (Note: Response is limited to 250 characters)

250 characters.  I shall have to choose my words carefully...  I should have it written in Kanji to see if they support multi-byte and truly support 250 characters.  Or if it is really 250 bytes.

Using Amazon Cloudfront

Last year, a customer was running into trouble with static product images on their website. The images were store in an OCFS2 Filesystem and every so often OCFS2 would hang and the site became slowly unresponsive. We had talked a number of times about using a Content Delivery Network (CDN) to also improve the download streams to the client. The number of concurrent downloads from a domain is limited and different per browser. So increasing the number of domains for your site will help to improve the concurrent download. While we were discussing this (and after another problem with OCFS2) Amazon AWS sent out an email about the availability of Cloudfront. Here you can store static content and it will be cached in different servers around the world that are the closest to the browsers that request content from them. So we decide to implement this.  

Step 1 was to signup for the Amazon AWS service. Then we had to create an S3 bucket and upload the static content. There you run into of one performance issues with the Cloud. Uploading large amounts of data to the S3 (European) bucket is limited by your internet UPLOAD speed. Most people use ADSL connection with high DOWNLOAD speeds but with lower UPLOAD speeds. So uploading 30 GB of data will take some time (depending on your upload speed). An S3 bucket is basically a raw datastore. You have to tell what objects in that datastore are directories or files. Uploading the data was done with the JetS3 program. This has a commandline interface (CLI) and is written in Java so it can run on different platforms. After 3 days(a weekend) all the data was online and available from the Cloudfront. We implemented a monitor service with ZABBIX to see the peformance and availability of the Cloudfront service. It hasn’t been down but we noticed a couple of performance degradations, but the service has been available 100%. 

So was it worth all the effort that was put into it? The reason we did it was that OCFS2 seem to have a performance and stability problem. Well 2 months later we discovered the real reason for the problem. There was a firewall between the Read-Only nodes an the one Read-Write node of the OCFS2 cluster. This firewall was doing some housekeeping and lost control of that . So certain connections and messages between the OCFS2 got lost. That caused hangs and performance degradations. So the reason for switching has been fixed, but we haven’t switched back to the old implementation.

This Amazon Cloudfront service turned out to be a nice solution to serve static content. 

More to follow later.

Shared Pool Latch Contention

Recently I was looking at a system that had some shared pool instability. Once a week during the day it would start flushing and loading the objects back in when needed. This resulted in large library cache pin and library cache load lock waits. That problem was attacked with some simple changes. One of the problems was that this customer had changed the reserved size minimal alloc (hidded parameter) from the default 5120 bytes to 51200 bytes. As a result the (large) reserved shared pool wasn’t used.

Another strange problem was that every hour there was a spike in shared pool latch waits. It turned out that an DBA had built to script to check the shared pool (queries against x$ksmsp) and that caused some problems. While querying this view, Oracle needs to hold the shared pool latch. So if there are many small pieces that need to be checked, one can hold on a long time for this latch.  When we killed the script, the spikes also dissappeared

Stability is your friend

Oracle and other Microsoft are putting more and more automatic and self – everything features into their database. There are of course many reasons why that makes sense (for Oracle and Microsoft), but does it make sense for all Oracle Systems and their DBAs? I don’t think so. Consider this:

All these automatic and self-tuning features will manage resources and make decisions that can and will change the behavior of your system. Now consider that you are the DBA of a mission critical Oracle system. Do you want  a system that runs good enough and stable or do you want a system that sometimes runs perfect and sometimes runs badly? Let me know.