One of my recent customers was a company with a somewhat large warehouse (around 60TB) on Oracle 10gR2. The system was using RAC, though it was a fairly simple setup: two nodes, very large AIX LPARs, workload manually partitioned between them and somewhat evenly balanced. The most important demand of their business is a large number of reports that must be generated every day from the warehouse. These reports were beginning to take most of the day and consume a large amount of resources… and the current forecast is for dramatic data growth later this year. So our project goal was to improve performance.
Our project was very successful – the key changes had dramatic effects and we are now running the same set of reports much faster. (Some reports went from many hours to less than 30 minutes.) My final project summary included a small section where I passed along a few suggestions specifically for these developers – but looking over the suggestions now, I think they might actually be good tips for a wider audience. At any rate it might be interesting to put them out for comments!
Of course these tips aren’t my own original work; in fact I think they’re pretty widely accepted in the Oracle professional community. We’ve been hearing these ideas for years now at conferences and user groups. I just picked a few that could make the biggest impact for this team right now – and I put the ideas in my own words. These were my top three:
I’m sure that these tips aren’t perfect – so please share your thoughts! How could I better word them? Do they apply to your development team?
I’m definitely no guru of Oracle performance yet, but I have been involved in a number of performance-oriented projects. My experience has been that it’s pretty common to have certain assumptions built in to the project through scoping and goal statements before I arrive. Often, project sponsors have already decided that one of the most important items in a performance project is that someone with experience in storage and system configuration for Oracle (this is usually me) will review settings and statistics for things like database parameters, AWR reports, operating system kernel settings, networking, I/O, etc. The contractual “statement of work” might say something like “review and assess configuration and architecture” with a list of specific areas of configuration.
I’m not opposed to periodic configuration audits or having a second set of eyes look over the setup. But this very customer was a perfect example of how these assumptions about performance tuning through system review really are not the best way to work. Due to the many parties involved in this contract and various expectations, it was important for me to review the system configuration – so I did spend time on this review. However I was lucky that the team and managers who I directly worked with were sharp and open-minded. After some spirited discussion we slightly altered our approach: starting at the top of the stack and working down rather than the other way around.
Our first two weeks were focused on the most important business reports and only during the last two weeks did I take an in-depth look at storage and system configuration. Can you guess what happened? Looking at the top business problems led us straight to the most important bottlenecks impacting reports across the board. It also showed me which areas of system configuration could have the biggest impact on the business. By the time I started doing a system review, I already knew which areas to focus on. Furthermore, I could already make some estimations about best-case impact of system changes on overall report run-times. Not surprisingly, I knew that no system change was going to give us the improvement we wanted – but tuning a few small pieces of code that were used widely gave us even more improvement than we had thought possible.
Again, this isn’t new… it’s stuff we’ve been hearing at conferences and user groups for years now. But it seems to me that it’s still worth repeating.