Search

Top 60 Oracle Blogs

Recent comments

hardware

Will the Single Box System make a Comeback?

For about 12 months now I’ve been saying to people(*) that I think the single box server is going to make a comeback and nearly all businesses won’t need the awful complexity that comes with the current clustered/exadata/RAC/SAN solutions.

Now, this blog post is more a line-in-the-sand and not a well researched or even thought out white paper – so forgive me the obvious mistakes that everyone makes when they make a first draft of their argument and before they check their basic facts, it’s the principle that I want to lay down.

Friday Philosophy – Oracle Performance Silver Bullet


Silver Cartridge and Bullet

For as long as I have been working with Oracle technology {which is now getting towards 2 decades and isn’t that pause for thought} there has been a constant search for Performance Silver Bullets – some trick or change or special init.ora parameter {alter system set go_faster_flag=’Y'} you can set to give you a guaranteed boost in performance. For all that time there has been only one.

There are a few performance Bronze Bullets…maybe Copper Bullets. The problem is, though, that the Oracle database is a complex piece of software and what is good for one situation is terrible for another. Often this is not even a case of “good 90% of the time, indifferent 9% of the time and tragic 1% of the time”. Usually it is more like 50%:30%:20%.


Cartridge with copper bullet &spent round

I’ve just been unfair to Oracle software actually, a lot of the problem is not with the complexity of Oracle, it is with the complexity of what you are doing with Oracle. There are the two extremes of OnLine Transaction Processing (lots of short running, concurrent, simple transactions you want to run very quickly by many users) and Data Warehouse where you want to process a vast amount of data by only a small number of users. You may well want to set certain initialisation parameters to favour quick response time (OLTP) or fastest processing time to completion (DW). Favouring one usually means a negative impact on the other. Many systems have both requirements in one… In between that there are the dozens and dozens of special cases and extremes that I have seen and I am just one guy. People get their database applications to do some weird stuff.

Partitioning is a bronze bullet. For many systems, partitioning the biggest tables makes them easier to manage, allows some queries to run faster and aids parallel activity. But sometimes (more often than you might think) Partitioning can drop rather than increase query or DML performance. In earlier versions of Oracle setting optimizer_index_caching and optimizer_index_cost_adj was often beneficial and in Oracle 9/8/7 setting db_file_multiblock_read_count “higher” was good for DWs….Go back to Oracle 7 and doing stuff to increase the buffer cache hit ratio towards 98% was generally good {and I will not respond to any comments citing Connors magnificent “choose your BCHR and I’ll achieve it” script}.
You know what? There was an old trick in Oracle 7 you could maybe still look at as a bronze bullet. Put your online redo logs and key index tablespaces on the fastest storage you have and split your indexes/tables/partitions across the faster/slower storage as is fit. Is all your storage the same speed? Go buy some SSD and now it isn’t….


Cartridge with Wooden Bullet

Then there are bronze bullets that you can use that very often improve performance but the impact can be catastrophic {Let’s call them wooden bullets :-) }. Like running your database in noarchivelog mode. That can speed up a lot of things, but if you find yourself in the situation of needing to do a recovery and you last cold backup is not recent enough – catastrophe. A less serious but more common version of this is doing things nologging. “oh, we can just re-do that after a recovery”. Have you done a test recovery that involved that “oh, we can just do it” step? And will you remember it when you have a real recovery situation and the pressure is on? Once you have one of these steps, you often end up with many of them. Will you remember them all?

How many of you have looked at ALTER SYSTEM SET COMMIT_WRITE=’BATCH,NOWAIT’? It could speed up response times and general performance on your busy OLTP system. And go lose you data on crash recovery. Don’t even think about using this one unless you have read up on the feature, tested it, tested it again and then sat and worried about could possibly go wrong for a good while.

That last point is maybe at the core of all these Performance Bronze Bullets. Each of these things may or may not work but you have to understand why and you have to understand what the payback is. What could now take longer or what functionality have I now lost? {hint, it is often recovery or scalability}.

So, what was that one Silver Bullet I tantalizingly left hanging out for all you people to wait for? You are not going to like this…

Look at what your application is doing and look at the very best that your hardware can do. Do you want 10,000 IOPS a second and your storage consists of less than 56 spindles? Forget it, your hardware cannot do it. No matter what you tune or tweak or fiddle with. The one and only Performance Silver Bullet is to look at your system and your hardware configuration and work out what is being asked and what can possibly be delivered. Now you can look at:

  • What is being asked of it. Do you need to do all of that (and that might involve turning some functionality off, if it is a massive drain and does very little to support your business).
  • Are you doing stuff that really is not needed, like management reports that no one has looked at in the last 12 months?
  • Is your system doing a heck of a lot to achieve a remarkably small amount? Like several hundred buffer gets for a single indexed row? That could be a failure to do partition exclusion.
  • Could you do something with physical data positioning to speed things up, like my current blogging obsession with IOTs?
  • You can also look at what part of your hardware is slowing things down. Usually it is spindle count/RAID level, ie something dropping your IOPS. Ignore all sales blurb from vendors and do some real-world tests that match what you app is or wants to do.

It’s hard work but it is possibly the only Silver Bullet out there. Time to roll up our sleeves and get cracking…

{Many Thanks to Kevin Closson for providing all the pictures – except the Silver Bullet, which he only went and identified in his comment!}

Fastest £1,000 Server – back from supplier

At the risk of turning my Blog into some sort of half-way-house tweet update thing (correct, I’ve never logged into twitter), as a couple of people asked about the outcome with the broken £1,000 server, I’m happy to report it came back this week. The motherboard had died. I’d convinced myself it was the PSU when I trawled the net as it seems to be one of those things that is most likely to die having fired up in the first place, but no, the motherboard. I guess some solder “dried” or the pc pixies just don’t like me. One month turnaround is not very impressive…

They had another motherboard exactly the same in stock so I got a like-for-like swap. I was kind of hoping for a different one with more SATA3 and USB3 headers :-)

Now I’m trying to download the latest oracle 11 for 64 bit windows. I live out in the wilds of North Essex (for non-UK people, this is all of 62 Kilometers North-Northeast of London as the crow flies, so not exactly in an obscure and remote part of the UK! For those who DO know the UK, it is nothing like “the only way is Essex” out here. We have trees, fields, wildlife and a lack of youth culture.) As such, my broadband connect is sloooow. The connection keeps breaking and I lose the download. *tsch*. I’m sure I had a download manager somewhere which got around these issues…


Fastest £1,000 server – what happened?

A couple of people have asked me recently what happened to that “fastest Oracle server for a grand” idea I had last year, after all I did announce I had bought the machine.

{Update – it came back.}
Well, a couple of things happened. Firstly, what was a small job for a client turned into a much more demanding job for a client – not so much mentally harder as time-consuming harder and very time consuming it was. So the playing had to go on hold, the client comes first. The server sat in the corner of the study, nagging me to play with it, but it remained powered down.
Secondly, when the work life quietened down last month and I decided to spend a weekend getting that server set up I hit an issue. I turned on the server and it turned itself straight off. It than rested for 5 seconds and turned itself back on for half a second – and then straight off. It would cycle like that for as long as I was willing to let it.

OK, duff power switch, mother board fault, something not plugged in right, PSU not reaching stable voltage… I opened the case and checked everything was plugged in OK and found the manufacturer had covered everything with that soft resin to hold things in place. I pressed on all the cards etc in hope but no, it was probably going to have to go back. It is still in warranty, the manufacturer can fix it.

So I rang the manufacturer and had the conversation. They were not willing to try and diagnose over the phone so I had to agree to ship it back to them to be fixed {I did not go for on-site support as the only time I did, with Evesham Micros, they utterly refused to come out to fix the problem. Mind you, it turns out they were counting down the last week or two before going bust and, I suspect, knew this}. I shipped it back and the waiting began. Emails ignored, hard to get on touch over the phone. Over three weeks on and they only started looking at the machine last Friday (they claim).

On the positive side, this delay means that solid state storage is becoming very affordable and I might be able to do some more interesting things within my budget.
On the bad side the technology has moved on and I could get a better server for the same money now, but that is always the case. Mine does not have the latest Sandy Bridge Intel processor for example. Also, I have time now to work on it, I hope not to have time next month as I’d like to find some clients to employ me for a bit!

I better go chase the manufacturer. If it is not fixed and on its way back very, very soon then they will be off my list of suppliers and I’ll be letting everyone know how good their support isn’t.

Sane SAN

 
 

SANE SAN
The Random Acronym Seminar (or RAS for short...)
James Morle, Scale Abilities, Ltd.

Introduction
This paper talks about storage within SANs. It is easier than ever to implement a badly laid out SAN, with the levels of abstraction and data sharing made possible through this technology. We will look at ways this can be simplified though careful planning, and discover why its a good idea to lie to your boss. Before that, though, it's worthwhile taking a journey into the past to find out why all this stuff exists, and why there are so many acronyms in the storage industry.