Search

OakieTags

Who's online

There are currently 0 users and 31 guests online.

Recent comments

Affiliations

MySQL

Thirteen signs of DBA fudging

If you are a director, manager or project manager who works with DBAs, you probably have had the nagging suspicion at one time or another that a DBA’s assertions regarding his or her practices lack an empirical or scientific basis, or are simply deflections intended to pass the buck.

Manager: Mr. DBA, the application is really slow. Do you have any idea what’s wrong?

DBA: Oracle is very complex. It could be any of 100 different possible causes. I will begin checking each. Anyhow, what makes you think it is the database?

Oracle OpenWorld 2011 — Bloggers Meetup

Isn’t that that time of the year again? Yes, it is — it’s time for our annual Oracle Bloggers Meetup and of course Oracle is piggybacking OpenWorld with the meetup again! ;) What: Oracle Bloggers Meetup 2011 When: Wed, 5-Oct-2011, 5:00pm Where: Main Dining Room, Jillian’s Billiards @ Metreon, 101 Fourth Street, San Francisco, CA [...]

Handling Human Errors

Interesting question on human mistakes was posted on the DBA Managers Forum discussions today.

As human beings, we are sometimes make mistakes. How do you make sure that your employees won’t make mistakes and cause downtime/data loss/etc on your critical production systems?

I don’t think we can avoid this technically, probably working procedures is the solution.
I’d like to hear your thoughts.

I typed my thoughts and as I was finishing, I thought that it makes sense to post it on the blog too so here we go…

The keys to prevent mistakes are low stress levels, clear communications and established processes. Not a complete list but I think these are the top things to reduce the number of mistakes we make managing data infrastructure or for that matter working in any critical environment be it IT administration, aviation engineering or medical surgery field. It’s also a matter of personality fit – depending on your balance between mistakes tolerance and agility required, you will favor hiring one individual or another.

Regardless of how much you try, there are still going to be human errors and you have to account for them in the infrastructure design and processes. The real disasters happen when many things align like several failure combined with few human mistakes. The challenge is to find the right balance between efforts invested in making no mistakes and efforts invested into making your environment errors-proof to the point when risk or human mistake is acceptable to the business.

Those are the general ideas.

Just a few examples of the practical solutions to prevent mistakes when it comes to Oracle DBA:

  • test production actions on a test system before applying in production
  • have a policy to review every production change by another senior member of a team
  • watch over my shoulder policy working on production environments – i.e. second pair of eye all the time
  • employee training, database recovery bootcamp
  • discipline of performing routing work under non-privileged accounts

Some of the items to limit impact of the mistakes:

  • multiples database controlfiles for Oracle database (in case DBA manually does something bad to one of them – I saw this happen)
  • standby database with delayed recovery or flashback database (for Oracle)
  • no SPOF architecture
  • Oracle RAC, MySQL high availability setup (like sharding or replication), SQL*Server cluster — architecture examples that limit impact of human mistakes affecting a single hardware component

Both lists can go on very long. Old article authored by Paul Vallee is very relevant top this topic — The Seven Deadly Habits of a DBA…and how to cure them.

Feel free to post your thoughts and example. How do you approach human mistakes in managing production data infrastructure?

Product Support vs Operational Support

Sometimes I get questions as to whether Pythian is one of the competitors battling with Oracle for MySQL support. The answer lies in the distinction of product support and operational support.

At Pythian, we are laser focused on supporting applications and data infrastructure using Oracle, MySQL and Microsoft SQL Server products. A vast majority of our Oracle customers (there are few customers who have very old 7.x and 8.x products running without vendor support) have Oracle maintenance subscriptions that include product updates and product support. Product support entitles the customer to open support requests when the product doesn’t perform according to the specifications (bug reports) as well as fill in enhancement requests. It also covers deployment blue-prints and deployment guidelines in the official vendor documentation and support database.

What you can’t expect from product support are answers to questions like these:

  • How do I architect my infrastructure?
  • How much CPU do I need to run this database
  • How do I setup my backups?
  • How do I tune that SQL statement?
  • What I need to monitor in my environment to keep it healthy and avoid service outages?

Of course you cannot expect product support to login to your systems and help monitor them, recover a corrupted database or resolve performance issues etc.

Oracle customers usually have clear understanding of the differences between product support and operations support and consulting that Pythian provides. Even then, every now and again we hear rare statements like “I’m not renewing our Oracle product support because we now have you, Pythian, supporting our databases.” Hearing that, we’re catching our breath for few seconds and then patiently explain that this is inadvisable and the product support is totally different from what Pythian does.

Because of its open-source nature, MySQL database customers have somewhat less incentive to sign up for product support relying on public community releases and the ability to patch the product themselves but even then there is a clear distinction between product support and operational support.

All that was a long prelude to answering the question — “Is Pythian Competing with Oracle and other vendors for MySQL product support”? The answer is NO — Pythian provides plan, deploy, manage services — we analyze, design, implement and maintain the infrastructure. We are working with the vendor providing product support (or as part of the community at large when it comes to the open-source community MySQL releases).

YPDNGG: You Probably Don’t Need Golden Gate

Before launching into this, I must give due deference to Mogens Nørgaard’s landmark article, You Probably Don’t Need RAC (YPDNR), available here, but originally published Q3 2003 in IOUG Select Journal.  Mogens showed that you can be a friend of Oracle without always agreeing with everything they do.

Report from Oracle Openworld

Openworld 2010, despite the supposedly lagging economy, had record attendance again this year.  No doubt this was the result of Oracle acquiring something like fourteen companies since last year, including Sun in 2009.  The crowds were thick, divided about evenly between geeks in badly-fitting vendor t-shirts and slick sales-side hustlers with dress pants and shiny shoes.  I landed somewhere in the middle of the two (badly-fitting dress shirt, comfortable jeans and loafers), proudly sporting a long dangling codpiece of ribbons from my attendee badge:

OOW2010 Badge

My OOW2010 Codpiece

Notes on Learning MySQL (as an Oracle DBA)

This post originally appeared over at Pythian. There are also some very smart comments over there that you shouldn’t miss, go take a look!

I spent some time last month getting up to speed on MySQL. One of the nice perks of working at Pythian is the ability to study during the workday. They could have easily said “You are an Oracle DBA, you don’t need to know MySQL. We have enough REAL MySQL experts”, but they didn’t, and I appreciate.

BAAG, Best Practices and Multiple Choice Exams

(This post originally appeared at the Pythian blog)

I’ve been following the discussion in various MySQL blogs regarding the sort_buffer_size parameters. As an Oracle DBA, I don’t have an opinion on the subject, but the discussion did remind me of many discussions I’ve been involved in. What’s the best size for SDU? What is the right value for OPEN_CURSORS? How big should the shared pool be?

All are good questions. Many DBAs ask them hoping for a clear cut answer – Do this, don’t do that! Some experts recognize the need for a clear cut answer, and if they are responsible experts, they will give the answer that does the least harm.

The Most Common Performance Problem I See

At the Percona Performance Conference in Santa Clara this week, the first question an audience member asked our panel was, "What is the most common performance problem you see in the field?"

I figured, being an Oracle guy at a MySQL conference, this might be my only chance to answer something, so I went for the mic. Here is my answer.

The most common performance problem I see is people who think there's a most-common performance problem that they should be looking for, instead of measuring to find out what their actual performance problem actually is.

It's a meta answer, but it's a meta problem. The biggest performance problems I see, and the ones I see most often, are not problems with machines or software. They're problems with people who don't have a reliable process of identifying the right thing to work on in the first place.

That's why the definition of Method R doesn't mention Oracle, or databases, or even computers. It's why Optimizing Oracle Performance spends the first 69 pages talking about red rocks and informed consent and Eli Goldratt instead of Oracle, or databases, or even computers.