In my blogpost When the oracle wait interface isn’t enough I showed how a simple asynchronous direct path scan of a table was spending more than 99% of it’s time on CPU, and that perf showed me that 68% (of the total elapsed time) was spent on a spinlock unlock in the linux kernel which was called by io_submit().
This led to some very helpful comments from Tanel Poder. This blogpost is a materialisation of his comments, and tests to show the difference.
First take a look at what I gathered from ‘perf’ in the first article:
Production data is clearly critical for the core of businesses, such as
My friend Kyle wrote a blog piece a while back
Now, not to criticise Delphix (in fact,the opposite – its a very very cool product, and you should read some of Kyle’s great blog content on it), but if you have not got it, or can’t get it, then as long as you have some imagecopy backups, and a NFS server hanging around, you can get to the "next best thing" using Oracle’s Direct NFS feature.
So…how close can we get to 1 minute 22 seconds…Lets try on a laptop
Very common request I get from my customers is to parameterize the query executed by a Hive action in their Oozie workflow.
For example, the dates used in the query depend on a result of a previous action. Or maybe they depend on something completely external to the system – the operator just decides to run the workflow on specific dates.
There are many ways to do this, including using EL expressions, capturing output from shell action or java action.
Here’s an example of how to pass the parameters through the command line. This assumes that whoever triggers the workflow (Human or an external system) has the correct value and just needs to pass it to the workflow so it will be used by the query.
Here’s what the query looks like:
Please feel free to share your own experiences with these tools or others in the comments!
There are a number of tools out there to do I/O benchmark testing such as
My choice for best of breed is fio
(thanks to Eric Grancher for suggesting fio).
It’s not uncommon for people on one hand to expound the functionality, performance and features of Oracle, whilst on the other hand, lament the potential high cost of the product.
I’m not pontificating here – I’m commonly one of these people. So much good stuff in Oracle….yet so much to pay to get that good stuff
So in the interests of fairness, I thought I’d share a little story where an Oracle solution was implemented with total expenditure of: ZERO
Oracle has done a great job with the wait interface. It has given us the opportunity to profile the time spend in Oracle processes, by keeping track of CPU time and waits (which is time spend not running on CPU). With every new version Oracle has enhanced the wait interface, by making the waits more detailed. Tuning typically means trying to get rid of waits as much as possible.
It’s coming up to the time when I have to think about which presentations to go to at UKOUG Tech13 – always difficult to decide whether to see topics I’m familiar with to find out how much I didn’t know, or whether to see topics which I don’t know to get some sort of intelligent briefing. Here’s my starting thought:
12:30 Me, on compression (index, basic and OLTP – not HCC)
13:40 Tony Hasler: “Why does the optimizer sometimes get the plan wrong”
15:00 Kyle Hailey: “Oracle transaction locks and analysis”
16:00 Neil Chandler: “10046 trace – powerful, or pointless in the real world”
The world is obsessed with I/O nowadays….
This is understandable – we’re in the middle of a pioneering period for I/O – flash, SSD, MLC, SLC, with ever more sophisticated transport mechanisms – infiniband, and the like.
But don’t forget, that once you get those blocks back to Oracle, you need to “consume” them, ie, get those rows and get that data…
And that’s not free !
For example, lets look at two tables, both 500 megabytes, so the I/O cost to consume them is thereabouts the same.
The first one has ~50byte rows.