Search

OakieTags

Who's online

There are currently 0 users and 37 guests online.

Recent comments

Oakies Blog Aggregator

Repairman Jack: Conspiracies…

Conspiracies is the third book in the Repairman Jack series by F. Paul Wilson.

Jack is hired to look for a missing woman who happens to be part of a very exclusive conspiracy theory group. What’s more she went missing just before presenting her Grand Unification Theory, that supposedly explains the true root of all conspiracies through all the ages. Coincidence?

I really liked this story. I’m not into conspiracy theories myself, but I can see why they are fascinating to people. That constant spiral of the lack of evidence because the evidence is being withheld. It kinda draws you in.

I find people’s reaction to conspiracy theories rather intriguing. Doubters will turn their nose up at the thought of aliens because of the lack of proof, then go to church to worship to a God that requires no proof, just faith. Interesting. :)

I have a couple of acquaintances who are into the whole conspiracy thing. The thing that is rather disturbing about that is once you go down that path everything you ever see, do or experience seems to some how relate to it. Everything is a cover-up by [ the government | the Illuminati | Aliens | Satan | some other dark forces ].

My brain is a bit to basic for all of that. If I see it, it’s real. If I don’t it’s mumbo jumbo. Life’s a lot easier to cope with that way. :)

Cheers

Tim…




Book offer

Hotsos 2012:
One of the session’s I attended yesterday (6th March) was Karen Morton speaking about grouping in SQL. Karen is a member of the Oak Table network, knows what she’s talking about, and speaks well. So you’ll be pleased to hear that one of the first slide she showed was about an offer from Apress for the book Pro Oracle SQL that she co-authored with several other members of the Oak Table.

Until April 8th, if you supply the code OR4SQ8 when you order this book from Apress you get a 40% discount on the price.

I haven’t been able to find an officlal statement of this fact online, or a confirmation for the code; and Karen wasn’t 100% sure whether the deal applied to the hard copy, electronic copy, or both; but when Apress did something similar for Oracle Core the code did apply to the hard copy.

 

Yes, File Systems Still Need To Support Concurrent Writes! Yet Another Look At XFS Versus EXT4.

My post entitled File Systems For A Database? Choose One That Couples Direct I/O and Concurrent I/O. What’s This Have To Do With NFS? Harken Back 5.2 Years To Find Out has not been an incredibly popular post by way of page views (averages about 10 per day for the last six months), but it has generated some email from readers asking about EXT4.

I’ve been putting off the topic but it is fresh on my mind.

Today I put out a quick tweet about concurrent writes on Ext4 (https://twitter.com/#!/kevinclosson/status/177111985790525440) that started a small tweet-thread by others looking for clarification.  This blog entry aims to clarify my point about concurrent writes on EXT4 compared to XFS. As an aside, if you have not read the above referenced blog post, and you are interested in concurrent writes and how the topic pertains to several file systems including NFS, I recommend you give it a read.

The topic at hand—EXT4 versus XFS—concurrent write handling is a very brief topic so this will be a brief blog post.  Allow me to explain. The following really sums it up:

EXT4 does not support concurrent writes, XFS does.

So, in spite of the fact that the topic is brief, I’d like to expound upon the matter and offer some proof.

In the following you will see two proof cases—one EXT4 and the other XFS. The proof case is as follows:

  1. The previous file system is unmounted
  2. An XFS file system is created in my md(4) SW RAID LUN
  3. The XFS file system is mounted on /mnt/dsk
  4. A script called simple.sh is executed to prove the volume supports high-performance sequential writes by first initializing a test file through the direct I/O code path
  5. The simple.sh script then measures 196,608 64KB sequential writes to the test file. The file is opened without truncate so this is an operation that merely over-writes the file. The writes are performed with direct I/O.
  6. The simple.sh script then performs concurrent writes of the same file—again the writes are through the direct I/O code path and the file is not truncated. There are two dd(1) processes—one over-writes the first half of the file the other over-writes the second half of the file.

I’ll paste the silly little simple.sh script at the bottom of this post.

The measure of goodness is , of course, whether or not the two-process case is able to push more I/O in aggregate than the single writer case.  You’ll see that with very large writes the LUN can sustain 3.7 GB/s with a single writer through the direct I/O code path on both XFS and EXT4 files. The concurrent versus single write test cases were conducted with 64KB writes. Again, with both file systems (XFS and EXT4) the single writer was able to push 1.4 GB/s. As the following shows, the XFS two-writer case scaled at 1.7x.

Now it’s time to move on to EXT4. Here you’ll see the same baseline of 3.7 GB/s when initializing the file and the familiar 1.4 GB/s for the single 64KB serial writer. That, however, is the extent of the similarities. The two-writer case on EXT4 sadly de-scales. The 2.4 GB/s seen in the XFS case f alls to aggregate of 1048 MB/s with two writers on EXT4.

The following is the simple.sh script:

#!/bin/bash

myfile=$1

echo "Creating test file $myfile using direct I/O"
dd if=/dev/zero of=$myfile bs=1024M count=12 oflag=direct

sync;sync;sync;echo 3 > /proc/sys/vm/drop_caches

echo "Single Direct I/O writer"
( dd if=/dev/zero of=$myfile bs=64K count=196608 conv=notrunc oflag=direct > thread1.out 2>&1 ) &

wait
cat thread1.out

echo "Two Direct I/O writer"
( dd if=/dev/zero of=$myfile bs=64K count=98304 conv=notrunc oflag=direct > thread1.out 2>&1 ) &
( dd if=/dev/zero of=$myfile bs=64K count=98304 seek=98304 conv=notrunc oflag=direct > thread2.out 2>&1 ) &

wait
cat thread1.out thread2.out

Filed under: oracle

UK Oracle User Group Council Elections

If you are a member of the UK Oracle User Group, you are entitled to vote in the elections to the council. The polls close this week on Thursday 8th March.

You can vote on-line at http://www.ukoug.org/other/councilvote/.

Regular readers of this blog will have an interest in the PeopleSoft community. Steve Smith has been a deputy chair of the PeopleSoft Technical SIG for several years, and is standing as the representitive of the PeopleSoft community in the UK, and I urge you  to support him,

If you are the primary contact for your organisation you will have been
contacted directly by the UKOUG with on-line voting instuctions. If you have any questions about the process please contact the UKOUG office.

UK Oracle User Group Council Elections

If you are a member of the UK Oracle User Group, you are entitled to vote in the elections to the council. The polls close this week on Thursday 8th March.

You can vote on-line at http://www.ukoug.org/other/councilvote/.

Regular readers of this blog will have an interest in the PeopleSoft community. Steve Smith has been a deputy chair of the PeopleSoft Technical SIG for several years, and is standing as the representitive of the PeopleSoft community in the UK, and I urge you  to support him,

If you are the primary contact for your organisation you will have been
contacted directly by the UKOUG with on-line voting instuctions. If you have any questions about the process please contact the UKOUG office.

Create a database link with the new host:port/service syntax

I just noticed that (finally) in 11.2 this syntax is supported:

SQL> CREATE DATABASE LINK demo_x2 
  2  CONNECT TO tanel IDENTIFIED BY password 
  3  USING 'exadb03:1521/DEMO';
Database link created.

This just makes life a bit easier as there’s no need to use the long TNS format entry (or a tnsnames.ora/LDAP alias). It might work in 11.1 too (haven’t tested) but it didn’t work on 10.2.0.4 …

Update: This feature works for dblinks in 10.2 onwards – when I tested it on my 10.2, I got an error initially, but it was because the hostname I used didn’t resolve to an IP. Thanks to Randolf Geist for pointing this out.

In case you didn’t know, the sqlplus supports such an easy connect method since 10g:

tanel@mac02:~$ sqlplus tanel/password@exadb03/DEMO
SQL*Plus: Release 10.2.0.4.0 - Production on Mon Mar 5 09:51:27 2012
Copyright (c) 1982, 2007, Oracle. All Rights Reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL>

If you omit the port from the syntax, it defaults to 1521.

Dates on articles…

Over the weekend, someone wrote to me asking about the lack of dates on my website articles. I tried to reply to them directly, but their mailbox bounced everything back, so I though I would write it here on the off chance they read my blog.

As the original poster correctly pointed out, there are no dates on my website articles. When I started the website many years ago it just never crossed my mind to put dates on the articles and I’ve seen no reason to include them since. Why not?

The articles on my site are divided into sections based on the database version (8i, 9i, 10g, 11g etc.). Within that grouping, there are subdivisions, based on the specific release or type of feature being discussed. If an article is not version-specific, it is put in the miscellaneous section. So when you are reading an article, you know the relevance of that article based on the version you are working on. If you are working on 11gR2 and following an 8i article, you should be questioning the relevance of that article. That’s not to say it has no relevance, but you have to question it.

As far as my own site goes, I often write new features articles and link to those from previous articles. For example, my Scheduler article was written against 10gR1, but it links to articles describing updates in 10gR2, 11gR1 and 11gR2 etc. I feel this is pretty clear.

The date of an article doesn’t have nearly as much significance as the version of the software it is written against. When I Google for an answer to something, I check the DB/OS/Software version being discussed. I can’t remember the last time I even looked at a date on a DB article. In the case of generic articles (not version-specific) on my site, they often contain a timeline internally, stating when a feature first became available and how it changed over DB versions, so once again, how relevant is the date here?

Now I’m not against dates on articles. If I were starting a site today I would probably use an open source CMS, which would no doubt put a date on every article, which would be fine. My point is that 11 years down the line, I don’t think adding (faking*) dates would add any value since the site is already divided by version.

If you are the person who emailed me and you are reading this, I hope that explains my position on this. :)

Cheers

Tim…

* I do actually store creation dates in my CMS, but only the articles written in the last 6-ish years were written using the CMS, so any articles prior to this point will have a fake date assigned to them, which is the date they were loaded into the CMS, not the date they were written.

I also track last updated dates, but articles get updated for typos etc all the time, so a last updated date is no indication of actual core content update, if you know what I mean. I can tell you now, almost every article on the site was updated between December 2011 and January 2012, so even very old articles that have no core content changes would have a very recent updated date.




Count(*)

A recent posting on the comp.databases.oracle.server newsgroup pointed me to a rather elderly Ask Tom question (originally posed in July 2004, last updated June 2011) where Tom produced an extraordinary observation. The response times for the following two queries are completely different (on Oracle 9.2 on his data set):

set autotrace traceonly explain

select count(*)
from   ( select null from big_table
         union all
         select null from big_table
       )
/

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=8 Card=1)
   1    0   SORT (AGGREGATE)
   2    1     VIEW (Cost=8 Card=8000000)
   3    2       UNION-ALL
   4    3         INDEX (FAST FULL SCAN) OF 'FOO' (NON-UNIQUE) (Cost=4 Card=4000000)
   5    3         INDEX (FAST FULL SCAN) OF 'FOO' (NON-UNIQUE) (Cost=4 Card=4000000)

select sum(cnt )
from   ( select count(*) cnt from big_table
         union all
         select count(*) cnt from big_table
       )
/

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=8 Card=1 Bytes=13)
   1    0   SORT (AGGREGATE)
   2    1     VIEW (Cost=8 Card=2 Bytes=26)
   3    2       UNION-ALL
   4    3         SORT (AGGREGATE)
   5    4           INDEX (FAST FULL SCAN) OF 'FOO' (NON-UNIQUE) (Cost=4 Card=4000000)
   6    3         SORT (AGGREGATE)
   7    6           INDEX (FAST FULL SCAN) OF 'FOO' (NON-UNIQUE) (Cost=4 Card=4000000)

The second query took 14 seconds compared to 52 seconds for the first query.
At first my response was stark disbelief – but Tom had, of course, provided the evidence and after a couple of minutes thought I realised why the difference has appeared. Before explaining what’s happened, I’ll just reproduce the (slightly stripped) tkprof outputs that Tom showed us.

select count(*)
  from ( select null from big_table
         union all
         select null from big_table )

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     46.32      52.72      44494      44630          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     46.32      52.72      44494      44630          0           1

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=44630 r=44494 w=0 time=52726468 us)
8000000   VIEW  (cr=44630 r=44494 w=0 time=46390678 us)
8000000    UNION-ALL  (cr=44630 r=44494 w=0 time=33556917 us)
4000000     INDEX FAST FULL SCAN FOO (cr=22315 r=22247 w=0 time=4464467 us)(object id 128577)
4000000     INDEX FAST FULL SCAN FOO (cr=22315 r=22247 w=0 time=3447898 us)(object id 128577)
********************************************************************************

select sum(cnt )
  from ( select count(*) cnt from big_table
         union all
         select count(*) cnt from big_table )

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     12.31      14.17      44494      44630          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     12.31      14.17      44494      44630          0           1

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=44630 r=44494 w=0 time=14174206 us)
      2   VIEW  (cr=44630 r=44494 w=0 time=14174178 us)
      2    UNION-ALL  (cr=44630 r=44494 w=0 time=14174171 us)
      1     SORT AGGREGATE (cr=22315 r=22247 w=0 time=7594295 us)
4000000      INDEX FAST FULL SCAN OBJ#(128577) (cr=22315 r=22247 w=0 time=3969944 us)(object id 128577)
      1     SORT AGGREGATE (cr=22315 r=22247 w=0 time=6579839 us)
4000000      INDEX FAST FULL SCAN OBJ#(128577) (cr=22315 r=22247 w=0 time=3545406 us)(object id 128577)

Update – 6th March

The comments from Tanel cover most of the interesting information in this example – but I’m going to write up an answer anyway since there are a couple of related points worth mentioning. First, the sort in the sort aggregate lines is not actually doing any sorting, Oracle is capable of keeping a simple running count in this particular case. In fact, even in cases where there is a genuine aggregation (multi-row result) the volume of data that goes into the sort area is related to the number of different values aggregated rather than the number of rows input – so the amount of memory used for a sort aggregate operation can be very small compared to the volume of data processed. (See this note for an example of how Oracle can handle one type of special case involving sorting)

As far as timing is concerned – the sort aggregate operator has to count the input from its child operation – and in the first case the input for the count is the output of the view, so the view has to be accumulating 8 million rows of “something” from its child operation, which is the union all, and the union all is getting 4 million rows of “something” from each of its two children. The code spends a lot of time sending millions for rows through the “pipelines” that connect operators Doing a quick count – 8 million rows go into the union all pipeline, then 8 million go into the view pipeline, then 8 million go into the sort aggregate pipeline for a total of 24 million rows of “something” being passed.

In the second case the sort aggregate operations at lines 4 and 6 have to accumulate “something” from their respective child operations at lines 5 an 7 – but thatt’s the limit of data being passed through pipelines – i.e. a total of 8 million rows going into a pipeline. So it’s not too surprising that the amount of CPU used is roughly one third of the CPU used by the first case – we’re passing one third of the rows. (Apart from this simpluy difference in volume there may even be some special case code that Oracle uses in the second case when it knows that the operation is simply a variation on “count number of rows in table”.)

Footnote: I ran into Tom Kyte yesterday at Hotsos 2012, and as a consequence I would like to point out that it is the question that is rather elderly, not Tom.

 

Hotsos 2012

Here I am again – in the BA lounge at Heathrow waiting for a plane. Take-off is in 45 minutes, and I land in Dallas at 14:40 local time after nine or ten hours of flying time. The movie selection looks good this month, so I don’t think I’ll need to worry about the entertainment.

If you see me aroundt the Omni over the next few days, feel free to come up and say hello – despite any rumours to the contrary I am an ordinary human being and you won’t get struck by lightning if you get too close. (I also tend to head for bed early in the US, so don’t assume that I’ll be around the bars or restaurants later in the evening – but I might be in the gym around 5:00 am)

Looking forward to seeing you.

 

 

Book Review: Oracle Database 11gR2 Performance Tuning Cookbook (Part 1)

March 3, 2012 Oracle Database Performance Tuning Test Cases without Many “Why”, “When”, and “How Much” Filler Details http://www.amazon.com/Oracle-Database-Performance-Tuning-Cookbook/dp/184... I ordered the “Oracle Database 11gR2 Performance Cookbook” book shortly after it became available for purchase.  I was very curious to see how the book compared with the similarly titled “Oracle Database 11g Performance Tuning Recipes” book, [...]