Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Asynch descriptor resize wait event in Oracle

A lot of people have started seeing “asynch descriptor resize” wait event in Oracle 11gR2. Here’s my understanding of what it is. Note that I didn’t spend too much time researching it, so some details may be not completely accurate, but my explanation will at least give you an idea of why the heck you suddenly see this event in your database.
FYI, there’s a short, but incomplete explanation of this wait event also documented in MOS Note 1081977.

Asynch descriptor resize wait event in Oracle

A lot of people have started seeing “asynch descriptor resize” wait event in Oracle 11gR2. Here’s my understanding of what it is. Note that I didn’t spend too much time researching it, so some details may be not completely accurate, but my explanation will at least give you an idea of why the heck you suddenly see this event in your database.
FYI, there’s a short, but incomplete explanation of this wait event also documented in MOS Note 1081977.

NFS and IO testing : Bonnie++

Been spending time lately testing out and trying to tune NFS mounts.
For testing IO, I’ve been using
  • dd
  • iozone
  • bonnie++
  • orion
This first option, dd, is fine for file creation tests and/or sequential reads but less flexible for random reads or random writes
To test random reads I thought I’d use iozone but don’t see a solely random read test.
Then I thought I’d use the Oracle supplied tool, orion, to do random read test on an NFS mounted file system, but this doesn’t work, at least on AIX 6.1, with my mount settings.
First orion test gave this error
$ orion -run simple -testname orion
ORION: ORacle IO Numbers — Version 11.1.0.7.0
orion_20101123_1503
rwbase_read_luncfg: SlfFopen error on orion.lun
orion_parse_args: rwbase_read_luncfg failed

OK, have to create “orion.lun” with either my lun locations or my file locations. I put in file locations:

$ cat orion.lun
/tmp/system01.dbf
Now get another error
$ orion -run simple -testname orion
ORION: ORacle IO Numbers — Version 11.1.0.7.0
orion_20101123_1508
Test will take approximately 9 minutes
Larger caches may take longer
orion_spawn: skgpspawn failed: Error category: 27155, Detail: 2
orion_main: orion_spawn failed
Non test error occurred
Orion exiting
Illegal instruction(coredump)
Looks like the “orion” executable wasn’t being found, at least not by execve
$ truss -f orion -run simple -testname orion
700502: execve(“orion”, 0x0FFFFFFFFFFBF2D0, 0x0FFFFFFFFFFFFB30) Err#2 ENOENT
so I ran it from my bin directory where the orion executable could be found. Now get another error
$ orion -run simple -testname orion
ORION: ORacle IO Numbers — Version 11.1.0.7.0
orion_20101123_1510
Test will take approximately 9 minutes
Larger caches may take longer
storax_skgfr_openfiles: File identification failed on /kyle/system01.dbf
OER 27054: please look up error in Oracle documentation
Additional information: 6
rwbase_lio_init_luns: lun_openvols failed
rwbase_rwluns: rwbase_lio_init_luns failed
orion_thread_main: rw_luns failed
Non test error occurred
Orion exiting
If the datafile was on “/tmp” it word fine but if it was on my NFS mount it failed with the above error. Hmm – doesn’t work over NFS?
Finally I decided to try bonnie++. I download, unziped, tar extracted bonnie++ , ran “./configure” and then make, but got error
$ make
make: 1254-002 Cannot find a rule to create target bon_io.o from dependencies.
Stop.
For some reason the rule for the “.o” doesn’t seem to work:
%.o: %.cpp %.h bonnie.h port.h
$(CXX) -c $&lt
Not feeling like trying to figure make out I just did it by hand
$ for i in *cpp; do
name=`echo $i | sed -e ‘s/.cpp//’`
echo “g++ -c $name.cpp $name.h bonnie.h port.h”
done > makeo
$ sh makeo
$ make

That worked. Test results coming

NFS and IO testing : Bonnie++

Been spending time lately testing out and trying to tune NFS mounts.
For testing IO, I've been using
  • dd
  • iozone
  • bonnie++
  • orion
This first option, dd, is fine for file creation tests and/or sequential reads but less flexible for random reads or random writes
To test random reads I thought I'd use iozone but don't see a solely random read test.
Then I thought I'd use the Oracle supplied tool, orion, to do random read test on an NFS mounted file system, but this doesn't work, at least on AIX 6.1, with my mount settings.
First orion test gave this error
$ orion -run simple -testname orion
ORION: ORacle IO Numbers -- Version 11.1.0.7.0
orion_20101123_1503
rwbase_read_luncfg: SlfFopen error on orion.lun
orion_parse_args: rwbase_read_luncfg failed

OK, have to create "orion.lun" with either my lun locations or my file locations. I put in file locations:

$ cat orion.lun
/tmp/system01.dbf
Now get another error
$ orion -run simple -testname orion
ORION: ORacle IO Numbers -- Version 11.1.0.7.0
orion_20101123_1508
Test will take approximately 9 minutes
Larger caches may take longer
orion_spawn: skgpspawn failed: Error category: 27155, Detail: 2
orion_main: orion_spawn failed
Non test error occurred
Orion exiting
Illegal instruction(coredump)
Looks like the "orion" executable wasn't being found, at least not by execve
$ truss -f orion -run simple -testname orion
...
700502: execve("orion", 0x0FFFFFFFFFFBF2D0, 0x0FFFFFFFFFFFFB30) Err#2 ENOENT
so I ran it from my bin directory where the orion executable could be found. Now get another error
$ orion -run simple -testname orion
ORION: ORacle IO Numbers -- Version 11.1.0.7.0
orion_20101123_1510
Test will take approximately 9 minutes
Larger caches may take longer
storax_skgfr_openfiles: File identification failed on /kyle/system01.dbf
OER 27054: please look up error in Oracle documentation
Additional information: 6
rwbase_lio_init_luns: lun_openvols failed
rwbase_rwluns: rwbase_lio_init_luns failed
orion_thread_main: rw_luns failed
Non test error occurred
Orion exiting
If the datafile was on "/tmp" it word fine but if it was on my NFS mount it failed with the above error. Hmm - doesn't work over NFS?
Finally I decided to try bonnie++. I download, unziped, tar extracted bonnie++ , ran "./configure" and then make, but got error
$ make
make: 1254-002 Cannot find a rule to create target bon_io.o from dependencies.
Stop.
For some reason the rule for the ".o" doesn't seem to work:
%.o: %.cpp %.h bonnie.h port.h
$(CXX) -c $&lt
Not feeling like trying to figure make out I just did it by hand
$ for i in *cpp; do
name=`echo $i | sed -e 's/.cpp//'`
echo "g++ -c $name.cpp $name.h bonnie.h port.h"
done > makeo
$ sh makeo
$ make

That worked. Test results coming

Workload System Statistics Bug in 11.2

Since the introduction of 11.2 I receive on a regular basis questions related to “strange” workload system statistics. The last email on that topic was sent to me yesterday. So, instead to send, again, a private reply, I decided to write this short post.

What’s wrong with 11.2 and workload system statistics?

Let’s have a look to the output of the following query:

SQL> SELECT pname, pval1
  2  FROM sys.aux_stats$
  3  WHERE sname = 'SYSSTATS_MAIN';

PNAME                  PVAL1
--------------- ------------
CPUSPEEDNW            1596.0
IOSEEKTIM                4.0
IOTFRSPEED            4096.0
SREADTIM             10900.3
MREADTIM              4525.8
CPUSPEED              1603.0
MBRC                     7.0
MAXTHR            17391616.0
SLAVETHR            413696.0

As you can see the SREADTIM and MREADTIM times are very high. In this case about three orders of magnitude of what you would expect from a regular system.

I’m not aware of the exact cause of this problem, but to me it seems that the statistics externalized in x$kcfio are broken. Anyway, in MOS there are several bugs related to it (9842771 and 9701256). Hence, it’s not a feature (e.g. a change in the unit of measure), it’s a bug. On my Linux test system I’m able to reproduce it on both 11.2.0.1 and 11.2.0.2. According to the bugs mentioned before, the problem is not limited to Linux.

Since with the DBMS_STATS package we are not able to gather correct statistics, the only advice I can give on that topic is that you have to manually set them to sensible values.

Update 2011-03-23

To fix the problem you can install the patch 9842771. It is available for 11.2.0.1 and 11.2.0.2. By the way, since the patch only provides a new version of the dbms_stats_internal package, the statistics externalized in x$kcfio are not broken… they just have another unit of measure.

Different Performance from Standard Edition and Enterprise Edition? 3

November 22, 2010 (Back to the Previous Post in the Series) (Forward to the Next Post in the Series) In the previous articles of this series we looked at different performance behavior exhibited by Standard Edition and Enterprise Editions for release versions 10.2.0.5 and 11.2.0.1, when presented with a very simple SQL statement that forced an index access path.  [...]

YPDNGG: You Probably Don’t Need Golden Gate

Before launching into this, I must give due deference to Mogens Nørgaard’s landmark article, You Probably Don’t Need RAC (YPDNR), available here, but originally published Q3 2003 in IOUG Select Journal.  Mogens showed that you can be a friend of Oracle without always agreeing with everything they do.

At Blue Gecko, many of our remote DBA customers have been asking us about Golden Gate.  In July 2009, Oracle bought Golden Gate Software, just one of several companies that have developed log-based replication mechanisms for Oracle and other databases.  This was one of many major acquisitions by Oracle in 2009, including Sun and Relsys. But unlike most of Oracle’s acquisitions, Golden Gate provides very little new functionality not already available in Oracle Streams. Nevertheless, at OpenWorld 2009, Oracle made a shocking announcement.  They declared that Golden Gate would be the primary replication channel for Oracle, and that development would cease on Streams and related components.

Usually, Oracle watches these little third-party products for good ideas, then implements them independently (and better) on their own in the Oracle kernel, then watches as the little third-party companies fizzle out.  A case in point is direct memory access performance sampling.  Precise Software and several other companies in the early 2000s developed  low-impact performance sampling and visualization products for Oracle based on sampling the SGA periodically from an external program.  In version 10g, Oracle answered them with Active Session History (ASH), which did the same thing but better.  Although ASH required customers to purchase the Diagnostic Pack, it still more or less spelled the downfall of competing products.

But in the case of Golden Gate, Oracle already has a log-based replication technology (Streams) built into the kernel, available for a very reasonable price (free with Enterprise Edition).  The only major components that Streams lacks compared to Golden Gate is the ability to replicate across database platforms (Oracle to MSSQL, MySQL, etc. and vice versa).  Even that capability was clearly around the corner: In 11g, Logical Standby (Data Guard), a technology that uses essentially the same stack of components as Streams, gained cross-platform capabilities.

By 11g, Streams has become a mature and stable product, and is far more scalable and configurable than Golden Gate in many ways.  Streams can mine logs on the source or the target, or even a third system.  Depending on the load profile, you can use a wide variety of configuration choices, including parallelism at almost any point.  Streams also allows customers to choose to enforce transaction order or not.

In contrast, Golden Gate’s parallelism is restricted to the apply side, and in parallel mode, does not have the option of guaranteeing transaction order (it is non-ACID). Golden Gate’s parallel apply splits work up by schema, relying on the assumption that interdependent data at the business process level is confined to a single schema at a time.  In other words, if all the tables reside in one schema, then parallel apply doesn’t work, and if they reside in many schemas, the changes in one schema may be applied out of order vis à vis the changes to the other schemas.

Streams is only one of Oracle’s preexisting features that can compete successfully in specific use cases with Golden Gate.  Even more ancient and time-tested solutions such as advanced replication and remote materialized views remain supported and highly effective, depending on the requirement.

If you look at many of the use cases where our customers have deployed Golden Gate, I find that the simplest and most scalable engineering solution would have been remote fast-refresh materialized views.  Our customers often replicate core look-up data, like exchange rates, inventory levels, and other slowly-changing data between Oracle databases within an enterprise.  For this, Golden Gate is completely unjustified, due to cost and complexity compared to remote materialized views.  If it were a question of heterogeneous (inter database product) replication, I completely understand.  But in the majority of situations where we see Golden Gate in use, it is Oracle to Oracle. Given that, I wonder how it could come to pass that responsible people would recommend and implement a solution for such a requirement involving Golden Gate.  Why would Oracle essentially abandon ten years of development and stabilization on a platform like Streams for a less mature, rudimentary product like Golden Gate? Oracle can’t possibly be asking customers to pay additional license fees for a worse version of a product they already own.

So let’s review…

Streams: Mature, complex, requires engineering, highly configurable, scalable, Oracle-only, free.

Golden Gate: Simple, east to deploy, few configuration options, less scalable, expensive, heterogeneous (inter-RDBMS), might break your data.

For me, the corporate direction with regard to Golden Gate is perplexing and smacks of sales-driven (as opposed to requirements and cost-driven) engineering.  I can only imagine what it must be like for the team at Oracle that built Log Miner and AQ into an impressive suite of options including Streams.

UPDATE

Since I posted this, a colleague inside Oracle reassured me that although the product will bear the name ‘Golden Gate,’ it will incorporate features and capabilities of Streams and Golden Gate into a single suite of functions as versions are released over time.  This means that the declarations from inside Oracle do not represent the abandonment of ten years of development.

Related posts:

  1. Oracle Education – Our First DBA Class!!
  2. Blue Gecko named to 2010 Inc. Magazine’s 5000 fastest growing companies in America!
  3. Report from Oracle Openworld

Index Join

One of the less well known access paths available to the optimizer is the “index join” also known as the “index hash join” path. It’s an access path that can be used when the optimizer decides that it doesn’t need to visit a table to supply the select list because there are indexes on the table that, between them, hold all the required columns. A simple example might look something like the following:


create table indjoin
as
select
	rownum	id,
	rownum	val1,
	rownum	val2,
	rpad('x',500) padding
from
	all_objects
where
	rownum <= 3000
;

create unique index ij_v1 on indjoin(id, val1);
create unique index ij_v2 on indjoin(id, val2);

-- collect stats on the table and indexes

select
	ij.id
from
	indjoin		ij
where
	ij.val1 between 100 and 200
and	ij.val2 between 50 and 150
;

Note that the columns in the where clause appear in (some) indexes, and the column(s) in the select list exist in (at least) some indexes. Under these circumstances the optimizer can produce the following plan (the test script was one I wrote for 8i – but this plan comes from an 11.1 instance):


---------------------------------------------------------------------------
| Id  | Operation              | Name             | Rows  | Bytes | Cost  |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |                  |     3 |    36 |    24 |
|   1 |  VIEW                  | index$_join$_001 |     3 |    36 |    24 |
|*  2 |   HASH JOIN            |                  |       |       |       |
|*  3 |    INDEX FAST FULL SCAN| IJ_V1            |     3 |    36 |    11 |
|*  4 |    INDEX FAST FULL SCAN| IJ_V2            |     3 |    36 |    11 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access(ROWID=ROWID)
   3 - filter("VAL1"<=200 AND "VAL1">=100)
   4 - filter("VAL2"<=150 AND "VAL2">=50)

Column Projection Information (identified by operation id):
-----------------------------------------------------------
   1 - "ID"[NUMBER,22]
   2 - (#keys=1) "ID"[NUMBER,22]
   3 - ROWID[ROWID,10], "ID"[NUMBER,22]
   4 - ROWID[ROWID,10]

We do a fast full scan of the two indexes extracting the rowid and id from index ij_v1 and just the rowid from index ij_v2. We can then get the result we want by doing a hash join between these two result sets on the rowid values because any time the two rowsources have a rowid in common, it’s a rowid for a row where val1 is between 100 and 200, and val2 is between 50 and 150 and the first rowsource is carrying the id - which is the thing we need to report.

There are a couple of little observations that we can make about this example.

    First, although I’ve only used two indexes in this example Oracle is not limited to just two indexes. The number of indexes that could be used is effectively unlimited.
    Second, the index_join path is strictly limited to cases where the optimizer can see that every column in the query can be found in indexes on the table.
    Third, although my example uses index fast full scans that’s not a necessary feature of the plan. Just like any other hash join, Oracle could use an index range (or full) scan to get some of the data.
    Finally, there are clearly a couple of bugs in the code.

Bugs:

If you check the rows/bytes columns in the plan you’ll see that the predicted number of rows selected is the same for both indexes (lines 3 and 4) – but we extract the rowid and the id from the first index (projection detail for line 3), so the total data volume expected from line 3 is slightly larger than the total data volume from line 4 where we extract only the rowid; theoretically, therefore, the optimizer has used the tables (indexes) in the wrong order – the one supplying the smaller volume of data should have been used as the first (build) rowsource.

More significantly, though, a quick check of the code that generates the data tells you that each index will supply 101 rows to the hash join – and you can even show that for other query execution plans the optimizer will calculate this cardinality (nearly) correctly. In the case of the index join the optimizer seems to have lost the correct individual cardinalities and has decided to use the size of the final result set as the cardinality of the two driving index scans.

There’s more, of course – one of the strangest things about the index join is that if your select list includes the table’s rowid, the optimizer doesn’t consider that to be a column in the index. So even though the predicate section of the plan shows the rowids being projected in the hash join, Oracle won’t use an index join for a query returning the rowid !

Footnote: The reason I’ve written this brief introduction to the index join is because an interesting question came up at the first E2SN virtual conference.

“If you hint an index hash join, is there any way of telling Oracle the order in which it should use the indexes?”

The answer is no – but there are ways of creating code that will do what you want, and that will be the topic of my next blog.

[Further reading on Index Joins]

Different Performance from Standard Edition and Enterprise Edition? 2

November 21, 2010 (Back to the Previous Post in the Series) (Forward to the Next Post in the Series) In the previous article in this series I compared the Standard Edition of Oracle 10.2.0.5 with the Enterprise Edition of Oracle 10.2.0.5 when a simple SQL statement was executed, looking for examples of different performances in the [...]

New Features

… but only a reference.

Arup Nanda, who wrote a series on 10g New Features and then 11g new features for Oracle Magazine, has just published a short note pointing out that Oracle has been busy doing a little house-keeping on their website(s) and this has resulted his articles moving.

If you want to read some really good material (the how and the why, as well as the what) on the more significant feartures of 10g and 11g, then following this link to Arup’s Blog.