Search

OakieTags

Who's online

There are currently 0 users and 31 guests online.

Recent comments

Affiliations

Oracle

Cache buffers chains latch contention troubleshooting using latchprofx.sql example

Laurent Demaret has written a good article about how he systematically troubleshooted cache buffers chains latch contention, starting from wait interface and drilling down into details with my latchprofx tool:

A common cause for cache buffers chains latch contention is that some blocks are visited and re-visited way too much by a query execution. This usually happens due to nested loops joins or FILTER loops retrieving many rows from their outer (driving) row sources and then visiting the inner row-source again for each row from driving row source. Once you manage to fix your execution plan (perhaps by getting a hash join instead of the loop), then the blocks will not be re-visited so much and the latches will be hammered much less too.

The moral of the story is that if you have latch contention in a modern Oracle database, you don’t need to start tweaking undocumented latching parameters, but reduce the latch usage instead. And Laurent has done a good job with systematically identifying the SQL that needs to be fixed.

Good stuff!

If you don’t know what LatchProfX is, read this:

 

audsid

Here’s an example of how the passing of time can allow a problem to creep up on you.

A recent client had a pair of logon/logoff database triggers to capture some information that wasn’t available in the normal audit trail, and they had been using these triggers successfully for many years, but one day they realised that the amount of redo generated per hour had become rather large, and had actually been quite bad and getting worse over the last few months for no apparent reason. (You’ve heard this one before … “honest, guv, nothing has changed”).

Here’s a clue to their problem. Their logoff trigger updated their own auditing table with the following code (although I’ve removed some columns) :

begin
        if sys_context('USERENV', 'SESSIONID') = 0 then
                update system.user_audit a
                set
                        logoff_time = sysdate
                where    audsid = sys_context('USERENV', 'SESSIONID')
                and     logon_time = (
                                select  max(logon_time)
                                from    system.user_audit b
                                where   b.audsid = a.audsid
                        );
        else
                update system.user_audit
                set
                        logoff_time = sysdate
                where   audsid = sys_context('USERENV', 'SESSIONID');
        end if;
end;

The table was suitably indexed to make the correlated update (“most recent of” subquery) operate very efficiently, so that wasn’t the problem.

You might question the validity of updating just the rows with the most recent date when the sys_context(‘userenv’,'sessioned’) is zero, and why the SQL to do the update doesn’t then use literal zeros rather than calling the sys_context() function and using a correlation column – but those are minor details.

You might wonder why zero is a special case, of course, but then you may recall that when a user connects as SYS the audsid is zero – so this code is attempting to limit the update to just the row created by the most recent logon by SYS, which may introduce a few errors but really SYS shouldn’t be connecting to a production system very often.

At this point you might ask which version of Oracle the client was running. They had started using Oracle a long time ago, but this system had gone through Oracle 9i, and was now running 10g; and if you’ve kept an eye on things like autotrace you may have noticed that the audsid used by any SYS session changed from zero to 4294967295 in the upgrade to 10g. So this code is no longer treating SYS as a special case.

By the time I was on site, the client’s audit table held about 87,000 rows for the audsid 4294967295, and every time SYS logged off the session would update every single one of them. (It took a few seconds and about 45 MB of redo for SYS to log off – and SYS sessions connected rather more frequently than expected).

Warning:

If you still have to get through the upgrade from 8i or 9i to 10g, then watch out for code that does things with the deprecated userenv(‘sessionid’) or sys_context(‘userenv’,'sessionid’).

Weird Hint – Open Book Quiz

I ran into a slightly strange hint last week:

insert /*+ append, nologging, parallel(orders_tab, 4) */ …

Anyone know how this hint will behave right off the top of their head?

When I was in school we used to occasionally have quizzes where we were allowed to use our books. So feel free to use any documentation you can find on the topic.

Note: Be sure and see the comments on this one …

UPDATE 01-JUN-11: – Here’s a test case using two easy to verify hints

 
SYS@SANDBOX1> select /*+ monitor, GATHER_PLAN_STATISTICS */ * from dual
  2  ;
 
D
-
X
 
Elapsed: 00:00:00.00
SYS@SANDBOX1> @fsx
Enter value for sql_text: select /*+ monitor, GATHER_PLAN_STATISTICS */ * from dual%
Enter value for sql_id: 
 
SQL_ID         CHILD  PLAN_HASH  EXECS  AVG_ETIME AVG_PX OFFLOAD IO_SAVED_% SQL_TEXT
------------- ------ ---------- ------ ---------- ------ ------- ---------- ----------------------------------------------------------------------
d8yaqqyc0yb9k      0  272002086      1        .00      0 No             .00 select /*+ monitor, GATHER_PLAN_STATISTICS */ * from dual
 
Elapsed: 00:00:00.04
SYS@SANDBOX1> select * from table(dbms_xplan.display_cursor('&sql_id','&child_no','allstats'));
Enter value for sql_id: d8yaqqyc0yb9k
Enter value for child_no: 
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  d8yaqqyc0yb9k, child number 0
-------------------------------------
select /*+ monitor, GATHER_PLAN_STATISTICS */ * from dual
 
Plan hash value: 272002086
 
---------------------------------------------------
| Id  | Operation                 | Name | E-Rows |
---------------------------------------------------
|   0 | SELECT STATEMENT          |      |        |
|   1 |  TABLE ACCESS STORAGE FULL| DUAL |      1 |
---------------------------------------------------
 
Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level
 
 
19 rows selected.
 
Elapsed: 00:00:00.04

As you can see in the Note section, the GATHER_PLAN_STATISTICS hint was not obeyed (actually it wasn’t even evaluated). This is due to the fact that the parser quits evaluating the hint text when it hits the comma. Here’s the same statement with the comma removed.

 
SYS@SANDBOX1> select /*+ monitor GATHER_PLAN_STATISTICS */ * from dual;
 
D
-
X
 
Elapsed: 00:00:00.01
SYS@SANDBOX1> @fsx
Enter value for sql_text: select /*+ monitor GATHER_PLAN_STATISTICS */ * from dual
Enter value for sql_id: 
 
SQL_ID         CHILD  PLAN_HASH  EXECS  AVG_ETIME AVG_PX OFFLOAD IO_SAVED_% SQL_TEXT
------------- ------ ---------- ------ ---------- ------ ------- ---------- ----------------------------------------------------------------------
1scryy04ggv60      0  272002086      1        .00      0 No             .00 select /*+ monitor GATHER_PLAN_STATISTICS */ * from dual
 
Elapsed: 00:00:00.19
SYS@SANDBOX1> select * from table(dbms_xplan.display_cursor('&sql_id','&child_no','allstats'));
Enter value for sql_id: 1scryy04ggv60
Enter value for child_no: 
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  1scryy04ggv60, child number 0
-------------------------------------
select /*+ monitor GATHER_PLAN_STATISTICS */ * from dual
 
Plan hash value: 272002086
 
-----------------------------------------------------------------------------------------------------
| Id  | Operation                 | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |      |      1 |        |      1 |00:00:00.01 |       2 |      2 |
|   1 |  TABLE ACCESS STORAGE FULL| DUAL |      1 |      1 |      1 |00:00:00.01 |       2 |      2 |
-----------------------------------------------------------------------------------------------------
 
 
13 rows selected.
 
Elapsed: 00:00:00.08

Notice that this time the GATHER_PLAN_STATISTICS hint was obeyed and so the xplan output has the A-Rows column and there is no Note section complaining about missing plan statistics. The documentation states that comment text can be interspersed with valid hints though. So why isn’t the comma treated as comment text. Well apparently it is a reserved word. Oddly enough, the word COMMENT is a reserved word as well. Watch this:

 
SYS@SANDBOX1> select /*+ monitor COMMENT GATHER_PLAN_STATISTICS */ * from dual;
 
D
-
X
 
Elapsed: 00:00:00.00
SYS@SANDBOX1> @fsx
Enter value for sql_text: select /*+ monitor COMMENT GATHER_PLAN_STATISTICS */ * from dual
Enter value for sql_id: 
 
SQL_ID         CHILD  PLAN_HASH  EXECS  AVG_ETIME AVG_PX OFFLOAD IO_SAVED_% SQL_TEXT
------------- ------ ---------- ------ ---------- ------ ------- ---------- ----------------------------------------------------------------------
8dxup58bgaxsy      0  272002086      2        .02      0 No             .00 select /*+ monitor COMMENT GATHER_PLAN_STATISTICS */ * from dual
 
Elapsed: 00:00:00.03
SYS@SANDBOX1> @dplan_allstats
Enter value for sql_id: 8dxup58bgaxsy
Enter value for child_no: 
 
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  8dxup58bgaxsy, child number 0
-------------------------------------
select /*+ monitor COMMENT GATHER_PLAN_STATISTICS */ * from dual
 
Plan hash value: 272002086
 
---------------------------------------------------
| Id  | Operation                 | Name | E-Rows |
---------------------------------------------------
|   0 | SELECT STATEMENT          |      |        |
|   1 |  TABLE ACCESS STORAGE FULL| DUAL |      1 |
---------------------------------------------------
 
Note
-----
   - Warning: basic plan statistics not available. These are only collected when:
       * hint 'gather_plan_statistics' is used for the statement or
       * parameter 'statistics_level' is set to 'ALL', at session or system level
 
 
19 rows selected.
 
Elapsed: 00:00:00.03

So the moral is don’t put comment text in your hints. If you must have a comment in your SQL, put it in a separate comment structure like so:

select /*+ monitor  GATHER_PLAN_STATISTICS */ /* This is a comment */ * from dual;

And if you really feel you need commas between your hints try something like this:

select /*+ monitor"," GATHER_PLAN_STATISTICS */ * from dual;

Audit Ouch!

A few days ago I was rehearsing a presentation about how to investigate how Oracle works, and came across something surprising. Here’s a simple bit of code:

create table t1
as
with generator as (
	select	--+ materialize
		rownum id
	from dual
	connect by
		rownum <= 10000
)
select
	rownum				id,
	lpad(rownum,10,'0')		small_vc,
	case mod(rownum,1000)
		when 0 then 'Y' else 'N'
	end 				flag
from
	generator	v1,
	generator	v2
where	rownum <= 1000000
;

-- gather table stats here

execute snap_redo.start_snap
execute snap_my_stats.start_snap

update
	/*+ full(t1) */
	t1
set
	flag = 'N'
where
	flag = 'Y'
;

execute snap_my_stats.end_snap
execute snap_redo.end_snap

The two “snap” packages take a starting snapshot of v$mystat/v$statname and v$sysstat (the latter restricted to redo statistics only) and an ending snapshot, and report the change between snapshots. Here are the results for just the redo statistics from the update when I ran the code on 10.2.0.3:

Name                                                                     Value
----                                                                     -----
redo entries                                                             1,907
redo size                                                              439,260
redo ordering marks                                                         17
redo subscn max counts                                                      18

You may at this point be wondering why the first two figures are roughly twice the size they ought to be; you might, of course, be wondering what you’re supposed to be seeing here (in which case you need to see my presentation “The Beginners’ Guide to becoming an Expert.”

My code updates one row in every thousand and each row is in a separate block. That means I should see roughly one redo entry per row at roughly 250 bytes per redo entry. Since I’m using Oracle 10g (single instance) the total number of entries ought to be slightly less than 1,000 because the first 40 or 50 should be accumulated in the private redo and in-memory undo areas. So we have about 950 redo entries too many; where did they come from ?

The test takes about 60 seconds to run, so I repeated it taking snapshots of a couple more dynamic performance views – which didn’t actually help very much – and then ran it one more time after switching log files so that I could dump the log file and find the relevant redo entries easily after the update. I won’t bore you with an extract from the log file dump, I’ll just point out that it showed me two redo records per row – op code 11.4 followed by op code 11.5, which is: “lock row, update row” – when it should simply have been showing nothing but “update row” records.

It took me a little while to figure out what was going on, but the title of this piece gives it away. A few days earlier I had enabled audit (audit_trail = db_extended) on my laptop to demonstrate a point to a client – the redundant “lock row” records appeared as a side effect (which I hadn’t noticed at the client site because we were focussing on a completely different point – which I might write about next week).

Here’s the really important point – I had NOT enabled any audit options on this table, the only thing I had done was make it possible for Oracle to record audit records. The spare records were NOT redo records relating to the aud$ table; instead, the simple act of enabling audit introduced a completely spurious “lock row” record on every update I did.

So I changed audit_trail to ‘none’, bounced the database and repeated the test, and got back to the result that I had been expecting:

Name                                                                     Value
----                                                                     -----
redo entries                                                               961
redo size                                                              295,996
redo ordering marks                                                         13
redo subscn max counts                                                      14

As you can see, we’re back to 1,000 redo records (minus the odd 40 or 50) at about 250 bytes per entry (allowing for the size of the first “private” redo record that combined lots of changes).

Note: Having observed this problem on 10.2.0.3 with audit_trail = db_extended I repeated the tests with audit_trail=db, and then tested a couple of older and newer versions of Oracle. The phenomenon appeared only on 10.2.0.3, and it didn’t matter whether I chose db or db_extended.

Footnote: the day after I made this discovery I gave what must have been my worst ever presentation: not only had I lost three hours sleep working out the details of this problem, I’d made a couple of “quick changes” to some of my demo scripts to do some extra checks and I had failed to clean up properly afterwards so I kept getting the “wrong” results and confusing myself in my live demonstrations.

Amazon RDS for Oracle: First Impressions

On Tuesday, Amazon announced availability of an Oracle version of their Relational Database Service (RDS). RDS is one of Amazon’s cloud services. You can think of it as ”database as a service.” Amazon provides a running database, storage, horsepower and a variety management tasks. And all you have to do is store you data in it. RDS has been available with a MySQL engine for some time, but the Oracle version of this service has been long anticipated.

As with Amazon’s other cloud services, you control and manage RDS services using a web application API.  You can either write your own software to do this, or use Amazon’s command line API tools or Amazon’s web-based console.

Fedora 15: First Impressions…

It’s been nearly six months since I made the switch from CentOS to Fedora as my main desktop OS.

The Fedora 15 final release dropped a couple of days ago and I slapped it on my main desktop PC straight away. As usual, the first impression is all about the visuals. GNOME 3 looks great. I’m sure lots of people think KDE looks great too, but I tend to just stick with the default window manager, so it’s GNOME for me.

If you’ve read any of the press you will know that the menu bar and task bar have been removed. This is kinda weird at first. How does it affect me?

  • Task Bar: I was surprised how reliant I was on the task bar for switching between open apps. I would say about 50% of the time I was using the task bar and 50% of the time I was using alt+tab. Now I’m using alt+tab 100% of the time. I think this change has worked really well for me. I feel like I’m navigating quicker.
  • Menu Bar: I think this one will annoy a few people. In essence, the menu bar is still there, but one layer down. If you click on “Activies” in the top left (or hit the Windows key) you switch to the activities screen. There is a favorites doc on the left and if you click “Applications”, you get a menu (more like a filter) on the right of the screen. That’s all nice, but the thing I really like is if you click the Windows key and type in a few letters it returns all apps and items that contain those letters in the name. Similar to the Windows menu, but prettier. So if you insist on using the menu like an old-style menu, I think this change will annoy you as it requires an extra click and some animations. If you use the favorites doc and the search feature I think it’s quite cool.

The fancy visuals worked straight out of the box for my main desktop machine, but one of my other machines (with a better graphics card) couldn’t handle GNOME 3 and ran using fallback mode. Fallback mode is pretty much like previous GNOME releases with a menu bar and task bar. I’m sure some people will prefer fallback mode, but I think the new stuff is certainly worth a try.

If you really can’t handle the new interface you can manually switch to fallback mode. Start up the System Info dialog (Activities > Applications > System Settings > System Info), click “Graphics” , flick the “Forced Fallback Mode” switch and relog.

As for the OS itself, I’ve had no dramas so far, but it is early days. Time will tell…

By the way, I did the usual Oracle on Fedora thing.

Cheers

Tim…




audit

Here’s one of those funny little details that can cause confusion:

SQL> select * from user_audit_object;

no rows selected

SQL> audit select on indjoin by session whenever successful;

Audit succeeded.

SQL> select
  2     count(*)
  3  from
  4     indjoin         ij
  5  where
  6     id between 100 and 200
  7  and        val between 50 and 150
  8  ;

  COUNT(*)
----------
        51

1 row selected.

SQL> select * from user_audit_object where obj_name = 'INDJOIN';

no rows selected

So we had nothing in our “object audit trail”, then we enabled auditing on one particular table for select statements (and audit_trail = db has been set in the parameter file) but our select statement hasn’t been audited. What went wrong ?

Let’s try a different query against the audit view which, until a moment ago, held no data:


SQL> select obj_name from user_audit_object;

OBJ_NAME
-----------------------------------------------
index$_join$_001

So we have audited something – but what ? Here’s the execution plan for our query:

----------------------------------------------------------------------------------------
| Id  | Operation           | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |                  |     1 |     8 |     5  (20)| 00:00:01 |
|   1 |  SORT AGGREGATE     |                  |     1 |     8 |            |          |
|*  2 |   VIEW              | index$_join$_001 |     3 |    24 |     5  (20)| 00:00:01 |
|*  3 |    HASH JOIN        |                  |       |       |            |          |
|*  4 |     INDEX RANGE SCAN| IJ_PK            |     3 |    24 |     3  (34)| 00:00:01 |
|*  5 |     INDEX RANGE SCAN| IJ_VAL           |     3 |    24 |     3  (34)| 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("VAL"<=150 AND "ID"<=200 AND "ID">=100 AND "VAL">=50)
   3 - access(ROWID=ROWID)
   4 - access("ID">=100 AND "ID"<=200)
   5 - access("VAL">=50 AND "VAL"<=150)

See that index$_join$_001 in line 2 ? We have a plan that uses the “index hash join” mechanism to query that table – so Oracle audits the query, but unfortunately uses the name of the internal view in the audit record.

Be very careful how you audit objects; you may think that an object has not been accessed when it has (sort of). If you’ve started doing this type of auditing to check whether or not an object is ever accessed you could be fooled.

You could claim, of course, that the object hasn’t really been accessed – but compare the index join above with the following (cut-n-paste) example, which I ran after deleting everything from the aud$ table:

SQL> connect test_user/test
Connected.

SQL> select obj_name from user_audit_object;

no rows selected

SQL> set autotrace on explain
SQL> select count(*) from indjoin where id = 23;

  COUNT(*)
----------
         1

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 689603510

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |     1 |     4 |     1   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE    |       |     1 |     4 |            |          |
|*  2 |   INDEX UNIQUE SCAN| IJ_PK |     1 |     4 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - access("ID"=23)

SQL> set autotrace off
SQL> select obj_name from user_audit_object;

OBJ_NAME
----------------
INDJOIN

1 row selected.

We haven’t actually visited the indjoin table, but the audit code recognises that that was the intent, even though the optimizer made physical access unnecessary. The behaviour ought to be the same for both cases.

(This example orginally came from a client site, by the way – with the question “how come the audit trail says we’re not visiting this table?”. The client version, and the version I tested on, was 10.2.0.3. There were a couple of other little surprises I found at the same time – I may find time to write about them some other day.)

Offloading production backups to a different storage array

For quite a while Oracle DBAs have performed split mirror backups using special devices called “Business Continuance Volumes” or BCVs for short. A BCV is a special mirror copy of a LUN on the same storage array as the primary copy.

In a BCV backup scenario, the storage administrator (usually) “splits” the mirror after putting the database into hot backup mode. After the mirror is split, the database is taken out of hot backup mode and resumes normal operation. A new Oracle instance on a different host can be mounted using the split mirror copy of the database for backups. The use of this technology for refreshing a test environment is out of scope of this article. The below figure demonstrates the idea:

The advantage of such an approach is that the backup operation, initiated from the mount host, should not impact the performance of the production database. Once the backup is complete, the BCV for the ARCH diskgroup should be re-synchronised with the source LUN, whereas the DATA disk group should not. This allows us to quickly recover from problems with the primary LUNs-more on that on a later post.

One Step Further

On my current site this process has been refined. One of the requirements was that SRDF should be used as a means for disaster recovery. I should say that SRDF, or Symmetrix Remote Data Facility is my customer’s preferred method for DR, and I do by no means want to advertise for EMC here; it just so happened that I was working on EMC storage for this project.

I should also note that Data Guard cannot be used due to an application constraint (an ACFS file system is integral part of the application and must be backed up together with the database).

All storage is presented by ASM, which in many ways makes life easier for me. The ASM LUNs or “disks” will have all the required information in the “disk” header. So after the cloned LUNs have been presented to the mount host, all I need to do is make them available to the OS, and optionally run an “oracleasm scandisks” as root to detect them. From then on I should be able to simply mount the disk group (either via sql*plus or srvctl in 11.2). The actual backup requires a few more steps, these are shown below.

Before going further into detail let’s have a look at the architecture first:

Two data centres are in use: the local one is used for production in normal operations, including backups. One of the design requirements was that backups can be taken on either data centre, in respect to the DR situation.

Split mirror backups as shown in the above figure are taken on the local data centre in normal operations. In case of DR, the remote data centre will be configured to take backups. For this to happen, it is necessary to clone the replicated LUNs (that would be activated in the DR event) much as it’s done for the local data centre’s split mirror backups. As an added advantage, the clones can be used to create pre- and post batch “backups” that would be activated in case of a horrible failure of the batch/end-of-year processing.

Taking the Backups

To be able to take a backup of the split mirror, a few things are necessary. Most of these are documented in MOS note “RMAN and Split Mirror Disk Backups [ID 302615.1]”. You certainly require a recovery catalogue database in the first place. As a first step you register the database in the catalog. You perform this step connect to the production database and recovery catalog, as shown in the below example:

$ rman target / catalog=rman/rmanPwd@rccat

Recovery Manager: Release 11.2.0.2.0 - Production on Fri May 20 10:39:00 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: PROD (DBID=1796248120)
connected to recovery catalog database

RMAN> register database;

database registered in recovery catalog
starting full resync of recovery catalog
full resync complete

RMAN> exit

If you like, you can configure defaults at this stage, I opted for the following (still connected to the primary database and recovery catalog):

RMAN> configure retention policy to redundancy 3;

new RMAN configuration parameters:
CONFIGURE RETENTION POLICY TO REDUNDANCY 3;
new RMAN configuration parameters are successfully stored

RMAN> CONFIGURE DEFAULT DEVICE TYPE TO SBT_TAPE;

new RMAN configuration parameters:
CONFIGURE DEFAULT DEVICE TYPE TO 'SBT_TAPE';
new RMAN configuration parameters are successfully stored

RMAN> configure controlfile autobackup off;

new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP OFF;
new RMAN configuration parameters are successfully stored

RMAN> CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE sbt_tape to '%F';

new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE 'SBT_TAPE' TO '%F';
new RMAN configuration parameters are successfully stored

RMAN> configure device type sbt_tape parallelism 4 backup type to backupset;

new RMAN configuration parameters:
CONFIGURE DEVICE TYPE 'SBT_TAPE' PARALLELISM 4 BACKUP TYPE TO BACKUPSET;
new RMAN configuration parameters are successfully stored

RMAN> exit

Now switch over to the mount host for some real work.

The high level steps for taking a backup of the “split mirror” are:

  1. Ensure that the cloned LUNs are presented to the mount host’s operating system, including appropriate zoning on the fabric
  2. Ensure that the multi-pathing solution of choice is correctly configured for all paths to the cloned LUNs
  3. Ensure that the ASM disks are known to the local ASM instance. That may including running an /etc/init.d/oracleasm scandisks as root, or putting the relevant rules into /etc/udev/rules.d/

These steps have been quite generic and will depend on your OS and storage stack. I won’t go into detail, but you might find some relevant bits and pieces on my blog if you are interested.

The next step is to make the ASM disks known to ASM. Initially that has to be done via the command line, from 11.1 onwards you have to connect as SYSASM:

SQL> alter  diskgroup DGName mount;

For Oracle 11.2 this automatically creates a resource for diskgroup DGName in the OCR which is very convenient (especially in RAC environments). Next time, all you need to do is to execute “srvctl start diskgroup -g DGName” as the grid software owner.

For pre 11.2 environments you might want to consider updating the “asm_diskstring” initialisation parameter accordingly.

Once the ASM disk groups your database requires are mounted in ASM, it’s time to mount the database. If not done so already, register the database in the OCR and ensure that you add the spfile option as well, as in this example (Oracle RAC users would add instances additionally):

$ srvctl add database -d PROD -o $ORACLE_HOME -c SINGLE -p ‘+data/PROD/spfilePROD.ora’ \
> -s mount -a DATA,ARCH

As per said MOS document, it is required to start the database using a BACKUP controlfile on the mount host. Otherwise you’d end up with these RMAN errors after the first backup:

RMAN-3014: Implicit resync of recovery catalog failed
RMAN-6038: Recovery catalog package detected an error
RMAN-20035: Invalid high RECID error

To do so, create a backup controlfile on the primary database before splitting the mirror. The process is very much the same you would for a physical standby database:

SQL> alter database backup controlfile to ‘/tmp/backup.ctl’;

Database altered.

This controlfile now needs to be made available to the mount host-an elegant way would be to use DBMS_FILETRANSFER to perform this task, or asmcmd’s copy command.

Once it’s on the mount host, say in /tmp//backup.ctl, it needs to be made available to Oracle. The easiest way is to use RMAN for this:

Connect to the mirror instance as SYSDBA; do not connect to the recovery catalog.

$ rman target /

Recovery Manager: Release 11.2.0.2.0 - Production on Fri May 20 10:53:40 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: PROD (DBID=1796248120, not open)

RMAN> shutdown immediate

using target database control file instead of recovery catalog
database dismounted
Oracle instance shut down

RMAN> startup nomount
connected to target database (not started)
Oracle instance started

Total System Global Area    9219969024 bytes

Fixed Size                     2234056 bytes
Variable Size               4630513976 bytes
Database Buffers            4563402752 bytes
Redo Buffers                  23818240 bytes

RMAN> restore controlfile from '/tmp/PROD/ backup.ctl';

Starting restore at 20-MAY-11
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=200 instance=PROD device type=DISK

channel ORA_DISK_1: copied control file copy
output file name=+ARCH/prod/controlfile/current.256.737725307
output file name=+DATA/prod/controlfile/current.256.737725307
Finished restore at 20-MAY-11

If you were really desperate to back the CURRENT controlfile (which is part of the clone) up, you could have done this prior to the restore of the backup controlfile. You must not connect to the recovery catalog in this case-see MOS note 1111369.1 for more information.

Querying V$DATABASE.CONTROLFILE_TYPE should now return “BACKUP”. With the setup completed, you are ready to back the database up from the cloned LUNs. Connect to RMAN again, using the backup instance and the recovery catalog and initiate the backup. For example:

run {
allocate channel t1 type sbt_tape parms 'ENV=(TDPO_OPTFILE=/u01/app/oracle/product/admin/PROD/tdp/tdpo_PROD.opt)';
allocate channel t2 type sbt_tape parms 'ENV=(TDPO_OPTFILE=/u01/app/oracle/product/admin/PROD/tdp/tdpo_PROD.opt)';
backup database;
}

This completes the backup of the database. One caveat exists: the control file is not backed up as part of this process-RMAN doesn’t back a “backup” controlfile up:

$ rman target / catalog rman/rmanPwd@rccat

Recovery Manager: Release 11.2.0.2.0 - Production on Tue May 24 12:51:22 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

connected to target database: GB1 (DBID=1796248120, not open)
connected to recovery catalog database

RMAN> CONFIGURE CONTROLFILE AUTOBACKUP ON;

new RMAN configuration parameters:
CONFIGURE CONTROLFILE AUTOBACKUP ON;
new RMAN configuration parameters are successfully stored

RMAN> run {
2> allocate channel t1 type sbt_tape parms
3> 'ENV=(TDPO_OPTFILE=/u01/app/oracle/product/admin/PROD/tdp/tdpo_PROD.opt)';
3> backup tablespace USERS;
4> }

allocated channel: t1
channel t1: SID=199 instance=PROD device type=SBT_TAPE
channel t1: Data Protection for Oracle: version 5.5.1.0

Starting backup at 24-MAY-11
channel t1: starting full datafile backup set
channel t1: specifying datafile(s) in backup set
input datafile file number=00006 name=+DATA/prod/datafile/users.264.737727697
channel t1: starting piece 1 at 24-MAY-11
channel t1: finished piece 1 at 24-MAY-11
piece handle=1cmd4ott_1_1 tag=TAG20110524T125316
comment=API Version 2.0,MMS Version 5.5.1.0
channel t1: backup set complete, elapsed time: 00:00:01
Finished backup at 24-MAY-11

RMAN-06497: WARNING: control file is not current, control file AUTOBACKUP skipped
released channel: t1

RMAN>  exit

It’s not difficult at all to get around this problem. As part of the regular archive log backups you are performing on the production database anyway, you add a “backup current controlfile” and a command to resync the catalog (“resync catalog”).

At this stage I should note that the backup process wouldn’t be different in principle, if additional backups were to be taken on the DR site (licensing questions aside). Instead of having to clone the primary LUNs on the local data centre, the storage admins would clone the replicated LUNs (the “R2s” in EMC-talk) and bring these clones up on the DR mount host.

Summary

The process described above approach was interesting from my personal point of view as I haven’t used this concept before. I’m an Oracle DBA, and I feel most comfortable when I have things in hand, relying on another team for doing BAU database tasks is a new experience.

The process description deliberately left product names out unless they were part of the command output. The concept is quite universal and is by no means tied down to a specific vendor.

One of the requirements is that your tape solution can talk to the primary host(s) and the mount host in the local and remote data centres. Although I have seen sites (including rather large ones ) where the DR site was not configured to take backups. Although you could argue that you shouldn’t operate from your DR site for long, that’s by no means an excuse for not having backups. Let’s suggest you run into a block corruption, how would you recover from that without a backup? But then the storage team usually argue that block corruption doesn’t happen on their high end arrays.

References

  • RMAN and Split Mirror Disk Backups [ID 302615.1]
  • RMAN Backup of Controlfile fails from Split Mirror / BCV Copy. Is there any way to take controlfile backup from Split Mirror / BCV so that the Production DB is not used for Controlfile backup? [ID 1111369.1]
  • Supported Backup, Restore and Recovery Operations using Third Party Snapshot Technologies [ID 604683.1]

oradebug peek

An odd little discovery today. The peek command of oradebug is fairly well documented on the internet, and I’ve used it from time to time to find the contents of memory. Earlier on today, though, I was running 11.1.0.6 and it didn’t do what I was expecting. I was trying to dump a log buffer block from memory so after logging on as sys and finding the location for the block I duly typed:


oradebug setmypid
peek 0x09090000 512

The response I got looked, as expected, like this:


[9090000, 9090200) = 00002201 0000E3D7 00000974 E1808044 00340035 00350035 00340035
0034002F 0022002F 0309002C 050C0AC2 010860C4 02C1023E 056F7807 08290714 ...

As you can see, the dump range is given as 512 bytes (the variation in bracketing for the "[9090000,9090200)" means include the low address but exclude the high address) but the dump itself is only 15 words. My memory told me that this is what Oracle does on screen and that all I had to do was read the trace file for the full 512 bytes. Unfortunately my memory was wrong - the trace file was limited to the same 15 words.

It wasn't until I added an extra number to the command that I got the full 512 bytes I wanted in the trace file:

=== On screen ===
SQL> oradebug peek 0x09090000 512 1
[9090000, 9090200) = 00002201 0000E3D7 00000974 E1808044 00340035 00350035 00340035
0034002F 0022002F 0309002C 050C0AC2 010860C4 02C1023E 056F7807 08290714 ...

=== from the trace file ===
Processing Oradebug command 'peek 0x09090000 512 1'
[9090000, 9090200) = 00002201 0000E3D7 00000974 E1808044 00340035 00350035 ...
Dump of memory from 0x9090018 to 0x9090200
9090010                   00340035 0034002F          [5.4./.4.]
9090020 0022002F 0309002C 050C0AC2 010860C4  [/.".,........`..]
9090030 02C1023E 056F7807 08290714 056F7807  [>....xo...)..xo.]
9090040 082A0714 023DC203 0303C102 091215C2  [..*...=.........]
     ...
90901E0 03083207 02083DC2 C20303C1 C2031215  [.2...=..........]
90901F0 00012D09 00000009 00000000 0016010B  [.-..............]

*** 2011-05-20 16:35:11.156
Oradebug command 'peek 0x09090000 512 1' console output:
[9090000, 9090200) = 00002201 0000E3D7 00000974 E1808044 00340035 00350035 00340035 0034002F 0022002F 0309002C 050C0AC2 010860C4 02C1023E 056F7807 08290714 ...

Apart from the repeating the on-screen dump, the trace file now reports the full dump, with both a raw list and a character-based interpretation of each 16 bytes that starts, unfortunately, by repeating the first 24 (0×18) bytes in the same format as the onscreen format.

Maybe I’m wrong about how I used to peek data – maybe it’s just a variation in this version of Oracle.

ASSM

Here’s an interesting little detail (obvious AFTER the event) about space management with ASSM (automatic segment space management). It starts with this question on OTN:

When I alter table deallocate unused and keep 1K the object ends up with 24 blocks, even after I’ve truncated the table. Why?
This is in a tablespace using ASSM, with locally managed extents set to use automatic (system) allocation.

Ultimately the answer is – the first extent in this table started life at 8MB, and an extent that large needs to have 16 level 1 bitmap (space management) blocks, one level 2 bitmap block, and the segment header block before you get to data blocks. When you truncate and deallocate Oracle doesn’t recreate the map, so the extent has to start with 18 blocks – round that up to the multiple of 8 blocks (the 64KB that Oracle normally uses for starting extents for small objects) and you get the 24 blocks from the question.

It took us a bit of time to get to the right answer on the thread – and that’s why I’m giving you the quick answer.