Search

OakieTags

Who's online

There are currently 0 users and 58 guests online.

Recent comments

Affiliations

Recover from unusable dictionary indexes in 10g using 11g

Some time ago was contacted by a company where someone had generated a script to move tables to another tablespace. Unfortunately, they had made an error and included sys and system tables.
Luckely, Oracle does not allow you to move base dictionary tables, throwing an ORA-00701 error.
ERROR at line 1:
ORA-00701: object necessary for warmstarting database cannot be altered
Anyway, some tables were moved out of the system tablespace and this left a trail of unusable dictionary indexes.
This resulted in a database throwing various errors and when the database was restarted, it just gave up.
SQL*Plus: Release 10.2.0.5.0 - Production on Tue Feb 12 17:24:03 2013

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area 1015021568 bytes
Fixed Size                  2101168 bytes
Variable Size             243269712 bytes
Database Buffers          734003200 bytes
Redo Buffers               35647488 bytes
Database mounted.
ORA-01092: ORACLE instance terminated. Disconnection forced
Alert.log snippet :
Tue Feb 12 17:24:12 GMT+01:00 2013ALTER DATABASE OPEN
Tue Feb 12 17:24:12 GMT+01:00 2013Thread 1 opened at log sequence 18
  Current log# 3 seq# 18 mem# 0: /redoctl1/TEST/redo03.rdo
Successful open of redo thread 1
Tue Feb 12 17:24:12 GMT+01:00 2013MTTR advisory is disabled because FAST_START_MTTR_TARGET is not set
Tue Feb 12 17:24:12 GMT+01:00 2013SMON: enabling cache recovery
Tue Feb 12 17:24:12 GMT+01:00 2013Errors in file /home/app/oracle/admin/TEST/udump/test_ora_48496640.trc:
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01502: index 'SYS.I_DEPENDENCY1' or partition of such index is in unusable state
Tue Feb 12 17:24:12 GMT+01:00 2013Error 704 happened during db open, shutting down database
USER: terminating instance due to error 704
Instance terminated by USER, pid = 48496640
ORA-1092 signalled during: ALTER DATABASE OPEN...
Trace file snippet :
*** ACTION NAME:() 2013-02-12 17:24:12.414
*** MODULE NAME:(sqlplus@ab00s123 (TNS V1-V3)) 2013-02-12 17:24:12.414
*** SERVICE NAME:(SYS$USERS) 2013-02-12 17:24:12.414
*** SESSION ID:(152.3) 2013-02-12 17:24:12.414
ORA-00704: bootstrap process failure
ORA-00604: error occurred at recursive SQL level 1
ORA-01502: index 'SYS.I_DEPENDENCY1' or partition of such index is in unusable state
The customer was on 10.1.0.3. I'm reproducing the case on a 10.2.0.5 database for the purpose of this blogpost.
The database was already down for 4days and a SR1 was already raised with Oracle.
Their first solution was to restore the backup - that was a bit tricky as apparently... as usual in these situations, there was no backup (they thought there was but there wasn't)
At first I focused on getting the data out with DUDE which took about 4hours for about 40GB, producings lots of DMP's and DDL scripts. 
Then a buddy of mine would take over and do the dirty work of rebuilding a new database and reload all data based on DUDE's output.
During the unload I had glanced at the SR1 call and noticed that the Oracle support engineer had done an effort to revive the crashed database by opening the 10G database using a 11G instance in upgrade mode. 
Which at first sounded strange to me as that would introduce an extra complexity  because of mis-matching dictionaries.
Once I had unloaded all data with DUDE (and most pressure was gone), I tried to open the database using an 11.2.x instance, making sure the COMPATIBLE parameter was set to 10.1.0.3 (otherwise you can't start your db with 10.1.0.3 anymore)
At least there was a cold backup of the database from just after the table moves, so I could play around a bit.
sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Tue Feb 12 17:36:33 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup upgrade pfile=/tmp/init.ora
ORA-32006: BACKGROUND_DUMP_DEST initialization parameter has been deprecated
ORA-32006: USER_DUMP_DEST initialization parameter has been deprecated
ORACLE instance started.

Total System Global Area 1010479104 bytes
Fixed Size                  2227776 bytes
Variable Size             260047296 bytes
Database Buffers          713031680 bytes
Redo Buffers               35172352 bytes
Database mounted.
Database opened.
Interesting enough the instance opened using the 11g instance in upgrade mode!
He then suggested to update ind$ and set the flag column to 1024 (0x400) for SYS.I_DEPENDENCY1.
Looking at dcore.bsq or sql.bsq this means disabling an index the quick and dirty way (/* index is disabled : 0x400 */).
This made the instance crash after bouncing it - game over. So that was clearly not the way the go.
I thought it could have something to do with the many differences between the 10g and 11g dictionary, so in my next attempt I tried the folowing steps:
- open system datafile in a hexeditor
- lookup SYS.I_DEPENDENCY1 in obj$ (based on dataobjectid and info produced by dude)
- mark SYS.I_DEPENDENCY1 as deleted in obj$ 
- see what happens if I fire up the db using 10g instance
Unfortunately I had forgotten to edit the blockheader (mainly the nrow value and the row directory) resulting in yet again a burning instance.
I did not take another attempt as it took to long to restore the db and my job was done.
But after thinking about it later that evening I reckoned, why not try to rebuild the indexes in 11g ?
So I tried one more time :
After starting the 10g db with 11g software and issueing a rebuild on SYS.I_DEPENDENCY1, this was the error returned :
SQL> alter index I_DEPENDENCY1 rebuild ;
alter index I_DEPENDENCY1 rebuild
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-00904: "DEFMAXSIZE": invalid identifier
Question is - where is DEFMAXSIZE coming from ?
So I issued a trace of my session and here's the recursive sql that's issued during a rebuild :
=====================
PARSE ERROR #4577143416:len=453 dep=1 uid=0 oct=3 lid=0 tim=5543224986472 err=904
select parttype, partcnt, partkeycols, flags, defts#, defpctfree, defpctused, definitrans, defmaxtrans, deftiniexts, defextsize, defminexts, defmaxexts, defextpct, deflists, defgroup
s, deflogging, spare1, mod(spare2, 256) subparttype, mod(trunc(spare2/256), 256) subpartkeycols, mod(trunc(spare2/65536), 65536) defsubpartcnt, mod(trunc(spare2/4294967296), 256) def
hscflags, mod(spare3, 256) interval_dty, rowid, defmaxsize from partobj$ where obj# = :1
=====================
Now remember - this is an 11g instance running a 10g database with it's original dictionary - when comparing dictionaries (sql.bsq versus dpart.bsq), the column DEFMAXSIZE did not exist in 10g.
So instead of introducing the complexity of running the migrate scripts, I decided to just add the column and see what happens :
SQL>  alter table partobj$ add defmaxsize number ;

Table altered.

SQL> alter index I_DEPENDENCY1 rebuild ;

Index altered.
Bonus !
During the actual case the above would also fail - it would complain about index I_WRI$_OPTSTAT_IND_OBJ#_ST being unusable.
This index seems related to the set of tables responsible for storing object statistics (index statistics in this case) - it seems logical that if you rebuild an index, oracle adds some stats in these tables.
If the related index is then not usable, it's only logical it complains.
So in fact I had to rebuild that index first :
SQL> alter index I_WRI$_OPTSTAT_IND_OBJ#_ST rebuild ;

Index altered.
And now I was able to rebuild all indexes without issues.
We could now start the db back in oracle 10g :
SQL*Plus: Release 10.2.0.5.0 - Production on Tue Feb 12 17:48:00 2013

Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area 1015021568 bytes
Fixed Size                  2101168 bytes
Variable Size             243269712 bytes
Database Buffers          734003200 bytes
Redo Buffers               35647488 bytes
Database mounted.
Database opened.
Double BONUS !
Now all that was left was to recompile the invalid objects :
SQL> select count(*) from dba_objects where status='INVALID' and owner='SYS' ;

       496


SQL> @?/rdbms/admin/utlrp

SQL> select count(*) from dba_objects where status='INVALID' and owner='SYS' ;

  COUNT(*)
----------
         0
And voila - the db is back among the living !
Again, when I reproduced this on my test kit, I had no invalid objects left after utlrp ran the first time, but on the actual database I was left with invalid KU$ views which are related to metadata generation needed for exp/expdp utilities.
I noticed this when I tried to do a full export :
About to export the entire database ...
. exporting tablespace definitions
. exporting profiles
. exporting user definitions
. exporting roles
. exporting resource costs
. exporting rollback segment definitions
. exporting database links
. exporting sequence numbers
. exporting directory aliases
. exporting context namespaces
. exporting foreign function library names
. exporting PUBLIC type synonyms
. exporting private type synonyms
. exporting object type definitions
. exporting system procedural objects and actions
. exporting pre-schema procedural objects and actions
. exporting cluster definitions
EXP-00056: ORACLE error 24324 encountered
ORA-24324: service handle not initialized
EXP-00056: ORACLE error 24324 encountered
ORA-24324: service handle not initialized
EXP-00056: ORACLE error 24324 encountered
ORA-24324: service handle not initialized
EXP-00000: Export terminated unsuccessfully
Datapump as well as exp was tumbling down half way through ddl generation.
So the last step to do was regenerate these views :
-- re-install DataPump types and views
@?\rdbms\admin\catdph.sql
-- re-install tde_library packages
@?\rdbms\admin\prvtdtde.plb
-- re-install DataPump packages
@?\rdbms\admin\catdpb.sql
-- re-install DBMS DataPump objects
@?\rdbms\admin\dbmspump.sql
-- recompile invalid objects
@?\rdbms\admin\utlrp.sql
And finally we can do a full export and regenerate the database !
Final note - I also tested this starting on an 11gR2 (moving sys tables in 11gR2) - and it seems 11gR2, allthough bitching and moaning, does not crash after restart and allows you to rebuild the indexes.
ps - if you were thinking about using SKIP_UNUSABLE_INDEXES - that didn't work either on the dictionary indexes ;-)
 
Final note - it's not possible to disable normal btree indexes (alter index disable) - only function based indexes.