In my previous posts about the first RAC Grid Infrastructure Patchset I document a few issues I encountered that were worth noting. But where things work as advertised I am more than happy to document it too. In a way, the January 2014 GI PSU works as you’d hope it would (at least in my lab for my 2 node cluster). Well-almost: if you have a non 12.1 database in your environment you might encounter this.
UPDATE: You might want review some additional information with regards to datapatch.
Admittedly it’s taken from an Oracle Restart (i.e. non cluster) environment but I can’t see this not happening in RAC:
[root@server1 stage]# opatchauto apply /u01/stage/188.8.131.52.2/17735306 -ocmrf /u01/stage/ocm.rsp OPatch Automation Tool Copyright (c) 2013, Oracle Corporation. All rights reserved. OPatchauto version : 184.108.40.206.2 OUI version : 220.127.116.11.0 Running from : /u01/app/grid/product/18.104.22.168/grid opatchauto log file: /u01/app/grid/product/22.214.171.124/grid/cfgtoollogs/opatchauto/17735306/ - opatch_gi_2014-02-17_20-04-54_deploy.log Parameter Validation: Successful System Configuration Collection failed: oracle.osysmodel.driver.crs.productdriver.ProductDriverException: PRCD-1027 : Failed to retrieve database ora11 PRCD-1229 : An attempt to access configuration of database ora11 was rejected because its version 126.96.36.199.0 differs from the program version 188.8.131.52.0. Instead run the program from /u01/app/oracle/product/184.108.40.206. opatchauto failed with error code 2. [root@server1 stage]#
I have a 2 node RAC cluster that hasn’t previously been patched. It uses GNS for name resolution and is probably one of the more complex setups. So I was keen to test the new GI Patch Set Update. Note that a PSU will roll back previous patches, it’s not a delta as you might know from RPM.
The cluster installation using GNS was slightly more challenging than the usual RAC system so I decided not to try separation of duties. In other words, the GRID and RDBMS owner are identical: oracle.
As always you have to update OPatch before you can apply the latest patch. I did this in all 4 homes: 2 RDBMS and 2 Grid homes. Download and stage the patch to a location you like by unzipping it.
Next you need the ocm response file. I simply copied it from the other cluster.
I started the installation of the patch on node 1 of my cluster in a screen session as always. Node 1 was responsible for the GNS resource at the time. Following the instructions and some previous experience I decided to let opatchauto do all of the work. And this time Oracle even included the “apply” keyword in the docs! And it didn’t bail out immediately with obscure error messages either…
[root@rac12node1 temp]# opatchauto apply /u01/temp/17735306 -ocmrf /u01/temp/ocm.rsp OPatch Automation Tool Copyright (c) 2013, Oracle Corporation. All rights reserved. OPatchauto version : 220.127.116.11.2 OUI version : 18.104.22.168.0 Running from : /u01/app/22.214.171.124/grid opatchauto log file: /u01/app/126.96.36.199/grid/cfgtoollogs/opatchauto/17735306/opatch_gi_2014-01-15_08-54-25_deploy.log Parameter Validation: Successful Grid Infrastructure home: /u01/app/188.8.131.52/grid RAC home(s): /u01/app/oracle/product/184.108.40.206/dbhome_1 Configuration Validation: Successful Patch Location: /u01/temp/17735306 Grid Infrastructure Patch(es): 17077442 17303297 17552800 RAC Patch(es): 17077442 17552800 Patch Validation: Successful Stopping RAC (/u01/app/oracle/product/220.127.116.11/dbhome_1) ... Successful Following database(s) were stopped and will be restarted later during the session: rac12c Applying patch(es) to "/u01/app/oracle/product/18.104.22.168/dbhome_1" ... Patch "/u01/temp/17735306/17077442" successfully applied to "/u01/app/oracle/product/22.214.171.124/dbhome_1". Patch "/u01/temp/17735306/17552800" successfully applied to "/u01/app/oracle/product/126.96.36.199/dbhome_1". Stopping CRS ... Successful Applying patch(es) to "/u01/app/188.8.131.52/grid" ... Patch "/u01/temp/17735306/17077442" successfully applied to "/u01/app/184.108.40.206/grid". Patch "/u01/temp/17735306/17303297" successfully applied to "/u01/app/220.127.116.11/grid". Patch "/u01/temp/17735306/17552800" successfully applied to "/u01/app/18.104.22.168/grid". Starting CRS ... Successful Starting RAC (/u01/app/oracle/product/22.214.171.124/dbhome_1) ... Successful SQL changes, if any, are applied successfully on the following database(s): RAC12C Apply Summary: Following patch(es) are successfully installed: GI Home: /u01/app/126.96.36.199/grid: 17077442, 17303297, 17552800 RAC Home: /u01/app/oracle/product/188.8.131.52/dbhome_1: 17077442, 17552800 opatchauto succeeded.
Wow, I could hardly believe my eyes. I “tail”d the logfile in a different session to see what it was doing but the output seems less verbose these days than initial when “opatch auto” came out. See further down in the article about useful log locations.
A few remarks. It took quite a while for the patch to advance past “Starting CRS…” I remember staring at the screen anxiously in previous patch exercises and witnessed the CRS start sequence timing out. In this case there is a valid reason for the slowness, and it’s documented in $GRID_HOME/cfgtoollogs/crsconfig/crspatch*.log: it validates ACFS and applies patches to the -MGMTDB database. In fact you should open a third session to tail the crspatch log file to see what is happening as it provides a wealth of information about stopping and starting Clusterware resources.
With GI PSU 184.108.40.206.1 I ran into problems with CRS which got confused about who is mastering the OCR. With 220.127.116.11.2 this didn’t happen. CRS started successfully on the last node, and even started the database instance.
Information about what’s going on can be found in many locations…
More log information can be found in:
Note these logs are not written to in that sequence. Start with the opatchauto logs, then refer to the more specific log locations for individual troubleshooting. The complete log file names are referenced at the end of the opatchauto log.
The post installation steps require you to load SQL scripts into the database-it seems to have done that, as shown in the opatchauto log file:
2014-01-15_09-32-13 : Ignoring the dbconsole command. 2014-01-15_09-32-17 : Executing command: /bin/bash -c 'ORACLE_HOME=/u01/app/oracle/product/18.104.22.168/dbhome_1 \ ORACLE_SID=$(/u01/app/oracle/product/22.214.171.124/dbhome_1/bin/srvctl status instance -d RAC12C -n rac12node1 | cut -d " " -f 2) \ /u01/app/oracle/product/126.96.36.199/dbhome_1/OPatch/datapatch' 2014-01-15_09-33-45 : Successfully executed the above command. SQL changes, if any, are applied successfully on the following database(s): RAC12C
Except that I could not find a log file in $ORACLE_HOME/sqlpatch/17552800 directory in the RDBMS home on the first node. The -MGMTDB has been patched, the log is in the $GRID_HOME. The registry$history view didn’t show any sign of a patch either. Only after applying the PSU on the last node did Oracle patch the database.
It appears that PSU 1 and 2 have issues when running datapatch. I found this in the output of my patch log:
Executing command: /bin/bash -c 'ORACLE_HOME=/u01/app/oracle/product/188.8.131.52/dbhome_2 ORACLE_SID=$(/u01/app/oracle/product/184.108.40.206/dbhome_2/bin/srvctl sta tus instance -d RCDB1 -n rac12node3 | cut -d " " -f 2) /u01/app/oracle/product/220.127.116.11/dbhome_2/OPatch/datapatch' 2014-03-21_11-25-36 : COMMAND EXECUTION FAILURE : SQL Patching tool version 18.104.22.168.0 on Fri Mar 21 11:23:26 2014 Copyright (c) 2013, Oracle. All rights reserved. Connecting to database...OK Determining current state...done For the following PDBs: CDB$ROOT Nothing to roll back The following patches will be applied: 17552800 For the following PDBs: PDB$SEED Nothing to roll back The following patches will be applied: 17552800 For the following PDBs: ACDEMOPDB Nothing to roll back The following patches will be applied: 17552800 For the following PDBs: DUBLIN Nothing to roll back The following patches will be applied: 17552800 Adding patches to installation queue...done Installing patches...done Validating logfiles... ERROR: mkdir: cannot create directory `/u01/app/grid/cfgtoollogs': Permission denied mkdir: cannot create directory `/u01/app/grid/cfgtoollogs': Permission denied mkdir: cannot create directory `/u01/app/grid/cfgtoollogs': Permission denied mkdir: cannot create directory `/u01/app/grid/cfgtoollogs': Permission denied DBD::Oracle::st execute failed: ORA-03113: end-of-file on communication channel Process ID: 20551 Session ID: 21 Serial number: 9 (DBD ERROR: OCIStmtExecute) [for Statement "ALTER PLUGGABLE DATABASE pdb$seed CLOSE IMMEDIATE INSTANCES=ALL"] at /u01/app/oracle/product/22.214.171.124/dbhome_2/sqlpatch/sqlpatch.pm line 448,
The corresponding bug number is 17421502, and a patch exists. I downloaded the patch and applied it on all of my nodes (rolling!). After everything has come up, I re-ran datapatch and this time it seemed to have worked without an “end-of-file on communication channel”. See MOS Datapatch may fail while patching a RAC+PDB environment (Doc ID 1632818.1) for more information.
Note that you have to set your ORACLE_SID to the database to be patched for the above command to work.
Applying the January PSU to my cluster worked flawlessly, definitely a huge improvement over the last PSU which was a bit of a disaster.