Well November came around quickly - when I started blogging I promised myself that I would at least try to share my Exadata findings once a month. So it is time to share my experience and findings from applying the bundle patch 6. If you are new to Exadata patching you should hopefully find some important points below. The patching took place on a ¼ Rack (2 compute nodes). The high level approach for applying BP6 is
1. Apply patch 10114705 - Database Bundle Patch 6 for Exadata first
2. Then patch 1060220 - Overlay N-apply Patch for Database Bundle Patch 6
3. Finally: Apply the patch 10110978 (EXADATA CRS MLR4 ON TOP OF 220.127.116.11.2 FOR BUGS 9799693 9820320 9907089 10040109) on the top
I used the non rolling patching approach as described in the read me files.
Patch 10114705 – 18.104.22.168 DB Machine Bundle Patch 6 MLR1
The patch was applied successfully on both RDBMS and Grid Infrastructure (GI) homes. As the readme instructions will tell you, before we apply the patch in the GI home, we need to unlock the cluster and the last step, after successful installation of the patch, we need to lock oracle cluster ware on each node, however as you can see below I was not able to lock the cluster successfully:
As root #> $GI_HOME/crs/install/rootcrs.pl -patch
2010-11-02 15:43:53: Parsing the host name
2010-11-02 15:43:53: Checking for super user privileges
2010-11-02 15:43:53: User has super user privileges
Using configuration parameter file: ./crsconfig_params
CRS-4123: Oracle High Availability Services has been started.
Timed out waiting for the CRS stack to start.
After looking into the log files under $GI_HOME/log/hostname, it was evident that the diskmon.bin had died and most of the cluster ware processes/binaries where offline. We were not able to lock or start the cluster and it was left in a bad state. As i did not apply the patch in local mode, the patch binaries were prorogated and compiled on node2 as well.
After creating a SR with support, the SR got picked up by support in Melbourne and it was a pleasure working with someone who had an in-depth expertise of Oracle’s cluster ware and Exadata.
We managed to fix the issue, after identifying the diskmon.bin binary that is shipped with BP6 was the culprit. We removed the diskmon.bin file from the GI home and replaced it with the previous binary from bundle patch 5, which I still had in the staging area.
For unknown reason to me, Oracle are shipping diskmon.bin with bundle patch 6 and we have not been able to find out why. According to the BP6 information/reference, it has no diskmon fixes.
After fixing above diskmon.bin issue, we were able to complete the patching for both RDBMS and GI homes for, 10114705, 10160220 and 10110978.
It is worth mentioning that the Cluster ware patch 10100978 EXADATA CRS MLR4 can be applied in an automated manner using the opatch auto option. The auto option will minimize a lot of the patching work, as it will stop all the needed resources and services and apply the patch to both the RDBMS and GI homes, propagate the patches to the other compute nodes and then startup everything again.
Now, be aware when using the auto option, you have to unzip the patch file into a directory where you have not unzipped other patches; otherwise opatch auto will try to apply these patches as well. I’m not sure if this is a known issue yet or not. So to give an example, if you have a /u01/app/oracle/stage directory where you normally uncompress the patches in, you will end up with the patch number as a subdirectory under the stage directory; instead do the following, to avoid above described issue:
Oracle$> cd /u01/app/oracle/stage
Oracle$> mkdir auto_patch
Oracle$> cd auto_patch
Oracle$> unzip /path_to_zipped_10110978_patch/ p10110978_112010_Linux-x86-64.zip .
Oracle$> opatch auto
I have read a few blog posts which mention that they have successfully applied bundle patch 6 without issues; so it is not for certain you will run into the diskmon binary issue; I just wanted to share my experience and findings with BP6 and hope it will help you find a quick resolution if you happen to run into the same issue.
If you don’t have an Exadata, test or standby database to validate the patching process first, then you can use –local opatch option for the first patch (10114705), so it does not propagate to the other compute nodes. If you run into any issues, you will obviously not apply the post upgrade steps and it will hopefully give you the option of making the other compute node(s) available, while you work on fixing the issue.
You may wonder why I have not advised you to rollback BP6, if you run into any patching issues. That is because, if you choose to rollback BP6 and if you like me, previously have applied BP4 and then BP5 on top of the install base – the rollback will indeed rollback back to the product image your Exadata database machine was delivered with; and I would not like to advise you to go in production on that version. The bundle patch 6 is a subset of your previous bundle patches; so once you commit to apply the bundle patches on Exadata, you really want to make all the way.