As you may know, Oracle released the first patchset on top of 11g Release 2. At the time of this writing, the patchset is out for 32bit and 64bit Linux, 32bit and 64bit Solaris SPARC and Intel. What an intersting combination of platforms… I thought there was no Solaris 32bit on Intel anymore.
Upgrade
Oracle has come up with a fundamentally different approach to patching with this patchset. The long version of this can be found in MOS document 1189783.1 “Important Changes to Oracle Database Patch Sets Starting With 11.2.0.2″. The short version is that new patches will be supplied as full releases. This is really cool, and some people have asked why that wasn’t always the case. In 10g Release 2, to get to the latest version with all the patches, you had to
Especially applying the PSUs for Clusterware were very labour intensive. In fact, for a fresh install it was usually easier to install and patch everything on only one node and then extend the patched software homes to the other nodes of the cluster.
Now in 11.2.0.2 things are different. You no longer have to apply any of the interim releases-the patch contains everything you need, already on the correct version. The above process is shortened to:
Optionally, apply PSUs or other patches when they become available. Currently, MOS note 756671.1 doesn’t list any patch as recommended on top of 11.2.0.2.
Interestingly upgrading from 11.2.0.1 to 11.2.0.2 is more painful than from Oracle 10g, at least on the Linux platform. Before you can run rootupgrade.sh, the script tests if you applied the Grid Infrastructure PSU for 11.2.0.1.2. OUI hasn’t performed the test when it checked for prerequisistes which caught me off-guard. The casual observer may now ask: why do I have to apply a PSU when the bug fixes should be rolled up into the patchset anyway? I honestly don’t have an answer, other than that if you are not on Linux you should be fine.
Grid Infrastructure will be an out-of-place upgrade which means you have to manage your local disk space very carefully from now on. I would not use anything less than 50-75G on my Grid Infrastructure mount point.This takes the new cluster health monitor facility (see below) into account, as well as the fact that Oracle performs log rotation for most logs in $GRID_HOME/log.
The RDBMS binaries can be patched either in-place or out-of-place. I’d say that the out-of-place upgrade for RDBMS binaries is wholeheartedly recommended as it makes backing out a change so much easier. As I said, you don’t have a choice for Grid Infrastructure which is always out-of-place.
And then there is the multicast issue Julian Dyke (http://juliandyke.wordpress.com/) has written about. I couldn’t reproduce the test case, and my lab and real-life clusters run with 11.2.0.2 happily.
Changes to Grid Infrastructure
After the successful upgrade you’d be surprised to find new resources in Grid Infrastructure. Have a look at these:
[grid@node1] $ crsctl stat res -t -init ----------------------------------------------------------------- NAME TARGET STATE SERVER STATE_DETAILS ----------------------------------------------------------------- Cluster Resources ----------------------------------------------------------------- ora.asm 1 ONLINE ONLINE node1 Started ora.cluster_interconnect.haip 1 ONLINE ONLINE node1 ora.crf 1 ONLINE ONLINE node1 ora.crsd 1 ONLINE ONLINE node1 ora.cssd 1 ONLINE ONLINE node1 ora.cssdmonitor 1 ONLINE ONLINE node1 ora.ctssd 1 ONLINE ONLINE node1 OBSERVER ora.diskmon 1 ONLINE ONLINE node1 ora.drivers.acfs 1 ONLINE ONLINE node1 ora.evmd 1 ONLINE ONLINE node1 ora.gipcd 1 ONLINE ONLINE node1 ora.gpnpd 1 ONLINE ONLINE node1 ora.mdnsd 1 ONLINE ONLINE node1
The cluster_interconnect.haip is yet another step towards the self contained system. The Grid Infrastructure installation guide for Linux states:
“With Redundant Interconnect Usage, you can identify multiple interfaces to use for the cluster private network, without the need of using bonding or other technologies. This functionality is available starting with Oracle Database 11g Release 2 (11.2.0.2).”
So – good news for anyone who is relying on third party software like for example HP ServiceGuard for network bonding. Linux has always done this for you, even in the times of the 2.4 kernel. Linux network bonding is actually quite simple to set up as well. But anyway, I’ll run a few tests in the lab when I have time with this new feature enabled, deliberately taking down NICs to see if the new feature works as labelled on the tin. The documentation states that you don’t need to bond your NICs for the private interconnect, simply leave the ethx (or whatever name you NICs have on your OS) as they are, and indicate the ones you like to use for the private interconnect as private during the installation. If you decide to add a NIC to the cluster for use with the private interconnect later, use oifcfg as root to add the new interface (or watch this space for a later blog post on this). Oracle states that if one of the private interconnects fails, it will transparently use another one. Additionally to the high availability benefit, Oracle apparently also performs load balancing across the configured interconnects.
To learn more about the redundant interconnect feature I had a glance at its profile. As with any resource in the lower stack (or HA stack), you need to append the “-init” argument to crsctl.
[oracle@node1] $ crsctl stat res ora.cluster_interconnect.haip -p -init NAME=ora.cluster_interconnect.haip TYPE=ora.haip.type ACL=owner:root:rw-,pgrp:oinstall:rw-,other::r--,user:grid:r-x ACTION_FAILURE_TEMPLATE= ACTION_SCRIPT= ACTIVE_PLACEMENT=0 AGENT_FILENAME=%CRS_HOME%/bin/orarootagent%CRS_EXE_SUFFIX% AUTO_START=always CARDINALITY=1 CHECK_INTERVAL=30 DEFAULT_TEMPLATE= DEGREE=1 DESCRIPTION="Resource type for a Highly Available network IP" ENABLED=1 FAILOVER_DELAY=0 FAILURE_INTERVAL=0 FAILURE_THRESHOLD=0 HOSTING_MEMBERS= LOAD=1 LOGGING_LEVEL=1 NOT_RESTARTING_TEMPLATE= OFFLINE_CHECK_INTERVAL=0 PLACEMENT=balanced PROFILE_CHANGE_TEMPLATE= RESTART_ATTEMPTS=5 SCRIPT_TIMEOUT=60 SERVER_POOLS= START_DEPENDENCIES=hard(ora.gpnpd,ora.cssd)pullup(ora.cssd) START_TIMEOUT=60 STATE_CHANGE_TEMPLATE= STOP_DEPENDENCIES=hard(ora.cssd) STOP_TIMEOUT=0 UPTIME_THRESHOLD=1m USR_ORA_AUTO= USR_ORA_IF= USR_ORA_IF_GROUP=cluster_interconnect USR_ORA_IF_THRESHOLD=20 USR_ORA_NETMASK= USR_ORA_SUBNET=
With this information at hand, we see that the resource is controlled through ORAROOTAGENT, and judging from the start sequence position and the fact that we queried crsctl with the “-init” flag, it must be OHASD’s ORAROOTAGENT.
Indeed, there are references to it in the $GRID_HOME/log/`hostname -s`/agent/ohasd/orarootagent_root/ directory. Further reference to the resource was found in cssd.log which makes perfect sense: it will use it for many things, last but not least fencing.
[ USRTHRD][1122056512] {0:0:2} HAIP: configured to use 1 interfaces
...
[ USRTHRD][1122056512] {0:0:2} HAIP: Updating member info HAIP1;192.168.52.0#0
[ USRTHRD][1122056512] {0:0:2} InitializeHaIps[ 0] infList 'inf bond1, ip 192.168.52.155, sub 192.168.52.0'
[ USRTHRD][1122056512] {0:0:2} HAIP: starting inf 'bond1', suggestedIp '169.254.79.209', assignedIp ''
[ USRTHRD][1122056512] {0:0:2} Thread:[NetHAWork]start {
[ USRTHRD][1122056512] {0:0:2} Thread:[NetHAWork]start }
[ USRTHRD][1089194304] {0:0:2} [NetHAWork] thread started
[ USRTHRD][1089194304] {0:0:2} Arp::sCreateSocket {
[ USRTHRD][1089194304] {0:0:2} Arp::sCreateSocket }
[ USRTHRD][1089194304] {0:0:2} Starting Probe for ip 169.254.79.209
[ USRTHRD][1089194304] {0:0:2} Transitioning to Probe State
[ USRTHRD][1089194304] {0:0:2} Arp::sProbe {
[ USRTHRD][1089194304] {0:0:2} Arp::sSend: sending type 1
[ USRTHRD][1089194304] {0:0:2} Arp::sProbe }
...
[ USRTHRD][1122056512] {0:0:2} Completed 1 HAIP assignment, start complete
[ USRTHRD][1122056512] {0:0:2} USING HAIP[ 0 ]: bond1 - 169.254.79.209
[ora.cluster_interconnect.haip][1117854016] {0:0:2} [start] clsn_agent::start }
[ AGFW][1117854016] {0:0:2} Command: start for resource: ora.cluster_interconnect.haip 1 1 completed with status: SUCCESS
[ AGFW][1119955264] {0:0:2} Agent sending reply for: RESOURCE_START[ora.cluster_interconnect.haip 1 1] ID 4098:343
[ AGFW][1119955264] {0:0:2} ora.cluster_interconnect.haip 1 1 state changed from: STARTING to: ONLINE
[ AGFW][1119955264] {0:0:2} Started implicit monitor for:ora.cluster_interconnect.haip 1 1
[ AGFW][1119955264] {0:0:2} Agent sending last reply for: RESOURCE_START[ora.cluster_interconnect.haip 1 1] ID 4098:343OK, I know understand this a bit better. But the log information mentioned something else as well, an IP address that I haven’t assigned to the cluster. It turns out that this IP address is another virtual IP on the private interconnect, called bond1:1
[grid]grid@node1 $ /sbin/ifconfig bond1 Link encap:Ethernet HWaddr 00:23:7D:3d:1E:77 inet addr:192.168.52.155 Bcast:192.168.52.255 Mask:255.255.255.0 inet6 addr: fe80::223:7dff:fe3c:1e74/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:33155040 errors:0 dropped:0 overruns:0 frame:0 TX packets:20677269 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:21234994775 (19.7 GiB) TX bytes:10988689751 (10.2 GiB) bond1:1 Link encap:Ethernet HWaddr 00:23:7D:3d:1E:77 inet addr:169.254.79.209 Bcast:169.254.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
Ah, something running multicast. I tried to sniff that traffic but couldn’t make any sense if it. There is UDP (not TCP) multicast traffic on that interface. This can be checked with tcpdump:
root@node1 ~]# tcpdump src 169.254.79.209 -i bond1:1 -c 10 -s 1514 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on bond1:1, link-type EN10MB (Ethernet), capture size 1514 bytes 14:30:18.704688 IP 169.254.79.209.55310 > 169.254.228.144.31112: UDP, length 252 14:30:18.704943 IP 169.254.79.209.55310 > 169.254.169.62.20057: UDP, length 252 14:30:18.705155 IP 169.254.79.209.55310 > 169.254.45.135.30040: UDP, length 252 14:30:18.895764 IP 169.254.79.209.51227 > 169.254.228.144.57323: UDP, length 192 14:30:18.895976 IP 169.254.79.209.51227 > 169.254.228.144.21319: UDP, length 296 14:30:18.897109 IP 169.254.79.209.48094 > 169.254.45.135.40464: UDP, length 192 14:30:18.897633 IP 169.254.79.209.48094 > 169.254.45.135.40464: UDP, length 192 14:30:18.897998 IP 169.254.79.209.48094 > 169.254.169.62.48215: UDP, length 192 14:30:18.902325 IP 169.254.79.209.51227 > 169.254.228.144.57323: UDP, length 192 14:30:18.902422 IP 169.254.79.209.51227 > 169.254.228.144.21319: UDP, length 296 10 packets captured 14 packets received by filter 0 packets dropped by kernel
If you are interested in the actual messages, use this command instead to capture a package:
[root@node1 ~]# tcpdump src 169.254.79.209 -i bond1:1 -c 1 -X -s 1514 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on bond1:1, link-type EN10MB (Ethernet), capture size 1514 bytes 14:31:43.396614 IP 169.254.79.209.58803 > 169.254.169.62.16178: UDP, length 192 0x0000: 4500 00dc 0000 4000 4011 ed04 a9fe 4fd1 E.....@.@.....O. 0x0010: a9fe a93e e5b3 3f32 00c8 4de6 0403 0201 ...>..?2..M..... 0x0020: e403 0000 0000 0000 4d52 4f4e 0003 0000 ........MRON.... 0x0030: 0000 0000 4d4a 9c63 0000 0000 0000 0000 ....MJ.c........ 0x0040: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x0050: a9fe 4fd1 4d39 0000 0000 0000 0000 0000 ..O.M9.......... 0x0060: e403 0000 0000 0000 0100 0000 0000 0000 ................ 0x0070: 5800 0000 ff7f 0000 d0ff b42e 0f2b 0000 X............+.. 0x0080: a01e 770d 0403 0201 0b00 0000 67f2 434c ..w.........g.CL 0x0090: 0000 0000 b1aa 0500 0000 0000 cf0f 3813 ..............8. 0x00a0: 0000 0000 0400 0000 0000 0000 a1aa 0500 ................ 0x00b0: 0000 0000 0000 ae2a 644d 6026 0000 0000 .......*dM`&.... 0x00c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................ 0x00d0: 0000 0000 0000 0000 0000 0000 ............ 1 packets captured 10 packets received by filter 0 packets dropped by kernel
Substitute the correct values of course for interface and source address.
Oracle CRF resources
Another intersting new feature is the CRF resource, which seems to be an implementation of IPD/OS Cluster Health Monitor on the servers. I need to dig a little deeper in this feature, currently I can’t get any configuration data from the cluster:
[grid@node1] $ oclumon showobjects Following nodes are attached to the loggerd [grid@node1] $
You will see some additional background processes now, namely ologgerd and osysmond.bin, which are started through the CRF resource. The resource profile (shown below) suggests that this resource is started through OHASD’s ORAROOTAGENT and can take custom logging levels.
[grid]grid@node1 $ crsctl stat res ora.crf -p -init NAME=ora.crf TYPE=ora.crf.type ACL=owner:root:rw-,pgrp:oinstall:rw-,other::r--,user:grid:r-x ACTION_FAILURE_TEMPLATE= ACTION_SCRIPT= ACTIVE_PLACEMENT=0 AGENT_FILENAME=%CRS_HOME%/bin/orarootagent%CRS_EXE_SUFFIX% AUTO_START=always CARDINALITY=1 CHECK_ARGS= CHECK_COMMAND= CHECK_INTERVAL=30 CLEAN_ARGS= CLEAN_COMMAND= DAEMON_LOGGING_LEVELS=CRFMOND=0,CRFLDREP=0,...,CRFM=0 DAEMON_TRACING_LEVELS=CRFMOND=0,CRFLDREP=0,...,CRFM=0 DEFAULT_TEMPLATE= DEGREE=1 DESCRIPTION="Resource type for Crf Agents" DETACHED=true ENABLED=1 FAILOVER_DELAY=0 FAILURE_INTERVAL=3 FAILURE_THRESHOLD=5 HOSTING_MEMBERS= LOAD=1 LOGGING_LEVEL=1 NOT_RESTARTING_TEMPLATE= OFFLINE_CHECK_INTERVAL=0 ORA_VERSION=11.2.0.2.0 PID_FILE= PLACEMENT=balanced PROCESS_TO_MONITOR= PROFILE_CHANGE_TEMPLATE= RESTART_ATTEMPTS=5 SCRIPT_TIMEOUT=60 SERVER_POOLS= START_ARGS= START_COMMAND= START_DEPENDENCIES=hard(ora.gpnpd) START_TIMEOUT=120 STATE_CHANGE_TEMPLATE= STOP_ARGS= STOP_COMMAND= STOP_DEPENDENCIES=hard(shutdown:ora.gipcd) STOP_TIMEOUT=120 UPTIME_THRESHOLD=1m USR_ORA_ENV=
An investigation of orarootagent_root.log revealed that the rootagent indeed starts the CRF resource. This resource will start the ologgerd and oysmond processes, which then write their log files into $GRID_HOME/log/`hostname -s`/crf{logd,mond}.
Configuration of the daemons can be found in $GRID_HOME/ologgerd/init and $GRID_HOME/osysmond/init. Except for the PID file for the daemons there didn’t seem to be anything of value in the directory.
The command line of the ologgerd process shows it’s configuration options:
root 13984 1 0 Oct15 ? 00:04:00 /u01/crs/11.2.0.2/bin/ologgerd -M -d /u01/crs/11.2.0.2/crf/db/node1
The files in the directory specified by the “-d” flag denote where the process stores its logging information. The files are in BDB format, or Berkeley DB (now Oracle too). The oclumon tool should be able to read these files, but until I can persuade it to connect to the host there is no output.
CVU
Unlike the previous resources, the cvu resource is actually cluster aware. It’s the Cluster Verification Utility we all know from installing RAC. Going by the profile (shown below), I conclude that the utility is run through the grid software owner’s scriptagent and has exactly 1 incarnation on the cluster. It is only executed every 6 hours and restarted if it fails. If you like to execute a manual check, simply execute the action script with the command line argument “check”.
[root@node1 tmp]# crsctl stat res ora.cvu -p NAME=ora.cvu TYPE=ora.cvu.type ACL=owner:grid:rwx,pgrp:oinstall:rwx,other::r-- ACTION_FAILURE_TEMPLATE= ACTION_SCRIPT=%CRS_HOME%/bin/cvures%CRS_SCRIPT_SUFFIX% ACTIVE_PLACEMENT=1 AGENT_FILENAME=%CRS_HOME%/bin/scriptagent AUTO_START=restore CARDINALITY=1 CHECK_INTERVAL=21600 CHECK_RESULTS= CHECK_TIMEOUT=600 DEFAULT_TEMPLATE= DEGREE=1 DESCRIPTION=Oracle CVU resource ENABLED=1 FAILOVER_DELAY=0 FAILURE_INTERVAL=0 FAILURE_THRESHOLD=0 HOSTING_MEMBERS= LOAD=1 LOGGING_LEVEL=1 NLS_LANG= NOT_RESTARTING_TEMPLATE= OFFLINE_CHECK_INTERVAL=0 PLACEMENT=balanced PROFILE_CHANGE_TEMPLATE= RESTART_ATTEMPTS=5 SCRIPT_TIMEOUT=600 SERVER_POOLS=* START_DEPENDENCIES=hard(ora.net1.network) START_TIMEOUT=0 STATE_CHANGE_TEMPLATE= STOP_DEPENDENCIES=hard(ora.net1.network) STOP_TIMEOUT=0 TYPE_VERSION=1.1 UPTIME_THRESHOLD=1h USR_ORA_ENV= VERSION=11.2.0.2.0
The action script $GRID_HOME/bin/cvures implements the usual callbacks required by scriptagent: start(), stop(), check(), clean(), abort(). All log information goes into $GRID_HOME/log/`hostname -s`/cvu.
The actual check performed is this one: $GRID_HOME/bin/cluvfy comp health -_format & > /dev/null 2>&1
Summary
Enough for now, this has become a far longer post than I initially anticipated. There are so many more new things around, like Quality of Server that need exploring making it very difficult to keep up.
Recent comments
17 weeks 1 day ago
27 weeks 1 min ago
28 weeks 4 days ago
31 weeks 6 days ago
34 weeks 1 day ago
43 weeks 4 days ago
45 weeks 1 day ago
46 weeks 1 day ago
46 weeks 2 days ago
49 weeks 23 hours ago