One of the promises from Oracle for OEM 12c was improved support for Oracle RAC One Node. I have spent quite a bit of time researching RON, and wrote a little article in 2 parts about it which you can find here:
One of my complaints with it was the limited support in OEM 11.1. At the time I was on a major consolidation project, which would have used OEM for management of the database.
Unfortunately OEM 11.1 didn’t have support for RAC One Node. Why? RON is a cluster database running on just one node. The interesting bit is that the ORACLE_SID is your normal ORACLE_SID with an underscore and a number. Under normal circumstances that number is _1, or RON_1. But as soon as you relocate the database using srvctl relocate database -d a second instance RON_2 is started until all sessions have failed over.
OEM obviously doesn’t know about RON_2: it was never discovered. Furthermore, the strict mapping of instance name to host is no longer true (the same applies for policy managed databases by the way!). A few weeks and a few switchover operations later you could be running RON_2 on racnode1.
As a consequence, the poor on-call DBA is paged about a database that has gone down, when it hasn’t-it’s up and running. As a DBA, I wouldn’t want that. After discussions with Oracle they promised to fix that problem, but it hasn’t made it into 11.1 hence this blog post about 12.
OEM 12.1 – the test
I wanted to see if that has been corrected with OEM 12.1. In order to test, I created an 18.104.22.168.0 database RON on a cluster consisting of 3 nodes, acfsprdnode1-3.
[oracle@acfsprdnode1 ~]$ srvctl config database -d RON Database unique name: RON Database name: RON Oracle home: /u01/app/oracle/product/22.214.171.124 Oracle user: oracle Spfile: +DATA/RON/spfileRON.ora Domain: Start options: open Stop options: immediate Database role: PRIMARY Management policy: AUTOMATIC Server pools: RON Database instances: Disk Groups: DATA,FRA Mount point paths: Services: RON_SRV Type: RACOneNode Online relocation timeout: 30 Instance name prefix: RON Candidate servers: acfsprdnode1,acfsprdnode2 Database is administrator managed
Currently the database is on node1:
[oracle@acfsprdnode1 ~]$ srvctl status database -d RON Instance RON_1 is running on node acfsprdnode1 Online relocation: INACTIVE
This is also reflected in OEM-as shown in this screenshot (click to enlarge):
Now what happens if I relocate?
[oracle@acfsprdnode1 ~]$ srvctl relocate database -d RON -n acfsprdnode2 -w 1 -v Configuration updated to two instances Instance RON_2 started Services relocated Waiting for 1 minutes for instance RON_1 to stop..... Instance RON_1 stopped Configuration updated to one instance [oracle@acfsprdnode1 ~]$ srvctl status database -d RON Instance RON_2 is running on node acfsprdnode2 Online relocation: INACTIVE [oracle@acfsprdnode1 ~]$
The relocate completes successfully, and there is only 1 instance active: RON_2 on acfsprdnode2.
What about OEM? Well to start with the database page hasn’t been updated. However, after navigating back to the home screen and to the RON database again, I have seen his:
This looks good so far, need to investigate whether emails are being sent out-which is for another blog post.
But what’s really nice is that the database main page (which I set as my default), the database RON is mentioned as “UP”. Also, when I click on RON, I am directly sent to RON_2-I think that’s it!
It seems I have been too happy too soon! A little more testing revealed that OEM doesn’t keep up in certain situations. After a server bounce and a little jiggery pokery I have ended up with this situation for the database:
[oracle@acfsprdnode1 ~]$ srvctl status database -d RON Instance RON_2 is running on node acfsprdnode1 Online relocation: INACTIVE
As you can see, RON_2 is now on acfsprdnode1, which hasn’t been the case before. OEM 12.1 doesn’t reflect this properly-in the below print screen shows both instances down, however the activity reports RON_2 to be up :)
Note the Summary in the upper left corner as well as the performance information in the top right. OK, all is not lost. Remember that we had a mapping of RON_2 to acfsprdnode2. So if I “assist” Clusterware in choosing the correct node, maybe I can get it back?
[oracle@acfsprdnode1 ~]$ srvctl start database -d RON -n acfsprdnode2 srvctl status database -d RON [oracle@acfsprdnode1 ~]$ srvctl status database -d RON Instance RON_2 is running on node acfsprdnode2 Online relocation: INACTIVE [oracle@acfsprdnode1 ~]$
By manually specifying the node to start on, it seems to work-OEM shows the instance as “up”. The ORACLE_SID seems to revert back to the one which was last used (RON_2) in this example. By specifying the node where that instance is associated with one can get the monitoring back into OEM. There is some more work to be done (or someone has to post a comment and tell me that I didn’t do it correctly!)
OEM 12.1 is a step into the right direction, and it has picked up the new instance running on the second cluster node. It seems to fail however to pick up