Search

Top 60 Oracle Blogs

Recent comments

linux

Installing Oracle Enterprise Manager 12c on OL 5.7

I have been closely involved in the upgrade discussion of my current customer’s Enterprise Managers setup from an engineering point of view. The client uses OEM extensively for monitoring, alerts generated by it are automatically forwarded to an IBM product called Netcool.

Now some of the management servers are still on 10.2.0.5 in certain regions, and for a private cloud project I was involved in an 11.1 system was needed.The big question was: wait for 12.1 or upgrade to 11.1?

Adding another node for RAC 11.2.0.3 on Oracle Linux 6.1 with kernel-UEK

As I have hinted at during my last post about installing Oracle 11.2.0.3 on Oracle Linux 6.1 with Kernel UEK, I have planned another article about adding a node to a cluster.

I deliberately started the installation of my RAC system with only one node to allow my moderately spec’d hardware to deal with a second cluster node. In previous versions of Oracle there was a problem with node additions: the $GRID_HOME/oui/bin/addNode.sh script did pre-requisite checks that used to fail when you had used ASMLib. Unfortuntely, due to my setup I couldn’t test if that was solved (I didn’t use ASMLib).

Cluvfy

Oracle Database 11gR2 on OL6 / RHEL6: Certified or Not?

There seems to be a little confusion out there about the certification status of Oracle Database 11gR2, especially with the release of the 11.2.0.3 patchset which fixes all the issues associated with RAC installs on OL/RHEL 6.1.

Currently, 11gR2 is *NOT* certified on OL6 or RHEL6. How do I know? My Oracle Support says so! Check for yourself like this:

  • Log on the My Oracle Support (support.oracle.com).
  • Click the “Certifications” link.
  • Type in the product name, like “Oracle Database”
  • Select the product version number, like “11.2.0.3.0″.
  • Select the platform, like “Linux x86_64″ or a specific distro beneath this.
  • Click the “Search” button.

From the results you will see that Oracle Database 11.2.0.3 is certified on OL and RHEL 5.x. Oracle do not differentiate between different respins of the major version. You will also notice that it is not currently supported on OL6 or RHEL6.

Having said that, we can expect this certification really soon. Why? Because Red Hat has submitted all the certification information to Oracle and (based on previous certifications) expects it to happen some time in Q4 this year, which is any time between now and the end of the year.

With a bit of luck, by the time I submit this post MOS certification will get updated and I will happily be out of date… :)

Cheers

Tim…




Oracle 11gR2 RAC on Oracle Linux 6.1 using VirtualBox…

It’s all a bit last minute, but today I decided to do an Oracle 11gR2 (11.2.0.3) RAC installation on Oracle Linux 6.1 using VirtualBox. The 11.2.0.3 patch has fixed all of the installation issues related to RAC on OL6.1, so it was pretty smooth. The procedure can be seen here.

As noted in the article, the screen shots of the GI and DB installers are from an 11.2.0.1 article. I’ll update these screen shots when I get back from OpenWorld. Like I said, it was all a bit last minute. :) Normally I wouldn’t put an article like this live (and you can see it’s not on the homepage yet), but I get lots of questions about this subject, so I thought I would make it available to make my life easier.

Cheers

Tim..

PS. There is no suitable oracle-validated package available for this at the moment, so the prerequisites have to be done manually.




Ksplice in action

On July 21, 2011 Oracle announced that it has aquired Ksplice. With Ksplice users can update the Linux kernel while it is running, so without a reboot or any other disruption. As of September 15, 2011 Ksplice is available, at no additional charge, to new and existing Oracle PremierSupport customers on the Unbreakable Linux Network […]

Installing Grid Infrastructure 11.2.0.3 on Oracle Linux 6.1 with kernel UEK

Installing Grid Infrastructure 11.2.0.3 on Oracle Linux 6.1

Yesterday was the big day, or the day Oracle release 11.2.0.3 for Linux x86 and x86-64. Time to download and experiment! The following assumes you have already configured RAC 11g Release 2 before, it’s not a step by step guide how to do this. I expect those to shoot out of the grass like mushrooms in the next few days, especially since the weekend allows people to do the same I did!

The Operating System

I have prepared a xen domU for 11.2.0.3, using the latest Oracle Linux 6.1 build I could find. In summary, I am using the following settings:

  • Oracle Linux 6.1 64-bit
  • Oracle Linux Server-uek (2.6.32-100.34.1.el6uek.x86_64)
  • Initially installed to use the “database server” package group
  • 3 NICs – 2 for the HAIP resource and the private interconnect with IP addresses in the ranges of 192.168.100.0/24 and 192.168.101.0/24. The public NIC is on 192.168.99.0/24
    • Node 1 uses 192.168.(99|100|101).129 for eth0, eth1 and eth2. The VIP uses 192.168.99.130
    • Node 1 uses 192.168.(99|100|101).131 for eth0, eth1 and eth2. The VIP uses 192.168.99.132
    • The SCAN is on 192.168.99.(133|134|135)
    • All naming resolution is done via my dom0 bind9 server
  • I am using a 8GB virtual disk for the operating system, and a 20G LUN for the oracle Grid and RDBMS homes. The 20G are subdivided into 2 LVMs of 10G each mounted to /u01/app/oracle and /u01/crs/11.2.0.3. Note you now seem to need 7.5 G for GRID_HOME
  • All software is owned by Oracle
  • Shared storage is provided by the xen blktap driver
    • 3 x 1G LUNs for +OCR containing OCR and voting disks
    • 1 x 10G for +DATA
    • 1 x 10G for +RECO

Configuring Oracle Linux 6.1

Installation of the operating environment is beyond the scope of this article, and it hasn’t really changed much since 5.x. All I did was to install the database server package group. I wrote this article for fans of xen-based para-virtualisation. Although initially for 6.0, it applies equally for 6.1. Here’s the xen native domU description (you can easily convert that to xenstore format using libvirt):

# cat node1.cfg
name="rac11_2_0_3_ol61_node1"
memory=4096
maxmem=8192
vcpus=4
on_poweroff="destroy"
on_reboot="restart"
on_crash="destroy"
localtime=0
builder="linux"
bootargs=""
extra=" "
disk=[
'file:/var/lib/xen/images/rac11_2_0_3_ol61_node1/disk0,xvda,w',
'file:/var/lib/xen/images/rac11_2_0_3_ol61_node1/oracle,xvdb,w',
'file:/var/lib/xen/images/rac11_2_0_3_ol61_shared/ocr1,xvdc,w!',
'file:/var/lib/xen/images/rac11_2_0_3_ol61_shared/ocr2,xvdd,w!',
'file:/var/lib/xen/images/rac11_2_0_3_ol61_shared/ocr3,xvde,w!',
'file:/var/lib/xen/images/rac11_2_0_3_ol61_shared/data1,xvdf,w!',
'file:/var/lib/xen/images/rac11_2_0_3_ol61_shared/fra1,xvdg,w!'
]
vif=[
'mac=00:16:1e:2b:1d:ef,bridge=br1',
'mac=00:16:1e:2b:1a:e1,bridge=br2',
'mac=00:16:1e:2a:1d:1f,bridge=br3',
]
bootloader = "pygrub"

Use the “xm create node1.cfg” command to start the domU. After the OS was ready I installed the following additional software to satisfy the installation requirements:

  • compat-libcap1
  • compat-libstdc++-33
  • libstdc++-devel
  • gcc-c++
  • ksh
  • libaio-devel

This is easiest done via yum and the public YUM server Oracle provides. It also has instructions on how to set your repository up.

# yum install compat-libcap1 compat-libstdc++-33 libstdc++-devel gcc-c++ ksh libaio-devel

On the first node only I wanted a VNC-like interface for a graphical installation. The older package vnc-server I loved from 5.x isn’t available anymore, the package you need is now called tigervnc-server. It also requires a new viewer to be downloaded from sourceforge. On the first node you might want to install these, unless you are brave enough to use a silent installation:

  • xorg-x11-utils
  • xorg-x11-server-utils
  • twm
  • tigervnc-server
  • xterm

Ensure that SELinux and the IPTables packages are turned off. SELinux is still configured in /etc/sysconfig/selinux, where the setting has to be permissive at least. You can use “chkconfig iptables off” to disable the firewall service at boot. Check that there are no filter rules using “iptables -L”.

I created the oracle account using these usual steps-this hasn’t change since 11.2.0.2.

A few changes to /etc/sysctl.were needed; you can copy and paste the below example and append it to your existing settings. Ensure to up the limits where you have more resources!

kernel.shmall = 4294967296
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 6815744
net.core.rmem_default = 262144
net.core.wmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_max = 1048576
fs.aio-max-nr = 1048576
net.ipv4.ip_local_port_range = 9000 65500
net.ipv4.conf.eth1.rp_filter = 0
net.ipv4.conf.eth2.rp_filter = 0

Also ensure that you change the rp_filter for your private interconnect to 0 (or 2)-my devices are eth1 and eth2. This is a new requirement for reverse path filtering introduced with 11.2.0.3.

ASM “disks” must be owned by the GRID owner. The easiest way to change the permissions of the ASM disks is to create a new set of udev rules, such as the following:

# cat 61-asm.rules
 KERNEL=="xvd[cdefg]1", OWNER="oracle", GROUP="asmdba" MODE="0660"

After a quick “start_udev” as root these were applied.

Note that as per my domU config file I actually know the device names are persistent, so it was easy to come up with this solution. In real life you would use the dm-multipath package which allows setting the owner,group and permission now in /etc/multipath.conf for every ASM LUN.

There was an interesting problem initially in that kfod seemed to trigger a change of permissions back to root:disk whenever it ran. Changing the ownership back to oracle only lasted until the next execution of kfod. The only fix I could come up with involved the udev rules.

Good news for those who suffered from the multicast problem introduced in 11.2.0.2-cluvfy now knows about it and checks during the post hwos stage (I had already installed cvuqdisk):

[oracle@rac11203node1 grid]$ ./runcluvfy.sh stage -post hwos -n rac11203node1

Performing post-checks for hardware and operating system setup

Checking node reachability...
Node reachability check passed from node "rac11203node1"

Checking user equivalence...
User equivalence check passed for user "oracle"

Checking node connectivity...

Checking hosts config file...

Verification of the hosts config file successful

Node connectivity passed for subnet "192.168.99.0" with node(s) rac11203node1
TCP connectivity check passed for subnet "192.168.99.0"

Node connectivity passed for subnet "192.168.100.0" with node(s) rac11203node1
TCP connectivity check passed for subnet "192.168.100.0"

Node connectivity passed for subnet "192.168.101.0" with node(s) rac11203node1
TCP connectivity check passed for subnet "192.168.101.0"

Interfaces found on subnet "192.168.99.0" that are likely candidates for VIP are:
rac11203node1 eth0:192.168.99.129

Interfaces found on subnet "192.168.100.0" that are likely candidates for a private interconnect are:
rac11203node1 eth1:192.168.100.129

Interfaces found on subnet "192.168.101.0" that are likely candidates for a private interconnect are:
rac11203node1 eth2:192.168.101.129

Node connectivity check passed

Checking multicast communication...

Checking subnet "192.168.99.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "192.168.99.0" for multicast communication with multicast group "230.0.1.0" passed.

Checking subnet "192.168.100.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "192.168.100.0" for multicast communication with multicast group "230.0.1.0" passed.

Checking subnet "192.168.101.0" for multicast communication with multicast group "230.0.1.0"...
Check of subnet "192.168.101.0" for multicast communication with multicast group "230.0.1.0" passed.

Check of multicast communication passed.
Check for multiple users with UID value 0 passed
Time zone consistency check passed

Checking shared storage accessibility...

Disk                                  Sharing Nodes (1 in count)
------------------------------------  ------------------------
/dev/xvda                             rac11203node1
/dev/xvdb                             rac11203node1
/dev/xvdc                             rac11203node1
/dev/xvdd                             rac11203node1
/dev/xvde                             rac11203node1
/dev/xvdf                             rac11203node1
/dev/xvdg                             rac11203node1

Shared storage check was successful on nodes "rac11203node1"

Post-check for hardware and operating system setup was successful.

As always, I tried to fix as many problems before invoking runInstaller as possible. The “-fixup” option to runcluvfy is again very useful. I strongly recommend running the fixup script prior to executing the OUI binary.

The old trick to remove /etc/ntp.conf causes the NTP check to complete ok, in which case you are getting the ctsd service for time synchronisation. You should not do this in production-consistent times in the cluster are paramount!

I encountered an issue with the check for free space later in the installation during my first attemps. OUI wants 7.5G for GRID_HOME, even though the installation “only” took around 3 in the end. I exported TMP and TEMP to point to my 10G mount point to avoid this warning:

$ export TEMP=/u01/crs/temp
$ export TMP=/u01/crs/temp
$ ./runInstaller

The installation procedure for Grid Infrastructure 11.2.0.3 is almost exactly the same as for 11.2.0.2, except for the option to change the AU size for the initial disk group you create:


Once you have completed the wizard, it’s time to hit the “install” button. The magic again happens in the root.sh file, or rootupgrade.sh if you are upgrading. I included the root.sh output so you have something to compare against:

Performing root user operation for Oracle 11g

The following environment variables are set as:
ORACLE_OWNER= oracle
ORACLE_HOME=  /u01/crs/11.2.0.3

Enter the full pathname of the local bin directory: [/usr/local/bin]: Creating y directory...
Copying dbhome to y ...
Copying oraenv to y ...
Copying coraenv to y ...

Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /u01/crs/11.2.0.3/crs/install/crsconfig_params
Creating trace directory
User ignored Prerequisites during installation
OLR initialization - successful
root wallet
root wallet cert
root cert export
peer wallet
profile reader wallet
pa wallet
peer wallet keys
pa wallet keys
peer cert request
pa cert request
peer cert
pa cert
peer root cert TP
profile reader root cert TP
pa root cert TP
peer pa cert TP
pa peer cert TP
profile reader pa cert TP
profile reader peer cert TP
peer user cert
pa user cert
Adding Clusterware entries to upstart
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac11203node1'
CRS-2676: Start of 'ora.mdnsd' on 'rac11203node1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac11203node1'
CRS-2676: Start of 'ora.gpnpd' on 'rac11203node1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac11203node1'
CRS-2672: Attempting to start 'ora.gipcd' on 'rac11203node1'
CRS-2676: Start of 'ora.gipcd' on 'rac11203node1' succeeded
CRS-2676: Start of 'ora.cssdmonitor' on 'rac11203node1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac11203node1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac11203node1'
CRS-2676: Start of 'ora.diskmon' on 'rac11203node1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac11203node1' succeeded

ASM created and started successfully.

Disk Group OCR created successfully.

clscfg: -install mode specified
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-4256: Updating the profile
Successful addition of voting disk 1621f2201ab94f32bf613b17f62982b0.
Successful addition of voting disk 337a3f0b8a2d4f7ebff85594e4a8d3cd.
Successful addition of voting disk 3ae328cce2b94f3bbfe37b0948362993.
Successfully replaced voting disk group with +OCR.
CRS-4256: Updating the profile
CRS-4266: Voting file(s) successfully replaced
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
1. ONLINE   1621f2201ab94f32bf613b17f62982b0 (/dev/xvdc1) [OCR]
2. ONLINE   337a3f0b8a2d4f7ebff85594e4a8d3cd (/dev/xvdd1) [OCR]
3. ONLINE   3ae328cce2b94f3bbfe37b0948362993 (/dev/xvde1) [OCR]
Located 3 voting disk(s).
CRS-2672: Attempting to start 'ora.asm' on 'rac11203node1'
CRS-2676: Start of 'ora.asm' on 'rac11203node1' succeeded
CRS-2672: Attempting to start 'ora.OCR.dg' on 'rac11203node1'
CRS-2676: Start of 'ora.OCR.dg' on 'rac11203node1' succeeded
CRS-2672: Attempting to start 'ora.registry.acfs' on 'rac11203node1'
CRS-2676: Start of 'ora.registry.acfs' on 'rac11203node1' succeeded
Configure Oracle Grid Infrastructure for a Cluster ... succeeded

That’s it! After returning to the OUI screen you run the remaing assistants and finally are rewarded with the success message:

Better still, I could now log in to SQL*Plus and was rewarded with the new version:

$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.3.0 Production on Sat Sep 24 22:29:45 2011

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> select * from v$version;

BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
CORE    11.2.0.3.0      Production
TNS for Linux: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production

SQL>

Summary

You might remark that in the output there has only ever been one node referenced. That is correct-my lab box has limited resources and I’d like to test the addNode.sh script for each new release so please be patient! I’m planning an article about upgrading to 11.2.0.3 soon, as well as the addition of a node. One thing I noticed was the abnormally high CPU usage for the CSSD processes: ocssd.bin, cssdagent and cssdmonitor-something I find alarming at the moment.

top - 22:53:19 up  1:57,  5 users,  load average: 5.41, 4.03, 3.77
Tasks: 192 total,   1 running, 191 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.3%us,  0.2%sy,  0.0%ni, 99.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4102536k total,  3500784k used,   601752k free,    59792k buffers
Swap:  1048568k total,     4336k used,  1044232k free,  2273908k cached

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
27646 oracle    RT   0 1607m 119m  53m S 152.0  3.0  48:57.35 /u01/crs/11.2.0.3/bin/ocssd.bin
27634 root      RT   0  954m  93m  55m S 146.0  2.3  31:45.50 /u01/crs/11.2.0.3/bin/cssdagent
27613 root      RT   0  888m  91m  55m S 96.6  2.3  5124095h /u01/crs/11.2.0.3/bin/cssdmonitor
28110 oracle    -2   0  485m  14m  12m S  1.3  0.4   0:34.65 asm_vktm_+ASM1
28126 oracle    -2   0  499m  28m  15m S  0.3  0.7   0:04.52 asm_lms0_+ASM1
28411 root      RT   0  500m 144m  59m S  0.3  3.6  5124095h /u01/crs/11.2.0.3/bin/ologgerd -M -d /u01/crs/11.2.0.3/crf/db/rac11203node1
32394 oracle    20   0 15020 1300  932 R  0.3  0.0  5124095h top
1 root      20   0 19336 1476 1212 S  0.0  0.0   0:00.41 /sbin/init

...

11.2.0.2 certainly didn’t use that much CPU across 4 cores…

Update: I have just repeated the same installation on VirtualBox 4.1.2 with less potent hardware, and funny enough the CPU problem has disappeared. How is that possible? I need to understand more, and maybe update the XEN host to something more recent.

Oracle 11.2.0.3-can’t be long now!

Update: well it’s out, actually. See the comments below. However the certification matrix hasn’t been updated so it’s anyone’s guess if Oracle/Red Hat 6 are certified at this point in time.

Tanel Poder has already announced it a few days ago, but 11.2.0.3 must be ready for release very soon. It has even been spotted in the “lastet patchset” page on OTN, only to be removed quickly. After another tweet came out from Laurent Schneider, it was time to investigate what’s new. The easiest way is to point your browser to tahiti.oracle.com and type “11.2.0.3” into the search box. You are going to find a wealth of new information!

As a RAC person by heart I am naturally interested in RAC features first. The new features I spotted in the Grid Infrastructure installation guide for Linux are listed here:

http://download.oracle.com/docs/cd/E11882_01/install.112/e22489/whatsnew.htm#CEGJBBBB

Additional information for this article was taken from the “New Features” guide:

http://download.oracle.com/docs/cd/E11882_01/server.112/e22487/chapter1_11203.htm#NEWFTCH1-c

So the question is-what’s in for us?

ASM Cluster File System

As I expected, there was support for ACFS and ADVM for Oracle’s own kernel. This has been overdue for a while. I remember how surprised I was when I installed RAC on Oracle Linux 5.5 with the UEK kernel only to see that infamous “…not supported” output when the installer probed the kernel version. Supported kernels are Linux kernels UEK5-2.6.32-100.34.1 and subsequent updates to 2.6.32-100 kernels for Oracle Linux kernels OL5 and OL6.

A big surprise is support for ACFS for SLES-I though that was pretty much dead in the water after all that messing around from Novell. ACFS has always worked on SLES 10 up to SP3, but it did never for SLES 11. The requirement is SLES 11 SP1, and it has to be 64bit.

There are quite a few additional changes to ACFS. For example, it’s now possible to use ACFS replication and tagging on Windows.

Upgrade

If one of the nodes in the cluster being upgraded with the rootupgrade.sh script fails during the execution of said script, the operation can be completed with the new “force” flag.

Random Bits

I found the following note on MOS regarding the time zone file: Actions For DST Updates When Upgrading To Or Applying The 11.2.0.3 Patchset [ID 1358166.1] I suggest you have a look at that note, as it mentions a new pre-upgrade script you need to download and corrective actions for 11.2.0.1 and 11.2.0.2. I’m sure it’s going to be mentioned in the 11.2.0.3 patch readme as well.

There are also changes expected with the dreaded mutex problem in busy systems, MOS note WAITEVENT: “library cache: mutex X” [ID 727400.1] lists 11.2.0.3 as the release where many of the problems related to this are fixed. Time will tell if they are…

Further enhancements are focused on Warehouse Builder, and XML. SQL apply and the log miner have also been enhanced which is good news for users of Streams and logical standby databases

Summary

It is much to early to say anything else about the patchset. Quite a few important documents don’t have a “new features” section yet. That includes the Real Application Clusters Administration and Deployment guide as well which I’ll cover as soon as it’s out. From a quick glance at the still unreleased patchset it seems it’s less of a radical change than 11.2.0.2 was which is a good thing. Unfortunately the certification matrix hasn’t been updated yet, I am very keen to see support for Oracle/Red Hat Linux 6.x.

Collecting and analysing Exadata cell metrics

Recently I have been asked to write a paper about Exadata Flash Cache and its impact on performance. This was a challenge to my liking! I won’t reproduce the paper I wrote, but I’d like to demonstrate the methods I used to get more information about what’s happening in the software.

Hardware

The Exadata Flash Cache is provided by four F20 PCIe cards in each cell. Currently the PCI Express bus is the most potent way to realise the potential of the flash disk in terms of latency and bandwidth. SSDs attached to a standard storage array will be slowed by fibre channel as the transport medium.

Each of the F20 cards holds 96G of raw space, totalling in 384GB of capacity per storage cell. The usable capacity is slightly less. The F20 card is subdivided into 4 so called FMODs, or solid state flash modules visible to the operating system using the standard SCSI SD driver.

Cellcli can also be used to view the FMODs using the “LIST PHYSICALDISK” command. The output is slightly different from the spinning disks as they are reported SCSI drivee’s [host:bus:target:lun] notation.

Now please don’t be tempted to take the FMODs and transform them into celldisks!

How it works

The flash cache is a write-through cache, and therefore won’t speed up write operations to disk. If an internal, undocumented algorithm decides that the data just written to disk is suitable for caching, it will put the data into the ESFC. It is quite clever in this respect and performs optimisations “ordinary” cache cannot perform.

Smart scans are going to ignore the flash cache by default and read directly from disk, unless the segment’s storage clause specifically instructs Oracle to read from flash cache as well. Meddling with the storage clause is also referred to as object pinning, a practice which is not recommended. Pinning objects to the flash cache immediately reminded me of the bad practice in the old 8i days of pinning objects in the shared pool. It didn’t necessarily made things go faster or help (except in some special cases).

The main statistics counter to query in relation to the smart cache is called “cell flash cache read hits”. The remainder of this paper has to do with the cell metrics which cannot be queried directly from the database. They are more granular though making them well suited for analysis.

The test case

The swingbench order entry benchmark has provided the data for the following text. The table ORDER_ITEMS contains most rows in the schema, and it is partitioned to reflect a realistic environment. Indexes have been made invisible as they may have impacted the performance statistics. Additionally cellcli’s WHERE clause isn’t very flexible and querying a number of objectIDs can be time consuming.

SOE.ORDER_ITEMS is divided into partitions with object IDs 74506 to 74570. These object IDs are required for mapping the output of cellcli to the data dictionary later.

A little bit of setup has been performed to capture the cell metrics. For pre and post comparison it was necessary to capture statistics in a statspack-like way and to analyse them. Rather than using Excel I decided to store the gathered information in the database in a dedicated schema.

I gathered two different sets of information from the cells: firstly the metrics for the objectType “FLASHCACHE”, and secondly the contents of the flash cache using the fabulous “dcli” command. The distributed command line interface can be used to execute arbitrary commands on a number of hosts simultaneously and reports the results back to the caller. I am redirecting the output from STDOUT to a CSV file in my home directory and use external tables to copy this information into the so-called “RESULT” table.

The table DDL is as follows:

create table cacheContents_csv (
cellname varchar2(50),
cachedKeepSize     number,
cachedSize         number,
hitCount           number,
missCount          number,
objectNumber             number
)
organization external (
type oracle_loader
default directory testdir
access parameters (
records delimited  by newline
fields  terminated by ';'
missing field values are null
)
location ('flashcontents.csv')
)
reject limit unlimited;

CREATE TABLE cacheContents_RESULT AS
SELECT 1 AS
TESTRUN#,
CELLNAME,
CACHEDKEEPSIZE,
CACHEDSIZE,
HITCOUNT,
MISSCOUNT,
OBJECTNUMBER
FROM cacheContents_csv A;

CREATE TABLE "METRICCURRENT_CSV"
(
"CELLNAME" VARCHAR2(100),
"NAME" VARCHAR2(50),
"OBJECTTYPE" VARCHAR2(50),
"METRICVALUE" VARCHAR2(100)
)
ORGANIZATION external
(
TYPE oracle_loader
DEFAULT DIRECTORY TESTDIR
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII
FIELDS LDRTRIM
REJECT ROWS WITH ALL NULL FIELDS
(
"CELLNAME" (1:12) CHAR,
"NAME" (14:36) CHAR,
"OBJECTTYPE" (39:48) CHAR,
"METRICVALUE" (51:90) CHAR
)
)
location
(
'metrics.csv'
)
) REJECT LIMIT UNLIMITED;

CREATE TABLE "METRICCURRENT_RESULT"
(
"TESTRUN#"    NUMBER,
"CELLNAME"    VARCHAR2(100 BYTE),
"NAME"        VARCHAR2(50 BYTE),
"OBJECTTYPE"  VARCHAR2(50 BYTE),
"METRICVALUE" VARCHAR2(100 BYTE)
)

I additionally added a table called “metricdefinition” which allows me to match the short metric name from metriccurrent with a more human readable description.

CREATE TABLE "METRICDEFINITION"
(
"NAME"        VARCHAR2(100 BYTE),
"DESCRIPTION" VARCHAR2(500 BYTE),
"METRICTYPE"  VARCHAR2(50 BYTE),
"OBJECTTYPE"  VARCHAR2(50 BYTE),
"UNIT"        VARCHAR2(20 BYTE)
)

The definitions, too, were loaded into the database using sqlldr.

First the metrics were captured:

# dcli -l root -g /tmp/martin_group 'cellcli -e "list METRICCURRENT where objecttype = \"FLASHCACHE\""' | sed -e 's/://' > /home/oracle/martin/metrics.csv

I admit it’s probably not the most elegant way to do so but it worked for me and I didn’t have weeks to prepare (as always…)

Now what does this command do? It will execute a call to cellcli on all nodes defined in /tmp/martin_group to list the current metrics for the Flash Cache. The result is then piped into sed which strips out any colons after the cell names and then saves the result in my TESTDIR directory on the first RDBMS node.

Similarly I am using the following command to get the contents of the flash cache to see if anything has changed after executing the query.

 # dcli -l root -g /tmp/martin_group 'cellcli -e "list flashcachecontent where dbUniqueName like 'TEST' and objectNumber > 74505 and objectNumber < 74571 attributes cachedKeepSize,cachedSize,hitCount,missCount,objectNumber"' | sed -e 's/://' | awk '{print $1";"$2";"$3";"$4";"$5";"$6}' > /home/oracle/martin/flashcontents.csv

This looks a little more complex, but it isn’t. The apparent complexity is required to limit the result set to the contents of the flash cache for database TEST and the object_ids of the table partitions. There is a little more transformation involved with AWK to convert the output in a “;” separated line.

With this done, it was possible to transfer the information in the staging tables into the RESULT tables. This is a simple “insert into table select testrun#, * from …CSV”.

The analysis I did was very simple once the data has been loaded into the RESULT tables. For the flashcache contents I used the following queries:

WITH prestats AS
( SELECT * FROM metriccurrent_result WHERE testrun# = &1
),
poststats AS
( SELECT * FROM metriccurrent_result WHERE testrun# = &2
)
SELECT pre.cellname,
--pre.name,
defn.description,
pre.metricvalue pre_metricvalue,
post.metricvalue post_metricvalue,
to_number(REGEXP_SUBSTR(post.metricvalue, '(\S*)'),'999G999G999D999999', 'NLS_NUMERIC_CHARACTERS = ''.,''') -
to_number(REGEXP_SUBSTR( pre.metricvalue, '(\S*)'),'999G999G999D999999', 'NLS_NUMERIC_CHARACTERS = ''.,''') delta
FROM prestats pre,
poststats post,
metricdefinition defn
WHERE pre.cellname = post.cellname
and pre.name = post.name
and pre.objecttype = post.objecttype
and pre.name = defn.name(+)
and pre.name = 'FC_IO_BYKEEP_R'
and (to_number(REGEXP_SUBSTR(post.metricvalue, '(\S*)'),'999G999G999D999999', 'NLS_NUMERIC_CHARACTERS = ''.,''') -
to_number(REGEXP_SUBSTR( pre.metricvalue, '(\S*)'),'999G999G999D999999', 'NLS_NUMERIC_CHARACTERS = ''.,''') ) <> 0
order by 1,2;

WITH prestats AS
( SELECT * FROM cachecontents_result WHERE testrun# = &1
),
poststats AS
( SELECT * FROM cachecontents_result WHERE testrun# = &2
)
SELECT pre.cellname,
pre.objectnumber,
pre.cachedkeepsize pre_cachedkeepsize,
pre.cachedsize pre_cachedsize,
pre.hitcount pre_hitcount,
pre.misscount pre_misscount,
post.cachedkeepsize post_cachedkeepsize,
post.cachedsize post_cachedsize,
post.hitcount post_hitcount,
post.misscount post_misscount,
(post.cachedkeepsize-pre.cachedkeepsize) delta_cachedkeepsize,
(post.cachedsize-pre.cachedsize) delta_cachedsize,
(post.hitcount-pre.hitcount) delta_hitcount,
(post.misscount-pre.misscount) delta_misscount
FROM prestats pre,
poststats post
WHERE pre.cellname   = post.cellname
AND pre.objectnumber = post.objectnumber
and
ORDER BY 1,2

This revealed quite a lot about the underlying workings of the flash cache. The above queries are best executed with SQL Developer and the result is then copied and pasted into a spreadsheet application to be saved from accidental overwriting.

Summary

The above certainly isn’t bullet proof or very elegant, but a good starting point for your own experiments. Time permitting I would like to automate the generation of the CSV files via a scheduled job, and also the population of the result tables. There really should be some sort of unique key on the results tables to avoid duplicates. Maybe one day I’d like to write a statspack like version of the lot…

Compiling RLWRAP for Oracle Linux 6.1 x86-64

RLWrap is a great too to enhance the user experience with SQL*Plus by allowing it to make use of the GNU readline library. Search the Internet for RLWrap and sqlplus and you should get plenty of hits explaining how awesome that combination is.

Why am I writing this? I am currently in the process of upgrading my lab reference database server to Oracle Linux 6.1, and in the process I wanted to install the rlwrap tool to get read line support with SQLPlus. It’s actually quite simple, all I did after installing the operating system with the “database server” package is described in the few steps that follow this introduction.

First of all I was surprised that gcc wasn’t installed with the package group, and some other bits were missing too. This can easily be rectified by allowing access to the Oracle Public YUM server. I’m behind a gateway so have to use a proxy server:

[root@rhel61ref rlwrap-0.37]# export http_proxy=http://192.168.xx.10:3128
[root@rhel61ref rlwrap-0.37]# export https_proxy=https://192.168.xx.10:3128

This is a very basic SQUID setup without authentication. It’s also behind a firewall and cannot be accessed from the outside world.

Now with this set, go to /etc/yum.repos.d/ and use wget to get the repository configuration:

[root@rhel61ref rlwrap-0.37]# wget http://public-yum.oracle.com/public-yum-ol6.repo

This file needs to be edited-set “enabled” to 1 for the repository called [ol6_u1_base]. A quick “yum update” as root updates the repository information. I needed the following packages (including dependencies) to be installed as well:

[root@rhel61ref rlwrap-0.37]# yum install gcc
 [root@rhel61ref rlwrap-0.37]# yum install readline-devel

Now it’s time to get RLWrap itself, at the time of this writing it was version 0.37:

[root@rhel61ref rlwrap-0.37]# wget http://utopia.knoware.nl/~hlub/uck/rlwrap/rlwrap-0.37.tar.gz

I downloaded it to my home directory. There was a small problem related to the configure process:

[root@rhel61ref rlwrap-0.37]# ./configure
 checking build system type... Invalid configuration `x86_64-unknown-linux-': machine `x86_64-unknown-linux' not recognized
 configure: error: /bin/sh tools/config.sub x86_64-unknown-linux- failed
 [root@rhel61ref rlwrap-0.37]#

To work around this, you can use the “–build” flag:

[root@rhel61ref rlwrap-0.37]# ./configure --build=x86_64

This did the trick. After the configure command completed, all I needed to do was to “make” the application and “make install”. Now all I need to do is wait for Oracle to release 11.2.0.3 hopefully around OpenWorld and install it to my system. Until then I’ll have to work on understanding upstart…

Alternative

The alternative is to grab a RPM from a compatible source, such as EPEL (http://fedoraproject.org/wiki/EPEL), but that depends if you are willing to mix RPMs from different sources which may not be possible due to political reasons.

Pimp my collectl-advanced system monitoring using collect-utils part I

I have recently written about collectl, a truly superb troubleshooting utility, in a previous post. After comments from Mark Seeger (the author) and Kevin Closson (who has used it extensively and really loves it), I have decided to elaborate a bit more about what you can do with collectl.

Even though it’s hard to believe, collectl’s functionality can be extended by using the collectl-utilities from sourceforge, available here: http://collectl-utils.sourceforge.net/

Like collectl, you can either download a source tgz file or a noarch-RPM. Collectl-utils consist of three major tools, out of which I’d like to introduce the first one: colplot. When finding time I’ll create a post about the other part, most likely about colmux first.

colplot

I mentioned in said previous post that you can use the “-P” option to generate output in a plot format. This in turn can be fed to your favourite spreadsheet application, or alternatively into gnuplot. When chosing to use a spreadsheet application, it’s your responsibility to decide what to do with the raw data, each time you load a plotfile. Maybe, one day I’ll write a collectl-analyzer which does similar things to nmon-analyzer, but that has to wait for now. So if you are lazy like me, you need another alternative, and it comes easily accessible in the form of gnuplot.

Although I am very impressed by what gnuplot can do, I never had the time or energy to get to grips with all its options. When at University I used Mathematica 2 (yes it’s been a while) and thought the plot2d() function was complex …

Now for the good news: the complexity of gnuplot is nicely hidden by colplot, which takes the burden of generating the plot files away from the user. And to make it more comfortable, all of this happens through a web interface. All you need is a web server such as apache and a little bit of initial configuration for it to work. I should also note that colplot can be used on the command line as well, but that is out of scope of this article.

This time around I downloaded the source tarball rather than the RPM as I wanted more control over the installation process. If you chose the RPM it is good to know that it has intelligence to tell SLES apart from RHEL and updates the web server configuration accordingly. If you decide to manually install colplot, check the INSTALL script as it can help you getting started. And don’t forget to read INSTALL-colplot and consult colplot-apache.conf for a sample apache configuration. The latter can go to /etc/httpd/conf.d on RHEL and will take effect after a reloading of the apache configuration. You also need collectl installed on the host running the collect-utils.

Colplot uses a directory, usually called plotfiles, where the recorded collectl output is stored. By default, it resides in the same directory as colplot but can be changed in the GUI.

I am thinking of using NFS to export the plotfiles directory, so that each monitored host could mount the directory and store output files. The more progressive use of SSHFS is probably out of scope for most database servers, but on my lab I’m king and do what I like. I personally found it easiest to use “collectl -P -f /mnt/sshfs/plotfiles/ ” to generate the files, where mount/sshfs/plotfiles was mounted from the web server host. If you are planning on generating the colplot output file names manually, i.e. not pointing to a directory, then make sure they are unique! This makes is easy to compare systems, as we’ll see below. One thing I noticed is that detail files all get their own trace file name in the form “host-date.type”, where type is dsk for detailed disk information etc.

After all the webserver setup is complete, point your browser to the host where you installed colplot. As I said, the “plotfiles” directory is scanned for files, which are processed. You see the following screen:

Using it

In the GUI, the first step you define which time of day with matching/gathered collectl information you would like to visualise (open the above screenshot in a separate window to better follow this discussion).

  • You can limit the information to be displayed to a certain time period, i.e. if you captured a day’s worth of statistics but only need the hour from 13:00 to 14:00 that’s a simple setting in the user interface
  • Alternatively, select “last 60 minutes” for the most recent period

You can also list the contents of the plotfiles directory, or even change the location-but bear in mind that the webserver still has to be able to read files from there!

If you like, you can instruct colplot to narrow down the files to be plotted by editing “filenames containing”. If the plotfiles directory contains information to satisfy the period/names you are interested in, it will plot it after a click on “Generate plot”. I suggest a display of “SysPlot” initially, which plots the systems recorded in the colplot files side by side. This is very useful for comparison of system health, especially in clusters. You should experiment with the different plot settings, which are very useful to do all sorts of analysis and allow aggregation on days, system and plots and various combinations of these. By the way the system name is derviced from the hostname when using the colplot -P -f /path/to/plotfiles/ … command.

Once you familiarised yourself with the options, you can further narrow down which data you interested in. I would suggest the “All Plots” option to get you started, unless of course you know what you are after. Colplot, like collectl differentiates between “summary” and “detail” plots. Of course, it can only plot what you recorded! Each of these has a convenient “All Plots” option to display all the information gathered. Here’s the slightly cropped output from a 2 node cluster I gathered (click for a larger view):

A very usful function is to email the results either in PDF or PNG format. For this to work you need uuencode (package sharutils in RHEL) on the web server host, and you need to be able to send email via the command line-colplot uses the mail (1) utility to send email. Sending a PDF is probably more useful than the PNG option, as the latter will send a each graph separately in a tar archive.

Summary

I can’t say how impressed I am with colplot, it’s really great for working out what happened when during a benchmark. The great things is the comparison of systems side by side which gives clear indications of imbalances and trensing. Using colplot is also a lot easier than writing your own spreadsheet macros to visualise the data. I really like it!