Search

OakieTags

Who's online

There are currently 0 users and 34 guests online.

Recent comments

Oakies Blog Aggregator

NUMBER data type…what harm can it do ?

There’s a somewhat sour discussion going on based attached to the video at https://www.youtube.com/watch?v=jZW-uLb52xk

Whether you agree or disagree with the video or the comments, or (sadly) the animosity in them, it does lead to an interesting bit of investigation when it comes to data types with arbitrary precision, which was stumbled upon by a friend at work.

Let’s start with a simple comparison between two dates. In this case, I’ve just used times, because it still serves to demonstrate the anomaly.

SQL> SELECT (TO_DATE('14:03:15','hh24:mi:ss')- TO_DATE('14:02:25','hh24:mi:ss')) * 24 * 60 * 60 delta
  2  FROM   dual;

     DELTA
----------
        50

Thankfully, the database has come back with the right answer of 50 seconds. Of course, we might want to remember that result – so lets store it in a variable:

SQL> variable x number
SQL> begin
  2  SELECT (TO_DATE('14:03:15','hh24:mi:ss')- TO_DATE('14:02:25','hh24:mi:ss')) * 24 * 60 * 60
  3  into   :x
  4  FROM   dual;
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL> print x

         X
----------
        50

So far so good…let’s now make the tiniest of changes. We’ll use a local PL/SQL variable

SQL> set serverout on
SQL> declare
  2    x number;
  3  begin
  4  SELECT (TO_DATE('14:03:15','hh24:mi:ss')- TO_DATE('14:02:25','hh24:mi:ss')) * 24 * 60 * 60
  5  into x
  6  FROM   dual;
  7  dbms_output.put_line(x);
  8  end;
  9  /
50.00000000000000000000000000000000000004

PL/SQL procedure successfully completed.

The use of arbitrary precision for ‘x’ shows how things can go a little bit awry. Similarly, lets look at what happens if convert the dates to simple ‘seconds past midnight’.

SQL> SELECT to_char(TO_DATE('14:03:15','hh24:mi:ss'),'SSSSS') d1,
  2         to_char(TO_DATE('14:02:25','hh24:mi:ss'),'SSSSS') d2
  3  FROM   dual;

D1    D2
----- -----
50595 50545

You might think that the calculation would the same, but when we insert those numbers into the equivalent calculation, we get a slightly different answer

SQL> exec dbms_output.put_line((50595/86400 - 50545/86400)* 24 * 60 * 60 );
49.99999999999999999999999999999999999968

Ultimately, this probably boils down to the fact that the certain division operations can never give a prefect answer in floating point arithmetic. In the example above, 50595/86400 yields the never ending result 0.58559027777777777777777…

Putting PL/SQL aside, there might also be hidden costs if you start using unbounded precision in your database tables. I always remember this example from Steve Adams many years ago.

SQL> create table T ( x1 number, x2 number(6,3));

Table created.

SQL> insert into T values ( 3*(1/3), 3*(1/3) );

1 row created.

SQL> commit;

Commit complete.

SQL> select * from T;

        X1         X2
---------- ----------
         1          1

No strange decimals there…but then take look at how you stored that data

SQL> select vsize(x1), vsize(x2) from T;

 VSIZE(X1)  VSIZE(X2)
---------- ----------
        21          2

Bottom line – things can go astray when you dont keep a handle on the appropriate precision to use for your data types.

Function-Based Indexes And CURSOR_SHARING = FORCE

In general it is known that Function-Based Indexes (FBIs) can no longer be used by the optimizer if the expression contains literals and CURSOR_SHARING = FORCE / SIMILAR (deprecated) turns those literals into bind variables. Jonathan Lewis described the issue quite a while ago here in detail.In a recent OTN thread this issue was raised again, but to my surprise when I played around with a test case that mimicked the OP's problem query I found that (certain) Oracle versions have some built-in logic that enable FBI usage for certain cases where you would expect them to be not usable.If you test the following code on versions from 10.2.0.4 (possibly earlier) up to and including version 11.2.0.3 then you'll notice some interesting details:


create table t
as
select * from all_objects;

create index t_idx on t (owner || ' ' || object_name);

exec dbms_stats.gather_table_stats(null, 't')

set echo on linesize 200 pagesize 0

alter session set cursor_sharing = force;

select /*+ full(t) */ * from t where owner || ' ' || object_name = 'BLA';

select * from table(dbms_xplan.display_cursor);

select /*+ index(t) */ * from t where owner || ' ' || object_name = 'BLA';

select * from table(dbms_xplan.display_cursor);

select /*+ index(t) */ * from t where owner || 'A' || object_name = 'BLA';

select * from table(dbms_xplan.display_cursor);

Here is the relevant output I got from 11.2.0.1 for example:


SQL> alter session set cursor_sharing = force;

Session altered.

SQL>
SQL> select /*+ full(t) */ * from t where owner || ' ' || object_name = 'BLA';

no rows selected

SQL>
SQL> select * from table(dbms_xplan.display_cursor);
SQL_ID ar3tw7r1rvawk, child number 0
-------------------------------------
select /*+ full(t) */ * from t where owner || :"SYS_B_0" || object_name
= :"SYS_B_1"

Plan hash value: 1601196873

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 284 (100)| |
|* 1 | TABLE ACCESS FULL| T | 1 | 117 | 284 (2)| 00:00:04 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("OWNER"||' '||"OBJECT_NAME"=:SYS_B_1)


19 rows selected.

SQL>
SQL> select /*+ index(t) */ * from t where owner || ' ' || object_name = 'BLA';

no rows selected

SQL>
SQL> select * from table(dbms_xplan.display_cursor);
SQL_ID 6kzz3vw5x8x3b, child number 0
-------------------------------------
select /*+ index(t) */ * from t where owner || :"SYS_B_0" ||
object_name = :"SYS_B_1"

Plan hash value: 470836197

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 4 (100)| |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 117 | 4 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | T_IDX | 1 | | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("T"."SYS_NC00016$"=:SYS_B_1)


20 rows selected.

SQL>
SQL> select /*+ index(t) */ * from t where owner || 'A' || object_name = 'BLA';

no rows selected

SQL>
SQL> select * from table(dbms_xplan.display_cursor);
SQL_ID 6kzz3vw5x8x3b, child number 1
-------------------------------------
select /*+ index(t) */ * from t where owner || :"SYS_B_0" ||
object_name = :"SYS_B_1"

Plan hash value: 3778778741

-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 52472 (100)| |
|* 1 | TABLE ACCESS BY INDEX ROWID| T | 724 | 84708 | 52472 (1)| 00:10:30 |
| 2 | INDEX FULL SCAN | T_IDX | 72351 | | 420 (1)| 00:00:06 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("OWNER"||:SYS_B_0||"OBJECT_NAME"=:SYS_B_1)


20 rows selected.

Looking at the statement text that results from "CURSOR_SHARING = force" we can spot the expected bind variables instead of the literals, and this should result in a corresponding predicate that doesn't match the FBI expression. However, when looking at the filter expression in the predicate section (when forcing a full table scan) we can spot something interesting: It still shows the literal, which doesn't correspond to the predicate of the rewritten query text.The next execution shows that the FBI really can be used despite the bind variable replacement taking place, and the final execution shows that the cursor sharing works correctly in that sense that a new child cursor got created for the same SQL text with a different plan and different predicate section when using a different literal in the original SQL text. V$SQL_SHARED_CURSOR shows "HASH_MATCH_FAILED" which is described as "No existing child cursors have the unsafe literal bind hash values required by the current cursor", which makes sense and probably means that the corresponding bind variable is marked as "unsafe" internally.This optimisation shows only up if there is a suitable FBI - if there's no corresponding expression the SQL text and predicate section match. Furthermore it only supports certain expressions - Jonathan's example shows that in general it's true that these rewrites prevent FBI usage. And obviously it ceases to work in 11.2.0.4 and 12c. Whether this is a bug or a feature I don't know, but since it only seems to apply to certain expressions it's probably not that relevant anyway.As Jonathan points out in his note you can always work around the general problem by hiding the expression in a view, and since 11g of course a proper virtual column definition is the better approach, which doesn't expose this problem either.Even better would be the proper usage of bind variables and not using forced cursor sharing, but there are still many installations out there that rely on that feature.

Little things worth knowing: Is there a penalty in establishing a connection to Oracle using the MAA connection string?

Sorry for the long title!

I had a question during my session about “advanced RAC programming features” during the last Paris Oracle Meetup about the MAA connection string. I showed an example taken from the Appication Continuity White Paper (http://www.oracle.com/technetwork/database/options/clustering/application-continuity-wp-12c-1966213.pdf). Someone from the audience asked me if I had experienced any problems with it, such as very slow connection timeouts. I haven’t, but wanted to double-check anyway. This is a simplified test using a sqlplus connection since it is easier to time than a call to a connection pool creation. If you know of a way to reliably do so in Java/UCP let me know and I’ll test it.

My system is 12.1.0.2 on Oracle Linux 7.1 with UEK (3.8.13-55.1.6.el7uek.x86_64) and the following patches on the RDBMS and Grid Infrastructure side:

[oracle@rac12sby1 ~]$ opatch lspatches -oh /u01/app/12.1.0.2/grid
19872484;WLM Patch Set Update: 12.1.0.2.2 (19872484)
19769480;Database Patch Set Update : 12.1.0.2.2 (19769480)
19769479;OCW Patch Set Update : 12.1.0.2.2 (19769479)
19769473;ACFS Patch Set Update : 12.1.0.2.2 (19769473)

[oracle@rac12sby1 ~]$ opatch lspatches -oh /u01/app/oracle/product/12.1.0.2/dbhome_1
19877336;Database PSU 12.1.0.2.2, Oracle JavaVM Component (Jan2015)
19769480;Database Patch Set Update : 12.1.0.2.2 (19769480)
19769479;OCW Patch Set Update : 12.1.0.2.2 (19769479)

This is the January System Patch by the way, 20132450. At first I wanted to blog about the patch application but it was so uneventful I decided against it.

I have two clusters, rac12pri and rac12sby. Both consist of 2 nodes each. Database CDB is located on rac12pri, STDBY on rac12sby. Normally I am against naming databases by function as you might end up using STDBY in primary role after a data centre migration, and people tend to find the use of STDBY as primary database odd. In this case I hope it helps understanding the concept better, and it’s a lab environment anyway …

Creating the broker configuration

There is nothing special about creating the Broker Configuration, just remember to define the broker configuration files on shared storage and enabling automatic standby file management. I also recommend standby redo logs on both the primary and standby databases. Once you have the configuration in place, check the database(s) in verbose mode to get the broker connection string. You can copy/paste the connection string to sqlplus to ensure that every instance can be started thanks to a static listener registration (Data Guard will restart databases during switchover and failover operations, which is bound to fail unless the databases are statically registered with the listener). Here is what I mean:

DGMGRL> show instance verbose 'CDB1';

Instance 'CDB1' of database 'CDB'

  Host Name: rac12pri1
  PFILE:     
  Properties:
    StaticConnectIdentifier         = '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.108)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=CDB_DGMGRL)(INSTANCE_NAME=CDB1)(SERVER=DEDICATED)))'
    StandbyArchiveLocation          = 'USE_DB_RECOVERY_FILE_DEST'
    AlternateLocation               = ''
    LogArchiveTrace                 = '0'
    LogArchiveFormat                = '%t_%s_%r.dbf'
    TopWaitEvents                   = '(monitor)'

Instance Status:
SUCCESS

Now I can take the static connection identifier and use it (be sure to specify the “as sysdba” at the end):

[oracle@rac12pri1 ~]$ sqlplus 

SQL*Plus: Release 12.1.0.2.0 Production on Sat Apr 18 10:50:09 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Enter user-name: sys@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.108)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=CDB_DGMGRL)(INSTANCE_NAME=CDB1)(SERVER=DEDICATED))) as sysdba
Enter password: 

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL>

When this worked for all instances in the configuration you are a big step further in your Data Guard setup!

Service setup

I have created a new service, named rootsrv on both databases.

[oracle@rac12sby1 ~]$ srvctl add service -d STDBY -s rootsrv -role primary -policy automatic -preferred STDBY1,STDBY2

[oracle@rac12pri1 ~]$ srvctl add service -d CDB -s rootsrv -role primary -policy automatic -preferred CDB1,CDB2

To better follow along here is the current status of the configuration (In real life you might want to use a different protection mode):

DGMGRL> show configuration

Configuration - martin

  Protection Mode: MaxPerformance
  Members:
  STDBY - Primary database
    CDB   - Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 28 seconds ago)

As per its definition, the service is started on the primary but not on the standby. In fact, I can’t start it on the standby:

[oracle@rac12sby1 ~]$ srvctl status service -d STDBY -s rootsrv
Service rootsrv is running on instance(s) STDBY1,STDBY2

[oracle@rac12pri1 ~]$ srvctl status service -d CDB -s rootsrv
Service rootsrv is not running.
[oracle@rac12pri1 ~]$ srvctl start service -d CDB -s rootsrv
PRCD-1084 : Failed to start service rootsrv
PRCR-1079 : Failed to start resource ora.cdb.rootsrv.svc
CRS-2800: Cannot start resource 'ora.cdb.db' as it is already in the INTERMEDIATE state on server 'rac12pri1'
CRS-2632: There are no more servers to try to place resource 'ora.cdb.rootsrv.svc' on that would satisfy its placement policy
CRS-2800: Cannot start resource 'ora.cdb.db' as it is already in the INTERMEDIATE state on server 'rac12pri2'
[oracle@rac12pri1 ~]$

So no more need for database triggers to start and stop services (this was another question I had during my talk) depending on the database’s role.

The MAA connection string

The MAA connection string as taken from the white paper and slightly adapted is this one:

MAA_TEST1 =
  (DESCRIPTION_LIST=
     (LOAD_BALANCE=off)(FAILOVER=on)
       (DESCRIPTION=
         (CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
           (ADDRESS_LIST= (LOAD_BALANCE=on) (ADDRESS=(PROTOCOL=TCP)(HOST=rac12pri-scan)(PORT=1521)))
           (CONNECT_DATA=(SERVICE_NAME=rootsrv)))
       (DESCRIPTION= (CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
           (ADDRESS_LIST= (LOAD_BALANCE=on) (ADDRESS=(PROTOCOL=TCP)(HOST= rac12sby-scan)(PORT=1521)))
           (CONNECT_DATA=(SERVICE_NAME=rootsrv)))
  )

Please refer to the white paper for an explanation of the various parameters.

Notice that both the primary and standby SCAN are referenced in there, both connecting to “rootsrv” which is currently active on STDBY. To get some proper timing about the connection delay I use the following little snippet:


$ time sqlplus system/secret@maa_test1 < select sys_context('userenv','instance_name') from dual;
> exit
> EOF

Testing on the “primary cluster” first, the local database is operating in the standby role:

[oracle@rac12pri1 ~]$ time sqlplus system/secret@maa_test1 <
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m0.432s
user    0m0.026s
sys     0m0.028s

[oracle@rac12pri1 ~]$ time sqlplus system/secret@maa_test1 <
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m0.415s
user    0m0.023s
sys     0m0.025s

That was quick-not even a second. And as you can see the connection is set up against database STDBY on the other cluster.

Next on the standby cluster:

[oracle@rac12sby1 ~]$ time sqlplus system/secret@maa_test1 <
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m0.613s
user    0m0.025s
sys     0m0.019s

[oracle@rac12sby1 ~]$ time sqlplus system/secret@maa_test1 <
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m0.199s
user    0m0.024s
sys     0m0.021s

So this was quick too. And finally on a different machine altogether (my lab server)

[oracle@lab tns]$ time sqlplus system/secret@maa_test1 < 
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real	0m0.387s
user	0m0.020s
sys	0m0.018s

Does it change after a role reversal? I think I’ll cover that in a different post …

Appendix: setting up Data Guard for RAC

This is something I always forget so it’s here for reference. I use an identical tnsnames.ora file across all nodes, here it is:

CDB =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rac12pri-scan)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = CDB)
    )
  ) 

STDBY =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = rac12sby-scan)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = STDBY)
    )
  ) 

# used for DG and RMAN duplicate
STDBY_NON_SCAN =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12sby1-vip)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12sby2-vip)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = STDBY)
    )
  )

CDB_NON_SCAN =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12pri1-vip)(PORT = 1521))
      (ADDRESS = (PROTOCOL = TCP)(HOST = rac12pri2-vip)(PORT = 1521))
    )
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = CDB)
    )
  )

The non-scan TNS entries are a quick workaround to the problem that I registered the SIDs statically with the node listeners only, not the SCAN listeners. If there is a more elegant way of statically registering a database, please let me know! I couldn’t find any documentation on how to do this. I guess adding a SID_LIST_LISTENER_SCANx would do, too. Anyway, here is an example for the local node listener configuration (not the complete file contents). The ORACLE_SID and host names must be adapted for each node in the cluster.

LISTENER =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = IPC)(KEY = LISTENER))
    )
  )

# for broker
SID_LIST_LISTENER=
  (SID_LIST=
    (SID_DESC=
      (GLOBAL_DBNAME=STDBY)
      (ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_1)
      (SID_NAME=STDBY1))
    (SID_DESC=
      (GLOBAL_DBNAME=STDBY_DGMGRL)
      (ORACLE_HOME=/u01/app/oracle/product/12.1.0.2/dbhome_1)
      (SID_NAME=STDBY1))
  )

After changing the listener configuration you need to reload the local node listener.

Real Push Button Refresh with Raspberry Pi

Push Button

In this post, we’re going to perform a push button refresh of an Oracle Database, Application Express (APEX) installation, and Tomcat webserver.

“But Oracle Alchemist,” you’re probably thinking, “we know about that. You’ve told us about how Delphix can provision and refresh data.” And yes, you’d be right. But I wasn’t done yet.

We’re going to perform a refresh of an Oracle Database, APEX installation, and Tomcat running in Amazon Web Services, replicated from a local Delphix Engine, by pressing a physical button wired to a Raspberry Pi running a python app that communicates with the Delphix REST API in the cloud over wifi.

We’ll go over technical details and the Python code right after the video. Make sure you check it out so you can watch me excitedly press the button. I even did a refresh from Starbucks.

I Like Pi

I’ve wanted a Raspberry Pi for a long time now, and for whatever reason never got around to buying one. I finally did last week, and let me tell you this thing is such a beautiful little device, I nearly fried the logic board by weRaspberry Pi B+eping on it. The components, how tiny! The GPIO pins, how enticing! The Raspberry Pi really is the ultimate geek toy. I ended up going with the Raspberry Pi Model B. It has a 40 pin GPIO header (an I/O interface we’ll use for this article), four USB 2.0 ports, a push-button micro SD slot for the hard drive, and the standard HDMI port, ethernet port, and power via micro USB. I can power it using my iPhone charging block or even with a USB battery pack. The one I bought also came with NOOBS on an 8GB SD card, which was preloaded with Raspbian, ArchLinux, OpenELEC, Pidora, RaspBMC, and RiscOS. I opted for Raspbian.

Zebra CaseAs for extras, I also got a super sexy little case called a Zebra Case. It’s designed and built right here in the good ol’ US of A, is incredibly easy to assemble, has optional rubber feet, can be hung up on a wall, and has easy access to the GPIO pins. I also picked up an Edimax EW-7811Un 150Mbps 11n Wi-Fi USB Adapter because I don’t like being tied down.

Let’s Communicate

I have Delphix 4.2 (the latest version) set up on my laptop with a bunch of data sources: Oracle, an APEX app in Tomcat, Sybase, Postgres, MS SQL Server, and a Delphix Agile Masking repository. I also have Delphix 4.2 installed in Amazon Web Services and am replicating the Oracle Database and application stack to it via Delphix replication. In Amazon, the database, Tomcat, and APEX library replicas have all been provisioned to a target Linux system as Virtual Databases (VDBs) and Virtual Files (vFiles). Delphix has a powerful GUI that can handle refresh, rewind, etc. operations but instead we’re going to do it programmatically through the Delphix REST API in Python. Why? Because alchemy, that’s why. Raspbian has Python already loaded with the GPIO library built in. So let’s take a look at the code.

Here’s the delphix.py code that connects to my Amazon Delphix Engine and performs the operations.

import urllib2
import simplejson as json
from cookielib import CookieJar

# VDBControl accepts op = [ refresh | undo ]

def VDBControl(op):

  # Delphix Details
  url = "http://ec2-52-1-228-37.compute-1.amazonaws.com"
  username = "delphix_admin"
  password = "delphix"

  # urllib2 setup
  headers = {"Content-Type" : "application/json"}
  cj = CookieJar()
  opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))

  # Get Delphix Session
  data = { "type" : "APISession", "version" : { "type" : "APIVersion", "major" : 1, "minor" : 1, "micro" : 0 } }
  opener.open(urllib2.Request(url + "/resources/json/delphix/session", data=json.dumps(data), headers=headers))

  # Login to Delphix
  data = { "type" : "LoginRequest", "username" : "delphix_admin", "password" : "delphix" }
  opener.open(urllib2.Request(url + "/resources/json/delphix/login", data=json.dumps(data), headers=headers))

  # Get Delphix Objects
  request = opener.open(urllib2.Request(url + "/resources/json/delphix/source", headers=headers))
  content = request.read()
  sourcelist = json.loads(content)

  # Loop through Delphix Objects

  for source in sourcelist["result"]:
    if source["virtual"] == True:
      # get container details including name and type
      virtualContainer = source["container"]
      virtualName = source["name"]
      request = opener.open(urllib2.Request(url + "/resources/json/delphix/database/" + virtualContainer, headers=headers))
      content = request.read()
      parentContainer = json.loads(content)["result"]["parentContainer"]
      objType = json.loads(content)["result"]["type"]
      refreshType = "OracleRefreshParameters" if objType == "OracleDatabaseContainer" else "RefreshParameters"

      if op == "refresh":
        # refresh each virtual object
        print "Refreshing " + virtualName + "..."
        data = { "type" : refreshType, "timeflowPointParameters" : { "type" : "TimeflowPointSemantic", "container" : parentContainer } }
        request = opener.open(urllib2.Request(url + "/resources/json/delphix/database/" + virtualContainer + "/refresh", data=json.dumps(data), headers=headers))
        content = request.read()
        print "Output: " + content
        print "Refresh Job Submitted."
      if op == "undo":
        print "Undoing Refresh of " + virtualName + " (if applicable)..."
        request = opener.open(urllib2.Request(url + "/resources/json/delphix/database/" + virtualContainer + "/undo", headers=headers))
        content = request.read()
        print "Output: " + content
        print "Undo Job Submitted."

In a nutshell, what this code does is:

  • Define a VDBControl function that accepts an operation, either “refresh” or “undo”
  • Declares variables for the Delphix Engine and the Python urllib2 classes
  • Get a Delphix session via the API and put a cookie in the cookie jar
  • Log into Delphix
  • Get a list of Delphix objects with a REST call to /resources/json/delphix/source on the Delphix Engine
  • Loop through the list and find the objects that are either Virtual Databases or Virtual Files
  • Grab some auxiliary data we need to do the refresh, like the source object we’re refreshing from.
  • If a refresh is being called, do the refresh by passing the proper data to Delphix using the /resources/json/delphix/database/VDBNAME/refresh REST method.
  • If an undo is being called, do the undo by passing the proper data using the undo method.

Now that we’ve seen the code that interfaces with Delphix, let’s take a look at the code that interfaces with the GPIO.

import RPi.GPIO as GPIO
import time
from delphix import VDBControl

GPIO.setmode(GPIO.BCM)
GPIO.setup(18, GPIO.IN, pull_up_down=GPIO.PUD_UP)
GPIO.setup(24, GPIO.IN, pull_up_down=GPIO.PUD_UP)

while True:
	input_state18 = GPIO.input(18)
	if input_state18 == False:
                VDBControl("refresh")
		time.sleep(0.2)

	input_state24 = GPIO.input(24)
	if input_state24 == False:
                VDBControl("undo")
		time.sleep(0.2)

Pi and BreadboardThis code must be run as root on the Raspberry Pi to interface with the GPIO. I actually have it appended in my rc.local so it starts up automatically when the Raspberry Pi comes online. The code is really quite simple as long as everything is plugged in properly. On my breadboard I have two buttons, each slotted in across the middle gap. The gap is there to divide up the breadboard so circuits can be built from both sides while only certain designated components “bridge” the gap. Two jumper cables connect each button to the Raspberry Pi GPIO. My refresh button has a positive connection to GPIO pin 18, and a ground connection. The undo button has a positive connection to GPIO pin 24, and a ground connection. Then the python code simply:

  • Sets up pins 18 and 24 for monitoring
  • Runs an endless loop
  • If GPIO pin 18 is clicked, the state changes to False, triggering a call to my delphix.VDBControl function with the “refresh” operation. It then sleeps for a brief moment to make sure the button press doesn’t rapid fire a bunch of presses.
  • If GPIO pin 24 is clicked, the same thing happens except an “undo” operation is passed.

And that’s it! A real, honest to goodness push-button refresh, courtesy of the cutest little computer I’ve ever seen and a hobby board, wires, and buttons that made me feel nervous carrying it around in public. It wouldn’t take much to make it presentable though. In fact, given the right top and bottom coverings, we could make quite the button. That was easy.®

So, what could you do with a Raspberry Pi and a couple buttons? Let me know in the comments!

Thank you to Simon Monk and the Raspberry Pi Cookbook for the button guide!

The post Real Push Button Refresh with Raspberry Pi appeared first on Oracle Alchemist.

Tips on Submitting an Abstract to Conference

<.. The tech15 committee and my role
<…. Who plans the Tech15 content

The call for Papers for UKOUG Tech15 has gone out. This is how most of the content for a large conference is sourced, by asking the community to submit abstracts for consideration. With smaller conferences and user group meetings the organisers can often get enough talks by hassling asking people they know.

mail_image_preview_big

Firstly, I would encourage anyone who has considered talking at conference but never has, to submit an abstract. We could easily fill the whole event with known speakers but we don’t. We have a policy of having some New Blood at every conference. {If you are in the UK and want to try out presenting then a great way to do so is by presenting at a smaller user group meeting, like for example the next RAC/Database UKOUG SIG meeting on July 1st :-) – It’s a friendly, relaxed way to get into presenting. Get in touch with me if it appeals to you}.

You can click on this link to go to the submission page, but before you do…

When you submit an abstract for a conference, you are not actually at that point communicating with your potential audience. You are communicating with the handful of people who are tasked with going through all the submissions and selecting the papers. With the UKOUG conference you are also communicating with the volunteers who will judge abstracts. And we, the agenda planning committee, take those judging scores very seriously. It is a large part of how we attempt to ensure we select the talks on topics that people want to hear about, as well as the people who you want to hear talk.

So when you get to the field where you describe your proposed presentation (the abstract) I would suggest you don’t want to be “teasing the audience” at this point. The people who are judging and selecting the papers are seasoned conference attenders. A catchy title might get you noticed but if the abstract does not clearly state what your talk is about, what you intend to cover and who you expect your audience to be, it makes it less likely that your abstract will get selected.
Also, if you have looked at the call-for-papers page and associated notes and have seen that we are particularly interested in some area (eg “what you need to know about ….” for the database stream) and your paper is addressing it, it is worth making that a little bit obvious. The agenda planning day is hectic, we get tired and tetchy and our brains start to leak out of our ears. If your abstract is clear about what you are talking about, you are increasing your chances of selection.

In years gone by we have given the people the option to give two versions of your abstract – the one for judging and the one for promoting your talk (that is the one that gets put in the conference notes and your potential audience will read and decided if your talk is worth their attention). However, many people felt this was asking for the same information twice so we have reverted back to a single abstract this your. However, you can change your abstract text after your talk has been accepted {but note, we are wise to people trying to change the actual content of the talk later on – we LOOK at the changes!}. So sell your talk to the committee and judges now and worry about the catchy reference to your favorite movie afterwards.

I used to make my submission abstract humorous (well, in my eyes) but I don’t anymore, or at least I tone it down. If anything, I make the abstract factual and simple. As an example:


How Oracle Works in 50 Minutes
—————————————–
This is a high level but technical talk about the key processes that underlie the Oracle database. I describe what an instance is and how the server process is the gateway to the instance. I cover how the redo mechanism works and why it is critical, how data is moved into and out of memory, delayed block cleanout, what a commit IS, the importance of the undo tablespace and how a read consistent image is maintained for the user. The intended audience is new DBAs or developers who have never been taught how the database works and at the end of the talk they will understand how these key processes work.

OK, the description is a bit boring but you know exactly what my talk is going to be about and if it will fit in your conference.

So what happens when you click on the above link to submit an abstract? You will see the below front screen:

Submission Screen

Submission Screen

I would suggest you not only read this screen but also check out the menu on the left of the screen. Look at the “Hints & Tips” and also the stream you are intending to submit to (eg “Systems” if you want to present on Exadata). If you are unsure which area your talk fits in, check them all out.

the big red Submit an Abstract will actually take you to the same place that the left menu “Speaker Application” takes you too. The first step of submitting an abstract is actually saying who you are and registering on the system. If you are willing to judge abstracts (ie you ticked that box) you will then get to indicate what topics in what streams you are willing to judge. THEN you will be put into the “Speaker Lounge” and you can enter your abstract by clicking the “Submit” button.

When you come back to the system, you can go straight to the Speaker Lounge, the system will show you your details again so you can correct anything. You will see what abstract(s) you have submitted and click on them to check or change anything, or click on “Submit” to add another abstract.

Think carefully before you submit 15 abstracts. As a general rule, more than 3 and you start to reduce your chances of having a paper selected. People judge your papers will score you down if you submit too many, it’s like you dilute your judging scores over all the abstracts.

Enjoy.

Little things worth knowing: what does opatchauto actually do?

I recently applied system patch 20132450 to my 12.1.0.2.0 installation on a 2 node RAC system on Oracle Linux 7.1. While ensuring that OPatch is the latest version available I came across an interesting command line option in opatchauto. It is called “-generateSteps”.

[oracle@rac12sby1 ~]$ $ORACLE_HOME/OPatch/opatchauto apply -help
OPatch Automation Tool
Copyright (c) 2015, Oracle Corporation.  All rights reserved.


DESCRIPTION
    Apply a System Patch to Oracle Home. User specified the patch
    location or the current directory will be taken as the patch location.
    opatchauto must run from the GI Home as root user.

SYNTAX
/OPatch/opatchauto apply
                            [-analyze]
                            [-database  ]
                            [-generateSteps]
                            [-invPtrLoc  ]
                            [-jre  ] 
                            [-norestart ]
                            [-nonrolling ]
                            [-ocmrf  ]
                            [-oh  ]
                            [  ]

OPTIONS


       -analyze
              This option runs all the required prerequisite checks to confirm
              the patchability of the system without actually patching or 
              affecting the system in any way.

       -database
              Used to specify the RDBMS home(s) to be patched. Option value 
              is oracle database name separated by comma.

       -generateSteps
              Generate the manual steps of apply session. These steps are what
              'opatchauto apply' does actually.

       -invPtrLoc
              Used to locate the oraInst.loc file when the installation used 
              the -invPtrLoc. This should be the path to the oraInst.loc file.

       -jre
              This option uses JRE (java) from the
              specified location instead of the default location
              under Oracle Home.

       -norestart
              This option tells opatchauto not to restart the Grid Infrastructure
              stack and database home resources after patching.

       -nonrolling
              This option makes the patching session run in 'nonrolling' mode.
              It is required that the stack on the local node is running while
              it must be stopped on all the remaining nodes before the patching
              session starts.

       -ocmrf 
              This option specifies the absolute path to the OCM response file. 
              It is required if the target Oracle Home doesn't have OCM 
              installed and configured.

       -oh
              This option specifies Oracle Home(s) to be patched. This can be a
              RAC home, a Grid Home or comma separated list of multiple homes.


PARAMETERS
      Patch Location
                    Path to the location for the patch. If the patch 
                    location is not specified, then the current
                    directory is taken as the patch location.


Example:
      To patch GI home and all RAC homes: 
      '/OPatch/opatchauto apply'

      To patch multiple homes:
      '/OPatch/opatchauto apply -oh ,,'

      To patch databases running from RAC homes only:
      '/OPatch/opatchauto apply -database db1,db2...dbn'

      To patch software-only installation:
      '/OPatch/opatchauto apply -oh ' OR
      '/OPatch/opatchauto apply -oh '


opatchauto succeeded.
[oracle@rac12sby1 ~]$ 

So this could be quite interesting so I wanted to see what it does with the system patch. I don’t have a database installed in my cluster yet, it’s going to be the standby site for another cluster (rac12pri). Unfortunately opatchauto still skips RDBMS homes without a database in it in my tests so I end up using the “-oh” flag in this case. Here’s the result, formatted for better readability. In reality all calls to opatchauto are in a single line:

[root@rac12sby1 ~]# opatchauto apply /u01/patches/20132450 \
-oh /u01/app/12.1.0.2/grid -generateSteps -ocmrf /u01/patches/ocm.rsp
OPatch Automation Tool
Copyright (c) 2015, Oracle Corporation.  All rights reserved.

OPatchauto version : 12.1.0.1.5
OUI version        : 12.1.0.2.0
Running from       : /u01/app/12.1.0.2/grid

Invoking opatchauto utility "generateapplysteps"

To apply the patch, Please do the following manual actions:

Step 1 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 1.1 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch version -oh /u01/app/12.1.0.2/grid -invPtrLoc \
/u01/app/12.1.0.2/grid/oraInst.loc -v2c 12.1.0.1.5

Step 2 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 2.1 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch lsinventory -invPtrLoc \
/u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid

Step 3 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 3.1 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckComponents \
-ph /u01/patches/20132450/19769473 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 3.2 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckComponents \
-ph /u01/patches/20132450/19769479 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 3.3 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckComponents \
-ph /u01/patches/20132450/19769480 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 3.4 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckComponents \
-ph /u01/patches/20132450/19872484 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

Step 4 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 4.1 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckConflictAgainstOH \
-ph /u01/patches/20132450/19769473 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 4.2 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckConflictAgainstOH \
-ph /u01/patches/20132450/19769479 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 4.3 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckConflictAgainstOH \
-ph /u01/patches/20132450/19769480 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

  Action 4.4 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch prereq CheckConflictAgainstOH \
-ph /u01/patches/20132450/19872484 -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc \
-oh /u01/app/12.1.0.2/grid

Step 5 As the "root" user on the host "rac12sby1", please run the following commands:
  Action 5.1 [root@rac12sby1]#
  /u01/app/12.1.0.2/grid/perl/bin/perl -I/u01/app/12.1.0.2/grid/perl/lib \
-I/u01/app/12.1.0.2/grid/crs/install /u01/app/12.1.0.2/grid/crs/install/rootcrs.pl -prepatch

Step 6 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 6.1 [oracle@rac12sby1]$
  echo /u01/patches/20132450/19769473 > /tmp/OraGI12Home1_patchList

  Action 6.2 [oracle@rac12sby1]$
  echo /u01/patches/20132450/19769479 >> /tmp/OraGI12Home1_patchList

  Action 6.3 [oracle@rac12sby1]$
  echo /u01/patches/20132450/19769480 >> /tmp/OraGI12Home1_patchList

  Action 6.4 [oracle@rac12sby1]$
  echo /u01/patches/20132450/19872484 >> /tmp/OraGI12Home1_patchList

  Action 6.5 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch napply -phBaseFile /tmp/OraGI12Home1_patchList \
-local  -invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid -silent

Step 7 As the "root" user on the host "rac12sby1", please run the following commands:
  Action 7.1 [root@rac12sby1]#
  /u01/app/12.1.0.2/grid/rdbms/install/rootadd_rdbms.sh

Step 8 As the "root" user on the host "rac12sby1", please run the following commands:
  Action 8.1 [root@rac12sby1]#
  /u01/app/12.1.0.2/grid/perl/bin/perl -I/u01/app/12.1.0.2/grid/perl/lib \
-I/u01/app/12.1.0.2/grid/crs/install /u01/app/12.1.0.2/grid/crs/install/rootcrs.pl -postpatch


Step 9 As the "oracle" user on the host "rac12sby1", please run the following commands:
  Action 9.1 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch lsinventory \
-invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid | grep 19769473

  Action 9.2 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch lsinventory \
-invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid | grep 19769479

  Action 9.3 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch lsinventory \
-invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid | grep 19769480

  Action 9.4 [oracle@rac12sby1]$
  /u01/app/12.1.0.2/grid/OPatch/opatch lsinventory \
-invPtrLoc /u01/app/12.1.0.2/grid/oraInst.loc -oh /u01/app/12.1.0.2/grid | grep 19872484

You could of course use oplan which is bundled in $ORACLE_HOME/OPatch/oplan/oplan to generate a lot more detailed profile for the patch application. For a first glance at the activity opatchauto -generateSteps seems quite useful.

The steps all rang a bell from my 11.2.0.1 days when this was the procedure to patch Grid Infrastructure (thank god that’s over). What I didn’t recognise was rootadd_rdbms.sh. Looking at the file I can see it is changing ownership and permissions for oradism (important!) and some other executables in $ORACLE_HOME and fixes potential problems with “fs.file-max*” in /etc/sysctl.conf.

Cartesian join

Some time ago I pulled off the apocryphal “from 2 hours to 10 seconds” trick for a client using a technique that is conceptually very simple but, like my example from last week, falls outside the pattern of generic SQL. The problem (with some camouflage) is as follows: we have a data set with 8 “type” attributes which are all mandatory columns. We have a “types” table with the same 8 columns together with two more columns that are used to translate a combination of attributes into a specific category and “level of relevance”. The “type” columns in the types table are, however, allowed to be null although each row must have at least one column that is not null – i.e. there is no row where every “type” column is null.

The task is to match each row in the big data set with all “sufficiently similar” rows in the types table and then pick the most appropriate of the matches – i.e. the match with the largest “level of relevance”. The data table had 500,000 rows in it, the types table has 900 rows. Here’s a very small data set representing the problem client data (cut down from 8 type columns to just 4 type columns):


create table big_table(
	id		number(10,0)	primary key,
	v1		varchar2(30),
	att1		number(6,0),
	att2		number(6,0),
	att3		number(6,0),
	att4		number(6,0),
	padding		varchar2(4000)
);

create table types(
	att1		number(6,0),
	att2		number(6,0),
	att3		number(6,0),
	att4		number(6,0),
	category	varchar2(12)	not null,
	relevance	number(4,0)	not null
);

insert into big_table values(1, 'asdfllkj', 1, 1, 2, 1, rpad('x',4000));
insert into big_table values(2, 'rirweute', 1, 3, 1, 4, rpad('x',4000));

insert into types values(   1, null, null, null, 'XX',  10);
insert into types values(   1, null, null,    1, 'YY',  20);
insert into types values(   1, null,    1, null, 'ZZ',  20);

commit;

A row from the types table is similar to a source row if it matches on all the non-null columns. So if we look at the first row in big_table, it matches the first row in types because att1 = 1 and all the other attN columns are null; it matches the second row because att1 = 1 and att4 = 1 and the other attN columns are null, but it doesn’t match the third row because types.att3 = 1 and big_table.att3 = 2.

Similarly, if we look at the second row in big_table, it matches the first row in types, doesn’t match the second row because types.att4 = 1 and big_table.att4 = 4, but does match the third row. Here’s how we can express the matching requirement in SQL:


select
	bt.id, bt.v1,
	ty.category,
	ty.relevance
from
	big_table	bt,
	types		ty
where
	nvl(ty.att1(+), bt.att1) = bt.att1
and	nvl(ty.att2(+), bt.att2) = bt.att2
and	nvl(ty.att3(+), bt.att3) = bt.att3
and	nvl(ty.att4(+), bt.att4) = bt.att4
;

You’ll realise, of course, that essentially we have to do a Cartesian merge join between the two tables. Since there’s no guaranteed matching column that we could use to join the two tables we have to look at every row in types for every row in big_table … and we have 500,000 rows in big_table and 900 in types, leading to an intermediate workload of 450,000,000 rows (with, in the client case, 8 checks for each of those rows). Runtime for the client was about 2 hours, at 100% CPU.

When you have to do a Cartesian merge join there doesn’t seem to be much scope for reducing the workload, however I didn’t actually know what the data really looked like so I ran a couple of queries to analyse it . The first was a simple “select count (distinct)” query to see how many different combinations of the 8 attributes existed in the client’s data set. It turned out to be slightly less than 400.

Problem solved – get a list of the distinct combinations, join that to the types table to translate to categories, then join the intermediate result set back to the original table. This, of course, is just applying two principles that I’ve discussed before: (a) be selective about using a table twice to reduce the workload, (b) aggregate early if you can reduce the scale of the problem.

Here’s my solution:


with main_data as (
	select
		/*+ materialize */
		id, v1, att1, att2, att3, att4
	from
		big_table
),
distinct_data as (
	select
		/*+ materialize */
		distinct att1, att2, att3, att4
	from	main_data
)
select
	md.id, md.v1, ty.category, ty.relevance
from
	distinct_data	dd,
	types		ty,
	main_data	md
where
	nvl(ty.att1(+), dd.att1) = dd.att1
and	nvl(ty.att2(+), dd.att2) = dd.att2
and	nvl(ty.att3(+), dd.att3) = dd.att3
and	nvl(ty.att4(+), dd.att4) = dd.att4
and	md.att1 = dd.att1
and	md.att2 = dd.att2
and	md.att3 = dd.att3
and	md.att4 = dd.att4
;

And here’s the execution plan.


---------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                            |    12 |  2484 |    11  (10)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION |                            |       |       |            |          |
|   2 |   LOAD AS SELECT           | SYS_TEMP_0FD9D6619_8FE93F1 |       |       |            |          |
|   3 |    TABLE ACCESS FULL       | BIG_TABLE                  |     2 |   164 |     2   (0)| 00:00:01 |
|   4 |   LOAD AS SELECT           | SYS_TEMP_0FD9D661A_8FE93F1 |       |       |            |          |
|   5 |    HASH UNIQUE             |                            |     2 |   104 |     3  (34)| 00:00:01 |
|   6 |     VIEW                   |                            |     2 |   104 |     2   (0)| 00:00:01 |
|   7 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D6619_8FE93F1 |     2 |   164 |     2   (0)| 00:00:01 |
|*  8 |   HASH JOIN                |                            |    12 |  2484 |     6   (0)| 00:00:01 |
|   9 |    NESTED LOOPS OUTER      |                            |     6 |   750 |     4   (0)| 00:00:01 |
|  10 |     VIEW                   |                            |     2 |   104 |     2   (0)| 00:00:01 |
|  11 |      TABLE ACCESS FULL     | SYS_TEMP_0FD9D661A_8FE93F1 |     2 |   104 |     2   (0)| 00:00:01 |
|* 12 |     TABLE ACCESS FULL      | TYPES                      |     3 |   219 |     1   (0)| 00:00:01 |
|  13 |    VIEW                    |                            |     2 |   164 |     2   (0)| 00:00:01 |
|  14 |     TABLE ACCESS FULL      | SYS_TEMP_0FD9D6619_8FE93F1 |     2 |   164 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   8 - access("MD"."ATT1"="DD"."ATT1" AND "MD"."ATT2"="DD"."ATT2" AND
              "MD"."ATT3"="DD"."ATT3" AND "MD"."ATT4"="DD"."ATT4")
  12 - filter("DD"."ATT1"=NVL("TY"."ATT1"(+),"DD"."ATT1") AND
              "DD"."ATT2"=NVL("TY"."ATT2"(+),"DD"."ATT2") AND
              "DD"."ATT3"=NVL("TY"."ATT3"(+),"DD"."ATT3") AND
              "DD"."ATT4"=NVL("TY"."ATT4"(+),"DD"."ATT4"))

Critically I’ve taken a Cartesian join that had a source of 500,000 and a target of 900 possible matches, and reduced it to a join between the 400 distinct combinations and the 900 possible matches. Clearly we can expect this to to take something like one twelve-hundredth (400/500,000) of the work of the original join – bringing 7,200 seconds down to roughly 6 seconds. Once this step is complete we have an intermediate result set which is the 4 non-null type columns combined with the matching category and relevance columns – and can use this in a simple and efficient hash join with the original data set.

Logic dictated that the old and new results would be the same – but we did run the two hour query to check that the results matched.

Footnote: I was a little surprised that the optimizer produced a nested loops outer join rather than a Cartesian merge in the plan above – but that’s probably an arterfact of the very small data sizes in my test.There’s presumably little point in transferring the data into the PGA when the volume is so small.

Footnote 2: I haven’t included the extra steps in the SQL to eliminate the reduce the intermediate result to just “the most relevant” – but that’s just an inline view with an analytic function. (The original code actually selected the data with an order by clause and used a client-side filter to eliminate the excess!).

Footnote 3: The application was a multi-company application – and one of the other companies had not yet gone live on the system because they had a data set of 5 million rows to process and this query had never managed to run to completion in the available time window.  I’ll have to get back to the client some day and see if the larger data set also collapsed to a very small number of distinct combinations and how long the rewrite took with that data set.

 

Demos do fail.

I am an ardent believer of “show me how it works” principle and usually, I have demos in my presentation. So, I was presenting “Tools for advanced debugging in Solaris and Linux” with demos in IOUG Collaborate 2015 in Las Vegas on April 13 and my souped-up laptop (with 32G of memory, SSD drives, and an high end video processor etc ) was not responding when I tried to access folder to open my presentation files.

Sometimes, demos do fail. At least, I managed to complete the demos with zero slides :-) Apologies to the audience for my R-rated rants about laptop issues.

You can download presentations files from the links below.

Session_145_advanced_debugging_using_UNIX_tools

Session_189_Riyaj_Inmemory_internals_files

Good Singapore Maths Students Would Likely Make Good Oracle DBAs (Problems)

An interesting mathematics based question from a Singapore high school exam has been doing the internet rounds in the past few days. Considering it’s aimed at high school students, it’s a tricky one and obviously designed to filter out the better students, in a country with a very good reputation for churning out mathematically gifted […]

Little things worth knowing: direct path inserts and referential integrity

This is another post to remind myself that Oracle evolves, and what I thought I knew might no longer be relevant. So double-checking instead of assuming should become a habit!

Today’s example: direct path inserts. I seemed to remember from Oracle 9i that a direct path insert ignores referential integrity. This is still confirmed in the 9i Release 2 Concepts Guide, chapter 19 “Direct Path Insert”. Quoting from there:

During direct-path INSERT operations, Oracle appends the inserted data after existing data in the table. Data is written directly into datafiles, bypassing the buffer cache. Free space in the existing data is not reused, and referential integrity constraints are ignored

That sounds a bit harsh in today’s times so it’s worth a test. On Oracle 12.1.0.2 I created a parent/child relationship, admittedly rather crude:

SQL> create table parent (id, vc) as
  2   select distinct data_object_id, subobject_name from dba_objects
  3   where data_object_id is not null;

Table created.

SQL> alter table parent add constraint pk_parent primary key (id);

Table altered.

SQL> create table child (id number, p_id number not null, vc varchar2(100),
  2  constraint pk_child primary key (id),
  3  constraint fk_parent_child foreign key (p_id) references parent (id));

Table created.

Now when I try to insert data using a direct path insert it fails:

SQL> insert /*+ append */ into child select s_child.nextval, -1, 'this will not work' from parent;
insert /*+ append */ into child select s_child.nextval, -1, 'this will not work' from parent
*
ERROR at line 1:
ORA-02291: integrity constraint (MARTIN.FK_PARENT_CHILD) violated - parent key not found

Which is brilliant and what I expected since it prevents me from writing a lot of garbage into my table. If you have developed with Oracle then you probably know about deferrable constraints. An existing constraint can’t be changed to a status of “initially deferrable”, which is why I have to drop it and then try the insert again:

SQL> alter table child drop constraint FK_PARENT_CHILD;

Table altered.

Elapsed: 00:00:00.04
SQL> alter table child add constraint FK_PARENT_CHILD foreign key (p_id) references parent (id)
  2  deferrable initially deferred;

Table altered.

Elapsed: 00:00:00.02
SQL> insert /*+ append */ into child select s_child.nextval, -1, 'this will not work' from parent;

7640 rows created.

Elapsed: 00:00:00.30
SQL> commit;
commit
*
ERROR at line 1:
ORA-02091: transaction rolled back
ORA-02291: integrity constraint (MARTIN.FK_PARENT_CHILD) violated - parent key not found

So this contraint fires as well! Good news for me. If it’s all ok then the insert will of course succeed.

Trouble is that Oracle “silently” ignores the direct path load! I haven’t initially put this into the post but thanks to Oren for adding it to it in the comments section (make sure to have a look at it).

Back to the “hot” constraint definition and inserting into the table yields the expected result.

SQL> alter table child drop constraint FK_PARENT_CHILD;

Table altered.

SQL> alter table child add constraint FK_PARENT_CHILD foreign key (p_id) references parent (id);

Table altered.

SQL> insert /*+ append */ into child select s_child.nextval, 2, 'this should work however' from parent;

7640 rows created.

SQL> commit;

Commit complete.

Summary

Again-the insert wasn’t really a “direct path insert”, see comments. It really pays off to verify and update “old knowledge” from time to time! I still prefer valid data over garbage, and when it comes to ILM I can “move” my table to get the required compression result.

References

Have fun!