Search

OakieTags

Who's online

There are currently 0 users and 31 guests online.

Recent comments

Oakies Blog Aggregator

Oracle Management Cloud – Application Performance Monitoring

A while ago I created a first post about the Oracle Management Cloud ( #OMC

Birmingham City University (BCU) Talk #5

bcuI had a day off work today so I could go over to Birmingham City University (BCU) and do a talk to the students.

Today’s session was more about giving them an idea of where databases fit into the big picture as far as data and data processing is concerned. I obviously come at this as a relational database guy, but the presentation also included NoSQL and Hadoop. I hope nobody is naive enough anymore to think relational databases are the correct solution for every use case, so it’s important to be objective about this stuff, rather than to push your own agenda.

Over recent weeks I’ve spent time trawling through material to get a feel for the subject matter, and it’s quite depressing to see the territorial pissing that goes on between all these camps. Each trying to convince you their solution is suitable for use cases where it clearly is not. To be fair, most of this is not coming from the experts themselves, but from the hangers on who can’t cope with the fact their favourite tech might not be perfect. We’ve all been there! </p />
</p></div>

    	  	<div class=

DEFAULT_CACHE_SIZE mentioned in alert.log of an #Oracle database

Today, I got this message in my alert.log file:

Full DB Caching disabled: DEFAULT_CACHE_SIZE should be at least 709 MBs bigger than current size.

When I look at the datafile sizes and compare them with the buffer cache size, it shows:

 

SYS@cloudcdb > select name,bytes/1024/1024 as mb from v$sgainfo;

NAME                                                       MB
-------------------------------------------------- ----------
Fixed SGA Size                                     2,80265045
Redo Buffers                                       13,1953125
Buffer Cache Size                                        3296
In-Memory Area Size                                      2048
Shared Pool Size                                          736
Large Pool Size                                            32
Java Pool Size                                             16
Streams Pool Size                                           0
Shared IO Pool Size                                       208
Data Transfer Cache Size                                    0
Granule Size                                               16
Maximum SGA Size                                         6144
Startup overhead in Shared Pool                    181,258133
Free SGA Memory Available                                   0

14 rows selected.

SYS@cloudcdb > select sum(bytes)/1024/1024 as mb from v$datafile;

        MB
----------
      3675

It is true, the database doesn’t fit completely into the buffer cache, missing roughly that amount of space mentioned. There is no such parameter as DEFAULT_CACHE_SIZE, though.
What we have instead is DB_CACHE_SIZE. In order to fix that issue, I was using this initialization parameter file to create a new spfile from:

[oracle@uhesse-service2 dbs]$ cat initCLOUDCDB.ora
*.audit_file_dest='/u02/app/oracle/admin/CLOUDCDB/adump'
*.audit_trail='db'
*.compatible='12.1.0.2.0'
*.control_files='/u02/app/oracle/oradata/CLOUDCDB/control01.ctl','/u03/app/oracle/fra/CLOUDCDB/control02.ctl'
*.db_block_size=8192
*.db_domain=''
*.db_name='CLOUDCDB'
*.db_recovery_file_dest='/u03/app/oracle/fra'
*.db_recovery_file_dest_size=10737418240
*.diagnostic_dest='/u02/app/oracle'
*.dispatchers='(PROTOCOL=TCP) (SERVICE=CLOUDCDBXDB)'
*.enable_pluggable_database=true
*.open_cursors=300
*.processes=300
*.remote_login_passwordfile='EXCLUSIVE'
*.undo_tablespace='UNDOTBS1'
*.sga_target=6g
*.pga_aggregate_target=2g
*.inmemory_size=1g
*.db_cache_size=4g

That reduced the size of the In-Memory Column Store to make room for the buffer cache. Now the database fits nicely into the buffer cache again:

SYS@cloudcdb > select name,bytes/1024/1024 as mb from v$sgainfo;

NAME                                                       MB
-------------------------------------------------- ----------
Fixed SGA Size                                     2,80265045
Redo Buffers                                       13,1953125
Buffer Cache Size                                        4256
In-Memory Area Size                                      1024
Shared Pool Size                                          800
Large Pool Size                                            32
Java Pool Size                                             16
Streams Pool Size                                           0
Shared IO Pool Size                                         0
Data Transfer Cache Size                                    0
Granule Size                                               16
Maximum SGA Size                                         6144
Startup overhead in Shared Pool                    181,290176
Free SGA Memory Available                                   0

14 rows selected.

Accordingly the message in the alert.log now reads
Buffer Cache Full DB Caching mode changing from FULL CACHING DISABLED to FULL CACHING ENABLED

Don’t get me wrong: I’m not arguing here against the In-Memory Option or in favor of Full Database Caching. Or whether it makes sense to use any of them or both. This post is just about clarifying the strange message in the alert.log that may confuse people.

And by the way, my demo database is running in the Oracle Cloud:-)

Tagged: 12c New Features

EM13c Cloning- Part II, Demands from Test Master and Kellyn

On my previous post, I submitted a job to create a test master database in my test environment.  Now my test environment is a sweet offering of containers that simulate a multi-host scenario, but in reality, it’s not.

mehttp://i2.wp.com/dbakevlar.com/wp-content/uploads/2016/03/me.png?resize=... 300w" sizes="(max-width: 320px) 100vw, 320px" data-recalc-dims="1" />

I noted that after the full copy started, my EMCC partially came down, as did my OMR, requiring both to be logged into and restarted.

Upon inspection of TOP on my “host” for my containers, we can see that there is some serious CPU usage from process 80:

 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 80 root 20 0 0 0 0 R 100.0 0.0 134:16.02 kswapd0
24473 oracle 20 0 0 0 0 R 100.0 0.0 40:36.39 kworker/u3+

and that this is Linux process is managing the swap, (good job, Kellyn! :))

$ ps -ef | grep 80
root 80 2 0 Feb16 ? 02:14:36 [kswapd0]

The host is very slow to respond and is working hard.  Now what jobs are killing it so?  Is it all the test master creation?

test_m1http://i1.wp.com/dbakevlar.com/wp-content/uploads/2016/03/test_m1.png?re... 300w, http://i1.wp.com/dbakevlar.com/wp-content/uploads/2016/03/test_m1.png?re... 768w" sizes="(max-width: 593px) 100vw, 593px" data-recalc-dims="1" />

Actually, no, remember, this is Kellyn’s test environment, so I have four databases that are loading to a local AWR Warehouse and these are all container environments sharing resources.

I have two failed AWR extract jobs due to me overwhelming the environment and can no longer get to my database home page to even remove them. I had to wait for a bit for the main processing to complete before I could even get to this.

As it got closer to completing the work of the clone,I finally did log into the AWR Warehouse, removed two databases and then shut them down to free up resources and space.  We can then see the new processes for the test master, owned by 500 instead of showing as the oracle OS user as they’re running on a different container than the one I’m running the top command from:

24473 root 20 0 0 0 0 R 100.0 0.0 52:23.17 kworker/u3+
15682 500 20 0 2456596 58804 51536 R 98.0 0.2 50:07.48 oracle_248+
 5034 500 20 0 2729304 190600 181552 R 91.1 0.8 8:59.36 ora_dbw0_e+
 2946 500 20 0 2802784 686440 626148 R 86.8 2.8 3:15.62 ora_j019_e+
 5041 500 20 0 2721952 19644 17612 R 68.6 0.1 6:36.20 ora_lg00_e+

It looks a little better thou as it starts to recover from me multi-tasking too much at once.  After a certain amount of time, the test master was finished and up:

test_m2http://i0.wp.com/dbakevlar.com/wp-content/uploads/2016/03/test_m2.png?re... 300w, http://i0.wp.com/dbakevlar.com/wp-content/uploads/2016/03/test_m2.png?re... 768w" sizes="(max-width: 600px) 100vw, 600px" data-recalc-dims="1" />

I did get it to the point during the clone where there was no swap left:

KiB Mem : 24423596 total, 178476 free, 10595080 used, 13650040 buff/cache
KiB Swap: 4210684 total, 0 free, 4210684 used. 1148572 avail Mem

They really should just cut me off some days… </p />
</p></div></div>

    	  	<div class=

Why You Should Consider Moving Your Enterprise Applications to the Oracle Cloud

class="l-submain">
class="l-submain-h g-html i-cf">

 

style="font-weight: 400;">If you’ve decided to migrate your Oracle enterprise applications to the public cloud, it’s a good idea to consider Oracle Cloud alongside alternatives such as Amazon Web Services (AWS) and Microsoft Azure.

style="font-weight: 400;">Oracle has made big strides in the cloud lately with platform-as-a-service (PaaS) offerings for its middleware and database software, culminating in the release of its first infrastructure-as-a-service (IaaS) offering in late 2015.

style="font-weight: 400;">Oracle has a clear advantage over the competition when it comes to running its own applications in the cloud: it has full control over product licensing and can optimize its cloud platform for lift-and-shift migrations. This gives you a low-risk strategy for modernizing your IT portfolio.

 

What to expect from Oracle Cloud IaaS

style="font-weight: 400;">Because Oracle’s IaaS offering is quite new, it has yet to match the flexibility and feature set of Azure and AWS. For example, enterprise VPN connectivity between cloud and on-premises infrastructure is still very much a work in progress. Unlike AWS, however, Oracle provides a free software appliance for accessing cloud storage on-premises. In addition to offering an hourly metered service, Oracle also provides unmetered compute capacity with a monthly subscription. Some customers prefer this option because it allows them to more easily control their spending through a predictable monthly fee rather than a pure pay-as-you-go model.

style="font-weight: 400;">At the same time, Oracle Cloud IaaS has a limited selection of instance shapes, there is no SSD storage yet or guaranteed input/output performance levels, and transferring data is more challenging for large-volume migrations.

 

What to expect from Oracle Cloud PaaS

style="font-weight: 400;">Oracle’s PaaS offerings are quickly becoming among the most comprehensive cloud-based services for Oracle Database. They include:

 

style="font-weight: 400;">Oracle Database Schema Service

style="font-weight: 400;">This is the entry-level unmetered offering, available starting at $175 a month for a 5GB database schema limit. Tenants share databases but are isolated in their own schemas. This means you have no control over database parameters, only the schema objects created. This service is currently available only with Oracle Database 11g Release 2 (i.e., it is not yet included in the latest release of Oracle Database 12c).

 

style="font-weight: 400;">Oracle Exadata Cloud Service

style="font-weight: 400;">This is a hosted service with monthly subscriptions starting at $70,000 for a quarter rack with 28 OCPUs enabled and 42TB of usable storage provisioned. You have full root OS access and SYSDBA database access, so you have total flexibility in managing your environment. However, this means Oracle manages only the bare minimum—the external networking and physical hardware—so you may end up expending the same effort as you would managing Exadata on-premises.

 

style="font-weight: 400;">Oracle Database Virtual Image Service

style="font-weight: 400;">This is a Linux VM with pre-installed Oracle Database software. The license is included in the rate. It’s available metered (priced per OCPU per hour of runtime) and unmetered (priced per OCPU allocated per month). As you’ll need to manage everything up from the VM level, including OS management and full DBA responsibilities, the metered service is a particularly good option for running production environments that require full control over the database deployment.

 

style="font-weight: 400;">Oracle Database-as-a-Service (DBaaS)

style="font-weight: 400;">This is an extension of Virtual Image Service and includes additional automation for database provisioning during service creation, backup, recovery, and patching. While you are still responsible for the complete management of the environment, the embedded automation and tooling can simplify some DBA tasks.

style="font-weight: 400;">I should point out that, with the exception of Oracle Database Schema Service, these are not “true” PaaS offerings; they function more like IaaS-style services but with database software licenses included. But this is on the way, as style="font-weight: 400;">Oracle recently announced plans for a fully managed DBaaS offering style="font-weight: 400;">  similar to the one available through AWS.

 

style="font-weight: 400;">While Oracle’s cloud options are still quite new and require additional features for broad enterprise adoption, if this option sparks your interest, now is the time to take the first steps. If you want to learn more about the migration path to Oracle Cloud, check out our white paper, style="font-weight: 400;">Migrating Oracle Databases to Cloud style="font-weight: 400;">.

migratingtocloudhttps://www.pythian.com/blog/wp-content/uploads/migratingtocloud-300x167... 300w, https://www.pythian.com/blog/wp-content/uploads/migratingtocloud-360x200... 360w, https://www.pythian.com/blog/wp-content/uploads/migratingtocloud-465x258... 465w, https://www.pythian.com/blog/wp-content/uploads/migratingtocloud-350x194... 350w" sizes="(max-width: 470px) 100vw, 470px" />

How to Find Information in the OMR, (Enterprise Manager Repository)

I get a lot of questions starting with, “Where do I find…” and end with “in the Oracle Management Repository, (OMR)?”

confused_lost

The answer to this is one that most DBAs are going to use, as it’s no different than locating objects in most databases, just a few tricks to remember when working with the OMR.

  1.  SYSMAN is the main schema owner you’ll be querying in an OMR, (although there are others, like SYSMAN_RO and others.)
  2. Most views you will be engaging with when querying the OMR start with MGMT or MGMT$.
  3. DBA_TAB_COLUMNS is your friend.
  4. Know the power of _GUID and _ID columns in joins.

Using this information, you can answer a lot of questions when trying to figure out a command you’re seen but don’t have your specific syntax and need to know where to get it from.

Getting Info

As a working example, someone asked me today how they would locate what platform # is used for their version of Linux?  The documentation referred to a command that listed one, but they couldn’t be sure if it was the same one that they were deploying.

So how would we find this?

./emcli 
 -platform=?

 

select table_name from dba_tab_columns
where owner='SYSMAN'
and table_name like 'MGMT%'
and column_name='PLATFORM_NAME';

This is going to return 5 rows and trust me, pretty much all of them are going to have the PLATFORM_ID  along with that PLATFORM_NAME  one way or another in it.  There are a few that stand out that with a little logic, make sense:

TABLE_NAME
--------------------------------------------------------------------------------
MGMT_ARU_PLATFORMS_E
MGMT$ARU_PLATFORMS
MGMT$EM_LMS_ACT_DATA_GUARD_VDB
MGMT_ARU_PLATFORMS
MGMT_CCR_HOST_INFO
SQL> select distinct(platform_name), platform_id from sysman.mgmt$aru_platforms
 2 order by platform_id;
PLATFORM_NAME PLATFORM_ID
---------------------------------------- -----------
HP OpenVMS Alpha 89
Oracle Solaris on x86 (32-bit) 173
HP-UX Itanium 197
Microsoft Windows Itanium (64-bit) 208
IBM: Linux on System z 209
IBM S/390 Based Linux (31-bit) 211
IBM AIX on POWER Systems (64-bit) 212
Linux Itanium 214
Linux x86-64 226
IBM: Linux on POWER Systems 227
FreeBSD - x86 228

The person who posted the question was looking for the Platform_ID for Linux x86-64, which happens to be 226.

Summary

I’d always recommend checking views, as they may be in reserve for plugins or management packs that haven’t been deployed or used before counting on data, but there’s a lot that you can find out even if it isn’t in the GUI.

We’re DBAs, we love data and there’s plenty of that in the OMR for EM13c.

 

 

 

 

 



Tags:  ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [How to Find Information in the OMR, (Enterprise Manager Repository)], All Right Reserved. 2016.

What Are Your Options For Migrating Enterprise Applications to the Cloud?

class="l-submain">
class="l-submain-h g-html i-cf">

style="font-weight: 400;">Migrating your enterprise applications from on-premises infrastructure to the public cloud is attractive for a number of reasons. It eliminates the costs and complexities of provisioning hardware and managing servers, storage devices, and network infrastructure; it gives you more compute capacity per dollar without upfront capital investment; and you gain opportunities for innovation through easier access to new technologies, such as advanced analytical capabilities.

style="font-weight: 400;">So how do you get there?

style="font-weight: 400;">You have a few options. At one end of the spectrum, you could simply wait and rationalize, making continuous incremental changes to gain efficiencies. This is obviously a “slow burn” approach. In the middle is a “lift-and-shift” from your current environment into the public cloud. And at the far extreme, you could plunge right in and re-architect your applications—a costly and probably highly complex task.

 

style="font-weight: 400;">In fact, a true migration “strategy” will involve elements of each of these. For example, you could perform short-term optimizations and migrations on a subset of applications that are ready for the cloud, while transforming the rest of your application stack over the longer term.

 

What to expect from the major public cloud platforms

style="font-weight: 400;">There are three leading public cloud platforms: Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). As Google doesn’t seem to be driving customers to lift-and-shift their applications to GCP, I’m going to focus on AWS and Azure as potential cloud destinations and, for specificity, take Oracle enterprise databases as the use case.

 

style="font-weight: 400;">Amazon Web Services

style="font-weight: 400;">You have two options for migrating Oracle databases to the AWS cloud: infrastructure-as-a-service (IaaS) and platform-as-a-service (PaaS).

 

style="font-weight: 400;">Deploying Oracle applications in AWS IaaS is much like deploying them on your in-house infrastructure. You don’t get flexible licensing options, but you do have the ability to easily allocate more or less capacity as needed for CPU, memory, and storage. However, because AWS IaaS is style="font-weight: 400;">virtualized style="font-weight: 400;"> infrastructure, you may experience slower performance due to suboptimal CPU core allocation or processor caches. You’ll also have less flexibility with instance sizes, network topology, storage performance tiers, and the like.

 

style="font-weight: 400;">AWS Relational Database Service (RDS) for Oracle is a managed PaaS offering where, in addition to giving you the benefits of IaaS, Amazon takes on major DBA and system administrator tasks including provisioning, upgrades, backups, and multi-availability zone replication. This significantly simplifies your operations—but also results in less control over areas such as configuration, patching, and maintenance windows. AWS RDS for Oracle can also be used with a pay-as-you-go licensing model included in the hourly rate.

 

style="font-weight: 400;">Microsoft Azure

style="font-weight: 400;">Azure does not have a managed offering for Oracle databases, so the only way to run Oracle Database on Azure is through its IaaS platform. The benefits are very similar to AWS IaaS, but Azure offers additional licensing options (with Windows-based license-included images) and its instances are billed by the minute rather than by the hour. What’s important to keep in mind is that Azure is not as broadly adopted as AWS and offers less flexibility for storage performance tiers and instance sizes. Oracle Database software running on Windows is also not as common as running on Linux.

 

style="font-weight: 400;">For more in-depth technical details on these options, I encourage you to read our white paper, style="font-weight: 400;">Migrating Oracle Databases to Cloud style="font-weight: 400;">. My next blog in this series will look at one other option not discussed here: migrating to Oracle Cloud.

style="font-weight: 400;"> migratingtocloudhttps://www.pythian.com/blog/wp-content/uploads/migratingtocloud-300x167... 300w, https://www.pythian.com/blog/wp-content/uploads/migratingtocloud-360x200... 360w, https://www.pythian.com/blog/wp-content/uploads/migratingtocloud-465x258... 465w, https://www.pythian.com/blog/wp-content/uploads/migratingtocloud-350x194... 350w" sizes="(max-width: 482px) 100vw, 482px" />

Oracle Security And Delphix Paper and Video Available

I did a webinar with Delphix on 30th March 2016 on USA time. This was a very good session with some great questions at the end from the attendees. I did a talk on Oracle Security in general, securing non-production....[Read More]

Posted by Pete On 01/04/16 At 03:43 PM

Set Operations

A recent post on the OTN database forum highlights a couple of important points ideas for optimising SQL. There are: (a) is there a logically equivalent way of stating the SQL and (b) is there a different “natural language” way of posing the problem.

The posting starts with a query, part of an execution plan, and a request to “get rid of the tablescan”. I guessed originally that the query came from an 11g instance, and the OP gave us some code to create the tables and indexes, so I’ve modelled the tables to get the indicated plan (then filled in the original numbers). This is the query, and my cosmetically adjusted version of the plan output that the OP probably got:


SELECT a.hotel_code
  FROM lf_hotel_temp a
WHERE a.service_id = : p_service_id
       AND (NOT EXISTS (SELECT *
          FROM lf_ts_roomtype_properties b
         WHERE a.hotel_code = b.hotel_code)
        or NOT EXISTS (SELECT *
          FROM lf_gta_roomtype_properties b
         WHERE a.hotel_code = b.hotel_code) 
       or  NOT EXISTS (SELECT *
          FROM lf_hb_roomtype_properties b
         WHERE a.hotel_code = b.hotel_code))

-------------------------------------------------------------------------------
| Id  | Operation          | Name                     | Rows  |  Bytes | Cost |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |                          | 12613 | 113517 |  135 |
|*  1 |  FILTER            |                          |       |        |      |
|*  2 |   TABLE ACCESS FULL| LF_HOTEL_TEMP            | 88433 | 795897 |  135 |
|*  3 |   INDEX RANGE SCAN | LF_TS_ROOMTYPE_PROP_IDX  |     1 |      7 |    1 |
|*  4 |   INDEX RANGE SCAN | LF_GTA_ROOMTYPE_PROP_IDX |     1 |      9 |    1 |
|*  5 |   INDEX RANGE SCAN | LF_HB_ROOMTYPE_PROP_IDX  |     2 |     14 |    3 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter( NOT EXISTS (SELECT 0 FROM "LF_TS_ROOMTYPE_PROPERTIES" "B" WHERE
              "B"."HOTEL_CODE"=:B1) OR  NOT EXISTS (SELECT 0 FROM "LF_GTA_ROOMTYPE_PROPERTIES" "B"
              WHERE "B"."HOTEL_CODE"=:B2) OR  NOT EXISTS (SELECT 0 FROM "LF_HB_ROOMTYPE_PROPERTIES"
              "B" WHERE "B"."HOTEL_CODE"=:B3))
   2 - filter("A"."SERVICE_ID"=TO_NUMBER(:P_SERVICE_ID))
   3 - access("B"."HOTEL_CODE"=:B1)
   4 - access("B"."HOTEL_CODE"=:B1)
   5 - access("B"."HOTEL_CODE"=:B1)

We were told in the original posting that there’s a primary key on lf_hotel_temp declared on (hotel_code, service_id), and we were given the definitions, sizes, and index declarations of all the table in a follow-up posting. It turns out that lf_hotel_temp consists of just those two columns and holds 278,000 rows: the optimizer’s estimate for the number of rows identified by a single service_id is over 88,000, and the nature of the query tells us that the optimizer would have to examine every one of those rows to check if it satisfied any of the three subqueries.

So how might Oracle access the rows ?  Given that the only columns used will all be in the primary key index (which implies not null constraints) there are four basic options: tablescan, index fast full scan, index full scan, and index skip scan. Given the most likely data content (i.e. lots of different hotel_codes), we can assume the skip scan would be a very bad idea. We can be sure that an index fast full scan will be lower cost than an index full scan – for anything except tiny indexes. Ultimately the question is really “why a tablescan instead of an index fast full scan?”. As I pointed out, though, the table consists of just those two columns – which means it’s perfectly reasonable for the index to be larger than the table as each entry of the index will consist of the two columns AND a rowid.

The first interesting bit

The question of why the access to lf_hotel_temp was by tablescan rather than some indexed method isn’t really interesting. The interesting bit is how (in principle) we might make the plan more efficient (if it really needs it); and this leads to two key, and general purpose, observations. As Andrew Sayer pointed out on the thread, we have a compound predicate:

    (not exists A OR not exists B OR not exists C)

and this is logically equivalent to

   not (exists A AND exists B AND exists C)

If we rewrite the query to reflect this equivalence could the optimizer find a different, better way of executing it:


select  /*+ dynamic_sampling(0) */
        a.hotel_code
from    lf_hotel_temp a
where
        a.service_id = :p_service_id
and     not(
                exists (
                        select  null
                        from    lf_ts_roomtype_properties ts
                        where   ts.hotel_code = a.hotel_code
                )
            and exists (
                        select  null
                        from    lf_gta_roomtype_properties gta
                        where   gta.hotel_code = a.hotel_code
                )
            and exists (
                        select  null
                        from    lf_hb_roomtype_properties hb
                        where   hb.hotel_code = a.hotel_code
                )
        )
;

Of course, I didn’t have the original data; so I copied the DDL supplied in the OTN thread and added a little DML to insert a few rows in the tables. The data I used looked like this:


insert into lf_hotel_temp (hotel_code, service_id) values ('A',1);
insert into lf_hotel_temp (hotel_code, service_id) values ('B',1);
insert into lf_hotel_temp (hotel_code, service_id) values ('C',1);
insert into lf_hotel_temp (hotel_code, service_id) values ('D',1);

-- insert into lf_ts_roomtype_properties values ( 'A','x','x',0,1,'x');
insert into lf_ts_roomtype_properties values ( 'B','x','x',0,1,'x');
insert into lf_ts_roomtype_properties values ( 'C','x','x',0,1,'x');
insert into lf_ts_roomtype_properties values ( 'D','x','x',0,1,'x');

-- insert into lf_gta_roomtype_properties values ( 'A','x','x',0,1,'x');
-- insert into lf_gta_roomtype_properties values ( 'B','x','x',0,1,'x');
insert into lf_gta_roomtype_properties values ( 'C','x','x',0,1,'x');
insert into lf_gta_roomtype_properties values ( 'D','x','x',0,1,'x');

-- insert into lf_hb_roomtype_properties values ( 'A','x','x',0,1,'x');
-- insert into lf_hb_roomtype_properties values ( 'B','x','x',0,1,'x');
-- insert into lf_hb_roomtype_properties values ( 'C','x','x',0,1,'x');
insert into lf_hb_roomtype_properties values ( 'D','x','x',0,1,'x');
commit;

It’s possible that with different data volumes you’d get different execution plans, but in 11g the optimizer transformed my query back into the original form – in other words it recognised the equivalence of “not (A and B and C)” and rewrote it as “(not A or not B or not C)” !

However, I also have 12c available, and I had created a script to build a model, so I ran the test on 12c. Both versions of the query produced the following plan:


----------------------------------------------------------------------------------------------------
| Id  | Operation             | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                            |     1 |  2027 |     8  (13)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT ANTI |                            |     1 |  2027 |     8  (13)| 00:00:01 |
|   2 |   VIEW                | VW_SQ_1                    |    82 |   984 |     6   (0)| 00:00:01 |
|*  3 |    HASH JOIN SEMI     |                            |    82 |  2952 |     6   (0)| 00:00:01 |
|*  4 |     HASH JOIN         |                            |    82 |  1968 |     4   (0)| 00:00:01 |
|   5 |      TABLE ACCESS FULL| LF_GTA_ROOMTYPE_PROPERTIES |    82 |   984 |     2   (0)| 00:00:01 |
|   6 |      TABLE ACCESS FULL| LF_HB_ROOMTYPE_PROPERTIES  |    82 |   984 |     2   (0)| 00:00:01 |
|   7 |     TABLE ACCESS FULL | LF_TS_ROOMTYPE_PROPERTIES  |    82 |   984 |     2   (0)| 00:00:01 |
|*  8 |   INDEX FULL SCAN     | LF_HOTEL_TEMP_PK           |   101 |   198K|     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("VW_COL_1"="A"."HOTEL_CODE")
   3 - access(SYS_OP_MAP_NONNULL("HB"."HOTEL_CODE")=SYS_OP_MAP_NONNULL("TS"."HOTEL_CODE"))
   4 - access(SYS_OP_MAP_NONNULL("HB"."HOTEL_CODE")=SYS_OP_MAP_NONNULL("GTA"."HOTEL_CODE"))
   8 - access("A"."SERVICE_ID"=TO_NUMBER(:P_SERVICE_ID))
       filter("A"."SERVICE_ID"=TO_NUMBER(:P_SERVICE_ID))

Ignore the numbers (I hadn’t collected stats, which is why I added the /*+ dynamic_sampling(0) */ hint – with stats in place 12c produced the FILTER plan that 11g had produced) the key feature is that Oracle has managed to transform my three filter subqueries into a single join subquery and then transformed the resulting subquery into an anti-join. It’s a pretty amazing transformation – the optimizer did it automatically in 12c, but if you are aware of the logical equivalence then you may find cases where you can turn “OR’s” into “AND’s” and help the optimizer to find transformations that it can’t find automatically.

The second interesting bit

If you think about the meaning behind the query (prompted, perhaps, by the logical equivalence described above) you might rephrase the question as “find me the hotel codes that fail to appear in all three related tables” – in English this is ambigious and open to catastrophic mis-interpretation so you might have another go and say “find me the hotel codes that appear in every one of the three related tables – those are the hotel codes I don’t want”. This latter expression, of course, is exactly what Oracle is doing by joining the three tables and then doing the “not exists”/anti-join against the result. Obviously you could translate the new English form into SQL by hand, with a three table join in a “not exists” subquery.

I actually took a different approach (which might, or might not, be efficient – depending on the actual data and indexes).  I translated the new English statement into the following:


select  /*+ dynamic_sampling(0) */
        hotel_code
from    lf_hotel_temp
where   service_id = :p_service_id
minus   (
        select  hotel_code
        from    lf_ts_roomtype_properties
        where   hotel_code is not null
        intersect
        select  hotel_code
        from    lf_gta_roomtype_properties
        where   hotel_code is not null
        intersect
        select  hotel_code
        from    lf_hb_roomtype_properties
        where   hotel_code is not null
        )
;

The three way intersection gets me the list of hotels that appear in all three tables; the minus operator takes the list of hotel with the correct service_id and eliminates from it the hotels that appear in the intersection – giving me the result I want.

For my tiny data set, this is the plan I got:

--------------------------------------------------------------------------------------------------
| Id  | Operation             | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |                          |     1 |  2159 |     8  (50)| 00:00:01 |
|   1 |  MINUS                |                          |       |       |            |          |
|   2 |   SORT UNIQUE NOSORT  |                          |     1 |  2015 |     2  (50)| 00:00:01 |
|*  3 |    INDEX FULL SCAN    | LF_HOTEL_TEMP_PK         |     1 |  2015 |     1   (0)| 00:00:01 |
|   4 |   INTERSECTION        |                          |       |       |            |          |
|   5 |    INTERSECTION       |                          |       |       |            |          |
|   6 |     SORT UNIQUE NOSORT|                          |     4 |    48 |     2  (50)| 00:00:01 |
|*  7 |      INDEX FULL SCAN  | LF_TS_ROOMTYPE_PROP_IDX  |     4 |    48 |     1   (0)| 00:00:01 |
|   8 |     SORT UNIQUE NOSORT|                          |     4 |    48 |     2  (50)| 00:00:01 |
|*  9 |      INDEX FULL SCAN  | LF_GTA_ROOMTYPE_PROP_IDX |     4 |    48 |     1   (0)| 00:00:01 |
|  10 |    SORT UNIQUE NOSORT |                          |     4 |    48 |     2  (50)| 00:00:01 |
|* 11 |     INDEX FULL SCAN   | LF_HB_ROOMTYPE_PROP_IDX  |     4 |    48 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("SERVICE_ID"=TO_NUMBER(:P_SERVICE_ID))
       filter("SERVICE_ID"=TO_NUMBER(:P_SERVICE_ID))
   7 - filter("HOTEL_CODE" IS NOT NULL)
   9 - filter("HOTEL_CODE" IS NOT NULL)
  11 - filter("HOTEL_CODE" IS NOT NULL)

Important note: I am not claiming that this use of set operators will be more efficient than a filter subquery or anti-join/semi-join approach, performance ultimately depends on the volume and patterns in the data combined with the available indexing. In this case you can almost see the classic performance compromise that we often see in Oracle – even in the trade-off between something as simple as choosing between a hash join and a nested loop join – should we operate this query as a tiny number of “bulk” operations, or as a (potentially) large number of tiny, high-precision operations.

If the original query was spending all it’s time on CPU running lots of subqueries, or doing lots of single block random I/Os because of the random ordering of the subqueries, then perhaps a couple of brute force “db file parallel read” index full scans would be a friendlier use of the available resources, run more quickly, and have less impact on every other user.

 

NoCOUG changes its name (Happy April Fools Day)‏

The board members of NoCOUG meets in closed session to discuss the continuing decline in memberships and conference attendance.(read more)