Search

OakieTags

Who's online

There are currently 0 users and 50 guests online.

Recent comments

Affiliations

Oakies Blog Aggregator

Zone Maps

Zone Maps is a new feature that got officially introduced in 12.1.0.2.0 so I've decided to take a closer look.

From the Oracle Documentation:

For full table access, zone maps allow I/O pruning of data based on the physical location of the data on disk, acting like an anti-index.



Let's start by creating a test table:

SQL> create table t pctfree 95 clustering by linear order (n) as
2 select level n, rpad('x', 200, 'x') v
3 from dual
4 connect by level <= 5000;

Table created

I've used a high pctfree setting to make sure the table gets spread out on disk -- each row will occupy it's own block:

SQL> select count(*) num_rows,
2 count(distinct dbms_rowid.rowid_block_number(rowid)) num_blocks
3 from t;

NUM_ROWS NUM_BLOCKS
---------- ----------
5000 5000

Zone Maps do not require attribute clustering, however, I did use clustering in this particular case to make sure that value ranges for column N do not overlap when physically stored on disk to make it easier to see what's happening when a Zone Map is created.

Now let's create a Zone Map:

SQL> create materialized zonemap zm$t on t (n);

Done

The first thing to notice is we now got additional table in our schema with the same name as a Zone Map:

SQL> select * from zm$t order by zone_id$;

ZONE_ID$ MIN_1_N MAX_1_N ZONE_LEVEL$ ZONE_STATE$ ZONE_ROWS$
------------ ---------- ---------- ----------- ----------- ----------
324421046272 1 748 0 0 748
324421046273 749 1756 0 0 1008
324421046274 1757 2764 0 0 1008
324421046275 2765 3772 0 0 1008
324421046276 3773 4780 0 0 1008
324421046277 4781 5000 0 0 220

6 rows selected

As you can see we've got six zones defined with most of them covering a range of about thousand rows with the exception of the first and the last ones. I can now map each ZONE_ID$ to it's respective block range on disk:

SQL> select zone_id$, min_block_id, max_block_id, zone_rows$
2 from (
3 select zm$t.zone_id$,
4 min(dbms_rowid.rowid_block_number(t.rowid)) min_block_id,
5 max(dbms_rowid.rowid_block_number(t.rowid)) max_block_id,
6 max(zone_rows$) zone_rows$
7 from t, zm$t
8 where t.n between zm$t.min_1_n and zm$t.max_1_n
9 group by zm$t.zone_id$
10 ) order by zone_id$;

ZONE_ID$ MIN_BLOCK_ID MAX_BLOCK_ID ZONE_ROWS$
------------ ------------ ------------ ----------
324421046272 179 1023 748
324421046273 1026 2047 1008
324421046274 2050 3071 1008
324421046275 3074 4095 1008
324421046276 4098 5119 1008
324421046277 5122 5343 220

6 rows selected

Based on a couple more tests I've done the algorithm appears to work itself out until the fist block number in a segment which divides evenly by 1024 after which all subsequent blocks get mapped to 8MB regions. The last map has 221 blocks because that's where the end of the last table block happens to be.

Zone Maps seems to be completely independent from the actual extents on disk. In my tests I was able to get multiple Zone Maps pointing at the same extent and in the case above we have multiple extents covered by the same Zone Map. In all cases zones were 8MB in size (more on that a little bit later). By the way if you're wondering why do we have 1008 rows and a little bit strange starting block_ids (i.e. 1026 instead of 1024 and so on) remember that the fist couple blocks are used for first level bmb and do not store actual table data.

So how is this all pieced together?

SYS_OP_ZONE_ID

SYS_OP_ZONE_ID function computes ZONE_ID$ for a given ROWID value. The calculation appears to be a straightforward math based on the ROWID value and assumes 8MB Zone Map chunks -- this explains why Zone Maps are created adjacent to 8MB boundaries since ZONE_ID$ values in the Zone Map are computed using the same function:







SQL> select rowid, SYS_OP_ZONE_ID(rowid)
2 from t
3 where rownum=1;

ROWID SYS_OP_ZONE_ID(ROWID)
------------------ ---------------------
AAAS4kAAFAAAACzAAA 324421046272


Essentially if we imagine the entire possible database space divided into 8MB regions this function would return into which region a particular ROWID value would belong.

SYS_ZMAP_FILTER

Let's look at the explain plan for the following query:






SQL> select * from t where n=1;


Execution Plan
----------------------------------------------------------
Plan hash value: 2931408918

--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%C
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 205 | 1380
|* 1 | TABLE ACCESS STORAGE FULL WITH ZONEMAP| T | 1 | 205 | 1380
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("N"=1)
filter(SYS_ZMAP_FILTER('/* ZM_PRUNING */ SELECT "ZONE_ID$", CASE WHEN
BITAND(zm."ZONE_STATE$",1)=1 THEN 1 ELSE CASE WHEN (zm."MIN_1_N" > :1 OR zm."MAX_1_N"
< :2) THEN 3 ELSE 2 END END FROM "ROOT"."ZM$T" zm WHERE zm."ZONE_LEVEL$"=0 ORDER BY
zm."ZONE_ID$"',SYS_OP_ZONE_ID(ROWID),1,1)<3 AND "N"=1)

We can see that SYS_ZMAP_FILTER appears to be the function involved in figuring out which Zone Maps needs to be accessed in order to execute the query. The condition inside the query (zm."MIN_1_N" > :1 OR zm."MAX_1_N" < :2) will be used to eliminate Zone Maps which do not have a value we're looking for and is dynamically constructed based on the predicate(-s) we have in the query. From here ZONE_ID$ can be mapped back to ROWID ranges (a reverse operation relative to SYS_OP_ZONE_ID) in order to scan only required portions of the data on disk.

This looks to be a very exciting feature and I can't help but think that it's a little bit disappointing that it's limited to Exadata storage only as it could have been very useful on other systems due to lack of storage indexes support.

In Search of Plan Stability (Part 1) - August 27 webinar

Register by clicking here.

Hope you'll join me!

Why am I seeing “Result: Failed. Insufficient resource available” messages?

This is an interesting point that came up recently, where someone had a DBaaS environment running on an Exadata system for a couple of months, with no issues.  It has recently started to return errors on the Database Cloud Self Service portal in the “Placement Algorithm” region:

Constraint : Number Of Databases
Current Resource Usage: 18
Total Resource Available: 10
Maximum Limit Defined: 10
Requested Value: 1
Total Resource that can be used: 10
Result: Failed. Insufficient resource available.

But if I grep for PMON processes,  the number of databases running on the machine is 8.  Why would I see such an error?

The reason here is fairly simple – there are a couple of databases that are down, and if we include those, the number of databases on the machine has reached the maximum limit defined. But the explanation needs a bit more detail.

So why do we include databases that are down in this count? Well, even in a non-cloud environment, shutting down a database doesn’t release the storage used by that database (thankfully!) A mere shutdown of the database cannot be assumed to reduce the quota count, as the database may be restarted at any time, either manually by the database administrator using Enterprise Manager Cloud Control, EM Express, or SQL*Plus, or automatically, using tools such as cron or init.d(rc2.d) and the machine resources will again be used. Over-allocation in quota is generally used to solve such requirements where not all databases will be up and running all the time.

The next question that comes to mind is whether this algorithm is based only on the number of databases, or machine resources such as CPU and memory when the database is down? At a high level, the answer is the algorithm is controlled by 3 factors:

  1. The number of RDBMS instances*: this is the number of instance targets on a specific host. This data is fetched from the Enterprise Manager repository, which is why databases that are down are counted. So if you really want to ensure a database that is shutdown does not count towards the total number of databases, then remove the database target from being managed by Enterprise Manager. :)
  2. Memory – there are two flavours here:
    • Memory usage based on the host – calculated as the average memory utilization on the host over the last 7 days
    • Memory usage based on the databases – calculated as the total sum of memory allocated for the databases on the host
  3. CPU – the average CPU utilization over the lat 7 days on the host

Hopefully this explains why we call the overall theme of database placement as ‘Optimistic’. :)

*You’ll notice I used the term “RDBMS instances” here, NOT “database instance”. There’s a reason for that. It’s one of my pet peeves to see the term “database instance”. In Oracle terms, a database is a set of physical files – the datafiles, redo log files and control files (but not the initialization parameter file, whichever variety – SPFile or the old init.ora text file – you use). An instance is the set of background processes and memory structures in the SGA. While there is most often a one to one mapping between a database and an instance (the exception of course being a Real Application Clusters environment where multiple instances share a single database), there really is no overlap between the term “database” and the term “instance”. Yes, I know our products use the term all the time, and so does our documentation, but to my pedantic way of thinking, they’re WRONG! :)

Weblogic Start Failure: Empty Initial Replica

This post is to address a failed Weblogic start in EM12c.  The fix is performed on a Windows host, but the fix can easily be for any EM12c environment that experiences this issue.  In the below instructions, just replace the %OMS_BASE% for Windows Env Var to $OMS_BASE for Linux/Unix.

Issue:  OMS Fails to Start and Points to Weblogic Issue in EMGC_OMS1.out File.

Error in Out File: 

Reason: Failure on create domain from template step, aka “Emplty Initial Replica”

How to resolve:

1.  Stop all Oracle processes/services on the host.

2.  Go to:

%OMS_BASE%\gc_inst\user_projects\domains\GCDomain\servers\EMGC_ADMINSERVER\data

Rename the ldap folder to ldapold.
3. Go to each of the following folders and remove all folders and files out of the directories:

%OMS_BASE%\gc_inst\user_projects\domains\GCDomain\servers\EMGC_ADMINSERVER\cache
 %OMS_BASE%\gc_inst\user_projects\domains\GCDomain\servers\EMGC_ADMINSERVER\tmp
 %OMS_BASE%\gc_inst\user_projects\domains\GCDomain\servers\EMGC_ADMINSERVER\logs

4.  Restart all Oracle services/processes for the EM12c environment on the host.

You’ll notice the LDAP folder is recreated with the restart of the Weblogic tier.  This should also resolve the domain creation error and all services should start successfully.



Tags:  


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [Weblogic Start Failure: Empty Initial Replica], All Right Reserved. 2014.

Oracle Midlands : Event #5

Just a quick reminder that Oracle Midlands Event #5 is just around the corner (#000000;">Tuesday 16 September).

  • Boost Performance by Clustering Data - Martin Widlake
  • Data Virtualisation and Instant Cloning - Ron Ekins (Delphix)

Martin is always good value, so I’m looking forward to that session. I’ve seen a couple of talks on the Delphix stuff and it is seriously cool! I think this session will open a few people’s eyes… :)

Big thanks to the Red Gate Software folks for sponsoring the event, allowing it to remain free

You can get more details here. I’ve already registered. See you there!

Cheers

Tim…

 


Oracle Midlands : Event #5 was first posted on August 18, 2014 at 10:43 am.
©2012 "The ORACLE-BASE Blog". Use of this feed is for personal non-commercial use only. If you are not reading this article in your feed reader, then the site is guilty of copyright infringement.

MERGE and IOT’s ….. unhappy bedfellows

Anyone who has used Oracle for a while will be familiar with the Parent/Child locking "issue" when it comes to tables and indexes on foreign keys.  For many years you’d hear people crying "bug" etc but thankfully most now know the reason, and accept it as sensible behaviour.

But lets take a look at a slight variation on that theme.

Lets start with a table called "LOC" which will be our parent table in this example. Note that it is an IOT, and we’ll also have a child table "LOC_CHILD", which is also an IOT.

SQL> CREATE TABLE LOC
  2  (
  3    LOC_ID              NUMBER(4)            NOT NULL,
  4    DATA                     NUMBER(6),
  5    CONSTRAINT LOC_PK
  6    PRIMARY KEY
  7    ( LOC_ID)
  8    ENABLE VALIDATE
  9  )
 10  ORGANIZATION INDEX
 11  /

Table created.

SQL> CREATE TABLE LOC_CHILD
  2  (
  3    CHILD_SEQ                  NUMBER(12)          NOT NULL,
  4    LOC_ID               NUMBER(4)           NOT NULL,
  5    CHILD_DATA  NUMBER(15,6),
  6    CONSTRAINT LOC_CHILD_PK
  7    PRIMARY KEY
  8    (CHILD_SEQ, LOC_ID)
  9    ENABLE VALIDATE
 10  )
 11  ORGANIZATION INDEX
 12  /

Table created.

SQL> insert into LOC
  2  select rownum,50
  3  from dual
  4  connect by level <= 5
  5  /

5 rows created.

Now being a good DBA :-) we’ve read all the "gloom and doom" nonsense about foreign keys being indexed, so just to be careful, we’ll add that index onto our child table before adding our foreign key back to LOC.

SQL> CREATE INDEX LOC_CHILD_IX ON LOC_CHILD
  2  (LOC_ID)
  3  /

Index created.

SQL> ALTER TABLE LOC_CHILD ADD (
  2    CONSTRAINT LOC_CHILD_FK
  3    FOREIGN KEY ( LOC_ID)
  4    REFERENCES LOC (LOC_ID)
  5    ENABLE VALIDATE)
  6  /

Table altered.

SQL> insert into LOC_CHILD
  2  select rownum,mod(rownum,5)+1,dbms_random.value(1000,5000)
  3  from dual
  4  connect by level <= 2000
  5  /

2000 rows created.

SQL> commit;

Commit complete.

So the scene is set..we’ve got our tables seeded with some data, and ready to go..

Lets update a row in the parent table LOC:

SQL> UPDATE loc
  2  SET    DATA = 99
  3  WHERE  LOC_ID = 2;

1 row updated.

Now we’ll pop into a new session and update the child table LOC_CHILD:

SQL> MERGE 
  2       INTO  LOC_CHILD
  3       USING (SELECT 500 CHILD_SEQ,
  4                     2 LOC_ID,
  5                     1000 CHILD_DATA
  6                FROM DUAL) M
  7          ON (    LOC_CHILD.CHILD_SEQ = M.CHILD_SEQ
  8              AND LOC_CHILD.LOC_ID = M.LOC_ID)
  9  WHEN MATCHED
 10  THEN
 11     UPDATE SET
 12        LOC_CHILD.CHILD_DATA =  NVL (LOC_CHILD.CHILD_DATA, 0) + M.CHILD_DATA
 13  WHEN NOT MATCHED
 14  THEN
 15     INSERT     (CHILD_SEQ,
 16                 LOC_ID,
 17                 CHILD_DATA)
 18         VALUES (M.CHILD_SEQ,
 19                 M.LOC_ID,
 20                 M.CHILD_DATA);

[stuck]

And splat…we’re stuck.  (Without evidence to support it) I’d hypothesize its due to the flexibility of the merge command.  A single merge can insert, update and even delete rows, so I’m guessing that to handle this flexibility (in particular, the DELETE option) then the locking errs on the side of safety.

Monitoring a Microsoft OS Failover Cluster

I know, I know-  none of you are using Microsoft Windows.  This is why I get so many questions on this topic and why there is so much interest in a white paper that no one thought I needed to write.  Well, while that pesky ol’ white paper is in review, I’m going to go onto a secondary topic of how to monitor a Microsoft Active/Passive cluster with Enterprise Manager 12c, release 4.

There are some disclaimers I’m going to put in this blog post-

1.  monitoring a clustered host is not explicitly stated as not supported with EM12c, but there is also some documentation that I’m still pouring over that is a bit contradicting… :)

2.  I am only covering MS Windows 2008 R2 server with this article.  It will be quite simple to cover MS Windows 2012 server, but no guarantees on 2003, as the security model for clustering, etc. is very different in the earlier releases.

3.  This is a blog post.  It’s not a MOS note, it’s not a white paper, so remember, this is just me blogging on how I accomplished this and there is no official support documentation behind this topic.

And so here we go!

The concept of OS level clustering can be very foreign to DBAs who’ve spent much of their time with application level clustering, (i.e. RAC/Real Application Clustering) but I believe it is important for us to understand different clustering models, along their benefits and drawbacks.

Microsoft OS level clustering, in an Active/Passive configuration for Microsoft includes the following basic, physical design-

  • Two hosts, identified by a physical name and physical IP Address
  • A Virtual Cluster name, along with Virtual Cluster address.
  • A quorum disk, (similar to a voting disk in RAC)
  • Shared storage

The overall design appears similar to the following when in an active/passive mode with the Management agent installed with two databases monitored:

failover_agent

The most important thing you need to recognize is that no matter what the HOSTS are named or their IP addresses, the databases and the management server, (along with the listener, that I didn’t include in my diagram!) is all configured with the VIRTUAL CLUSTER NAME.  They do not recognize the host names at all and are only monitoring and running on the ACTIVE host via the VIRTUAL cluster name.

This is achieved through the same method as you would any other Windows host.  I recommend a silent installation, even with EM12c, Release 4, (12.1.0.4) using a response file to enter the commands and ensure you are completing all information as you normally would, but with the cluster information.

1.  The first requirement is your cluster MUST BE STABLE.

If a Windows Server cluster is not stable, don’t proceed.  Correct whatever DNS, Active Directory or shared storage issues exist before proceeding to install Oracle or the agent.  This will save you a lot of headaches.

2.  Failover Group should exist to add the Management Agent to.

If an application failover group doesn’t already exist in the Microsoft Failover Cluster Administration Manager, then you will need to request a virtual host name and virtual IP Address to be used for the Central Access Point, (CAP).  This is the group service that will manage the failover for the databases, etc, (so if databases already exist, this should exist and all you will do is add the agent to it….)

Test out all cluster connectivity via Name Server (nslookup) commands

nslookup 
nslookup 
nslookup 
nslookup 
 

Install Software

The management agent is going to be installed on the shared storage, which means it will only know about the active host in the cluster, (disclaimer alert!) To perform the installation, we are going to use a new method of PsExec with Windows Servers.

Download Agent Software

Check your software library for the correct version of the agent and download it via EM CLI commands:

emcli get_supported_platforms
emcli get_agentimage -destination=G:/swlib/ –platform="Microsoft Windows x64 (64-bit)"

Exit from EM CLI, unzip the file and prep for installation.

Setting up PsExec on the OMS

Create your folder you wish to install to:  C:\agent12c

Download the PsExec bat file and the utility, following instructions in the DOC ID 1636851.1

Download the agentDeployPsExec.bat and the PsExec utility to a folder on C:\psexec.

Create the psexec.rsp response file with the following information and save it to the C:\psexec directory with the other files:

HOST_NAMES=
USER_NAME=
PASSWORD=
AGENT_IMAGE_PATH=C:\agent12c_ins
AGENT_BASE_DIR=C:\agent12c
OMS_HOST=
EM_UPLOAD_PORT=
AGENT_REGISTRATION_PASSWORD=
PSEXEC_DIR=C:\psexec

If you are unsure of the values for your upload port, etc., run the following on the OMS:

emctl status oms -details

It requires the SYSMAN password and will return all information pertinent to your EM environment.

Once you have this all filled in, you are ready to deploy from the OMS to the 1st node of the OS clustered server.

Run the following:

agentDeployPsExec.bat AGENT_BASE_DIR= RESPONSE_FILE=\psexec.rsp
C:\psexec>agentDeployPsExec.bat PROPERTIES_FILE=C:\psexec\psexec.rsp

C:\psexec>echo off
===================================================================
Agent deployment started on host : host1.us.oracle.com

Creating installation base directory ...

Note:  You must have remote admin privileges to the target host to perform this successfully.  If the account in the response file does NOT have privileges granted to create directories, start remote services and such, it will fail.

It will take some time to deploy the agent and once complete, will show the following:

Agent deployed successfully.

Duplicate registry and services to second host

Click on Start, Run and type in regedit.exe on the first host of the failover cluster.

Go to HKEY_LOCAL_MACHINE\SOFTWARE\oracle and right click on the folder and choose Export.  Save the registry file in a secure location.

FTP the registry file to the second host.

Log into the second host and double click on the registry file.  It will ask you if you are sure you want to copy the registry key, click OK.

Create the service

The service for the agent must now be duplicated on the second host.  This is OS level clustering, so no application level service creation should be performed, (emctl, emcli, oradim…)  Use sc commands, (Windows Service Control) to create the service.

Open up a command prompt in administrator mode and the duplicate the OracleAgent12c1:

The syntax for creating the service is as follows:

sc create  binPath= "" start= auto

By opening up Windows services on the first host, you can go to the OracleAgent12c1 and double click on it to view the values you need for above:

ora_agent

Run the sc command on the second host to create the support service for the installation performed on the first host.

Add the Agent to the Failover Group

In the Server Manager, go to the Failover Management and open up the Failover group:

failover_cluster_1

Right click on Add a resource and choose Generic Service.  You can then choose the Agent service listed, (for the example above, Oracleagent12c1Agent) and follow through with the defaults to finish.

The service for the agent is now set to be cluster aware and will failover if the first host it was installed becomes unavailable for some reason.

Failover Validation

The installation is now complete on the shared storage of the failover cluster for the first host and you’ve now copied over the registry settings and duplicated service, so you are ready to test the failover and ensure the agent is successful.

There are a couple ways to test the failover:

1.  Reboot the first host-  this will cause a failover.

2.  Right click on the Failover group and click on More Actions, then Simulate failure of this resource.

You should now see the drives and services, including the agent, failover and start on the second host.  Verify that all services come online and log into the EMCC , (EM12c console) to verify uploading occurs to the OMS.  Verify that all the targets you added are showing correctly for the virtual hostname.

Test failover multiple times to both hosts.  If a failure occurs on the second host, check the services, comparing to the 2nd to 1st host and/or dependencies on start up of your Failure group.

Drawbacks

The clear drawback of OS level clustering through the agents is that only one host is being monitored at a time.  As the targets, (MSSQL database, applications, etc.) that are being monitored are active on only one host at a time, there would be manual intervention required if dual agents were deployed.

A workaround in the form of a monitoring script to ping the hosts at all times, only alerting if no response received is a second level of host monitoring availability.

I’m also inspecting the option of Failsafe with EM12c, (not currently supported) and the new Partner Agents to see if there are more opportunities to monitoring OS level clustering.

 

 

 

 



Tags:  


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [Monitoring a Microsoft OS Failover Cluster], All Right Reserved. 2014.

In-memory limitation

I’ve been struggling to find time to have any interaction with the Oracle community for the last couple of months – partly due to workload, partly due to family matters and (okay, I’ll admit it) I really did have a few days’ holiday this month. So making my comeback with a bang – here’s a quick comment about the 12.1.0.2 in-memory feature, and how it didn’t quite live up to my expectation; but it’s also a comment about assumptions, tests, and inventiveness.

One of the 12.1.0.2 manuals tells us that the optimizer can combine the in-memory columnar storage mechanism with the “traditional” row store mechanisms – unfortunately it turned out that this didn’t mean quite what I had hoped; I had expected too much of the first release. Here’s a quick demo of what doesn’t happen, what I wanted to happen, and how I made it happen, starting with a simple definition (note – this is running 12.1.02 and the inmemory_size parameter has been set to enable the feature):


create table t1 nologging
as
select	*
from	all_objects
where	rownum <= 50000
;

alter table t1 inmemory
no inmemory (object_id, object_name)
inmemory memcompress for query low (object_type)
-- all other columns implicitly inmemory default
;

insert into t1 select * from t1;
commit;

insert into t1 select * from t1;
commit;

insert into t1 select * from t1;
commit;

begin
	dbms_stats.gather_table_stats(user, 't1', method_opt=>'for all columns size 1');
end;
/

rem
rem	Needs select on v$_im_column_level granted
rem

select
	table_name,
	column_name,
	inmemory_compression
from
	v$im_column_level
where	owner = user
and	table_name = 'T1'
order by
	segment_column_id
;

explain plan for
select
	last_ddl_time, created
from
	t1
where	t1.created > trunc(sysdate)
and	t1.object_type = 'TABLE'
and	t1.subobject_name is not null
;

select * from table(dbms_xplan.display);

All I’ve done at this point is create a table with most of its columns in-memory and a couple excluded from the columnar store. This is modelling a table with a very large number of columns where most queries are targeted at a relatively small subset of the data; I don’t want to have to store EVERY column in-memory in order to get the benefit of the feature, so I’m prepared to trade lower memory usage in general against slower performance for some queries. The query against v$im_column_level shows me which columns are in-memory, and how they are stored. The call to explain plan and dbms_xplan then shows that a query involving only columns that are declared in-memory could take advantage of the feature. Here’s the resulting execution plan:

-----------------------------------------------------------------------------------
| Id  | Operation                  | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |      |     1 |    27 |    73   (9)| 00:00:01 |
|*  1 |  TABLE ACCESS INMEMORY FULL| T1   |     1 |    27 |    73   (9)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - inmemory("T1"."SUBOBJECT_NAME" IS NOT NULL AND
              "T1"."OBJECT_TYPE"='TABLE' AND "T1"."CREATED">TRUNC(SYSDATE@!))
       filter("T1"."SUBOBJECT_NAME" IS NOT NULL AND
              "T1"."OBJECT_TYPE"='TABLE' AND "T1"."CREATED">TRUNC(SYSDATE@!))

Note that the table access full includes the inmemory keyword; and the predicate section shows the predicates that have taken advantage of in-memory columns. The question is – what happens if I add the object_id column (which I’ve declared as no inmemory) to the select list.  Here’s the resulting plan:


--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |    32 |  1818   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |    32 |  1818   (1)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("T1"."SUBOBJECT_NAME" IS NOT NULL AND
              "T1"."OBJECT_TYPE"='TABLE' AND "T1"."CREATED">TRUNC(SYSDATE@!))

There’s simply no sign of an in-memory strategy – it’s just a normal full tablescan (and I didn’t stop with execution plans, of course, I ran other tests with tracing, snapshots of dynamic performance views etc. to check what was actually happening at run-time).

In principle there’s no reason why Oracle couldn’t use the in-memory columns that appear in the where clause to determine the rowids of the rows that I need to select and then visit the rows by rowid but (at present) the optimizer doesn’t generate a plan to do that. There’s no reason, though, why we couldn’t try to manipulate the SQL to produce exactly that effect:


explain plan for
select
        /*+ no_eliminate_join(t1b) no_eliminate_join(t1a) */
        t1b.object_id, t1b.last_ddl_time, t1b.created
from
        t1 t1a, t1 t1b
where   t1a.created > trunc(sysdate)
and     t1a.object_type = 'TABLE'
and     t1a.subobject_name is not null
and     t1b.rowid = t1a.rowid
;

select * from table(dbms_xplan.display);

------------------------------------------------------------------------------------
| Id  | Operation                   | Name | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |      |     1 |    64 |    74   (9)| 00:00:01 |
|   1 |  NESTED LOOPS               |      |     1 |    64 |    74   (9)| 00:00:01 |
|*  2 |   TABLE ACCESS INMEMORY FULL| T1   |     1 |    31 |    73   (9)| 00:00:01 |
|   3 |   TABLE ACCESS BY USER ROWID| T1   |     1 |    33 |     1   (0)| 00:00:01 |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - inmemory("T1A"."SUBOBJECT_NAME" IS NOT NULL AND
              "T1A"."OBJECT_TYPE"='TABLE' AND "T1A"."CREATED">TRUNC(SYSDATE@!))
       filter("T1A"."SUBOBJECT_NAME" IS NOT NULL AND
              "T1A"."OBJECT_TYPE"='TABLE' AND "T1A"."CREATED">TRUNC(SYSDATE@!))

I’ve joined the table to itself by rowid, hinting to stop the optimizer from getting too clever and eliminating the join. In the join I’ve ensured that one reference to the table can be met completely from the in-memory columns, isolating the no inmemory columns to the second reference to the table. It is significant that the in-memory tablescan is vastly lower in cost than the traditional tablescan – and there will be occasions when this difference (combined with the knowledge that the target is a relatively small number of rows) means that this is a very sensible strategy. Note – the hints I’ve used happen to be sufficient to demonstrate method but I’d be much more thorough in a production system (possibly using an SQL baseline to fix the execution plan).

Of course, this method is just another example of the “visit a table twice to improve the efficiency” strategy that I wrote about a long time ago; and it’s this particular variant of the strategy that allows you to think of the in-memory columnar option as an implementation of OLTP bitmap indexes.

Oracle Database 12c In-Memory Feature – Part V. You Can’t Use It If It’s Not “Enabled.” Not Being Able To Use A Feature Is An Important “Feature.”

This is part 5 in a series: Part I, Part II, Part III, Part IV, Part V.

Synopsis

This blog post is the last word on the matter.

Enabled?  It’s About Usage!

You don’t get charged for Oracle feature usage unless you use the feature. So why does Oracle inconsistently use the word enabled when we care about usage? If enabled precedes usage then enabled is a sanctified term. Please read on…

It’s All About Getting The Last Word? No, It’s About Taking Care Of Customers.

On August 6, 2014  Oracle shared their last word and official statement on the matter of bug-ridden tracking of the Oracle Database 12c  In-Memory feature usage in a quote to the press at CBR. I’ll paraphrase first and then quote the article. Here is what I hear when I read the words of Oracle’s spokesman:

Yeah, my bad, we have a bug. The defective code erroneously tracks feature usage for an Enterprise Edition additional cost option priced at $23,000 per processor core. Don’t worry. When we track this particular feature usage we’ll ignore it should you be audited. You have our spoken word that we’ll just shine this one on. Here, let me trade a few confusing words about usage without using the word enabled or disabled since those are taboo.

My paraphrase probably draws a more serene picture than the visions of tip-toeing and side-stepping conjured up by the following words I’ll quote from the CBR article. Bear in mind the fact that the bug spoken of in the quote is 19308780–a bug, by the way, that is not readable by maintenance contract holders. Now I’ll quote the article:

Recording that the In-Memory option is in use in this case is a bug and we will fix it in the first patchset update coming in October.

Yes, we knew it was a bug. I merely had to do the hard work of getting Oracle to acknowledge it. The article continued with the following quote. Please ignore the fact that Oracle’s spokesman refers to me common. Focus instead on the fact that throughout parts 1 through 4 in my series I suffered erroneous  feature usage reporting because of a bug (software defect). I quote:

Kevin initially claimed that feature tracking could report In-Memory usage, and therefore impact licensing, without the end-user doing anything. This was and is still not the case. Customer licensing of Oracle Database In-Memory is not impacted by the bug that Maria notes in her blog. When an end-user explicitly undertakes actions to set the INMEMORY attribute on a table but the In-Memory column store has not been allocated (by setting the inmemory_size parameter to a non zero value), the bug results in feature tracking incorrectly reporting In-Memory ‘in use’. However as no column store has been allocated, the feature is not in use and therefore there is no licensing impact.

 

Ah yes. The old, “it’s not in use but it reports it’s in use situation.” That’s could have been conveyed in very short sentences…could have.

Since the bug spoken of in the above quote is not visible to contract holders I’m just going to let you mull over the circular logic.  This whole situation could be a lot simpler if Oracle would either a) make a bug description visible to contract holders so customers know what is broken and how to test whether it got fixed when the patch is eventually applied and/or b) add this defect to MOS 1309070.1 which is a bug that tracks all the other bugs in feature usage reporting. Yes, indeed, there are other bugs of this sort with other features. All software has bugs.

Last Word On The Matter

My last word on the matter has to do with the fact that the feature cannot be unlinked. It is a very expensive–and very useful, important feature. As I pointed out in Part II the feature cannot be absolutely disabled at the executable level as is the case for other high cost options like Real Application Clusters and Partitioning.  I think Oracle is trying to tell us it is impossible computer science to make it an unlinkable feature–at least that’s how I interpret the following words in a blog post at Oracle.com:

Oracle Database In-Memory is not a bolt on technology to the Oracle Database. It has been seamlessly integrated into the core of the database as a new component of the Shared Global Area (SGA). When the Oracle Database is installed, Oracle Database In-Memory is installed. They are one and the same. You can’t unlink it or choose not to install it.

Now maybe this is not saying there is no way to code the feature as unlinkable. Maybe it’s saying the choice was made to not make it unlinkable. I don’t know. If, however, we are to believe that the mere fact the feature uses the SGA makes  it some sort of atomic-level symbiotic parasite, well, that argument doesn’t  hold water. Indeed, Real Application Clusters is massively integrated with the SGA. Ever heard of Cache Fusion? With Cache Fusion data blocks get shuttled from one SGA to another across hosts in a cluster! Real Application Clusters is unlinkable–that’s unthinkable!

 

What Is Unlinkable Anyway

There might be folks that don’t know what we mean when we say a feature is unlinkable. This doesn’t mean all the code for the feature is yanked out of the binary. It simply means that a single–or perhaps a few–binary objects are linked into the Oracle executable that enables the feature. If unlinked there is absolutely no way to use the feature–as is the case with, for instance, Real Application Clusters.

And not being able to use the feature is an important feature!

So let’s ponder the insurmountable computer science that must surely be involved in implementing the In-Memory Column Store feature as unlinkable.

Oracle has told us the INMEMORY_SIZE initialization parameter is the on/off  button for the feature. That means there is a single, central on/off button that is, indeed, able to be manipulated even by the user. Can you imagine how difficult it must be to implement a global variable–even a simple boolean–that get’s linked in and checked when one boots the database? Not hard to grasp. What if the variable had a silly name like inmemory_deactivated. What if the feature activation module–let’s call it inmem.o–had inmemory_deactived=TRUE but an alternate module called  inmemON.o had inmemory_deactivated=FALSE. In much the same way we relink Real Application Clusters, the link scripts manipulate the file name so that the default (with feature deactivated) gets replaced with the activated module–only if the user wants the possibility of using the feature. How would all this deep, dark, complex code come together? Well, when the database instance is booted inmemory_deactivated is evaluated and regardless of the user’s setting of INMEMORY_SIZE the In-Memory feature is really, truly, disabled–and most importantly not usable. No possibility for confusion. Would that be better than a game of Licensed-Feature Usage Prevention Twister(tm)?

1966_Twister_Cover

Intensely Deep Engineering Difficulty

Now, imagine that. We didn’t even have to use the back of a cocktail napkin to draw out a solution to the mysteries behind how utterly unlinkable the In-Memory Database feature must surely be. We simply a)  drew upon our understanding of other SGA-integrated features like Real Application Clusters and b) recalled how unlinking works for other features and c) drew upon our basic level understanding of the C programming language vis a vis global variables and object linking.

Let me summarize all that: There is a single user-modifiable boot-time parameter that disables In-Memory Database as per Oracle’s blog and spokesman assertions. Um, that’s a pretty simple focal point to make the feature unlinkable.

Summary

Yes, Oracle could implement a method for making the In-Memory Column Store feature an unlinkable option just like they did for Real Application Clusters. I can only imagine why they chose not to (visions of USD $23,000 per processor core).

Filed under: oracle

Exadata Zone Maps

Just a quick post on a new Exadata feature called Zone Maps. They’re similar to storage indexes on Exadata, but with more control (you can define the columns and how the data is refreshed for example). People have complained for years that storage indexes provided no control mechanisms, but now we have a way to exert our God given rights as DBA’s to control yet another aspect of the database. Here’s a link to the 12.1.0.2 documentation which resides in the Data Warehousing Guide: Zone Map Documentation

Zone Maps are restricted to Exadata storage by the way (well probably they work on ZFS and Pillar too). Have a look at the Oracle error messages file:

 
>grep -i "storage type" $ORACLE_HOME/rdbms/mesg/oraus.msg | grep -i "not supported"
 
/u01/app/oracle/product/12.1.0.2/dbhome_1/rdbms/mesg/oraus.msg:31969, 00000, "ZONEMAP not supported for table stored in tablespace of this storage type"
/u01/app/oracle/product/12.1.0.2/dbhome_1/rdbms/mesg/oraus.msg:64307, 00000, " Exadata Hybrid Columnar Compression is not supported for tablespaces on this storage type" 
/u01/app/oracle/product/12.1.0.2/dbhome_1/rdbms/mesg/oraus.msg:64309, 00000, " Hybrid Columnar Compression with row-level locking is not supported for tablespaces on this storage type."
/u01/app/oracle/product/12.1.0.2/dbhome_1/rdbms/mesg/oraus.msg:65425, 00000, "CLUSTERING clause not supported for table stored in tablespace of this storage type"
/u01/app/oracle/product/12.1.0.2/dbhome_1/rdbms/mesg/oraus.msg:65451, 00000, "Advanced index compression is not supported for tablespaces on this storage type."

So according to the messages file, there are a handful of features that are restricted in this fashion (Zone Maps, HCC, Attribute Clustering and Advanced Index Compression).

As a bit of totally irrelevant history, zone maps were actually included in the 12.1.0.1 release, but the documentation on them was removed. So they worked, but they were undocumented.

Here’s an example on a 12.1.0.1 DB on a non-Exadata platform.

 
SQL*Plus: Release 12.1.0.1.0 Production on Wed Aug 13 15:41:46 2014
 
Copyright (c) 1982, 2013, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options
 
 
INSTANCE_NAME    STARTUP_TIME      CURRENT_TIME         DAYS    SECONDS
---------------- ----------------- ----------------- ------- ----------
LAB1211          13-AUG-2014 09:54 13-AUG-2014 15:41     .24      20820
 
SYS@LAB1211> create table kso.junk1 (col1 number, col2 number) clustering by linear order (col1,col2);
create table kso.junk1 (col1 number, col2 number) clustering by linear order (col1,col2)
*
ERROR at line 1:
ORA-65425: CLUSTERING clause not supported for table stored in tablespace of this storage type
 
 
SYS@LAB1211> create table kso.junk1 (col1 number, col2 number);
 
Table created.
 
SYS@LAB1211> create materialized zonemap skew_zonemap on kso.junk1(col1);
create materialized zonemap skew_zonemap on kso.junk1(col1)
                                                          *
ERROR at line 1:
ORA-31969: ZONEMAP not supported for table stored in tablespace of this storage type

Note that both zone maps and attribute clustering were disallowed with the “not supported for table stored in tablespace of this storage type” error message.

By the way, attribute clustering is another interesting new feature of 12g that allows you to declaratively instruct Oracle to store data on disk in a sorted order. This physical ordering can have big benefit for storage indexes or zone maps (or any btree index where clustering factor is important for that matter). Oracle’s new In-Memory column store also has a min/max pruning feature (storage indexes) which means physical ordering on disk is important with that feature as well.

Anyway, here’s a link to the 12.1.0.2 documentation on attribute clustering which also resides in the Data Warehousing Guide: Attribute Clustering Documentation

And here’s another example using 12.1.0.2 on an Exadata.

 
SQL*Plus: Release 12.1.0.2.0 Production on Wed Aug 13 15:42:18 2014
 
Copyright (c) 1982, 2014, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Advanced Analytics
and Real Application Testing options
 
 
INSTANCE_NAME    STARTUP_TIME      CURRENT_TIME         DAYS    SECONDS
---------------- ----------------- ----------------- ------- ----------
INMEM            24-JUL-2014 18:35 13-AUG-2014 15:42   19.88    1717600
 
Elapsed: 00:00:00.00
SYS@INMEM> @test_zonemap
SYS@INMEM> create table kso.junk1 (col1 number, col2 number) clustering by linear order (col1,col2);
 
Table created.
 
Elapsed: 00:00:00.01
SYS@INMEM> create materialized zonemap skew_zonemap on kso.junk1(col1);
 
Materialized zonemap created.
 
Elapsed: 00:00:00.15
SYS@INMEM> 
SYS@INMEM> -- so as expected, we're able to create an attribute clustered table and a zone map on Exadata
SYS@INMEM> 
SYS@INMEM> -- Let's try creating a tablespace that is not on Exa storage (even though the DB is on EXA platform)
SYS@INMEM> 
SYS@INMEM> create tablespace KSO_NON_EXA datafile '/home/oracle/KSO_NON_EXA.dbf' size 100M;
 
Tablespace created.
 
Elapsed: 00:00:00.38
SYS@INMEM> @tablespaces
 
TABLESPACE_NAME STATUS    CONTENTS  LOGGING   EXTENT_MGT ALLOC_TYP SPACE_MGT BLOCK_SIZE PREDICA
--------------- --------- --------- --------- ---------- --------- --------- ---------- -------
CLASS_DATA      ONLINE    PERMANENT LOGGING   LOCAL      SYSTEM    AUTO            8192 STORAGE
EXAMPLE         ONLINE    PERMANENT NOLOGGING LOCAL      SYSTEM    AUTO            8192 STORAGE
KSO_NON_EXA     ONLINE    PERMANENT LOGGING   LOCAL      SYSTEM    AUTO            8192 HOST      <=== 
SYSAUX          ONLINE    PERMANENT LOGGING   LOCAL      SYSTEM    AUTO            8192 STORAGE
SYSTEM          ONLINE    PERMANENT LOGGING   LOCAL      SYSTEM    MANUAL          8192 STORAGE
TEMP            ONLINE    TEMPORARY NOLOGGING LOCAL      UNIFORM   MANUAL          8192 STORAGE
UNDOTBS1        ONLINE    UNDO      LOGGING   LOCAL      SYSTEM    MANUAL          8192 STORAGE
USERS           ONLINE    PERMANENT LOGGING   LOCAL      SYSTEM    AUTO            8192 STORAGE
 
8 rows selected.
 
Elapsed: 00:00:00.02
SYS@INMEM> 
SYS@INMEM> -- note that tablespace KSO_NON_EXA is on local disk, not Exadata storage servers, so PREDICATE_EVALUATION is set to HOST.
SYS@INMEM> 
SYS@INMEM> drop table kso.junk1;
 
Table dropped.
 
Elapsed: 00:00:00.01
SYS@INMEM> create table kso.junk1 (col1 number, col2 number) clustering by linear order (col1,col2) tablespace kso_non_exa;
 
Table created.
 
Elapsed: 00:00:00.01
SYS@INMEM> select owner, table_name, tablespace_name from dba_tables where table_name like 'JUNK1';
 
OWNER                TABLE_NAME                     TABLESPACE_NAME
-------------------- ------------------------------ ---------------
KSO                  JUNK1                          KSO_NON_EXA
 
Elapsed: 00:00:00.01
SYS@INMEM> 
SYS@INMEM> -- wow - that's a bit of a surprise, clustered table create worked on non-Exa storage
SYS@INMEM> -- maybe the check is done on some other level than the tablespace
SYS@INMEM> 
SYS@INMEM> create materialized zonemap skew_zonemap on kso.junk1(col1);
create materialized zonemap skew_zonemap on kso.junk1(col1)
                                                          *
ERROR at line 1:
ORA-31969: ZONEMAP not supported for table stored in tablespace of this storage type
 
 
Elapsed: 00:00:00.00

So as you can see, attempting to create the zone map on non-Exa storage failed as expected. But I was able to create a clustered table on non-Exa storage, which is a little weird. So while the error message for attribute clustering exists in the messages file, it doesn’t appear that there is a check in the code, at least at the tablespace level. I don’t have a 12.1.0.2 install on a non-Exadata platform at the moment to test it out, but if you do, please let me know.

That’s it for now. I hope to do some more detailed posts on In-Memory, Zone Maps, Attribute Clustering in the near future. As always, your comments are welcomed.