Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Creating a RAC 12.1 Data Guard Physical Standby environment (3b)

Huh, what is this I hear you ask? Part 3b? Oracle 12.1? Well, there’s a bit of a story to this post. Back in December 2016 I started to write a series of blog posts (part 1 | part 2 | part 3 | part 4) about how I created a standby database on RAC 12.1. For some reason I forgot to post this part. Up until now the step where I am creating the broker configuration was missing. Thanks to a friend I discovered my mistake. I decided to post this article to complete the series. There’s a pretty big logical gap that needed filling :)

I also deliberately decided against making changes to my notes, so this is written in the same style as the articles preceding this one. Also bear in mind that this is Oracle 12.1!

NOTE: As always, this is just a demonstration using VMs in my lab, based on my notes. Your system is most likely different, so in real-life you might take a different approach. The techniques I am using here were suitable for me, and my own small-scale testing. I tried to make sure they are valid, but you may want to allocate more resources in your environment. Test, test, test on your own environment on test kit first!

Here is the original blog post

In a last-minute decision I decided to split part 3 into half: first the database duplication, followed by the Data Guard configuration. It really got a bit too long (even for my standards), and would have put readers off the article. I appreciate your patience though…

What’s the situation?

At the end of part 3 I had a physical standby database ready for use, including its registration in the OCR. it was looking promising, but there was more work to do.

At the moment, the standby database is out of sync with the primary, a fact that needs to be rectified by creating a Data Guard configuration. I prefer to do this on the command line. If you are so inclined you can of course push the mouse around and do it with OEM.

Step 1: Data Guard Broker pre-requisites

Before creating the configuration you should change the location of the broker configuration files to shared storage. This needs to be done on each side – primary and every standby – before you can start the broker.

SQL> alter system set dg_broker_config_file1='+DATA/NCDBA/dr1NCDBA.dat' scope=both sid='*';

System altered.

SQL> alter system set dg_broker_config_file2='+DATA/NCDBA/dr2NCDBA.dat' scope=both sid='*';

System altered.

SQL> show parameter dg_broker

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
dg_broker_config_file1               string      +DATA/NCDBA/dr1NCDBA.dat
dg_broker_config_file2               string      +DATA/NCDBA/dr2NCDBA.dat
dg_broker_start                      boolean     FALSE

SQL> alter system set dg_broker_start=true scope=both sid='*';

System altered.

SQL> 

I had to repeat the step on NCDBB, of course with adapted paths and file names.

Step 2: Creating a Data Guard Broker configuration

Once that’s done, you create the configuration using dgmgrl, the Data Guard Broker Manager Line Mode tool. I named my configuration RACTEST and added both databases. Here is an example of my session:

[oracle@rac12pri1 ~]$ dgmgrl
DGMGRL for Linux: Version 12.1.0.2.0 - 64bit Production

Copyright (c) 2000, 2013, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
DGMGRL> connect sys@ncdba
Password:
Connected as SYSDBA.
DGMGRL> create configuration ractest as primary database is 'NCDBA' connect identifier is 'NCDBA';
Configuration "ractest" created with primary database "NCDBA"
DGMGRL> add database 'NCDBB' as connect identifier is 'NCDBB';
Database "NCDBB" added
DGMGRL> show configuration

Configuration - ractest

  Protection Mode: MaxPerformance
  Members:
  NCDBA - Primary database
    NCDBB - Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
DISABLED

As a first step I connect as SYSDBA before creating the configuration. You can usually type help command to remind you of the exact syntax. After the configuration is created you can start adding (standby) databases. In my case there is only 1 – physical standby – database, but you can think of environments where you have more than one standby environment although that is less common. When specifying the connect identifier you use valid net*8 names found in the default tnsnames.ora (the one in the RDBMS home).

At this point it is important to set standby_file_management to auto. Otherwise managed recovery can abort and you will have to do extra work managing data files that can easily be avoided. The broker syntax to do so involves changing database properties (type show database verbose 'NCDBA' to show all of them). Many Data Guard broker properties are not easily mapped back to the corresponding initialisation parameter. In the case of standby_file_management it is not hard, and the documentation helps with the rest.

Eventually I enable the configuration and check the status. I have found that it can take a couple of minutes at times to get the SUCCESS message.

DGMGRL> edit database 'NCDBA' set property standbyfilemanagement=auto;
Property "standbyfilemanagement" updated
DGMGRL> edit database 'NCDBB' set property standbyfilemanagement=auto;
Property "standbyfilemanagement" updated
DGMGRL> enable configuration
Enabled.
DGMGRL> show configuration

Configuration - ractest

  Protection Mode: MaxPerformance
  Members:
  NCDBA - Primary database
    NCDBB - Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 21 seconds ago)

There are other things you must consider in accordance with your company’s (data protection/availability/business continuity/…) standards and other documentation – again my example is rather basic, but deliberately so. It’s my lab playground after all.

For the sake of completeness, here is the current configuration of NCDBA and NCDBB:

DGMGRL> show database 'NCDBA'

Database - NCDBA

  Role:               PRIMARY
  Intended State:     TRANSPORT-ON
  Instance(s):
    NCDBA1
    NCDBA2

Database Status:
SUCCESS

DGMGRL> show database 'NCDBB';

Database - NCDBB

  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 1 second ago)
  Apply Lag:          0 seconds (computed 1 second ago)
  Average Apply Rate: 223.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    NCDBB1
    NCDBB2 (apply instance)

Database Status:
SUCCESS

DGMGRL> exit

Please note that the broker detected the fact that my systems are cluster databases and lists both instances for each database. It also shows that NCDBB2 is the apply instance. I was positively surprised about the broker’s ability to detect both instances automatically. I seem to remember cases in earlier releases where said instance detection occasionally required a little nudging to get it right.

This example is deliberately kept as minimalist as possible: it works. Most likely it won’t satisfy your requirements. Review your Data Guard (and any other relevant) standards and change the configuration accordingly.

Summary

… back to present time :) This post completes the series, all my notes are now transcribed to the blog. I hope you find this useful.

back to part 3 | on to part 4

How much free space can be reclaimed from a segment?

By Franck Pachot

.
You have the feeling that your table takes more blocks than it should? Here are the queries I use to quickly check the free space. The idea is to call DBMS_SPACE.SPACE_USAGE and infer the minimum space from the percentages. For example, a block in FS3 (defined as having at least 50 to 75% free space) is supposed to have at least 50% of free space. Of course it can have more, but you don’t know.

Here is some PL/SQL to do so:

set serveroutput on
declare
unf number; unfb number; fs1 number; fs1b number; fs2 number; fs2b number; fs3 number; fs3b number; fs4 number; fs4b number; full number; fullb number;
begin
for i in (select * from (select * from dba_segments where segment_subtype='ASSM' and segment_type in (
'TABLE','TABLE PARTITION','TABLE SUBPARTITION','CLUSTER','LOB','LOB PARTITION','LOB SUBPARTITION'
) order by bytes desc) where 10>=rownum)
loop
begin
dbms_space.space_usage(i.owner,i.segment_name,i.segment_type,unf,unfb,fs1,fs1b,fs2,fs2b,fs3,fs3b,fs4,fs4b,full,fullb,partition_name=>i.partition_name);
dbms_output.put_line(to_char((unfb+fs1b+fs2b*0.25+fs3b*0.5+fs4b*0.75)/1024/1024/1024,'999G999D999')||' GB free in '||i.segment_type||' "'||i.owner||'"."'||i.segment_name||'" partition "'||i.partition_name||'"');
exception
when others then dbms_output.put_line(i.segment_type||' "'||i.owner||'"."'||i.segment_name||'" partition "'||i.partition_name||'": '||sqlerrm);
end;
end loop;
end;
/

The output looks like:

.001 GB free in INDEX "DEMO"."ACCOUNT_PK" partition ""
.001 GB free in TABLE "APEX_040200"."WWV_FLOW_PAGE_PLUGS" partition ""
.009 GB free in TABLE "SCOTT"."DEMO" partition ""
.000 GB free in TABLE "APEX_040200"."WWV_FLOW_STEP_ITEMS" partition ""
.003 GB free in INDEX "SYS"."WRH$_SYSMETRIC_HISTORY_INDEX" partition ""
.000 GB free in TABLE "MDSYS"."SDO_CS_SRS" partition ""
.002 GB free in INDEX "SYS"."I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST" partition ""
.006 GB free in TABLE "SYS"."WRH$_SYSMETRIC_HISTORY" partition ""
.002 GB free in TABLE "SYS"."WRH$_SQL_PLAN" partition ""

If you are in 12c, an inline function in the query might come handy:

with function freebytes(segment_owner varchar2, segment_name varchar2, segment_type varchar2,partition_name varchar2) return number as
unf number; unfb number; fs1 number; fs1b number; fs2 number; fs2b number; fs3 number; fs3b number; fs4 number; fs4b number; full number; fullb number;
begin
dbms_space.space_usage(segment_owner,segment_name,segment_type,unf,unfb,fs1,fs1b,fs2,fs2b,fs3,fs3b,fs4,fs4b,full,fullb,partition_name=>partition_name);
return unfb+fs1b+fs2b*0.25+fs3b*0.5+fs4b*0.75;
end;
select round(freebytes(owner,segment_name,segment_type,partition_name)/1024/1024/1024,3) free_GB,segment_type,owner,segment_name,partition_name
from dba_segments where segment_subtype='ASSM' and segment_type in (
'TABLE','TABLE PARTITION','TABLE SUBPARTITION','CLUSTER','LOB','LOB PARTITION','LOB SUBPARTITION'
) order by bytes desc fetch first 10 rows only
/

The result looks like:

FREE_GB SEGMENT_TYPE OWNER SEGMENT_NAME PARTITION_NAME
------- ------------ ----- ------------ --------------
0 TABLE DEMO ACCOUNTS
0.001 INDEX DEMO ACCOUNT_PK
0.001 TABLE APEX_040200 WWV_FLOW_PAGE_PLUGS
0.009 TABLE SCOTT DEMO
0.003 INDEX SYS WRH$_SYSMETRIC_HISTORY_INDEX
0 TABLE APEX_040200 WWV_FLOW_STEP_ITEMS
0.002 INDEX SYS I_WRI$_OPTSTAT_H_OBJ#_ICOL#_ST
0 TABLE MDSYS SDO_CS_SRS
0.006 TABLE SYS WRH$_SYSMETRIC_HISTORY
0.002 TABLE SYS WRH$_SQL_PLAN

Future evolution will be published on GitHub:
https://raw.githubusercontent.com/FranckPachot/scripts/master/administration/segment_free_space_plsql.sql
https://raw.githubusercontent.com/FranckPachot/scripts/master/administration/segment_free_space_sql.sql

Note that having free space does not mean that you have to shrink or reorg. Try to understand what happened to your data before, and whether this space will be reused soon.

Update 8-AUG-2018

In the initial post I added all segment types accepted by the dbms_space documentation but finally removed ‘INDEX’,’INDEX PARTITION’,’INDEX SUBPARTITION’ because the meaning of the output is completely different. See Jonathan Lewis note about it: https://jonathanlewis.wordpress.com/2013/12/17/dbms_space_usage/

 

Cet article How much free space can be reclaimed from a segment? est apparu en premier sur Blog dbi services.

Conferences 2018

Here are the conferences where I speak this end of year


Join Methods: Nested Loop, Hash, Sort, Merge, Adaptive

This is an explanation of the join methods, with live demos on very simple cases, reading execution plan statistics, looking at how it is executed. And how to improve the query performance: hints, partitioning, indexing. The goal is to understand, with modern tracing tools (dbms_xplan and SQL Monitoring), the basic core stuff that is involved in all query tuning: the join methods, the estimations that drive the choice between full scan and index access, the transformations that avoid unnecessary joins.

  • POUG 2018 September 07, 2018

POUG 2018

  • DOAG 2018 VOM 20. BIS 23. NOVEMBER 2018

https://programm.doag.org/doag/2018/#/scheduledEvent/565962


Multitenant Security Features Clarify DBA Role in DevOps Cloud

The role of the DBA changes with the cloud and autonomous managed services. The multitenant architecture is the foundation for these new roles. The container database/pluggable database (CDB/PDB) paradigm enforces role segregation between the container DBA and the application. In a DevOps environment, you need the finer access privilege definition that comes in Oracle Database 12.1 (common/local users) and Oracle Database 12.2 (lockdown profiles, PDB isolation), and that was improved in Oracle Database 18c. In this session learn to administer a cloud or on-premises database and give PDB administration rights to application teams. See how to grant database access to a developer on a PDB with Oracle Database 18c. Discover the new features, see a demo, and learn best practices.

  • Oracle Open World October 22–25, 2018 San Francisco, CA

Session Catalog | Oracle OpenWorld 2018


Microservices: Get Rid of Your DBA and Send the DB into Burnout

The database is always a problem: it is a single point of failure for applications; it makes any continuous integration/delivery impossible; the DBA is a drag on your agile developments; and any interaction with the database is done with unmaintainable, complex, old-style SQL queries. Those are the reasons why modern architects want to get all code out of the database and even the data moved out to caches, local to CQRS microservices. But there are drawbacks with this approach, and doing it incorrectly may show the worst once implemented. This session focuses on the antipatterns and shows, with facts and measurements, what can be wrong in the implementation. This knowledge may help you design your applications with scalability in mind.

  • Oracle Code One October 22–25, 2018 San Francisco, CA

Session Catalog | Oracle Code One 2018


Oracle Database on Docker: Lessons Learned

Developers need to quickly get a database up and running without wasting time waiting for the DBA or doing the whole installation themselves. Docker containers are perfect for that. But there are two big problems with Oracle Database. First, the default installation is huge, and you do not want a 20GB image to start a container. Second, the database must persist beyond the container lifecycle. This means that we need to create it in external volume. But do you accept a 15 minutes database creation at ‘docker run’?

There can be different solutions to address different needs: a small database to run on a laptop, a large number of databases running on OpenShift, quick continuous integration ephemeral databases…

This session will show some ideas to run Oracle Database in a container in an efficient way. With some tuning and tweaks, it is possible to provide a small image that can run in seconds. And is that different, or complementary, to multitenant containers?

  • UKOUG Tech18, Liverpool, 04/12/2018, 11:40–12:25

UKOUG Technology Conference & Exhibition 2018

CQRS, Event Sourcing and the Oracle Database

By Franck Pachot

.
This blog post relates my thoughts when reading about Command Query Responsibility Separation and Event Sourcing, in the context of the Oracle Database (but it can probably apply to any database). We see those terms in the new software architecture diagrams, but they are actually quite old:

Command-Query separation

Command-Query separation was defined by Bertrand Meyer 15 years ago, not for the database but for the Eiffel language. See page 22-44 of Eiffel: a language for software engineering.

This basically states that a function that returns a result does not change anything (“asking a question should not change the answer”). In PL/SQL you can do whatever you want with FUNCTION and PROCEDURE: PROCEDURE can return values with OUT parameters and FUNCTION can modify some data. However, ‘You can’ does not mean that you should. It is easy to define some coding rules where a FUNCTION does not change anything and where on PROCEDURE implements what Bertrand Meyer calls ‘command’.

In SQL this is clearly enforced. A SELECT is for queries and cannot change any data. Even if you call a function that tries to change some data you will encounter:

ORA-14551: cannot perform a DML operation inside a query

Object Oriented data modeling

This concept came from Eiffel, an object-oriented programming language. Object Oriented approach was designed for transient data structures: stored in memory (the object identification is the address in memory) and usually not shared (the representation of the same data is a different object in different systems). But the modular and encapsulation aspects of OO approach being great, OO approach has been extended to data, the persistent data that we store in the database. Before, the functions and the data model were analyzed separately. This designed monolithic applications which were difficult to maintain and evolve. The Object Oriented approach helps to analyze the sub-subsystems separately.

Create Retrieve Update Delete

However, because the objects were the business objects and not the use-cases, or services, then the software architects came with this simplified idea that all interaction with the database is CRUD: you Create (aka INSERT), retrieve (aka SELECT), UPDATE or DELETE. And then you define an Object (mapped to a table) and define those 4 CRUD methods mapped to 4 SQL Statements. And you can do everything. And because some ORM frameworks give the possibility to ignore how the data is stored and shared in the database, a new generation of developers was working with data sets as if it were objects in memory.

And this is completely wrong because

  • one object maps to only one row, and the CRUD approach exposes no method for bulk updates – those where the database can be the most efficient
  • the CRUD methods read and update all table columns – ignoring all efficient access paths to a subset of columns
  • mapping a transient object identifier (address in memory) to a shared persistent identifier (primary key) is not easy
  • and finally, all those CRUD operations belong to an object when they should belong to use-cases and transactions

The last point is more obvious between the queries (the R in CRUD) and the insert (the C in CRUD). In an information system, you usually use (or query or retrieve) the data in a completely different way than it was entered (entered, inserted). Let’s take an example in an Order Entry system. When the customer starts with an order, it is probably the responsibility of the Orders object. But then, it will be processed by for the delivery, and then queried by the Customer object, and finally by the Products for sales analysis. Grouping all that in the same object is easier for the Object-Oriented Analysis, for the modelization of the domain object. But keeping this grouping for the software implementation is a big source of problems. And finally, the same object will be used by completely different use-cases, with different availability, consistency, performance,… requirements. That’s too much responsibility for one object.

Command-Query-Responsability-Separation (CQRS)

Because of this (CRUD row-by-row queries inefficiency and Object over-responsability) the applications started to cache locally the data used by queries. In order to avoid the row-by-row roundtrips and to store them in a model more suited to the query than the way it was inserted. Like dozens of data marts that try to keep in sync with the changes done in the database.

Hibernate, for example, can cache the whole data in a second level cache, using object stores like GigaSpaces or Oracle Coherence. But this doesn’t solve the main problem. The object model, which is not relational but hierarchical, cannot be used efficiently by all use-cases. And I’ve seen very few Hibernated developers accepting to have multiple Entity mappings for the different use-cases. Basically, the relational model can cope with all kind of queries (especially since Oracle In-Memory Columns Store as you can also do analytics on it). But when querying it with an object model, you need several models.

Then came the idea to split the data model between ‘command’ – the modifications, and ‘query’. This is the CQRS from Martin Fowler: https://martinfowler.com/bliki/CQRS.html

This is, again, not new. The relational model exposes data through views and you can, and should, have different views on your data. The data is stored in one normalized model, modified though views or procedures (Transactional API) and queried through views. Separation of responsibility has always been there. And I totally agree with all those diagrams showing multi-layers, separation, and even micro-services. There’s only one problem with them.

Logical vs. physical separation

I don’t really know when, but at a point, the architects looking at those diagrams forgot one step where the logical model should be implemented in a physical model. And what was logical layers became physical tiers without any reasons. Many architects uses ‘layer’, ‘level’, ‘tier’ without even knowing if they are at logical or physical level. It is good to have logical layers, but processing data across multiple physical tiers will exhaust all resources in all tiers just by doing nothing else than waiting on roundtrips latency and expensive context switches.

The CRUD methods had to be coded in an Object Oriented language, and then the idea was Java. Then, the rows have to be shipped between two different processes: one running a JVM, and one running the SQL queries in the RDBMS. Different processes means context switches for each calls. This is latency and overhead, and is not scalable. Oracle has the possibility to run the JVM but this still context switch and datatype conversion. Then, to try to scale, more application servers were needed and this data processing part moved to the application server.

And the bad effect is not only on performance and scalability. Because of this confusion, implementing logical layers into different servers, technologies, languages,… we lost the link between the storage data and the processing of data. Which is exactly the opposite of an Object Oriented approach. With the data logic (you can call it business logic as well, as all data processing is there to implement business rules) in the database you can manage dependencies. The database always keeps track of which function or procedure is using which table, and how.

Event Sourcing

https://martinfowler.com/eaaDev/EventSourcing.html

Oracle can even guarantee consistent reads without changing anything, thanks to MVCC. Of course, SELECT FOR UPDATE can write to the database, not to change data but to write lock information, but you can restrict this by granting only the READ privilege.

This means that, for example, in PL/SQL we use PROCEDURE for what Bertrand Meyer calls ‘command': changing data. And we use FUNCTION to query data

 

Cet article CQRS, Event Sourcing and the Oracle Database est apparu en premier sur Blog dbi services.

Visual Studio- All the Warm and Fuzzy

So I haven’t opened Visual Studio in….oh….let’s just say it’s been a few years…:)

I had a project that I needed to run and was surprised when the Solution Explorer was missing from SSMS 2017.  Its only fair to say, there was also fair warning from Microsoft.

https://i0.wp.com/dbakevlar.com/wp-content/uploads/2018/08/sol_ex.png?re... 300w, https://i0.wp.com/dbakevlar.com/wp-content/uploads/2018/08/sol_ex.png?re... 768w, https://i0.wp.com/dbakevlar.com/wp-content/uploads/2018/08/sol_ex.png?w=... 1568w, https://i0.wp.com/dbakevlar.com/wp-content/uploads/2018/08/sol_ex.png?w=... 1400w" sizes="(max-width: 650px) 100vw, 650px" data-recalc-dims="1" />

Due to this, I opened up Visual Studio to use its Solution Explorer and integration for SSIS and other features required for a large project.  I was both happy with the sheer amount of features and have some constructive feedback to make it more user friendly.

I love that I can navigate servers, log into SQL Server databases and manage and verify what’s occurred in my releases.  The properties pane comes in handy to offer valuable information when I’m building out connection strings or looking for data that may have not compiled correctly in a build.

Although the rough instructions were for Solution Explorer for SSMS, I was able, even with as rusty as I was, figure out how to do everything- projects, SSIS and database SQL, in Visual Studio.

The interface is familiar as a Windows user-  right click for options, left click to execute the option.  The interface has links on the left for shortcuts to SSMS Object Explorer, which allows me to log into my database environments, along with browsing servers that I may also be deploying application code to.

Projects make it easy to build out a full, multi-tier deployment and debug it all from one application, too.  Needless to say, I’m happy to report that even with some missing instructions, I was able to do what needed to be done and do it with some grace.

https://i2.wp.com/dbakevlar.com/wp-content/uploads/2018/08/vss_dw_deploy... 300w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2018/08/vss_dw_deploy... 768w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2018/08/vss_dw_deploy... 1400w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2018/08/vss_dw_deploy... 2100w" sizes="(max-width: 650px) 100vw, 650px" data-recalc-dims="1" />

So what do I see that can be improved?

  1. When you copy and paste to update a path, don’t automatically remove the remainder of the file name, etc., that’s been left on purpose.  This can lead to extra human intervention, which then leads to more chance of human error.
  2. The hints when hovering over a button can become a nuisance instead of a help.  Have the hints auto-hide after 5 seconds.  There’s no reason to leave them up when we’re trying to guide our cursor to a small button.
  3. Make debug display all steps, the errors and then shut down automatically when complete.
  4. Make it easier to keep panes open for the SQL Server Object Explorer, Toolbox, etc. vs. auto-hiding.  The information on the pane may be needed for reference as one works on another configuration panel.
  5. The Control Flow on SSIS package execution shouldn’t be blown up so large that you can’t decipher what a package is doing.  Keep it legible.



Tags:  


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBAKevlar [Visual Studio- All the Warm and Fuzzy], All Right Reserved. 2018.

Oracle 18c preinstall RPM on RedHat RHEL

By Franck Pachot

.
The Linux prerequisites for Oracle Database are all documented but using the pre-install rpm makes all things easier. Before 18c, this was easy on Oracle Enterprise Linux (OEL) but not so easy on RedHat (RHEL) where the .rpm had many dependencies on OEL and UEK.
Now that 18c is there to download, there’s also the 18c preinstall rpm and the good news is that it can be run also on RHEL without modification.

This came to my attention on Twitter:

And of course this is fully documented:
https://docs.oracle.com/en/database/oracle/oracle-database/18/cwlin/about-the-oracle-preinstallation-rpm.html#GUID-C15A642B-534D-4E4A-BDE8-6DC7772AA9C8

In order to test it I’ve created quickly a CentOS instance on the Oracle Cloud:
CaptureCentosPreinstall

I’ve downloaded the RPM from the OEL7 repository:

[root@instance-20180803-1152 opc]# curl -o oracle-database-preinstall-18c-1.0-1.el7.x86_64.rpm https ://yum.oracle.com/repo/OracleLinux/OL7/latest/x86_64/getPackage/oracle-database-preinstall-18c-1.0-1 .el7.x86_64.rpm
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 18244 100 18244 0 0 63849 0 --:--:-- --:--:-- --:--:-- 63790

then ran the installation:

[root@instance-20180803-1152 opc]# yum -y localinstall oracle-database-preinstall-18c-1.0-1.el7.x86_ 64.rpm

 
It installs automatically all dependencies:
Installed:
oracle-database-preinstall-18c.x86_64 0:1.0-1.el7
 
Dependency Installed:
compat-libcap1.x86_64 0:1.10-7.el7 compat-libstdc++-33.x86_64 0:3.2.3-72.el7 glibc-devel.x86_64 0:2.17-222.el7 glibc-headers.x86_64 0:2.17-222.el7
gssproxy.x86_64 0:0.7.0-17.el7 kernel-headers.x86_64 0:3.10.0-862.9.1.el7 keyutils.x86_64 0:1.5.8-3.el7 ksh.x86_64 0:20120801-137.el7
libICE.x86_64 0:1.0.9-9.el7 libSM.x86_64 0:1.2.2-2.el7 libXext.x86_64 0:1.3.3-3.el7 libXi.x86_64 0:1.7.9-1.el7
libXinerama.x86_64 0:1.1.3-2.1.el7 libXmu.x86_64 0:1.1.2-2.el7 libXrandr.x86_64 0:1.5.1-2.el7 libXrender.x86_64 0:0.9.10-1.el7
libXt.x86_64 0:1.1.5-3.el7 libXtst.x86_64 0:1.2.3-1.el7 libXv.x86_64 0:1.0.11-1.el7 libXxf86dga.x86_64 0:1.1.4-2.1.el7
libXxf86misc.x86_64 0:1.0.3-7.1.el7 libXxf86vm.x86_64 0:1.1.4-1.el7 libaio-devel.x86_64 0:0.3.109-13.el7 libbasicobjects.x86_64 0:0.1.1-29.el7
libcollection.x86_64 0:0.7.0-29.el7 libdmx.x86_64 0:1.1.3-3.el7 libevent.x86_64 0:2.0.21-4.el7 libini_config.x86_64 0:1.3.1-29.el7
libnfsidmap.x86_64 0:0.25-19.el7 libpath_utils.x86_64 0:0.2.1-29.el7 libref_array.x86_64 0:0.1.5-29.el7 libstdc++-devel.x86_64 0:4.8.5-28.el7_5.1
libverto-libevent.x86_64 0:0.2.5-4.el7 nfs-utils.x86_64 1:1.3.0-0.54.el7 psmisc.x86_64 0:22.20-15.el7 xorg-x11-utils.x86_64 0:7.5-22.el7
xorg-x11-xauth.x86_64 1:1.0.9-1.el7

Note that the limits are stored in limits.d which has priority over limits.conf:

[root@instance-20180803-1152 opc]# cat /etc/security/limits.d/oracle-database-preinstall-18c.conf
 
# oracle-database-preinstall-18c setting for nofile soft limit is 1024
oracle soft nofile 1024
 
# oracle-database-preinstall-18c setting for nofile hard limit is 65536
oracle hard nofile 65536
 
# oracle-database-preinstall-18c setting for nproc soft limit is 16384
# refer orabug15971421 for more info.
oracle soft nproc 16384
 
# oracle-database-preinstall-18c setting for nproc hard limit is 16384
oracle hard nproc 16384
 
# oracle-database-preinstall-18c setting for stack soft limit is 10240KB
oracle soft stack 10240
 
# oracle-database-preinstall-18c setting for stack hard limit is 32768KB
oracle hard stack 32768
 
# oracle-database-preinstall-18c setting for memlock hard limit is maximum of 128GB on x86_64 or 3GB on x86 OR 90 % of RAM
oracle hard memlock 134217728
 
# oracle-database-preinstall-18c setting for memlock soft limit is maximum of 128GB on x86_64 or 3GB on x86 OR 90% of RAM
oracle soft memlock 134217728

Note that memlock is set to 128GB here but can be higher on machines with huge RAM (up to 90% of RAM)

And for information, here is what is set in /etc/sysctl.conf:

fs.file-max = 6815744
kernel.sem = 250 32000 100 128
kernel.shmmni = 4096
kernel.shmall = 1073741824
kernel.shmmax = 4398046511104
kernel.panic_on_oops = 1
net.core.rmem_default = 262144
net.core.rmem_max = 4194304
net.core.wmem_default = 262144
net.core.wmem_max = 1048576
net.ipv4.conf.all.rp_filter = 2
net.ipv4.conf.default.rp_filter = 2
fs.aio-max-nr = 1048576
net.ipv4.ip_local_port_range = 9000 65500

Besides that, the preinstall rpm disables NUMA and transparent huge pages (as boot options in GRUB). It creates the oracle user (id 54321 and belonging to groups oinstall,dba,oper,backupdba,dgdba,kmdba,racdba)

 

Cet article Oracle 18c preinstall RPM on RedHat RHEL est apparu en premier sur Blog dbi services.

RAC One node databases are relocated by opatchauto in 12.2 part 2

In a previous post I shared how I found out that RAC One Node databases are relocated on-line during patching and I promised a few more tests and sharing of implications. If you aren’t familiar with RAC One Node I recommend having a look at the official documentation: The Real Application Clusters Administration and Deployment Guide features it prominently in the introduction. One of the things I like to keep in mind when working with RAC One Node is this, quoting from section 1.3 in said admin and deployment guide:

Using the Single Client Access Name (SCAN) to connect to the database, clients can locate the service independently of the node on which it is running. Relocating an Oracle RAC One Node instance is therefore mostly transparent to the client, depending on the client connection. Oracle recommends to use either Application Continuity and Oracle Fast Application Notification or Transparent Application Failover to minimize the impact of a relocation on the client.

A little later, you can read this important additional piece of information in the “notes” section:

To prepare for all failure possibilities, you must add at least one Dynamic Database Service (Oracle Clusterware-managed database service) to an Oracle RAC One Node database.

If you created your RAC One database with the database creation assistant (dbca), you are already complying with that rule. In my case, my lab database is named RON with the mandatory service RON_SVC. I opted for an instance prefix of “DCB”.

What I’d like to try out for this blog post is what happens to an active workload on a RAC One database during patching.

I have successfully patched multi-node RAC systems, but that required the application to support this procedure. One of my favourite talks is named “Advanced RAC programming techniques” where I demonstrate the resilience of an application based on RAC to node failures by using Universal Connection Pool (UCP), JDBC and Fast Connection Failover (FCF). UCP is a Fast Application Notification (FAN) aware connection pool allowing my application to react to cluster events such as node up/node down. The idea is to mask instance failure from users.

If all of this sounds super-cryptic, I would like to suggest chapters 5 and 6 of the RAC administration and deployment guide for you. There you can read more about Workload Management with Dynamic Database Services and Application Continuity in RAC.

RAC One Node is different from multi-node RAC as it is only ever active on one node in normal operations. Online relocation, as described in the previous post, is supported by temporarily adding a second (destination) instance to the cluster, and moving all transactions across before terminating them after a time-out (default 30m). Once the time-out is reached (or all sessions made it across to the destination) the source instance is shut down and you are back to 1 active instance.

The online relocation does not care too much about the compatibility of the application with the process. If your application is written for RAC, most likely it will migrate quickly from source to destination instance. If it isn’t, well, the hard timeout might kick in and evict a few of your users. In a worse case your users need to re-connect to the database. Even worse still, you might have to restart the middle-tier system(s).

Sadly I haven’t seen too many applications capable of handling RAC events properly. One application that does is Swingbench, so I’ll stick with it. I configured it according to Dominic Giles’s blog post.

This post assumes that you are properly licensed to use all these features.

The environment

My setup hasn’t changed from previous post so I spare you the boredom of repeating it here. Jump over to the other post for details.

Preparations

For this blogpost I need to ensure that my RAC One node database resides on the node I am about to patch. I have again followed the patch readme very carefully, I have made sure that I have (proven, working, tested) backups of the entire stack and all the rest of it…

My database is indeed actively running on the node I am about to patch:

[oracle@rac122sec2 ~]$ srvctl status database -db DCB -detail -verbose
Instance DCB_1 is running on node rac122sec2 with online services RON_SVC. Instance status: Open.
Instance DCB_1 is connected to ASM instance +ASM2
Online relocation: INACTIVE

Before I can benefit from Application Continuity, I need to make a few changes to my application service, RON_SVC. There are quite a few sources to choose from, I went with the JDBC Developer’s Guide. Here’s the modification:

[oracle@rac122sec2 ~]$ srvctl modify service -db DCB -service RON_SVC -commit_outcome true \
> -failoverretry 30 -failoverdelay 10 -failovertype transaction \
> -replay_init_time 1800 -retention 86400 -notification true
[oracle@rac122sec2 ~]

Following the instructions on Dominic Giles’s blog, I also need to grant SOE the right to execute DBMS_APP_CONT.

Finally, I need to make changes to my Swingbench configuration file. The relevant part is shown here:


    "Order Entry (PLSQL) V2"
    Version 2 of the SOE Benchmark running in the database using PL/SQL
    
        soe
        
        (DESCRIPTION=(TRANSPORT_CONNECT_TIMEOUT=5)
        (RETRY_COUNT=3)(FAILOVER=ON)
        (ADDRESS=(PROTOCOL=TCP)(HOST=rac122sec-scan)(PORT=1521))
        (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=RON_SVC)))
        Oracle jdbc Driver
        
            5
            5
            10
            50
            45
            10
            120
        
        
            50
            true
            20
            nodes=rac122sec1:6200,rac122sec2:6200
            20
            true
        
    
    
        5
        0
        0
        50
        500
        120
        -1
        0:0
        1
        1000
        false
        false
        0:0
        0:0
        0
        
...

The connection string is actually on a single line, I have formatted it for readability in the above example. The main change from the standard configuration file is the use of connection pooling and setting the required properties for Application Continuity.

Let’s patch!

Once all the preparations are completed, it’s time to see how RAC One Node deals with an active workload undergoing an online relocation during patching. First of all I need to start the workload. I’d normally use charbench for this, but this time around opted for the GUI. It shows performance graphs over a 3 minute rolling window.

A few minutes after starting the benchmark I commenced patching. Soon thereafter, opatchauto told me that it was relocating the database.

OPatchauto session is initiated at Fri Jul 27 14:52:23 2018

...

Relocating RACOne home before patching on home /u01/app/oracle/product/12.2.0.1/dbhome_1

According to the session log, this happened at 14:54. And by the way, always keep the session log, it’s invaluable!

2018-07-27 14:54:07,914 INFO  [163] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - 
Executing command as oracle: 
 /bin/sh -c 'ORACLE_HOME=/u01/app/oracle/product/12.2.0.1/dbhome_1 
 /u01/app/oracle/product/12.2.0.1/dbhome_1/bin/srvctl relocate database -d DCB -n rac122sec1 -w 30 -v'
2018-07-27 14:54:07,914 INFO  [163] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - 
  COMMAND Looks like this: 
    /bin/sh -c 'ORACLE_HOME=/u01/app/oracle/product/12.2.0.1/dbhome_1 
    /u01/app/oracle/product/12.2.0.1/dbhome_1/bin/srvctl relocate database -d DCB -n rac122sec1 -w 30 -v'
2018-07-27 14:55:31,736 INFO  [163] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - 
  status: 0
2018-07-27 14:55:31,737 INFO  [163] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - 
  Successfully executed the above command.

2018-07-27 14:55:31,737 INFO  [163] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - 
  Output from the command:
2018-07-27 14:55:31,737 INFO  [163] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor -
Configuration updated to two instances
Instance DCB_2 started
Services relocated
Waiting for up to 30 minutes for instance DCB_1 to stop ...
Instance DCB_1 stopped
Configuration updated to one instance

2018-07-27 14:55:31,737 INFO  [163] com.oracle.glcm.patch.auto.db.product.executor.PatchingStepExecutor - 
  Command executed successfully.

You can see the same information by querying Clusterware, although there aren’t any timestamps associated with it:

[root@rac122sec1 ~]# srvctl status database -db DCB -detail -verbose
Instance DCB_1 is running on node rac122sec2 with online services RON_SVC. Instance status: Open.
Instance DCB_1 is connected to ASM instance +ASM2
Online relocation: ACTIVE
Source instance: DCB_1 on rac122sec2
Destination instance: DCB_2 on rac122sec1
[root@rac122sec1 ~]# 

Although the online relocation timeout was set to 30 minutes, use of modern coding techniques and connection pooling allowed for a much faster online relocation. As you can see in the log excerpt the entire relocation was completed 2018-07-27 14:55:31,737. Clusterware now tells me that my database runs on node 1:

[root@rac122sec1 ~]# date;  srvctl status database -db DCB -detail -verbose
Fri 27 Jul 14:55:38 2018
Instance DCB_2 is running on node rac122sec1 with online services RON_SVC. Instance status: Open.
Instance DCB_2 is connected to ASM instance +ASM1
Online relocation: INACTIVE

While this is wicked, what are the implications for the application? Have a look at this print screen, taken a minute after the online relocation completed.

https://martincarstenbach.files.wordpress.com/2018/07/rac-one-online-rel... 600w, https://martincarstenbach.files.wordpress.com/2018/07/rac-one-online-rel... 150w" sizes="(max-width: 300px) 100vw, 300px" />

As you can see there was nothing unusual recorded (tab events to the left), and I couldn’t see a drop in the number of sessions connected. I noticed a slight blip in performance though but it recovered very soon thereafter.

Summary

During automatic patching of the Oracle stack opatchauto will perform an online relocation of a RAC One Node database if it is found running on the node currently undergoing patching. If your application is developed with RAC in mind – such as using a FAN-aware connection pool like UCP, and either supports Fast Connection Failover or Application Continuity, there is a good chance that patching the system does not affect your service.

While my test was successful, it is by no means representative of a production workload – my RAC One database has a SGA of 4 GB and there were 5 concurrent users – hardly what I see in the real world. However that shouldn’t deter you: if you are using RAC One Node I think it’s definitely worth a try implementing modern technology into the application.

Building community via the speaker community

I recently participated in the Oracle Developer Community Yatra tour throughout India. It is a hectic event with 7 cities covered in a mere 9 days, so you can imagine how frantic the pace was. A typical day would be:

  • 7am – breakfast
  • 8am – check out of hotel and leave for the venue
  • 9am – speak all day, host Q&A
  • 6pm – leave straight from venue to the airport
  • 8pm – dinner at airport, and fly to next city
  • 1am – get to next hotel, grab some sleep before doing it all again in 6 hours time

Yet as a speaker in this event, it never felt that the logistics of the event were out of control.  This is mainly due to the incredible work of the people in the AIOUG (All India Oracle User Group) coordinated by Sai Penumuru. The smoothness of the organization prompted me to blog about how user groups could follow the AIOUG lead in terms of running successful events.

Every time I do an event in India, not only do I come away with a stronger network and connection with the attendees, I also gain new and strong friendships with the speakers and this is in no small part due to organizational skills of Sai and the volunteers.

So from that reference point – namely, the smart way in which Sai and the user group foster a great feeling of community amongst the speakers, I thought I’d share the mechanisms via which user groups can organize events that make speakers feel welcome and keen to return.

Common accommodation

When I travel to India, Sai provides a recommended hotel for all speakers. Obviously, no speaker is compelled to stay there, but we all know that the recommendation Sai makes has taken in account:

  • facilities the hotel provides,
  • hotel price to find a balance for speakers either having company funding or funding themselves,
  • transport time to/from the conference venue.

So most of us will always use his recommendation, and it makes the decision making process easy.  But most of all, it is a catalyst for speakers to meet in a casual environment outside the conference hours, and build relationships.

Common transport

The AIOUG organizes transport to and from the venue, and from the airport as well. And for when this is not possible, they will provide a recommended transport means (such as Uber etc) so that speakers never have the risk of picking an unreliable or unsafe transport option. For multi-city events such as Yatra, the AIOUG also recommends common itinerary for flights, once again making the entire planning process much easier for speakers.

A communication mechanism

Before a conference, AIOUG sets up a Whatsapp group containing all of the speakers, and the key representatives from the user group. In this way, all communication is sent in a consistent fashion.  We don’t have to jump between email, twitter, etc to see if any correspondence has been missed. And this group also helps build the relationship between speakers and user group. Humourous stories and pictures can be shared, but most importantly, if there is an issue or problem – everyone is aware of it immediately, making resolution is fast and effective.

And perhaps most importantly, it helps accentuate the human element – messages are sent as people land or takeoff, and when people arrive at the hotel.  Organizers regularly send messages making sure no-one is having any difficulties.  All these things make the speakers feel more comfortable before, during and after the event.

A typical day

Perhaps the most valuable piece of information that is conveyed by the user group is ‘local knowledge’. For example, each evening a whatsapp message would be sent out detailing

  • hotel pickup time,
  • expected transit time,
  • who to contact/what to do when arriving at venue,
  • what identification requirements may be needed on site
    etc

So even though it may be a first visit to a city for the speakers, there is a degree of familiarity with what is expected to happen, and hence knowledge of whether things are departing from the norm.

Onsite assistance

The most stressful 5 minutes for any speaker is the time they are setting up for their talk. Making sure projectors work, internet connectivity, schedule changes etc – are all things that can sabotage a good talk for a speaker. The AIOUG always has someone visit the room in that critical 5 minutes, so a speaker does not have to go hunting for technical assistance.

 

In summary, as you can see, none of these things are particular arduous to do, but the benefits are huge for speakers.  We feel comfortable and welcome, which means a much better chance of a successful talks, and makes us much more likely to want to return.

So if you are reading this, and are part of a user group committee, please consider some of the tips above for your local events. If your speakers have a good experience, they’ll be much more keen to offer continued support for your events.

Extended Histograms – 2

Following on from the previous posting which raised the idea of faking a frequency histogram for a column group (extended stats), this is just a brief demonstration of how you can do this. It’s really only a minor variation of something I’ve published before, but it shows how you can use a query to generate a set of values for the histogram and it pulls in a detail about how Oracle generates and stores column group values.

We’ll start with the same table as we had before – two columns which hold only the combinations (‘Y’, ‘N’) or (‘N’, ‘Y’) in a very skewed way, with a requirement to ensure that the optimizer provides an estimate of 1 if a user queries for (‘N’,’N’) … and I’m going to go the extra mile and create a histogram that does the same when the query is for the final possible combination of (‘Y’,’Y’).

Here’s the starting code that generates the data, and creates histograms on all the columns (I’ve run this against 12.1.0.2 and 12.2.0.1 so far):


rem
rem     Script:         histogram_hack_2a.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jul 2018
rem
rem     Last tested 
rem             12.2.0.1
rem             12.1.0.2
rem             11.2.0.4
rem

create table t1
as
select 'Y' c2, 'N' c3 from all_objects where rownum <= 71482 -- > comment to deal with wordpress format issue.
union all
select 'N' c2, 'Y' c3 from all_objects where rownum <= 1994 -- > comment to deal with wordpress format issue.
;

variable v1 varchar2(128)

begin
        :v1 := dbms_stats.create_extended_stats(null,'t1','(c2,c3)');
        dbms_output.put_line(:v1);
end;
/

execute dbms_stats.gather_table_stats(null, 't1', method_opt=>'for all columns size 10');

In a variation from the previous version of the code I’ve used the “create_extended_stats()” function so that I can return the resulting virtual column name (also known as an “extension” name) into a variable that I can use later in an anonymous PL/SQL block.

Let’s now compare the values stored in the histogram for that column with the values generated by a function call that I first referenced a couple of years ago:


select
        endpoint_value
from 
        user_tab_histograms
where
        table_name = 'T1'
and     column_name = :v1
;

select 
        distinct c2, c3, 
        mod(sys_op_combined_hash(c2,c3),9999999999) endpoint_value
from t1
;

ENDPOINT_VALUE
--------------
    4794513072
    6030031083

2 rows selected.


C C ENDPOINT_VALUE
- - --------------
N Y     4794513072
Y N     6030031083

2 rows selected.

So we have a method of generating the values that Oracle should store in the histogram; now we need to generate 4 values and supply them to a call to dbms_stats.set_column_stats() in the right order with the frequencies we want to see:


declare
        l_distcnt number;
        l_density number;
        l_nullcnt number;
        l_avgclen number;

        l_srec  dbms_stats.statrec;
        n_array dbms_stats.numarray;

begin
        dbms_stats.get_column_stats (
                ownname =>null,
                tabname =>'t1',
                colname =>:v1,
                distcnt =>l_distcnt,
                density =>l_density,
                nullcnt =>l_nullcnt,
                avgclen =>l_avgclen,
                srec    =>l_srec
        );

        l_srec.novals := dbms_stats.numarray();
        l_srec.bkvals := dbms_stats.numarray();

        for r in (
                select
                        mod(sys_op_combined_hash(c2,c3),9999999999) hash_value, bucket_size
                from    (
                        select 'Y' c2, 'Y' c3, 1 bucket_size from dual
                        union all
                        select 'N' c2, 'N' c3, 1 from dual
                        union all
                        select 'Y' c2, 'N' c3, 71482 from dual
                        union all
                        select 'N' c2, 'Y' c3, 1994 from dual
                        )
                order by hash_value
        ) loop
                l_srec.novals.extend;
                l_srec.novals(l_srec.novals.count) := r.hash_value;

                l_srec.bkvals.extend;
                l_srec.bkvals(l_srec.bkvals.count) := r.bucket_size;
        end loop;

        n_array := l_srec.novals;

        l_distcnt  := 4;
        l_srec.epc := 4;

--
--      For 11g rpcnts must not be mentioned
--      For 12c is must be set to null or you
--      will (probably) raise error:
--              ORA-06533: Subscript beyond count
--

        l_srec.rpcnts := null;

        dbms_stats.prepare_column_values(l_srec, n_array);

        dbms_stats.set_column_stats(
                ownname =>null,
                tabname =>'t1',
                colname =>:v1,
                distcnt =>l_distcnt,
                density =>l_density,
                nullcnt =>l_nullcnt,
                avgclen =>l_avgclen,
                srec    =>l_srec
        );

end;

The outline of the code is simply: get_column_stats, set up a couple of arrays and simple variables, prepare_column_values, set_column_stats. The special detail that I’ve included here is that I’ve used a “union all” query to generate an ordered list of hash values (with the desired frequencies), then grown the arrays one element at a time to copy them in place. (That’s not the only option at this point, and it’s probably not the most efficient option – but it’s good enough). In the past I’ve used this type of approach but used an analytic query against the table data to produce the equivalent of 12c Top-frequency histogram in much older versions of Oracle.

A couple of important points – I’ve set the “end point count” (l_srec.epc) to match the size of the arrays, and I’ve also changed the number of distinct values to match. For 12c to tell the code that this is a frequency histogram (and not a hybrid) I’ve had to null out the “repeat counts” array (l_srec.rpcnts). If you run this on 11g the reference to rpcnts is illegal so has to be commented out.

After running this procedure, here’s what I get in user_tab_histograms for the column:


select
        endpoint_value                          column_value,
        endpoint_number                         endpoint_number,
        endpoint_number - nvl(prev_endpoint,0)  frequency
from    (
        select
                endpoint_number,
                lag(endpoint_number,1) over(
                        order by endpoint_number
                )                               prev_endpoint,
                endpoint_value
        from
                user_tab_histograms
        where
                table_name  = 'T1'
        and     column_name = :v1
        )
order by endpoint_number
;

COLUMN_VALUE ENDPOINT_NUMBER  FREQUENCY
------------ --------------- ----------
   167789251               1          1
  4794513072            1995       1994
  6030031083           73477      71482
  8288761534           73478          1

4 rows selected.


It’s left as an exercise to the reader to check that the estimated cardinality for the predicate “c2 = ‘N’ and c3 = ‘N'” is 1 with this histogram in place.

Extended Histograms

Today’s little puzzle comes courtesy of the Oracle-L mailing list. A table has two columns (c2 and c3), which contain only the values ‘Y’ and ‘N’, with the following distribution:


select   c2, c3, count(*)
from     t1
group by c2, c3
;

C C   COUNT(*)
- - ----------
N Y       1994
Y N      71482

2 rows selected.

The puzzle is this – how do you get the optimizer to predict a cardinality of zero (or, using its best approximation, 1) if you execute a query where the predicate is:

where   c2 = 'N' and c3 = 'N'

Here are 4 tests you might try:

  • Create simple stats (no histograms) on c2 and c3.
  • Create frequency histograms on c2 and c3
  • Create a column group (extended stats) on (c2,c3) but no histograms
  • Create a column group (extended stats) on (c2,c3) with a histogram on (c2, c3)

If you do these tests you’ll find the estimated cardinalities are (from 12.1.0.2):

  • 18,369 – derived as 73,476 / 4  … total rows over total possible combinations
  • 1,940   – derived as 73,476 * (1,994/73,476) * (71,482/73,476) … total rows * fraction where c2 = ‘N’ * fraction where c3 = ‘N’
  • 36,738 – derived as 73,476 / 2 … total rows / number of distinct combinations of (c2, c3)
  • 997      – derived as 1,994 / 2 … half the frequency of the least frequently occurring value in the histogram

The last algorithm appeared in 10.2.0.4; prior to that a “value not in frequency histogram” would have been given an estimated cardinality of 1 (which is what the person on Oracle-L wanted to see).

In fact the optimizer’s behaviour can be reverted to the 10.2.0.3 mechanism by setting fix-control 5483301 to zero (or off), either with an “alter session” call or inside the /*+ opt_param() */ hint. There is, however, another option – if you get the column stats, then immediately set them (dbms_stats.get_column_stats(), dbms_stats.set_column_stats()) the optimizer defines the stats as “user defined” and (for reasons I don’t know – perhaps it’s an oversight) reverts to the 10.2.0.3 behaviour. Here’s some code to demonstrate the point; as the srcipt header says, I’ve tested it on versions up to 18.1


rem
rem     Script:         histogram_hack_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Jul 2018
rem
rem     Last tested 
rem             18.1.0.0        via LiveSQL (with some edits)
rem             12.2.0.1
rem             12.1.0.2
rem

create table t1
as
select 'Y' c2, 'N' c3 from all_objects where rownum <= 71482 -- > comment to avoid format issue
union all
select 'N' c2, 'Y' c3 from all_objects where rownum <= 1994 -- > comment to avoid format issue
;

execute dbms_stats.gather_table_stats(user,'t1',method_opt=>'for all columns size 10 for columns (c2,c3) size 10');

column column_name format a128 new_value m_colname

select  column_name
from    user_tab_cols
where   table_name = 'T1'
and     column_name not in ('C2','C3')
;

set autotrace traceonly explain
select /* pre-hack */ * from t1 where c2 = 'N' and c3 = 'N';
set autotrace off

declare
        l_distcnt number default null;
        l_density number default null;
        l_nullcnt number default null;
        l_srec    dbms_stats.statrec;
        l_avgclen number default null;

begin

        dbms_stats.get_column_stats (
                ownname =>user,
                tabname =>'t1',
                colname =>'&m_colname',
                distcnt =>l_distcnt,
                density =>l_density,
                nullcnt =>l_nullcnt,
                srec    =>l_srec,
                avgclen =>l_avgclen
        );

        dbms_stats.set_column_stats(
                ownname =>user,
                tabname =>'t1',
                colname =>'&m_colname',
                distcnt =>l_distcnt,
                density =>l_density,
                nullcnt =>l_nullcnt,
                srec    =>l_srec,
                avgclen =>l_avgclen
        );

end;
/

set autotrace traceonly explain
select /* post-hack */  * from t1 where c2 = 'N' and c3 = 'N';
set autotrace off

I’ve created a simple table for the data and collected stats including histograms on the two columns and on the column group. I’ve taken a simple strategy to find the name of the column group (I could have used the function dbms_stats.create_extended_stats() to set an SQL variable to the name of the column group, of course), and then run a little bit of PL/SQL that literally does nothing more than copy the column group’s stats into memory then write them back to the data dictionary.

Here are the “before” and “after” execution plans that we get from autotrace:

BEFORE
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   997 |  3988 |    23  (27)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |   997 |  3988 |    23  (27)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"='N' AND "C3"='N')


AFTER
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |     1 |     4 |    23  (27)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| T1   |     1 |     4 |    23  (27)| 00:00:01 |
--------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - filter("C2"='N' AND "C3"='N')

As required – the estimate for the (‘N’,’N’) rows drops down to (the optimizer’s best approximation to ) zero.

Footnote:

An alternative strategy (and, I’d say, a better strategic approach) would have been to create a “fake” frequency histogram that included the value (‘N’,’N’) giving it a frequency of 1 – a method I’ve suggested in the past  but with the little problem that you need to be able to work out the value to use in the array passed to dbms_stats.set_column_stats() to represent the value for the (‘N’,’N’) combination – and I’ve written about that topic in the past as well.

You might wonder why the optimizer is programmed to use “half the least popular” for predicates references values not in the index. Prior to 12c it’s easy to make an argument for the algorithm. Frequency histograms used to be sampled with a very small sample size, so if you were unlucky a “slightly less popular” value could be missed completely in the sample; if you were requesting a value that didn’t appear in the histogram then (presumably) you knew it should exist in the data, so guessing a cardinality somewhat less than the least popular must have seemed like a good idea.

In 12c, of course, you ought to be taking advantage of the “approximate NDV” implementation for using a 100% sample to generate frequency (and Top-N / Top-Frequency histograms). If you’ve got a 12c frequency histogram then the absence of a value in the histogram means the data really wasn’t there so a cardinality estimate of 1 makes more sense. (Of course, you might have allowed Oracle to gather the histogram at the wrong time – but that’s a different issue). If you’ve got a Top-N histogram then the optimizer will behave as if a “missing” value is one of those nominally allowed for in the “low frequency” bucket and use neither the 1 nor the “half the least popular”.

So, for 12c and columns with frequency histograms it seems perfectly reasonably to set the fix control to zero – after getting approval from Oracle support, of course.