Oakies Blog Aggregator

Little things worth knowing: redo transport in Data Guard 12.2 part 2

In the first part of this article I looked at a number of views and some netstat output to show how redo is transported from the primary database to its standby systems. The long story short is that TT02 (“async ORL multi”) was found sending redo to CDB3 asynchronously whilest NSS2 (“sync”) transferred redo to the synchronised target – CDB2. Unlike v$dataguard_process wanted me to believe, it really wasn’t LGWR sending redo over the network.

In this little article I would like to show you how the standby databases CDB2 and CDB3 receive redo and how you can map this back to the primary database, closing the loop.

How does CDB2 receive redo?

First I’m looking at CDB2, which receives redo via synchronous mode. I should be able to narrow the communication down between primary and standby by referring to the LGWR and TT02 process IDs in the CLIENT_PID column on the standby. As a quick reminder, 14986 is the PID for LGWR, 15029 belongs to NSS2, and 15252 maps to TT02. Let’s try:

SQL> select db_unique_name,database_role from v$database;

DB_UNIQUE_NAME                 DATABASE_ROLE
------------------------------ ----------------
CDB2                           PHYSICAL STANDBY

SQL> select name, pid, role, action, client_pid, client_role, sequence#, block#, dest_id
  2  from v$dataguard_process where client_pid in (14986,15029,15252);

NAME  PID        ROLE                    ACTION       CLIENT_PID CLIENT_ROLE       SEQUENCE#     BLOCK#    DEST_ID
----- ---------- ----------------------- ------------ ---------- ---------------- ---------- ---------- ----------
rfs   5517       RFS sync                IDLE              14986 log writer               95     174565          0

So it would appear the process responsible for shipping redo to “SYNC” destinations is the log writer. Actually, the output of v$dataguard_process is quite interesting, which is why I’m adding it here for the sake of completeness:

SQL> select name, pid, role, action, client_pid, client_role, sequence#, block#, dest_id
  2  from v$dataguard_process order by action;

NAME  PID        ROLE                    ACTION       CLIENT_PID CLIENT_ROLE       SEQUENCE#     BLOCK#    DEST_ID
----- ---------- ----------------------- ------------ ---------- ---------------- ---------- ---------- ----------
rfs   5517       RFS sync                IDLE              14986 log writer               95     229446          0
rfs   5350       RFS archive             IDLE              15224 archive gap               0          0          1
rfs   5346       RFS ping                IDLE              15124 gap manager              95          0          0
rfs   5354       RFS archive             IDLE              15233 archive gap               0          0          1
MRP0  5348       managed recovery        IDLE                  0 none                      0          0          0
rfs   5352       RFS archive             IDLE              15240 archive gap               0          0          1
LGWR  5207       log writer              IDLE                  0 none                      0          0          0
TT01  5259       redo transport timer    IDLE                  0 none                      0          0          0
TT00  5255       gap manager             IDLE                  0 none                      0          0          0
ARC1  5263       archive redo            IDLE                  0 none                      0          0          0
ARC2  5265       archive redo            IDLE                  0 none                      0          0          0
ARC3  5267       archive redo            IDLE                  0 none                      0          0          0
TMON  5242       redo transport monitor  IDLE                  0 none                      0          0          0
ARC0  5257       archive local           IDLE                  0 none                      0          0          0

14 rows selected.

This view tells me that LGWR is attached to the RFS sync proces. But now I know better than that, and it is similar to what I saw on the primary. Looking a little closer, I can see that strictly speaking, the RFS process is connected to NSS2:

[root@server2 ~]# netstat -tunalp | egrep 'Active|Proto|5517'
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp6       0      0 192.168.100.22:1521     192.168.100.21:15515    ESTABLISHED 5517/oracleCDB2   

I am repeating the values for the primary here so you don’t have to go back to the previous article:

[root@server1 ~]# ps -ef | egrep 'lgwr|lg0|nss'
oracle   14986     1  0 09:58 ?        00:01:19 ora_lgwr_CDB1
oracle   14990     1  0 09:58 ?        00:00:00 ora_lg00_CDB1
oracle   14994     1  0 09:58 ?        00:00:00 ora_lg01_CDB1
oracle   15029     1  0 09:58 ?        00:00:43 ora_nss2_CDB1

[root@server1 ~]# netstat -vW --numeric-ports -ee -p -la --tcp | egrep '14986|14990|14994|15029'
tcp        0      0 server1.example.com:15515 server2.example.com:1521 ESTABLISHED oracle     16400768   15029/ora_nss2_CDB1

You will notice that port 15515 on server1 belongs to ora_nss2_CDB1.

Going back a little to v$dataguard_process, it seems a bit weird to see MRP0 as “idle” when the database is in managed recovery mode using real time apply. Trying something else I am querying v$managed_standby and voila: MRP0 is said to apply logs:

SQL> select process,pid,status,client_process,client_pid,sequence#,block# 
  2  from v$managed_standby order by status;

PROCESS   PID        STATUS       CLIENT_P CLIENT_PID                                SEQUENCE#     BLOCK#
--------- ---------- ------------ -------- ---------------------------------------- ---------- ----------
DGRD      5255       ALLOCATED    N/A      N/A                                               0          0
DGRD      5259       ALLOCATED    N/A      N/A                                               0          0
MRP0      5348       APPLYING_LOG N/A      N/A                                              95     246625
ARCH      5257       CLOSING      ARCH     5257                                             92       4096
ARCH      5263       CLOSING      ARCH     5263                                             93       2048
ARCH      5265       CLOSING      ARCH     5265                                             94     342016
ARCH      5267       CONNECTED    ARCH     5267                                              0          0
RFS       5350       IDLE         UNKNOWN  15224                                             0          0
RFS       5354       IDLE         UNKNOWN  15233                                             0          0
RFS       5352       IDLE         UNKNOWN  15240                                             0          0
RFS       5517       IDLE         LGWR     14986                                            95     246626
RFS       5346       IDLE         Archival 15124                                             0          0

12 rows selected.

I guess that’s true, as the system is in constant recovery using the standby logfiles.

And what about CDB3?

On the other hand, CDB3 – to which redo is shipped asynchronously – lists TT02 as it’s counterpart:

SQL> select db_unique_name,database_role from v$database;

DB_UNIQUE_NAME                 DATABASE_ROLE
------------------------------ ----------------
CDB3                           PHYSICAL STANDBY

SQL> select name, pid, role, action, client_pid, client_role, sequence#, block#, dest_id 
  2   from v$dataguard_process order by action;

NAME  PID                      ROLE                    ACTION       CLIENT_PID CLIENT_ROLE       SEQUENCE#     BLOCK#    DEST_ID
----- ------------------------ ----------------------- ------------ ---------- ---------------- ---------- ---------- ----------
rfs   14803                    RFS ping                IDLE              15124 gap manager              96          0          0
rfs   14809                    RFS archive             IDLE              15233 archive gap               0          0          0
rfs   14811                    RFS async               IDLE              15252 async ORL multi          96      34674          0
MRP0  11825                    managed recovery        IDLE                  0 none                      0          0          0
ARC0  11776                    archive local           IDLE                  0 none                      0          0          0
ARC2  11786                    archive redo            IDLE                  0 none                      0          0          0
TT00  11774                    gap manager             IDLE                  0 none                      0          0          0
ARC3  11788                    archive redo            IDLE                  0 none                      0          0          0
TMON  11706                    redo transport monitor  IDLE                  0 none                      0          0          0
LGWR  11676                    log writer              IDLE                  0 none                      0          0          0
ARC1  11784                    archive redo            IDLE                  0 none                      0          0          0
TT01  11778                    redo transport timer    IDLE                  0 none                      0          0          0

12 rows selected.

Unlike the case with CDB2, the local RFS process is indeed connecting to TT02 on server1:

[root@server2 ~]# netstat -vW --numeric-ports -ee -p -la --tcp | egrep '^Active|^Proto|14811'
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       User       Inode      PID/Program name    
tcp6       0      0 server2.example.com:1521 server1.example.com:12936 ESTABLISHED oracle     4658198    14811/oracleCDB3    

… and on server1:

[root@server1 ~]# netstat -vW --numeric-ports -ee -p -la --tcp | egrep '^Active|^Proto|12936'
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address            State       User       Inode      PID/Program name    
tcp        0      0 server1.example.com:12936 server2.example.com:1521 ESTABLISHED oracle     15820538   15252/ora_tt02_CDB1 

This should be enough evidence, I rest my case :)

Summary

So I guess that answers my question: On my small VMs in the lab, NSSn is responsible for shipping redo to targets in “SYNC” mode. The redo transport server processes TTnn ship redo to destinations that are defined for asynchronous transport.

Early Bird Extension – UK February Dates: “Oracle Indexing Internals and Best Practices” Seminar

As a Christmas present to those in the UK looking at attending my “Oracle Indexing Internals and Best Practices” seminar in February next year, the Early Bird rates are now available until 19th January 2018. Take this opportunity to attend this highly acclaimed seminar that is packed full of information designed to significantly improve the […]

Content- My Year in Review, 2017

So where did 2017 go?!?!?  I really, really would like to know…  Needless to say, it’s time to do a year in review already and I actually have time to do it this year!

DBAKevlar Blog

I wrote over 100 blog posts this year between DBAKevlar, Delphix and partner sites, but I’ve enjoyed sharing with the different communities.  There’s significant changes going on in the IT world regarding the future of the Database Administrator.

This deafening message came through when pulling together my top blog posts for 2017, with Death of the DBA, Long Live the DBA presenting itself as the top post of the year.

https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/12/blog_posts1.p... 300w, https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/12/blog_posts1.p... 768w, https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/12/blog_posts1.p... 200w" sizes="(max-width: 650px) 100vw, 650px" data-recalc-dims="1" />

I get a lot of my views outside of subscriptions via search engines, but there’s still a good number of viewers that come to my site due to Twitter and Linked in, etc.  Big shout out to Martin Berger, who’s been an awesome reference for my blog, (whatever you’re doing, just keep doing it… :))

https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/12/blog_referr.p... 300w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/12/blog_referr.p... 768w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/12/blog_referr.p... 1200w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/12/blog_referr.p... 1566w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/12/blog_referr.p... 1300w" sizes="(max-width: 650px) 100vw, 650px" data-recalc-dims="1" />

I also did a lot of referring from my own blog, including over 450 times people went to Delphix from one of my blog posts, (Yay!  I actually do my job sometimes… :))

Most of my readers are from the United States, but then again, most of my speaking events are here in the US, too.

https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/12/blog_countrie... 174w" sizes="(max-width: 523px) 100vw, 523px" data-recalc-dims="1" />

DBAKevlar Presenting

So how much did I present this year? Between in person events for Oracle and SQL Server, along with webinars and the demand for DevOps, I presented a lot-  69 to be exact.  With the quantity of SQL Saturday’s out there, I presented at a few more SQL Server events than I did Oracle ones, but it just means I get to share the love of database technology more.

https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/12/blog_events-1... 300w, https://i1.wp.com/dbakevlar.com/wp-content/uploads/2017/12/blog_events-1... 768w" sizes="(max-width: 650px) 100vw, 650px" data-recalc-dims="1" />

I was thrilled to get a chance to present in Morocco this year, along with returning to Europe to present in Germany at DOAG and the UK for UKOUG.

https://i0.wp.com/dbakevlar.com/wp-content/uploads/2017/12/kellyn_ukoug.... 300w, https://i0.wp.com/dbakevlar.com/wp-content/uploads/2017/12/kellyn_ukoug.... 768w, https://i0.wp.com/dbakevlar.com/wp-content/uploads/2017/12/kellyn_ukoug.... 1200w, https://i0.wp.com/dbakevlar.com/wp-content/uploads/2017/12/kellyn_ukoug.... 1300w, https://i0.wp.com/dbakevlar.com/wp-content/uploads/2017/12/kellyn_ukoug.... 1950w" sizes="(max-width: 650px) 100vw, 650px" data-recalc-dims="1" />

DBAKevlar Other Schtuff

I also published an ebook, (with a copy in print to give away at events) did a couple podcasts and then we have a successful 2017 RMOUG Training Days conference, the beginning of my presidency of Rocky Mountain Oracle User Group, (RMOUG) and work for the board of directors for the Denver SQL Server User Group as their liaison.   We’re now hard at work on the upcoming RMOUG Training Days 2018, which has a new location, new events and I can’t wait to see where it takes us next!

My highlights for 2017?  That I was accepted to present at both Oracle Open World and Microsoft Summit.  This was one of my goals for 2017.  Being honored as one of eight Microsoft technologists to accept my Idera ACE and to have one Oracle webinar and two Microsoft webinars-  one for 24HOP, (24 hrs. of PASS) and the Microsoft PASS Linux Edition just this last week.

Yeah, 2017 ROCKED.  Bring on 2018!!

 

 

 

 

 



Tags:  


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [Content- My Year in Review, 2017], All Right Reserved. 2018.

The post Content- My Year in Review, 2017 appeared first on DBA Kevlar.

Seasons greetings from the AskTOM team

This year we’ve answered over 2800 of your questions and followed up on 3500 reviews. It’s been a great and busy year! We’ll see you all in 2018

Hints, Patches, Force Matching and SQL Profiles

Sometimes, when all else fails, it is necessary to add hints to a piece of SQL in order to make the optimizer use a particular execution plan. We might directly add the hint to the code. However, even if it is possible to access the code directly, that may not be a good idea. In the future, if we need to change the hint, possibly due to a change in optimizer behaviour on database upgrade, then we would again need to make a code change.
Instead, we could look at Oracle's plan stability technologies.  There are 4 plan stability features in the database. They are different, but conceptually they all involve associating a set of hints with a statement that will be picked up by the optimizer at runtime.

  • SQL Outlines can be collected by the database at runtime. They can exactly match a SQL text or force match. However, they are deprecated from 11g. 
  • SQL Baselines can be collected by the database at runtime either on request or automatically. They feed real experience of previous executions of a statement back into subsequent executions. They match by SQL_ID. 
  • SQL Patches are a mechanism for manually inserting specific hints into a specific SQL ID 
  • SQL Profiles are also a mechanism for inserting specific hints into a specific SQL statement, but can also do forced matching. We are probably most used to creating profiles with Carlos Sierra's coe_xfr_sql_profile.sql script (part of Oracle support's SQLTXPLAIN), but as this blog will show that is not the only option. However, they require the database tuning pack to be licenced, which is not the case with the other mechanisms. 

There are several posts on Oracle's optimizer team that blog explains how to use SQL patches to inject a hint or disable an embedded hint.

However, the SQL patches work by attaching to a specific SQL_ID. If your application has literal values that change instead of bind variables you will not be able to inject hints with SQL patches.
The following example SQL was generated by PeopleSoft GL reporting tool, nVision. The code is dynamically generated by the tool, and there is no way to add a hint directly. The generated code contains literal values, in particular, the SELECTOR_NUM will be different for every execution. Therefore the SQL_ID will be different for every execution. Therefore, it is not possible to use a SQL Patch to inject a hint.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">SELECT L2.TREE_NODE_NUM,L3.TREE_NODE_NUM,SUM(A.POSTED_TOTAL_AMT) 
FROM PS_XX_SUM_CONSOL_VW A, PSTREESELECT05 L2, PSTREESELECT10 L3
WHERE A.LEDGER='S_USMGT'
AND A.FISCAL_YEAR=2017
AND A.ACCOUNTING_PERIOD BETWEEN 0 AND 12
AND (A.DEPTID BETWEEN 'A0000' AND 'A8999' OR A.DEPTID BETWEEN 'B0000' AND 'B9149'
OR A.DEPTID='B9156' OR A.DEPTID='B9158' OR A.DEPTID BETWEEN 'B9165' AND 'B9999'
OR A.DEPTID BETWEEN 'C0000' AND 'C9999' OR A.DEPTID BETWEEN 'D0000' AND 'D9999'
OR A.DEPTID BETWEEN 'G0000' AND 'G9999' OR A.DEPTID BETWEEN 'H0000' AND 'H9999'
OR A.DEPTID='B9150' OR A.DEPTID=' ')
AND L2.SELECTOR_NUM=10228
AND A.BUSINESS_UNIT=L2.RANGE_FROM_05
AND L3.SELECTOR_NUM=10231
AND A.ACCOUNT=L3.RANGE_FROM_10
AND A.CHARTFIELD1='0012345'
AND A.CURRENCY_CD='GBP'
GROUP BY L2.TREE_NODE_NUM,L3.TREE_NODE_NUM
/

If only the literal values differ, then the SQL statements will have the same force matching signature and one SQL profile will match all similar statements. Although, statements will not force match if the number of predicates changes. Nonetheless, I can inject hints with a SQL profile. This is the plan I get initially without a profile. It doesn't perform well, and it is not the plan I want.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">Plan hash value: 808840077
-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 10408 (100)| | | |
| 1 | HASH GROUP BY | | 517 | 50666 | 10408 (1)| 00:00:01 | | |
| 2 | HASH JOIN | | 517 | 50666 | 10407 (1)| 00:00:01 | | |
| 3 | PARTITION RANGE SINGLE | | 731 | 13158 | 3 (0)| 00:00:01 | 10228 | 10228 |
| 4 | INDEX FAST FULL SCAN | PSAPSTREESELECT05 | 731 | 13158 | 3 (0)| 00:00:01 | 10228 | 10228 |
| 5 | HASH JOIN | | 518 | 41440 | 10404 (1)| 00:00:01 | | |
| 6 | PARTITION RANGE SINGLE | | 249 | 5727 | 2 (0)| 00:00:01 | 10231 | 10231 |
| 7 | INDEX FAST FULL SCAN | PSAPSTREESELECT10 | 249 | 5727 | 2 (0)| 00:00:01 | 10231 | 10231 |
| 8 | PARTITION RANGE ITERATOR | | 7785 | 433K| 10402 (1)| 00:00:01 | 28 | 40 |
| 9 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| PS_X_LEDGER_ACCTS | 7785 | 433K| 10402 (1)| 00:00:01 | 28 | 40 |
| 10 | SORT CLUSTER BY ROWID BATCHED | | 5373 | | 5177 (1)| 00:00:01 | | |
| 11 | INDEX SKIP SCAN | PS_X_LEDGER_ACCTS | 5373 | | 5177 (1)| 00:00:01 | 28 | 40 |
-----------------------------------------------------------------------------------------------------------------------------------

These are the hints I want to introduce. I want to force materialize view rewrite, invoke parallelism if the statement is estimated to run for at least 2 seconds, and use a Bloom filter on the materialized view.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">SELECT /*+OPT_PARAM('parallel_degree_policy','AUTO') OPT_PARAM('parallel_min_time_threshold',2) 
OPT_PARAM('parallel_degree_limit',4) REWRITE PX_JOIN_FILTER(PS_XX_SUM_GCNSL_MV)*/…

This application produces many statements that I may want to control with a profile, but with forced matching this comes down to a not unmanageable number.  I have created a data-driven framework to create the profiles. I have created working storage table into which I will populate it with details of each force matching signature for which I want to create a profile.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">CREATE TABLE dmk_fms_profiles
(force_matching_signature NUMBER NOT NULL
,sql_id VARCHAR2(13)
,plan_hash_value NUMBER
,module VARCHAR2(64)
,report_id VARCHAR2(32) /*Application Specific*/
,tree_list CLOB /*Application Specific*/
,sql_profile_name VARCHAR2(30)
,parallel_min_time_threshold NUMBER
,parallel_degree_limit NUMBER
,other_hints CLOB
,delete_profile VARCHAR2(1)
,sql_text CLOB
,CONSTRAINT dmk_fms_profiles_pk PRIMARY KEY (force_matching_signature)
,CONSTRAINT dmk_fms_profiles_u1 UNIQUE (sql_id)
,CONSTRAINT dmk_fms_profiles_u2 UNIQUE (sql_profile_name)
)
/

Using conditional parallelism with the PARALLEL_MIN_TIME_THRESHOLD, but limited with PARALLEL_DEGREE_LIMIT is a frequently used tactic with this application, so I have specified columns for those hints, otherwise, hints are injected via a string. I identified the problematic SQL by analysis with ASH, and hence I also obtained the FORCE_MATCHING_SIGNATURE for them. The metadata is keyed by FORCE_MATCHING_SIGNATURE. I am specifying an additional hint to force materialized view rewrite and use a Bloom filter on the ledger table. I have specified a meaningful name for the SQL profile.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">INSERT INTO dmk_fms_profiles (force_matching_signature, parallel_min_time_threshold, parallel_degree_limit, other_hints, sql_profile_name) 
VALUES (16625752171077499412, 1, 4, 'REWRITE PX_JOIN_FILTER(PS_FT_SUM_GCNSL_MV)', 'NVS_GBGL123I_BU_CONSOL_ACCOUNT');
COMMIT;

Profiles are created on the text of a SQL rather than the SQL_ID or FORCE_MATCHING_SIGNATURE directly. Therefore the SQL_TEXT must be extracted from the AWR, so this method also requires that the statement has been captured by an AWR snapshot.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">UPDATE dmk_fms_profiles a
SET (module, action, sql_id, plan_hash_value, sql_text)
= (SELECT s.module, s.action, s.sql_id, s.plan_hash_value, t.sql_text
FROM dba_hist_sqlstat s
, dba_hist_sqltext t
WHERE t.dbid = s.dbid
AND t.sql_id = s.sql_id
AND s.force_matching_signature = a.force_matching_signature
AND s.snap_id = (
SELECT MAX(s1.snap_id)
FROM dba_hist_sqlstat s1
WHERE s1.force_matching_signature = a.force_matching_signature
AND s1.module = 'RPTBOOK' /*Application Specific*/
AND s1.action LIKE 'PI=%:%:%' /*Application Specific*/)
AND s.module = 'RPTBOOK' /*Application Specific*/
AND s.action LIKE 'PI=%:%:%' /*Application Specific*/
AND ROWNUM = 1)
WHERE sql_id IS NULL
/

MERGE INTO dmk_fms_profiles u
USING (
SELECT a.sql_id, a.force_matching_signature, p.name
FROM dmk_fms_profiles a
, dba_sql_profiles p
WHERE p.signature = a.force_matching_signature
) s
ON (s.force_matching_signature = u.force_matching_signature)
WHEN MATCHED THEN UPDATE
SET u.sql_profile_name = s.name
/

Columns REPORT_ID and TREE_LIST contain application specific information extracted from the application instrumentation and logging. They only mean something in the context of the application.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">/*Application Specific - extract report ID from ACTION*/
UPDATE dmk_fms_profiles a
SET report_id = substr(regexp_substr(s.action,':([A-Za-z0-9_-])+',1,1),2)
WHERE report_id IS NULL
AND action IS NOT NULL
/

Now I can produce a simple report of the metadata in order to see what profiles should be created.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">column sql_text word_wrapped on format a110
column module format a8
column report_id heading 'nVision|Report ID'
column tree_list word_wrapped on format a20
column plan_hash_value heading 'SQL Plan|Hash Value' format 9999999999
column parallel_min_time_threshold heading 'Parallel|Min Time|Threshold' format 999
column parallel_degree_limit heading 'Parallel|Degree|Limit' format 999
set long 500
SELECT * FROM dmk_fms_profiles
/

SQL Plan
FORCE_MATCHING_SIGNATURE SQL_ID Hash Value MODULE ACTION
------------------------ ------------- ----------- -------- ----------------------------------------------------------------
Parallel Parallel
nVision Min Time Degree
Report ID TREE_LIST SQL_PROFILE_NAME Threshold Limit D
-------------------------------- -------------------- ------------------------------ --------- -------- -
OTHER_HINTS
--------------------------------------------------------------------------------
SQL_TEXT
--------------------------------------------------------------------------------------------------------------
12803175998948432502 5pzxhha3392cs 988048519 RPTBOOK PI=3186222:USGL233I:10008
USGL233I BU_GAAP_CONSOL, NVS_GBGL123I_BU_CONSOL_ACCOUNT 1 4
GAAP_ACCOUNT
REWRITE PX_JOIN_FILTER(PS_XX_SUM_GCNSL_MV)
SELECT L2.TREE_NODE_NUM,A.ACCOUNT,SUM(A.POSTED_TOTAL_AMT) FROM PS_LEDGER A, PSTREESELECT05 L2, PSTREESELECT10 L3
WHERE A.LEDGER='S_GBMGT' AND A.FISCAL_YEAR=2017 AND A.ACCOUNTING_PERIOD BETWEEN 0 AND 12 AND (A.DEPTID BETWEEN
'A0000' AND 'A8999' OR A.DEPTID BETWEEN 'B0000' AND 'B9149' OR A.DEPTID='B9156' OR A.DEPTID='B9158' OR A.DEPTID
BETWEEN 'B9165' AND 'B9999' OR A.DEPTID BETWEEN 'C0000' AND 'C9999' OR A.DEPTID BETWEEN 'D0000' AND 'D9999' OR
A.DEPTID BETWEEN 'G0000' AND 'G9999' OR A.DE

Next, this PL/SQL block will create or recreate SQL profiles from the metadata. The various hints can be concatenated into a single string and passed as a parameter to SQLPROF_ATTR. The SQL text is passed as a parameter when the profile is created.

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">set serveroutput on
DECLARE
l_signature NUMBER;
h SYS.SQLPROF_ATTR;
e_no_sql_profile EXCEPTION;
PRAGMA EXCEPTION_INIT(e_no_sql_profile, -13833);
l_description CLOB;
BEGIN

FOR i IN (
SELECT f.*, s.name
FROM dmk_fms_profiles f
LEFT OUTER JOIN dba_sql_profiles s
ON f.force_matching_signature = s.signature
) LOOP

BEGIN
IF i.name IS NOT NULL AND i.delete_profile = 'Y' THEN
dbms_sqltune.drop_sql_profile(name => i.name);
END IF;
EXCEPTION WHEN e_no_sql_profile THEN NULL;
END;

IF i.delete_profile = 'Y' THEN
NULL;
ELSIF i.sql_text IS NOT NULL THEN
h := SYS.SQLPROF_ATTR(
q'[BEGIN_OUTLINE_DATA]',
CASE WHEN i.parallel_min_time_threshold>=0 THEN 'OPT_PARAM(''parallel_degree_policy'',''AUTO'') ' END||
CASE WHEN i.parallel_degree_limit >=0 THEN 'OPT_PARAM(''parallel_degree_limit'',' ||i.parallel_degree_limit ||') ' END||
CASE WHEN i.parallel_min_time_threshold>=0 THEN 'OPT_PARAM(''parallel_min_time_threshold'','||i.parallel_min_time_threshold||') ' END||
i.other_hints,
q'[END_OUTLINE_DATA]');

l_signature := DBMS_SQLTUNE.SQLTEXT_TO_SIGNATURE(i.sql_text);
l_description := 'coe nVision '||i.report_id||' '||i.tree_list||' '||i.force_matching_signature||'='||l_signature;
dbms_output.put_line(i.sql_profile_name||' '||l_description);

DBMS_SQLTUNE.IMPORT_SQL_PROFILE (
sql_text => i.sql_text,
profile => h,
name => i.sql_profile_name,
description => l_description,
category => 'DEFAULT',
validate => TRUE,
replace => TRUE,
force_match => TRUE /* TRUE:FORCE (match even when different literals in SQL). FALSE:EXACT (similar to CURSOR_SHARING) */ );

END IF;
END LOOP;
END;
/

I can verify that the profile has been created, and the hints that it contains, thus:

#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">SELECT profile_name,
xmltype(comp_data) as xmlval
FROM dmk_fms_profiles p
, dbmshsxp_sql_profile_attr x
WHERE x.profile_name = p.sql_profile_name
AND p.status = 'ENABLED'
ORDER BY 1
/

PROFILE_NAME
------------------------------
XMLVAL
------------------------------------------------------------------------------------------------
NVS_GBGL123I_BU_CONSOL_ACCOUNT


And now when the application runs, I get the plan that I wanted.

  • The query runs in parallel.
  • The SQL is rewritten to use materialized view.
  • There are no indexes so it must full scan it.
  • It generates a bloom filter from PSTREESELECT10 and applies it to the materialized view.
#eeeeee; border: 0px solid #000000; font-family: courier new; font-size: 85%; overflow: auto; padding-left: 4px; padding-right: 4px; width: 95%;">---------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2219 (100)| | | | | | |
| 1 | PX COORDINATOR | | | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10004 | 111 | 9879 | 2219 (6)| 00:00:01 | | | Q1,04 | P->S | QC (RAND) |
| 3 | HASH GROUP BY | | 111 | 9879 | 2219 (6)| 00:00:01 | | | Q1,04 | PCWP | |
| 4 | PX RECEIVE | | 111 | 9879 | 2219 (6)| 00:00:01 | | | Q1,04 | PCWP | |
| 5 | PX SEND HASH | :TQ10003 | 111 | 9879 | 2219 (6)| 00:00:01 | | | Q1,03 | P->P | HASH |
| 6 | HASH GROUP BY | | 111 | 9879 | 2219 (6)| 00:00:01 | | | Q1,03 | PCWP | |
| 7 | HASH JOIN | | 536 | 47704 | 2218 (6)| 00:00:01 | | | Q1,03 | PCWP | |
| 8 | PX RECEIVE | | 536 | 38056 | 2215 (6)| 00:00:01 | | | Q1,03 | PCWP | |
| 9 | PX SEND HYBRID HASH | :TQ10002 | 536 | 38056 | 2215 (6)| 00:00:01 | | | Q1,02 | P->P | HYBRID HASH|
| 10 | STATISTICS COLLECTOR | | | | | | | | Q1,02 | PCWC | |
| 11 | HASH JOIN | | 536 | 38056 | 2215 (6)| 00:00:01 | | | Q1,02 | PCWP | |
| 12 | BUFFER SORT | | | | | | | | Q1,02 | PCWC | |
| 13 | JOIN FILTER CREATE | :BF0000 | 236 | 3776 | 2 (0)| 00:00:01 | | | Q1,02 | PCWP | |
| 14 | PX RECEIVE | | 236 | 3776 | 2 (0)| 00:00:01 | | | Q1,02 | PCWP | |
| 15 | PX SEND BROADCAST | :TQ10000 | 236 | 3776 | 2 (0)| 00:00:01 | | | | S->P | BROADCAST |
| 16 | PARTITION RANGE SINGLE | | 236 | 3776 | 2 (0)| 00:00:01 | 36774 | 36774 | | | |
| 17 | INDEX FAST FULL SCAN | PSAPSTREESELECT10 | 236 | 3776 | 2 (0)| 00:00:01 | 36774 | 36774 | | | |
| 18 | JOIN FILTER USE | :BF0000 | 8859 | 475K| 2213 (6)| 00:00:01 | | | Q1,02 | PCWP | |
| 19 | PX BLOCK ITERATOR | | 8859 | 475K| 2213 (6)| 00:00:01 | 29 | 41 | Q1,02 | PCWC | |
| 20 | MAT_VIEW REWRITE ACCESS STORAGE FULL| PS_XX_SUM_GCNSL_MV | 8859 | 475K| 2213 (6)| 00:00:01 | 29 | 41 | Q1,02 | PCWP |
| 21 | BUFFER SORT | | | | | | | | Q1,03 | PCWC | |
| 22 | PX RECEIVE | | 731 | 13158 | 3 (0)| 00:00:01 | | | Q1,03 | PCWP | |
| 23 | PX SEND HYBRID HASH | :TQ10001 | 731 | 13158 | 3 (0)| 00:00:01 | | | | S->P | HYBRID HASH|
| 24 | PARTITION RANGE SINGLE | | 731 | 13158 | 3 (0)| 00:00:01 | 36773 | 36773 | | | |
| 25 | INDEX FAST FULL SCAN | PSAPSTREESELECT05 | 731 | 13158 | 3 (0)| 00:00:01 | 36773 | 36773 | | | |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Conclusion SQL 

Profiles can be used in much the same way as SQL Patches to introduce hints into application SQL without changing the code, the difference being that SQL Profiles can force match SQL. However, SQL Profiles do require the Tuning pack to be licenced, whereas SQL Patches do not. Some hints do not work because the profile is applied to late in the parse process. For example, if you use cursor sharing then the CURSOR_SHARING_EXACT hint will not take effect.

Purging Unused Triples Is a Never-ending Operation

This is a short post to point out a problem I recently discovered and for which I couldn’t find information online. Hence, in case you hit it, thanks to this short post I hope that you’ll save some time.

The documentation describes a procedure you can use to get rid of dangling triples stored in a semantic network. Simply put, you must execute a PL/SQL block like the following:

BEGIN
  sem_apis.purge_unused_values(flags => 'PARALLEL');
END;

Since one of my customers was/is concerned by the number of dangling triples that are stored in its databases, in October I experimented with the SEM_APIS.PURGE_UNUSED_VALUES procedure. Unfortunately, I quickly discovered that with a non-trivial number of triples it resulted in a never-ending operation. For one specific case I estimated ca. 300 days for one single run! It goes without saying that the issue was recognized as a bug.

Oracle Support provided a patch that I could test today. Note that I can’t share with you the patch number because, as far as I can see, it has no official number (two “references” are provided in the README: “12_1_02_SEM_NOV_30_2017” and “12.1.0.2.0_P1”). Anyway, the essential thing is that according to my tests the patch works. Now the never-ending operation takes less than 3 minutes to complete. Way better!

In case you experience a similar issue, you should open a service request. Oracle Support should be able to help you…

Graph Database in SQL Server 2017, Part I

I felt like playing with something new for the holidays  and chose to investigate the Graph Database feature in SQL Server 2017.

Graph Database is no simple feature, so this is going to take more than one post, but we’ll start here and see how long it takes for me to be distracted by another new feature…:)  As you can only have one user graph per database, that should limit my investigation to controlled chaos.

Graph databases are all about relationships-  relationships between data.  The output can provide insight into data that might not be apparent in a tabular fashion and the basic, graph data model is slightly different from how we think of a database model:

https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/12/graph_db.png?... 300w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/12/graph_db.png?... 768w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/12/graph_db.png?... 1200w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/12/graph_db.png?... 1586w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/12/graph_db.png?... 1300w" sizes="(max-width: 650px) 100vw, 650px" data-recalc-dims="1" />

The first thing is to know that the feature is available in both Azure SQL Database and SQL Server 2017 as shown above, it’s comprised of node and edge tables.

You can quickly  locate the graph database objects in a SQL Server 2017 database  by querying sys.tables:

SELECT NAME as table_name, is_node as node_table, is_edge as edge_table
FROM sys.tables
WHERE IS_NODE=1 or IS_EDGE=1
ORDER BY  table_name;
GO

As you’ll note, the where clause is searching for node, (vertices) and edge, (relationship) values of 1.  And these are the two objects that create the graph database.  Those edge tables are going to define the relationship between each of the node tables, which then will render your graphs to provide value to the output.

Querying a graph database requires the use of a new TSQL function, called MATCH(), which then will then graph out the entities to the screen, matching up each node via appropriate edge for other node’s properties and attributes.  When data is complex and the need for a more visual output to view the data in a non-tabular form, a graph database output can provide incredible value.

SELECT RelPerson2.Name 
FROM  RelPerson1, Friends, RelPerson2
WHERE MATCH(RelPerson1-(Friends)->RelPerson2)
AND RelPerson1.Name = 'Joe';

The physical relationship between all the people in this query that are connected to Joe might appear something like the following:

https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/12/graph2.png?re... 300w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/12/graph2.png?re... 768w, https://i2.wp.com/dbakevlar.com/wp-content/uploads/2017/12/graph2.png?w=... 1159w" sizes="(max-width: 650px) 100vw, 650px" data-recalc-dims="1" />

For the DBA, it’s important to know that node and edge tables can have indexes just like other tables, but most often are going to provide benefit on the those ID columns that connect node and edge tables, as that’s were the joins between the objects are made.

Where they differ is foreign keys.  Outside processing and foreign keys aren’t required due to the design focus on relationships already present in a graph database.

I think the best examples I’ve seen of graph database use is for complex hierarchy, such as org charts and data centers.  As the use of these grows and people find more value out of them, I have no doubt DBAs will be spending more time working with them.

 

 

 

 



Tags:  


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [Graph Database in SQL Server 2017, Part I], All Right Reserved. 2018.

The post Graph Database in SQL Server 2017, Part I appeared first on DBA Kevlar.

Oracle docker image from docker store

Did you notice that each time you want to download the Oracle binaries from the OTN website (for non-production usage) you have to click on the “Accept License Agreement”? This is because you have to agree with the OTN license agreement, and that makes it not easy to distribute an automated way to build an environment.
The only exception I have seen was the sealed envelopes provided for the RACattack:

In both cases, there must be a physical action involved to agree legally with the license terms.

Docker

There is now a new possibility where you click on ‘Agree’ only once, in the Docker Store, and then can download (aka ‘pull’) a container containing the binary distribution. You just go to: https://store.docker.com/images/oracle-database-enterprise-edition, login (you can create one in two minutes with a username, e-mail address, and password) and accept the license agreement:
CaptureOracle Docker

Once this is done, you will be able to pull the Oracle containers from the command line, after a ‘docker login’.

It may not be easy to use Docker on your laptop, especially in you are on Windows and don’t want to enable Hyper-V. Here is how I run it on a VirtualBox VM running Oracle Enterprise Linux. You may wonder what’s the point to run containers within a VM. But I think that you don’t have the choice here. The docker processes will run within the host. This means that you need an OS that is supported (and Oracle Enterprise Linux is the best fitted to run Oracle Database). This also means that you also need to adapt the kernel parameters for it, shm, limits, have the oracle user, etc. Better to do that in a VM dedicated for Oracle Database.

Then you wonder what’s the point of running in a container, given that you have to do all those installation prerequisites anyway, and installing Oracle is just two more commands (runInstaller and dbca). Well, it seems that the main reason is that it’s cool. In my opinion, any user of database (from developer to administrator) should have installed an Oracle Database at least with the DBA GUI, because it is a good way to understand what is a database, a listener, a datafile, the characterset,… But let’s be cool and pull it instead of install it.

Docker evolves quickly, I remove old releases just in case:

[root@localhost oracle]# yum -y remove docker-ce docker docker-common docker-selinux docker-engine
Loaded plugins: ulninfo
No Match for argument: docker-ce
No Match for argument: docker
No Match for argument: docker-common
No Match for argument: docker-selinux
No Match for argument: docker-engine
No Packages marked for removal

I’m on the latest OEL7:

[oracle@localhost ~]$ cat /etc/oracle-release
Oracle Linux Server release 7.4
 
[root@localhost oracle]# yum upgrade
Loaded plugins: ulninfo
No packages marked for update
 
[oracle@localhost ~]$ uname -a
Linux localhost.localdomain 4.1.12-103.10.1.el7uek.x86_64 #2 SMP Tue Dec 5 15:42:37 PST 2017 x86_64 x86_64 x86_64 GNU/Linux

I don’t want docker to fill my / filesystem, and those images with an Oracle Database are big, so I’m creating a link from /var/lib/docker to /u01

[root@localhost oracle]# mkdir /u01/docker
[root@localhost oracle]# ln -s /u01/docker /var/lib/docker
[root@localhost oracle]# ls -ld /var/lib/docker
lrwxrwxrwx. 1 root root 11 Dec 10 15:48 /var/lib/docker -> /u01/docker

Installing Docker (Community Edition):

[root@localhost oracle]# yum -y install docker-ce
Loaded plugins: ulninfo
docker-ce-stable | 2.9 kB 00:00:00
ol7_UEKR4 | 1.2 kB 00:00:00
ol7_addons | 1.2 kB 00:00:00
ol7_latest | 1.4 kB 00:00:00
docker-ce-stable/x86_64/primary_db | 10 kB 00:00:00
(1/7): ol7_UEKR4/x86_64/updateinfo | 135 kB 00:00:00
(2/7): ol7_addons/x86_64/updateinfo | 40 kB 00:00:00
(3/7): ol7_addons/x86_64/primary | 78 kB 00:00:00
(4/7): ol7_latest/x86_64/group | 681 kB 00:00:00
(5/7): ol7_latest/x86_64/updateinfo | 1.6 MB 00:00:02
(6/7): ol7_UEKR4/x86_64/primary | 25 MB 00:00:25
(7/7): ol7_latest/x86_64/primary | 31 MB 00:00:26
ol7_UEKR4 506/506
ol7_addons 251/251
ol7_latest 23517/23517
Resolving Dependencies
--> Running transaction check
---> Package docker-ce.x86_64 0:17.09.1.ce-1.el7.centos will be installed
--> Processing Dependency: container-selinux >= 2.9 for package: docker-ce-17.09.1.ce-1.el7.centos.x86_64
--> Running transaction check
---> Package container-selinux.noarch 2:2.21-1.el7 will be installed
--> Finished Dependency Resolution
 
Dependencies Resolved
 
==============================================================================================================================================================================================
Package Arch Version Repository Size
==============================================================================================================================================================================================
Installing:
docker-ce x86_64 17.09.1.ce-1.el7.centos docker-ce-stable 21 M
Installing for dependencies:
container-selinux noarch 2:2.21-1.el7 ol7_addons 28 k
 
Transaction Summary
==============================================================================================================================================================================================
Install 1 Package (+1 Dependent package)
 
Total download size: 21 M
Installed size: 76 M
Is this ok [y/d/N]: y
Downloading packages:
(1/2): container-selinux-2.21-1.el7.noarch.rpm | 28 kB 00:00:01
(2/2): docker-ce-17.09.1.ce-1.el7.centos.x86_64.rpm | 21 MB 00:00:07
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total 2.7 MB/s | 21 MB 00:00:07
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : 2:container-selinux-2.21-1.el7.noarch 1/2
Installing : docker-ce-17.09.1.ce-1.el7.centos.x86_64 2/2
Verifying : docker-ce-17.09.1.ce-1.el7.centos.x86_64 1/2
Verifying : 2:container-selinux-2.21-1.el7.noarch 2/2
 
Installed:
docker-ce.x86_64 0:17.09.1.ce-1.el7.centos
 
Dependency Installed:
container-selinux.noarch 2:2.21-1.el7
 
Complete!

Starting Docker:

[root@localhost oracle]# systemctl start docker

I have to login with my credentials. This is the way to connect with the agreement I accepted online:

[root@localhost oracle]# docker login
Login with your Docker ID to push and pull images from Docker Hub. If you don't have a Docker ID, head over to https://hub.docker.com to create one.
Username: franck.pachot
Password:
Login Succeeded

Then I pull the docker container provided by Oracle. Oracle software is quit large when including all features, so I choose the ‘slim’ one:

[root@localhost oracle]# docker pull store/oracle/database-enterprise:12.2.0.1-slim
12.2.0.1-slim: Pulling from store/oracle/database-enterprise
4ce27fe12c04: Pull complete
9d3556e8e792: Pull complete
fc60a1a28025: Pull complete
0c32e4ed872e: Pull complete
be0a1f1e8dfd: Pull complete
Digest: sha256:dbd87ae4cc3425dea7ba3d3f34e062cbd0afa89aed2c3f3d47ceb5213cc0359a
Status: Downloaded newer image for store/oracle/database-enterprise:12.2.0.1-slim
[root@localhost oracle]#

Here is the image:

[root@localhost oracle]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
store/oracle/database-enterprise 12.2.0.1-slim 27c9559d36ec 3 months ago 2.08GB

To run a database, you just have to run the container. In order to connect to it, you need to forward the 1521 port:

[root@localhost oracle]# docker run -p 0.0.0.0:9001:1521 store/oracle/database-enterprise:12.2.0.1-slim
Setup Oracle Database
Oracle Database 12.2.0.1 Setup
Sun Dec 10 19:09:14 UTC 2017
 
Check parameters ......
log file is : /home/oracle/setup/log/paramChk.log
paramChk.sh is done at 0 sec
 
untar DB bits ......
log file is : /home/oracle/setup/log/untarDB.log

Ok. This takes some time. The ORACLE_HOME is unzipped, then the database created…

I’ll not describe further. Just go to the Setup Instructions in https://store.docker.com/images/oracle-database-enterprise-edition where everything is clearly explained.

In my opinion, it is good to try and think about it. Docker was created to containerize an application with process(es) and memory. A database is a different beast. The database is persistent, so you should store it in an external volume, because you don’t want to create a new empty database each time you start it. There are also the configuration files which should be persistent: should they belong to the container or be external? And the logs under ORACLE_BASE_DIAG? Do you want to keep them? purge them? Or just let the image grow, which can be very quick if you fill the Recovery Area. Finally, do you want to run a container into a Virtual Machine, this container running Oracle Database 12c, which is a Container Database (CDB), itself containing containers (PDBs)? Personally, I’m very skeptical about the usage of docker for an Oracle Database. But I also gave it an early try some years ago (read here). And you may see things differently in your context. The very good point is that Oracle now provides an easy way to test whether it helps you to run a database quickly or not, with a way to get the binaries without an ‘accept’ click for each download.

Update 22-DEC-17

Here is the link to Gerald Venzl presentation on Best Practices for Oracle on Docker: https://www.slideshare.net/gvenzl/oracle-database-on-docker-best-practices

 

Cet article Oracle docker image from docker store est apparu en premier sur Blog dbi services.

Little things worth knowing: redo transport in Data Guard 12.2 part 1

I have researched Data Guard redo transport quite a bit, but so far haven’t really blogged about some of the details that I have worked out. This is what I am trying to do in this post.

In order to have a somewhat meaningful setup, I created a Data Guard configuration with 1 primary and 2 standby databases. They all reside on my lab (virtual) machines:

  • CDB1 currently operates in primary mode on server1
  • CDB2 and CDB3 are physical standby databases on server2
  • The VMs use Oracle Linux 7.4 with UEK4

The databases have been patched to 12.2.0.1.170814. I c^Hshould probably have created a dedicated VM for the second standby database but I hope what I found out with just 2 systems applies equally. If you know or find out otherwise, please let me know in the comments section of this article. This post is based on information from the Data Guard Concepts and Administration Guide version 12c as of 171212 as well as my own investigation shown here. There’s a good Oracle white paper out there on the impact of synchronous log shipping and some “best practices” which you might want to read as well.

Data Guard configuration

All my Data Guard instances are always managed via the Data Guard Broker, it’s just too convenient for me not to. Here’s the configuration detail for later reference:

DGMGRL> show configuration

Configuration - allthree

  Protection Mode: MaxAvailability
  Members:
  CDB1 - Primary database
    CDB2 - Physical standby database 
    CDB3 - Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 22 seconds ago)

My protection mode is defined as MaxAvailability, which can be a good compromise between availability/data protection and performance. Your mileage will vary, picking the right protection mode for your environment is out of scope of this little article.

I am using sync redo transport to CDB2, but asynchronous redo transport to CDB3:

DGMGRL> show database 'CDB2' logxptmode
  LogXptMode = 'SYNC'
DGMGRL> show database 'CDB3' logxptmode
  LogXptMode = 'ASYNC'

In addition to the Broker, the primary database shows the same information:

SQL> select dest_id, destination, status, error, transmit_mode
  2  from v$archive_dest where destination is not null;

   DEST_ID DESTINATION                    STATUS    ERROR                TRANSMIT_MOD
---------- ------------------------------ --------- -------------------- ------------
         1 USE_DB_RECOVERY_FILE_DEST      VALID                          SYNCHRONOUS
         2 CDB2                           VALID                          PARALLELSYNC
         3 CDB3                           VALID                          ASYNCHRONOUS

DEST_ID 1 is of course the local archiving destination and doesn’t have anything to do with redo transport:

SQL> select dest_name, destination, type, valid_type, valid_role
  2  from v$archive_dest where dest_id = 1
  3  /

DEST_NAME            DESTINATION                    TYPE    VALID_TYPE      VALID_ROLE
-------------------- ------------------------------ ------- --------------- ------------
LOG_ARCHIVE_DEST_1   USE_DB_RECOVERY_FILE_DEST      PUBLIC  ALL_LOGFILES    ALL_ROLES

As @fritshoogland would point out, the log writer trace is a great source of information and I’ll show parts of it throughout the post. This excerpt shows how lgwr says it’s not responsible for destination 1:

Trace file /u01/app/oracle/diag/rdbms/cdb1/CDB1/trace/CDB1_lgwr_14986.trc
Oracle Database 12c Enterprise Edition Release 12.2.0.1.0 - 64bit Production
Build label:    RDBMS_12.2.0.1.0_LINUX.X64_170125
ORACLE_HOME:    /u01/app/oracle/product/12.2.0.1/dbhome_1
System name:    Linux
Node name:      server1
Release:        4.1.12-94.3.9.el7uek.x86_64
Version:        #2 SMP Fri Jul 14 20:09:40 PDT 2017
Machine:        x86_64
Instance name: CDB1
Redo thread mounted by this instance: 0 
Oracle process number: 20
Unix process pid: 14986, image: oracle@server1 (LGWR)


*** 2017-12-12T09:58:48.421846+00:00 (CDB$ROOT(1))
*** SESSION ID:(10.52290) 2017-12-12T09:58:48.421874+00:00
*** CLIENT ID:() 2017-12-12T09:58:48.421878+00:00
*** SERVICE NAME:() 2017-12-12T09:58:48.421882+00:00
*** MODULE NAME:() 2017-12-12T09:58:48.421885+00:00
*** ACTION NAME:() 2017-12-12T09:58:48.421888+00:00
*** CLIENT DRIVER:() 2017-12-12T09:58:48.421892+00:00
*** CONTAINER ID:(1) 2017-12-12T09:58:48.421895+00:00
 
Created 2 redo writer workers (2 groups of 1 each)

*** 2017-12-12T09:59:06.794537+00:00 (CDB$ROOT(1))
Destination LOG_ARCHIVE_DEST_3 is using asynchronous network I/O
Destination LOG_ARCHIVE_DEST_1 is not serviced by LGWR
...

The information I wanted to show you is found towards the end of this little extract: Destination LOG_ARCHIVE_DEST_1 is not serviced by LGWR.

Investigation into Data Guard redo transport

Like so many other areas in the database, Data Guard is well instrumented. The main views I’ll use for this post are v$managed_standby and v$dataguard_process. Let’s begin by looking at the processes on the primary with the intention to work out which process is transmitting redo.

SQL> select process,pid,status,client_process,client_pid,sequence#,block#
  2  from v$managed_standby
  3  order by status;

PROCESS   PID        STATUS       CLIENT_P CLIENT_PID  SEQUENCE#     BLOCK#
--------- ---------- ------------ -------- ---------- ---------- ----------
DGRD      15275      ALLOCATED    N/A      N/A                 0          0
DGRD      15128      ALLOCATED    N/A      N/A                 0          0
DGRD      15362      ALLOCATED    N/A      N/A                 0          0
DGRD      15124      ALLOCATED    N/A      N/A                 0          0
ARCH      15240      CLOSING      ARCH     15240              84          1
ARCH      15224      CLOSING      ARCH     15224              83          1
ARCH      15126      CLOSING      ARCH     15126              85     342016
ARCH      15233      CLOSING      ARCH     15233              82          1
LGWR      14986      WRITING      LGWR     14986              86      51577
LNS       15252      WRITING      LNS      15252              86      51577

10 rows selected.

OK, so we have a few archiver processes present, a group of “DGRD” processes, LNS, and LGWR. This is the current state of my primary database.

SQL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     84
Next log sequence to archive   86
Current log sequence           86

I am running a low-impact Swingbench workload to generate some redo, but not too much (Average Apply Rate: 56.00 KByte/s according to the Broker).

These DGRD processes are visible as TTnn processes on the O/S-level. I used the listagg function to group their PIDs into an egrep-able format:

SQL> select process , listagg(pid, '|') within group (order by pid) as pids
  2  from v$managed_standby group by process;

PROCESS   PIDS
--------- ------------------------------
ARCH      15126|15224|15233|15240
DGRD      15124|15128|15275|15362
LGWR      14986
LNS       15252

The first group to map to Linux processes is DGRD (I have an idea what these archiver processes will look like on the O/S so won’t go into that detail)

[oracle@server1 ~]$ ps -ef | egrep '15124|15128|15275|15362'
oracle   15124     1  0 09:59 ?        00:00:00 ora_tt00_CDB1
oracle   15128     1  0 09:59 ?        00:00:00 ora_tt01_CDB1
oracle   15275     1  0 09:59 ?        00:00:00 ora_tt03_CDB1
oracle   15362     1  0 09:59 ?        00:00:00 ora_tt04_CDB1

So there are tt00 – tt04 with the exception of tt02. The Oracle 12.2 reference refers to the TTnn processes as redo transport slaves. The LNS process from the query is our missing tt02:

[oracle@server1 ~]$ ps -ef | egrep 15252
oracle   15252     1  0 09:59 ?        00:00:04 ora_tt02_CDB1

Interestingly enough, the current (Oracle 12.2) documentation has nothing to say about the LNS process. Or it has hidden it well from me. I found a reference in the 10.2 DataGuard Concepts and Administration Guide, chapter 5 (Redo Transport) but nothing concrete.

The final process to show is LGWR:

[oracle@server1 ~]$ ps -ef | grep 14986
oracle   14986     1  0 09:58 ?        00:00:12 ora_lgwr_CDB1

The next step is to work out how processes on the primary and standby database map. v$dataguard_status is an interesting view, and it shows relevant information as seen in the alert.log. I found v$dataguard_process to be even more valuable. On the primary, I am getting the following results when querying it:

SQL> select name, pid, role, action, client_pid, client_role, sequence#, block#, dest_id
  2  from v$dataguard_process order by action;
  
NAME  PID        ROLE                    ACTION       CLIENT_PID CLIENT_ROLE       SEQUENCE#     BLOCK#    DEST_ID
----- ---------- ----------------------- ------------ ---------- ---------------- ---------- ---------- ----------
ARC0  15126      archive local           IDLE                  0 none                      0          0          0
ARC3  15240      archive redo            IDLE                  0 none                      0          0          0
TMON  15020      redo transport monitor  IDLE                  0 none                      0          0          0
NSS2  15029      sync                    IDLE                  0 none                      0          0          0
TT00  15124      gap manager             IDLE                  0 none                     86          0          2
TT01  15128      redo transport timer    IDLE                  0 none                      0          0          0
TT04  15362      controlfile update      IDLE                  0 none                      0          0          0
TT03  15275      heartbeat redo informer IDLE                  0 none                      0          0          0
ARC1  15224      archive redo            IDLE                  0 none                      0          0          0
ARC2  15233      archive redo            IDLE                  0 none                      0          0          0
TT02  15252      async ORL multi         WRITING               0 none                     86      92262          0
LGWR  14986      log writer              WRITING               0 none                     86      92271          0

12 rows selected.

So from that output it appears as if only LGWR and TT02 are the only processes actually writing. Would be interesting to know where they are writing to :) It’s unclear to me why the process with PID 15252 appears as TT02 instead of LNS, as in v$managed_standby.

If you run the same query again you notice that the block# is increasing: as far as I remember that’s a sign of work! This coincides with information from the alert.log:

Destination LOG_ARCHIVE_DEST_2 is SYNCHRONIZED
LGWR: Standby redo logfile selected to archive thread 1 sequence 94
LGWR: Standby redo logfile selected for thread 1 sequence 94 for destination LOG_ARCHIVE_DEST_2
2017-12-12T13:23:07.153256+00:00
Thread 1 advanced to log sequence 94 (LGWR switch)
  Current log# 1 seq# 94 mem# 0: /u01/oradata/CDB1/onlinelog/o1_mf_1_dvh035kx_.log
  Current log# 1 seq# 94 mem# 1: /u01/fast_recovery_area/CDB1/onlinelog/o1_mf_1_dvh0363f_.log
2017-12-12T13:23:07.310438+00:00
Archived Log entry 194 added for T-1.S-93 ID 0x3784c932 LAD:1
2017-12-12T13:23:07.531896+00:00
TT02: Standby redo logfile selected for thread 1 sequence 94 for destination LOG_ARCHIVE_DEST_3

There’s also a new entry that wasn’t present in v$managed_standby, named NSS2. This process has not yet been discussed:

[oracle@server1 ~]$ ps -ef  | grep 5348
oracle    5348     1  0 11:16 ?        00:00:01 ora_nss2_CDB1

Its job is described by the database reference as transferring redo for synchronous targets. Nevertheless, I never saw it in a state other than “IDLE”. The aforementioned white paper however references it in section “Understanding how synchronous transport insures data integrity”.

In this particular section, the authors state that LGWR writes to the online redo logs and an identical network redo write is performed by NSSn. Looking at the netstat output, LGWR (and its 2 worker processes) does not have a connection to “server2” open, but NSS2 does:

[root@server1 ~]# ps -ef | egrep 'lgwr|lg0|nss'
oracle   14986     1  0 09:58 ?        00:01:19 ora_lgwr_CDB1
oracle   14990     1  0 09:58 ?        00:00:00 ora_lg00_CDB1
oracle   14994     1  0 09:58 ?        00:00:00 ora_lg01_CDB1
oracle   15029     1  0 09:58 ?        00:00:43 ora_nss2_CDB1

[root@server1 ~]# netstat -vW --numeric-ports -ee -p -la --tcp | egrep '14986|14990|14994|15029'
tcp        0      0 server1.example.com:15515 server2.example.com:1521 ESTABLISHED oracle     16400768   15029/ora_nss2_CDB1 
[root@server1 ~]# 

I cross-referenced this with output from lsof and it seems to be correct. What I haven’t been able to work out though is the relationship between a user session, LGWR and NSSn process. One clue is in the current log writer trace. Continuing from the previous trace:

*** 2017-12-12 09:59:12.450533 
Netserver NSS2 (PID:15029) for mode SYNC has been re-initialized
Performing a channel reset to ignore previous responses
Connecting as publisher to KSR Channel [id = 20]
Successfully reused NSS2 (PID:15029) for dest CDB2 mode SYNC ocis = 0x7fea5c6f6c40

There appears to be some sort of communication, although it’s not clear to me of what kind ;)

Summary

In this part of the two-part series I hoped to give you a few insights into Data Guard redo transport for synchronous and asynchronous redo shipping. In the next part the information in this article will be joined up with diagnostic information from server2 and both standby databases.