Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Statistics on Object tables

Way back in Oracle 8.0 we introduced the “Object-Relational” database, which was “the next big thing” in the database community back then. Every vendor was scrambling to show just how cool their database technology was with the object-oriented programming paradigm.

Don’t get me wrong – using the Oracle database object types and features associated with them has made my programming life a lot easier over the years. But for me, it’s always been pretty much limited to that, ie, programming, not actually using the object types in a database design as such. Nevertheless, using objects as columns, or even creating tables of objects is supported by the database. For example, I can create a object type of MY_OBJECT (which could itself be made up of objects) and then have a table, not with that object as a column, but actually a table of that object.


SQL> create type myobj as object  ( x int, y int );
  2  /

Type created.

SQL> create table t of myobj;

Table created.

To populate that table, I need to construct objects before they can be inserted, because are inserting objects not rows.


SQL> insert into t
  2  select myobj(rownum,rownum)
  3  from dual connect by level <= 10;

10 rows created.

Obviously, for good performance, we always need to gather optimizer statistics on database tables so that the optimizer can derive the best execution plans it can for any SQL statements that access this object table. Which leads to the question: Where can we see the optimizer statistics for an object table? Because once we get into object table territory, the first thing that normally takes DBAs and Developers by surprise is that the standard dictionary view for tables looks … sparse.Smile


SQL> exec dbms_stats.gather_table_stats('','T');

PL/SQL procedure successfully completed.

SQL>
SQL> select *
  2  from  user_tables
  3  where  table_name = 'T';

no rows selected

A quick perusal of the documentation reveals where we can get information for object tables – and that is the aptly named USER_OBJECT_TABLES view


SQL> select *
  2  from   user_object_tables
  3  where  table_name = 'T'
  4  @pr
==============================
TABLE_NAME                    : T
TABLESPACE_NAME               : USERS
CLUSTER_NAME                  :
IOT_NAME                      :
STATUS                        : VALID
PCT_FREE                      : 10
PCT_USED                      :
INI_TRANS                     : 1
MAX_TRANS                     : 255
INITIAL_EXTENT                : 65536
NEXT_EXTENT                   : 1048576
MIN_EXTENTS                   : 1
MAX_EXTENTS                   : 2147483645
PCT_INCREASE                  :
FREELISTS                     :
FREELIST_GROUPS               :
LOGGING                       : YES
BACKED_UP                     : N
NUM_ROWS                      : 10
BLOCKS                        : 5
EMPTY_BLOCKS                  : 0
AVG_SPACE                     : 0
CHAIN_CNT                     : 0
AVG_ROW_LEN                   : 23
AVG_SPACE_FREELIST_BLOCKS     : 0
NUM_FREELIST_BLOCKS           : 0
DEGREE                        :          1
INSTANCES                     :          1
CACHE                         :     N
TABLE_LOCK                    : ENABLED
SAMPLE_SIZE                   : 10
LAST_ANALYZED                 : 18-FEB-19
PARTITIONED                   : NO
IOT_TYPE                      :
OBJECT_ID_TYPE                : SYSTEM GENERATED
TABLE_TYPE_OWNER              : MCDONAC
TABLE_TYPE                    : MYOBJ
TEMPORARY                     : N
SECONDARY                     : N
NESTED                        : NO
BUFFER_POOL                   : DEFAULT
FLASH_CACHE                   : DEFAULT
CELL_FLASH_CACHE              : DEFAULT
ROW_MOVEMENT                  : DISABLED
GLOBAL_STATS                  : YES
USER_STATS                    : NO
DURATION                      :
SKIP_CORRUPT                  : DISABLED
MONITORING                    : YES
CLUSTER_OWNER                 :
DEPENDENCIES                  : DISABLED
COMPRESSION                   : DISABLED
COMPRESS_FOR                  :
DROPPED                       : NO
SEGMENT_CREATED               : YES
INMEMORY                      : DISABLED
INMEMORY_PRIORITY             :
INMEMORY_DISTRIBUTE           :
INMEMORY_COMPRESSION          :
INMEMORY_DUPLICATE            :
EXTERNAL                      : NO
CELLMEMORY                    :
INMEMORY_SERVICE              :
INMEMORY_SERVICE_NAME         :
MEMOPTIMIZE_READ              : DISABLED
MEMOPTIMIZE_WRITE             : DISABLED
HAS_SENSITIVE_COLUMN          : NO

PL/SQL procedure successfully completed.

So far so good. But if I want to really dig down into optimizer statistics for table, I need to look further afield than just USER_TABLES and look at the dictionary views dedicated to just optimizer statistics information, and that is when things get a little interesting for object tables.


SQL> select *
  2  from  user_tab_statistics
  3  where  table_name = 'T';

no rows selected.

SQL> select *
  2  from   user_tab_col_statistics
  3  where  table_name = 'T'
  4  @pr
==============================
TABLE_NAME                    : T
COLUMN_NAME                   : SYS_NC_OID$
NUM_DISTINCT                  : 10
LOW_VALUE                     : 265DF9DB62004A2B90B6FB3D550CB9FD
HIGH_VALUE                    : E868127216F045C89F54B55A48EAD6CF
DENSITY                       : .1
NUM_NULLS                     : 0
NUM_BUCKETS                   : 1
LAST_ANALYZED                 : 18-FEB-19
SAMPLE_SIZE                   : 10
GLOBAL_STATS                  : YES
USER_STATS                    : NO
NOTES                         :
AVG_COL_LEN                   : 17
HISTOGRAM                     : NONE
SCOPE                         : SHARED
==============================
TABLE_NAME                    : T
COLUMN_NAME                   : X
NUM_DISTINCT                  : 10
LOW_VALUE                     : C102
HIGH_VALUE                    : C10B
DENSITY                       : .1
NUM_NULLS                     : 0
NUM_BUCKETS                   : 1
LAST_ANALYZED                 : 18-FEB-19
SAMPLE_SIZE                   : 10
GLOBAL_STATS                  : YES
USER_STATS                    : NO
NOTES                         :
AVG_COL_LEN                   : 3
HISTOGRAM                     : NONE
SCOPE                         : SHARED
==============================
TABLE_NAME                    : T
COLUMN_NAME                   : Y
NUM_DISTINCT                  : 10
LOW_VALUE                     : C102
HIGH_VALUE                    : C10B
DENSITY                       : .1
NUM_NULLS                     : 0
NUM_BUCKETS                   : 1
LAST_ANALYZED                 : 18-FEB-19
SAMPLE_SIZE                   : 10
GLOBAL_STATS                  : YES
USER_STATS                    : NO
NOTES                         :
AVG_COL_LEN                   : 3
HISTOGRAM                     : NONE
SCOPE                         : SHARED

PL/SQL procedure successfully completed.

I can get column level statistics, but there is no entry in USER_TAB_STATISTICS for my object table. Checking the documentation indicates that unlike the USER_TABLES/USER_OBJECT_TABLES pairing, there is no partnering object-based view to match USER_TAB_STATISTICS. Digging into the definition for the USER_TAB_STATISTICS view shows that object tables are explicitly excluded:


and bitand(t.property, 1) = 0 /* not a typed table */

That might not seem such a big deal, but what happens if you lock the statistics for a table.


SQL> exec dbms_stats.lock_table_stats('','T');

PL/SQL procedure successfully completed.

The STATTYPE_LOCKED column is not on USER_TABLES and since we do not have an entry for the table in USER_TAB_STATISTICS, there is no direct mean of seeing if an object table has locked statistics. If you are faced with this problem, you have a couple of options at your disposal:

1) It would appear that the same flag on the internal dictionary table that indicates locked statistics is set for object tables as it would be for standard relational tables. Hence you could create a clone of the USER_TAB_STATISTICS view and remove the BITAND condition on the PROPERTY column. That of course is a very unsupported thing to do, and is just a rod for your back every time you patch or upgrade the database.

2) The other option is to assume that no-one is going to “mix and match” locking table statistics with index statistics. Every object table has an underlying index that is automatically created, so you can look at the locked status for this underlying index as a representative indicator of the table’s statistics locked state.


SQL> select *
  2  from  user_ind_statistics
  3  where  table_name = 'T'
  4  @pr
==============================
INDEX_NAME                    : SYS_C0028743
TABLE_OWNER                   : MCDONAC
TABLE_NAME                    : T
PARTITION_NAME                :
PARTITION_POSITION            :
SUBPARTITION_NAME             :
SUBPARTITION_POSITION         :
OBJECT_TYPE                   : INDEX
BLEVEL                        : 0
LEAF_BLOCKS                   : 1
DISTINCT_KEYS                 : 10
AVG_LEAF_BLOCKS_PER_KEY       : 1
AVG_DATA_BLOCKS_PER_KEY       : 1
CLUSTERING_FACTOR             : 1
NUM_ROWS                      : 10
AVG_CACHED_BLOCKS             :
AVG_CACHE_HIT_RATIO           :
SAMPLE_SIZE                   : 10
LAST_ANALYZED                 : 18-FEB-19
GLOBAL_STATS                  : YES
USER_STATS                    : NO
STATTYPE_LOCKED               : ALL
STALE_STATS                   : NO
SCOPE                         : SHARED

PL/SQL procedure successfully completed.

TL;DR: Querying optimizer statistics for object tables takes a little more care than with normal tables.

Oracle 19c Hint Usage reporting

One reason why we try to avoid hints in our queries is that it is very difficult to use correctly. No error is raised when there’s an incorrect syntax or when the hint cannot be used semantically. 19c dbms_xplan has an important enhancement as it can report hint usage, at least for optimizer hints.

By default, DBMS_XPLAN in the default TYPICAL format will report only invalid hints:

SQL> select /*+ FULL(DUAL) INDEX(BLABLABLA) BLABLABLA(DUAL) */ * from dual;
DUMMY
-----
X
SQL> select * from dbms_xplan.display_cursor(format=>'-cost');
PLAN_TABLE_OUTPUT
SQL_ID 3ps01tc9mxuhd, child number 0
-------------------------------------
select /*+ FULL(DUAL) INDEX(BLABLABLA) BLABLABLA(DUAL) */ * from dual
Plan hash value: 272002086
-------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Time |
-------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | TABLE ACCESS FULL| DUAL | 1 | 2 | 00:00:01 |
-------------------------------------------------------------
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2 (N - Unresolved (1), E - Syntax error (1))
--------------------------------------------------------------------
1 -  SEL$1
N - INDEX(BLABLABLA)
E - BLABLABLA

This tells me that for the line Id=1 of the plan, the query block SEL$1 has two hints unused. One because of syntax Error (E) because BLABLABLA is not a hint. The other, INDEX(), is a valid syntax but mentions an alias that is not in the query and then the error is unresolved (N)

We can choose to show also the hints that were correctly used:

SQL> select * from dbms_xplan.display_cursor('3ps01tc9mxuhd',format=>'+HINT_REPORT');
PLAN_TABLE_OUTPUT
SQL_ID 3ps01tc9mxuhd, child number 0
-------------------------------------
select /*+ FULL(DUAL) INDEX(BLABLABLA) BLABLABLA(DUAL) */ * from dual
Plan hash value: 272002086
--------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 2 (100)|
| 1 | TABLE ACCESS FULL| DUAL | 1 | 2 | 2 (0)|
--------------------------------------------------------------------
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 3 (N - Unresolved (1), E - Syntax error (1))
---------------------------------------------------------------------
1 -  SEL$1
N - INDEX(BLABLABLA)
E - BLABLABLA
1 -  SEL$1 / DUAL@SEL$1
- FULL(DUAL)

The FULL(DUAL) is correct and was used. Of course, this hint was not useful because there’s no other access path to DUAL, but that cannot be reported because basically the FULL() hint only tell the optimizer to ignore other access paths. So it was used even if it was not useful.

The DBMS_XPLAN formats are:

  • +HINT_REPORT_USED to show used hints
  • +HINT_REPORT_UNUSED to show unresolved and syntax errors
    this format flag is included in TYPICAL, the default format
  • +HINT_REPORT combines both of them and is the default with ALL

As an example, the following formats are the same

format=>'ALL -HINT_REPORT_UNUSED'
format=>'BASIC +HINT_REPORT_USED'

OTHER_XML

This displayed by all DBMS_XPLAN display functions is available in OTHER_XML from the PLAN_TABLE, V$SQL_PLAN, AWR, STS, SPM,…

SQL> select extract(xmltype(other_xml),'//hint_usage') from v$sql_plan where other_xml like '%hint_usage%' and sql_id='3ps01tc9mxuhd';
EXTRACT(XMLTYPE(OTHER_XML),'//HINT_USAGE')
------------------------------------------

DBMS_XPLAN is the best way to format it as this xml format is not documented. It seems that:

  • ’ is the query block name (hint scope can statement ‘’, query block ‘’, or alias ‘’)
  • ‘@st’ is PE for parsing syntax error (‘E’ in dbms_xplan note)
  • ‘@st’ is UR for unresolved (‘N’ in dbms_xplan note)
  • ‘@st’ is ‘NU’ or ‘EU’ for unused (‘U’ in dbms_xplan note)
  • ’ is the hint text
  • we might get a reason for unused ones in ‘

Pre-19c

Before this feature, there was some information in the CBO trace, but very limited. Here is what I have for my statement:

Dumping Hints
=============
atom_hint=(@=0x7fcd2d8aa460 err=0 resol=1 used=1 token=83 org=1 lvl=3 txt=INDEX ("BLABLABLA") )
atom_hint=(@=0x7fcd2d8ac008 err=0 resol=1 used=1 token=448 org=1 lvl=3 txt=FULL ("DUAL") )

Examples with reason

Here is an example with statement-level hints FIRST_ROWS/ALL_ROWS:

SQL> explain plan for select /*+ first_rows(1) all_rows */ * from SCOTT.DEPT;
Explained.
SQL> select * from dbms_xplan.display(format=>'basic +hint_report');
PLAN_TABLE_OUTPUT
Plan hash value: 3383998547
----------------------------------
| Id | Operation | Name |
----------------------------------
| 0 | SELECT STATEMENT | |
| 1 | TABLE ACCESS FULL| DEPT |
----------------------------------
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2 (U - Unused (2))
--------------------------------------------------------------------
0 -  STATEMENT
U - all_rows / hint conflicts with another in sibling query block
U - first_rows(1) / hint conflicts with another in sibling query block

The message is clear: conflicting hints are all ignored

SQL> explain plan for select /*+ first_rows(1) */ * from (
2 select /*+ all_rows */ * from SCOTT.DEPT
3 ) ;
Explained.
SQL> select * from dbms_xplan.display(format=>'basic +alias +hint_report');
[...]
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2 (U - Unused (1))
--------------------------------------------------------------------
0 -  STATEMENT
U - all_rows / hint overridden by another in parent query block
- first_rows(1)

Here the conflicting statement level hints are in different query blocks, and the report tells me that only the parent one was used

It is recommended to name the query blocks for complex queries. If some name conflicts (in the query or in underlying views) this will be reported:

SQL> explain plan for select /*+ qb_name(one) */ * from (
2 select /*+ qb_name(one) */ * from SCOTT.DEPT
3 ) ;
[...]
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$F5BB74E1 / DEPT@SEL$2
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2 (U - Unused (2))
--------------------------------------------------------------------
0 -  SEL$2
U - qb_name(one) / same QB_NAME hints for different query blocks
1 -  SEL$F5BB74E1
U - qb_name(one) / same QB_NAME hints for different query blocks

the consequence is that both were ignored.

You get also an error when the name is too long (more than 20 characters):

Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
--------------------------------------------------------------------
1 -  SEL$1
U - qb_name(X12345678901234567890) / hinted query block name is too long

With duplicate hints, one is ignored:

SQL> explain plan for select /*+ full(DEPT) full(DEPT) */ * from SCOTT.DEPT;
[...]
1 - SEL$1 / DEPT@SEL$1
U - full(DEPT) / duplicate hint
- full(DEPT)

When I allow no possible join method, all my join hinting are ignored:

SQL> explain plan for select /*+ leading(DEPT) no_use_nl(EMP) no_use_hash(EMP) no_use_merge(EMP) */ * from SCOTT.DEPT join SCOTT.EMP using(DEPTno);
[...]
1 - SEL$58A6D7F6
- leading(DEPT)
3 -  SEL$58A6D7F6 / EMP@SEL$1
U - no_use_hash(EMP) / all join methods are excluded by hints
U - no_use_merge(EMP) / all join methods are excluded by hints
U - no_use_nl(EMP) / all join methods are excluded by hints

If I mention an index name that does not exist, I see the reason:

SQL> explain plan for select /*+ index(DEPT SYS_AI_M4J1C) */ * from SCOTT.DEPT;
[...]

1 - SEL$1 / DEPT@SEL$1
U - index(DEPT SYS_AI_M4J1C) / index specified in the hint doesn't exist

This means that the table/alias DEPT exists but not the index mentioned.

Here is one where I use FULL() for an IOT:

SQL> explain plan for select /*+ full(IOT) */ count(*) from IOT where id=42;
Table created.
SQL> explain plan for select /*+ full(IOT) */ count(*) from IOT where id=42;
Explained.
SQL> select * from dbms_xplan.display(format=>'basic +hint_report');
PLAN_TABLE_OUTPUT
Plan hash value: 3425135035
---------------------------------------------------
| Id | Operation | Name |
---------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | SORT AGGREGATE | |
| 2 | INDEX FAST FULL SCAN| SYS_IOT_TOP_73014 |
---------------------------------------------------
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
--------------------------------------------------------------------
2 -  SEL$1 / IOT@SEL$1
U - full(IOT) / FULL hint is same as INDEX_FFS for IOT

This is not really unused because without a hint the optimizer would have chosen an INDEX UNIQUE SCAN. But the reason explains clearly that INDEX_FFS was substituted as it is the IOT equivalent of FULL

When ignore_optim_embedded_hints hint is present, other hints are ignored:

SQL> explain plan for select /*+ index(DEPT) ignore_optim_embedded_hints */ * from SCOTT.DEPT;
[...]
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 2 (U - Unused (1))
--------------------------------------------------------------------
0 -  STATEMENT
- ignore_optim_embedded_hints
1 -  SEL$1 / DEPT@SEL$1
U - index(DEPT) / rejected by IGNORE_OPTIM_EMBEDDED_HINTS

Exactly the same reason is displayed when it is set at session level:

SQL> alter session set optimizer_ignore_hints=true;
Session altered.
SQL>
SQL> explain plan for select /*+ index(DEPT) */ * from SCOTT.DEPT;
[...]
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
--------------------------------------------------------------------
1 -  SEL$1 / DEPT@SEL$1
U - index(DEPT) / rejected by IGNORE_OPTIM_EMBEDDED_HINTS

There is a similar parameter for parallel hints:

SQL> alter session set optimizer_ignore_parallel_hints=true;
Session altered.
SQL>
SQL> explain plan for select /*+ parallel(DEPT) */ * from SCOTT.DEPT;
[...]
Hint Report (identified by operation id / Query Block Name / Object Alias):
Total hints for statement: 1 (U - Unused (1))
--------------------------------------------------------------------
1 -  SEL$1 / DEPT@SEL$1
U - parallel(DEPT) / because of _optimizer_ignore_parallel_hints

Looking at the binaries (I don’t think it is exposed as a V$), here some the possible reasons:

The documentation has many other examples explained:

SQL Tuning Guide

MERGE and ORA-30926

Just a quick blog post on MERGE and the “unable to get a stable set of rows” error that often bamboozles people. This is actually just the script output from a pre-existing YouTube video (see below) that I’ve already done on this topic, but I had a few requests for the SQL example end-to-end, so here it is.

Imagine the AskTOM team had a simple table defining the two core members, Chris Saxon and myself. But in the style of my true Aussie laziness, I was very slack about checking the quality of the data I inserted.


SQL> create table oracle_team (
  2      pk number primary key,
  3      first_name varchar2(10),
  4      last_name varchar2(10)
  5  );

Table created.

SQL>
SQL> insert into oracle_team (pk, first_name, last_name) values (1, 'Connor', 'Macdonald');

1 row created.

SQL> insert into oracle_team (pk, first_name, last_name) values (3, 'kris', 'saxon');

1 row created.

SQL> commit;

Commit complete.

SQL> select * from oracle_team;

        PK FIRST_NAME LAST_NAME
---------- ---------- ----------
         1 Connor     Macdonald
         3 kris       saxon

2 rows selected.

You can see that the data is garbage. Both of our names are wrong so they need fixing. So I build a table called FIXES which lets people “queue up” fix requests to the table. I’ll add the 2 obvious fixes to that table.


SQL> create table fixes (
  2      team_pk number,
  3      first_name varchar2(10),
  4      last_name  varchar2(10),
  5      requested  date
  6  );

Table created.

SQL> insert into fixes values (1, 'Connor', 'McDonald',sysdate);

1 row created.

SQL> insert into fixes values (3, 'Chris', 'Saxon',sysdate);

1 row created.

SQL> commit;

Commit complete.

SQL> select * from fixes;

   TEAM_PK FIRST_NAME LAST_NAME  APPLIED
---------- ---------- ---------- ---------
         1 Connor     McDonald   18-FEB-19
         3 Chris      Saxon      18-FEB-19

2 rows selected.

To apply those fixes to the table, a simple MERGE command is all I need. Notice that MERGE does not have to be a “full” merge (ie, update AND insert), you can pick and choose what elements you want. MERGE is very powerful and flexible in that regard.


SQL>
SQL>
SQL> merge into oracle_team target
  2  using (select team_pk, first_name, last_name
  3         from fixes
  4        ) source
  5  on (target.pk = source.team_pk)
  6  when matched then
  7  update set
  8      target.first_name = source.first_name,
  9      target.last_name = source.last_name
 10
SQL> pause

SQL> /

2 rows merged.

SQL>
SQL> select * from oracle_team;

        PK FIRST_NAME LAST_NAME
---------- ---------- ----------
         1 Connor     McDonald
         3 Chris      Saxon

2 rows selected.

So all looks well. Let me now show how what seems like a simple repeat of that operation can get us into trouble. I’ve dropped all the tables, so that I can recreate the demo from scratch.


SQL>
SQL> create table oracle_team (
  2      pk number primary key,
  3      first_name varchar2(10),
  4      last_name varchar2(10)
  5  );

Table created.

SQL>
SQL> insert into oracle_team (pk, first_name, last_name) values (1, 'Connor', 'Macdonald');

1 row created.

SQL> insert into oracle_team (pk, first_name, last_name) values (3, 'kris', 'saxon');

1 row created.

SQL> commit;

Commit complete.

SQL> select * from oracle_team;

        PK FIRST_NAME LAST_NAME
---------- ---------- ----------
         1 Connor     Macdonald
         3 kris       saxon

2 rows selected.

SQL> create table fixes (
  2      team_pk number,
  3      first_name varchar2(10),
  4      last_name  varchar2(10),
  5      applied    date
  6  );

Table created.

This time we’ll assume that repeated fix requests have come in for a single AskTOM team member (PK=1). My first fix request was to the change the “Mac” to “Mc” in McDonald, but then I got all picky and realised that I’d like to have a capital “D” in McDonald. Fussy fussy fussy Smile


SQL> insert into fixes values (1, 'Connor', 'Mcdonald',sysdate-1);

1 row created.

SQL> insert into fixes values (1, 'Connor', 'McDonald',sysdate);

1 row created.

SQL> insert into fixes values (3, 'Chris', 'Saxon',sysdate);

1 row created.

SQL> commit;

Commit complete.

SQL> select * from fixes;

   TEAM_PK FIRST_NAME LAST_NAME  APPLIED
---------- ---------- ---------- ---------
         1 Connor     Mcdonald   17-FEB-19
         1 Connor     McDonald   18-FEB-19
         3 Chris      Saxon      18-FEB-19

3 rows selected.

SQL>

Look what happens now when I re-attempt the MERGE.


SQL>
SQL>
SQL> merge into oracle_team target
  2  using (select team_pk, first_name, last_name
  3         from fixes
  4        ) source
  5  on (target.pk = source.team_pk)
  6  when matched then
  7  update set
  8      target.first_name = source.first_name,
  9      target.last_name = source.last_name
 10
SQL> pause

SQL> /
merge into oracle_team target
           *
ERROR at line 1:
ORA-30926: unable to get a stable set of rows in the source tables

Conceptually this example (hopefully) makes it clear why the error occurred. Depending on which FIXES row for my row in the target table (PK=1) the database sees first, the end result of the MERGE could be different. And we really can’t allow that because it means the results are pseudo-random.

So how can we fix these things? We need a stable set of rows in the sense that the MERGE results should never be questionable based on order in which we process the input set of rows. In the example above, we could probably make an assumption that the last fix request for a given primary key value is the one that should always take precedence. Hence the source set of fixes can be altered as below.


SQL>
SQL>
SQL> select *
  2  from
  3  (select team_pk,
  4          first_name,
  5          last_name,
  6          row_number() over ( partition by team_pk order by applied desc ) as r
  7   from fixes
  8  ) where r = 1;

   TEAM_PK FIRST_NAME LAST_NAME           R
---------- ---------- ---------- ----------
         1 Connor     McDonald            1
         3 Chris      Saxon               1

2 rows selected.

which then becomes an input into the MERGE command and the error is avoided.


SQL>
SQL> merge into oracle_team target
  2  using
  3  ( select *
  4    from
  5    (select team_pk,
  6            first_name,
  7            last_name,
  8            row_number() over ( partition by team_pk order by applied desc ) as r
  9     from fixes
 10    ) where r = 1
 11  ) source
 12  on (target.pk = source.team_pk)
 13  when matched then
 14  update set
 15      target.first_name = source.first_name,
 16      target.last_name = source.last_name
 17

SQL> /

2 rows merged.

SQL>
SQL> select * from oracle_team;

        PK FIRST_NAME LAST_NAME
---------- ---------- ----------
         1 Connor     McDonald
         3 Chris      Saxon

2 rows selected.

The video walking through this demo came from an Office Hours session, and you can watch it below

Friday Philosophy – Size is Relative

The below is a USB memory stick, a 64GB USB memory stick which I bought this week for about 10€/$. I think the first USB memory stick I bought was 8MB (1/8000 the capacity) and cost me twice as much.

This is a cheap, almost entry level USB memory stick now – you can get 0.5TB ones for around €50. I know, I know, they keep getting larger. As does the on-board memory of computers, disc/SSD size, and network speed. (Though compute speed seems to be stalling and has dropped below Moore’s law, even taking into account the constant rise in core count). But come on, think about what you can store on 64GB. Think back a few years to some of the large systems you worked on 10 years ago and how vast you might have thought the data was.

What made me sit back a little is that I’ve worked with VLDBs (Very Large DataBases) for most of my career, the first one being in about 1992/3. And that was a pretty big one for it’s time, it was the largest database Oracle admitted to working on in the UK back then I think. You guessed it – this little USB memory stick would hold the whole database, plus some to spare. What did the VLDB hold? All the patient activity information for a large UK hospital – so all patients who had ever been to the hospital, all their inpatient and outpatient stays, waiting list, a growing list of lab results (blood tests, x-rays)… The kit to run this took up the space of a large lorry/shipping container. And another shipping container of kit to cool it.

What makes a database a VLDB? Well, This old post here from 2009 probably still covers it, but put simply it is a database where simply the size of it gives you problems to solve – how to back it up, how to migrate it, how to access the data within it in a timely manner. It is not just about raw volume, it also depends on the software you are using and the hardware. Back in the mid 2000’s we had two types of VLDB where I worked:

  • Anything above 100GB or so on MySQL was a VLDB as the database technology struggled
  • Anything above 2TB on Oracle was a VLDB as we had trouble getting enough kit to handle the IO requirements and memory for the SGA.

That latter issue was interesting. There was kit that could run 2TB Oracle database with ease back then, but it cost millions. That was our whole IT budget, so we had to design a system using what were really beefed-up PCs and RAC. It worked. But we had to plan and design it very carefully.

So size in I.T. is not just about the absolute volume. It is also to do with what you need to do with the data and what hardware is available to you to solve your volume problems.

Size in I.T. is not absolute – It is relative to your processing requirements and the hardware available to you

That USB stick could hold a whole hospital system, possibly even now if you did not store images or DNA information. But with a single port running at a maximum of 200MB/s and, I don’t know, maybe 2,000 IOPS (read – writes can suck for USB memory sticks), could it support a whole hospital system? Maybe just, I think those figures would have been OK for a large storage array in 1993! But in reality, what with today’s chatty application design and larger user base, probably not. You would have to solve some pretty serious VLDB-type issues.

Security would also be an issue – it’s real easy to walk off with a little USB Memory stick! Some sort of data encryption would be needed… </p />
</p></div>

    	  	<div class=

The death of UTL_FILE – part 2

I wrote a post a while back call “The Death of UTL_FILE”, and probably because of it’s click-bait title I got lots of feedback, so I’m back to flog that horse Smile. Seriously though, I stand behind my assertion in that post, that the majority of usages of UTL_FILE I’ve seen my career are mimicking the spooling behaviour of a SQL*Plus script. And as that post pointed out, you can now achieve that functionality directly with the scheduler.

That is well and good for writing files from the database, and I added:

“I’ll come back to UTL_FILE to read files in a future post”.

That time has come! Many developers will be aware of the feature known as external tables but a criticism of those has been that whereas UTL_FILE only needs permissions on the appropriate directory object to read a file, an external table requires a DDL definition, which typically means a lot of red tape and processes to follow when trying to implement changes.

Some of that pain was alleviated in 12c Release 2 when it became possible to modify portions of the external table definition on the fly at query execution time. For example, if you needed to change the name of the file name each day, then rather than having to re-craft an external table DDL, or even an ‘alter’ command, it could be done as required by the query:


SQL> select * from ext_emp
  2    external modify ( location ('emp20161002.dat') );

     EMPNO ENAME      JOB              MGR HIREDATE         SAL
---------- ---------- --------- ---------- --------- ---------- ---
      7902 FORD       ANALYST         7566 03-DEC-81       3000
      7934 MILLER     CLERK           7782 23-JAN-82       1300        
      7566 JONES      MANAGER         7839 02-APR-81       2975

That still of course does not alleviate the need for the initial table DDL. But in 18c, you can query an external file and the only thing you need to do so is appropriate privileges on the directory object. Sounds a lot like UTL_FILE, no? In 18c, we can query an external source directly with query. Here’s a simple example using the following data file (emp.dat) as a source:


7369,SMITH,CLERK,7902,800,,20
7499,ALLEN,SALESMAN,7698,1600,300,30
7521,WARD,SALESMAN,7698,1250,500,30
7566,JONES,MANAGER,7839,2975,,20
7654,MARTIN,SALESMAN,7698,1250,1400,30
7698,BLAKE,MANAGER,7839,2850,,30
7782,CLARK,MANAGER,7839,2450,,10
7788,SCOTT,ANALYST,7566,3000,,20
7839,KING,PRESIDENT,,5000,,10
7844,TURNER,SALESMAN,7698,1500,,30
7876,ADAMS,CLERK,7788,1100,,20
7900,JAMES,CLERK,7698,950,,30
7902,FORD,ANALYST,7566,3000,,20
7934,MILLER,CLERK,7782,1300,,10

The query (in this case) just requires the EXTERNAL modifier and a definition of the columns and we’re done!


SQL> select * from
  2  external (
  3     (
  4       empno     number(4),
  5       ename     varchar2(12),
  6       job       varchar2(12),
  7       mgr       number(4),
  8       sal       number(7,2),
  9       comm      number(7,2),
 10       deptno    number(2)
 11     )
 12  type oracle_loader
 13  default directory TMP
 14  location ('emp.dat')
 15  );

     EMPNO ENAME        JOB                 MGR        SAL       COMM     DEPTNO
---------- ------------ ------------ ---------- ---------- ---------- ----------
      7369 SMITH        CLERK              7902        800                    20
      7499 ALLEN        SALESMAN           7698       1600        300         30
      7521 WARD         SALESMAN           7698       1250        500         30
      7566 JONES        MANAGER            7839       2975                    20
      7654 MARTIN       SALESMAN           7698       1250       1400         30
      7698 BLAKE        MANAGER            7839       2850                    30
      7782 CLARK        MANAGER            7839       2450                    10
      7788 SCOTT        ANALYST            7566       3000                    20
      7839 KING         PRESIDENT                     5000                    10
      7844 TURNER       SALESMAN           7698       1500                    30
      7876 ADAMS        CLERK              7788       1100                    20
      7900 JAMES        CLERK              7698        950                    30
      7902 FORD         ANALYST            7566       3000                    20
      7934 MILLER       CLERK              7782       1300                    10

14 rows selected.

Of course, UTL_FILE is more about totally flexibility in the content your read. After all, you (typically) read a line from the file and then parse it however you please. But the same is possible with the 18c syntax. Here I’ll take some free format text from the Oracle documentation in a file plain.dat.


9.5 System Triggers
A system trigger is created on either a schema or the database.
Its triggering event is composed of either DDL statements (listed in "ddl_event") or database operation statements (listed in "database_event").
A system trigger fires at exactly one of these timing points:
Before the triggering statement runs
(The trigger is called a BEFORE statement trigger or statement-level BEFORE trigger.)
After the triggering statement runs
(The trigger is called a AFTER statement trigger or statement-level AFTER trigger.)
Instead of the triggering CREATE statement
(The trigger is called an INSTEAD OF CREATE trigger.)
Topics
SCHEMA Triggers
DATABASE Triggers
INSTEAD OF CREATE Triggers
9.5.1 SCHEMA Triggers
A SCHEMA trigger is created on a schema and fires whenever the user who owns it is the current user and initiates the triggering event.
Suppose that both user1 and user2 own schema triggers, and user1 invokes a DR unit owned by user2. Inside the DR unit, user2 is the current user. Therefore, if the DR unit initiates the triggering event of a schema trigger that user2 owns, then that trigger fires. However, if the DR unit initiates the triggering event of a schema trigger that user1 owns, then that trigger does not fire.
Example 9-19 creates a BEFORE statement trigger on the sample schema HR. When a user connected as HR tries to drop a database object, the database fires the trigger before dropping the object.

To mimic the line-by-line read style of UTL_FILE, I can simple define a single column holding a line from the data file.


SQL> select * from
  2  external (
  3     (
  4       text varchar2(4000)
  5     )
  6  type oracle_loader
  7  default directory TMP
  8  location ('plain.dat')
  9  );

TEXT
------------------------------------------------------------
9.5 System Triggers
A system trigger is created on either a schema or the databa
se.

Its triggering event is composed of either DDL statements (l
isted in "ddl_event") or database operation statements (list
ed in "database_event").

A system trigger fires at exactly one of these timing points
:

Before the triggering statement runs
(The trigger is called a BEFORE statement trigger or stateme
nt-level BEFORE trigger.)

After the triggering statement runs
(The trigger is called a AFTER statement trigger or statemen
t-level AFTER trigger.)

Instead of the triggering CREATE statement
(The trigger is called an INSTEAD OF CREATE trigger.)
Topics
SCHEMA Triggers
DATABASE Triggers
INSTEAD OF CREATE Triggers
9.5.1 SCHEMA Triggers
A SCHEMA trigger is created on a schema and fires whenever t
he user who owns it is the current user and initiates the tr
iggering event.

Suppose that both user1 and user2 own schema triggers
Example 9-19 creates a BEFORE statement trigger on the sampl
e schema HR. When a user connected as HR tries to drop a dat
abase object


18 rows selected.

So there is plenty of potential there to reduce the amount of source code to maintain using this nice syntax. The line between database data and external data gets blurrier, which is a great convenience for developers and ad-hoc users in being able to bring all the power of SQL to more data sources.

Footnote: When you query an external data source in 18c in the way mentioned above, behind the scenes, the database may create global temporary tables to support the operation. In the first example above, the database performed the following DDL recursively:


CREATE GLOBAL TEMPORARY TABLE "SCOTT"."SYS_TEMP_0FD9D678B_BDA818A8" SHARING=NONE  
("EMPNO" NUMBER(4),
 "ENAME" VARCHAR2(12) COLLATE "USING_NLS_COMP" ,
 "JOB" VARCHAR2(12) COLLATE "USING_NLS_COMP" ,
 "MGR" NUMBER(4),"SAL" NUMBER(7,2),
 "COMM" NUMBER(7,2),
 "DEPTNO" NUMBER(2) 
) 
IN_MEMORY_METADATA CURSOR_SPECIFIC_SEGMENT INLINE_XT  
STORAGE (OBJNO 4254951307 ) NOPARALLEL

That should not be cause for concern, but if you have hundreds of differently structured external data sources, and/or triggers that fire on DDL operations on the database, then you might see some impact. As with all things, thorough testing should always precede production implementation.

Use the features available!

Advance warning: This post is a just another normal Friday morning rant. If you’re not interested in my pontificating, move along…nothing else to see here Smile

Sometimes you can’t make use of a facility that you normally would, and you have to substitute in something else. For example, if I would normally take the train to the basketball game, but today it’s not running due to track maintenance, then I’ll take the bus. I have no problem with that, because there’s a reason that I can’t take the train that day.

What does get my goat is on a day when the train is running, you come to me and say:

“Can you tell how to get the basketball game? Oh, by the way, I can’t take the train; I won’t use the bus; I don’t have a car; planes bother me; I get sea sick; I’m allergic to horses, and my feet hurt when I walk.”

Because all I’m thinking then is: “Then just don’t go to the damn game!” Smile

So lets bring that metaphor into the world of Oracle. I always chuckle when AskTOM questions come in along the lines of:

“I need to achieve ‘X’. I have read that there is a feature ‘Y’ that achieves ‘X’. But I cannot use ‘Y'”

Don’t get me wrong. If there is a reason why “Y” is not a potential solution (eg, requires Enterprise Edition or perhaps an optional feature), then fine. We can then try work within those restrictions. But often what seems to be the case is that the statement:

“I cannot use ‘Y'”

is really just someone saying:

“I am not familiar with ‘Y’ so I’d rather not take the time and effort to become familiar with it”

and that’s…well…annoying.

No-one expects you to have immediate grasp and knowledge of the entire suite of features available in the Oracle database. I doubt that anyone does that breadth of knowledge. That is why we have the documentation; that is why we have community forums and sites like AskTOM, so that the collective knowledge of the community can be of assistance. But if you’re not prepared to spread your wings a little, and learn those new things, then you’re missing out on some of the great functionality available to you in the Oracle database.

AskTOM Question: How I can restrict the number of sessions a user has ?

Us: You can use a profile for this.

Response: I’d rather not use a profile

Us: OK then, then perhaps look at using Resource Manager to limit session activity.

Response: I’d rather not Resource Manager

Us: OK then, then perhaps look at using a logon trigger to perform your own custom logic to manage sessions.

Response: I don’t to write a trigger for this.

Us: Then just don’t go to the damn game! Smile

 

The 19c database has just come out, and I know it’s always tempting just to focus on the new features that come with a release, but rest assured that inside Oracle, the focus for 19c was less on new features, but more on honing and improving the existing features in the Oracle database, to make them even more compelling solutions for your business requirements. But all of that won’t count for much if you’re not prepared to give them a chance.

Network troubleshooting with tcpdump and strace

Here is a little example using tcpdump and strace to troubleshoot a network issue with an Oracle connection. It may not be the best approach for all cases, but just an example, as this is a copy/paste of my screen after I analyzed it. I just changed the server names.

At some point, the communication between two servers was hanging, with both endpoints waiting on read() — like ‘SQL*Net message from client’ wait event on the server. This issue occurred only on some circumstances: a case that always reproduced was RMAN correctly connected to the catalog, but hanging when we did a ‘show all’.

Client netstat and strace

I have RMAN started and connected to the catalog.

I identify my client process, in this case the RMAN executable. I display its TCP connection with netstat and its current system calls with strace:

[root@client tmp]# netstat -p | grep "/rman"
Proto Local Address Foreign Address State
tcp client:37051 server:1521 ESTABLISHED 192131/rman
[root@client tmp]# strace -yttp 192131
Process 192131 attached
10:41:29.168494 read(0,

I can see the connection from client to server on port 1521 and the process is waiting on user input (stdin).

Server netstat and strace

On the remote side, I see the same TCP connection (same hosts and ports):

[root@server log]# netstat -p | grep ":37051"
Proto Local Address Foreign Address State
tcp server:1521 client:37051 ESTABLISHED 36761/oraclePDBR2

and the process is waiting on socket (‘SQL*Net message from client’):

[root@server log]# strace -yttp 36761
Process 36761 attached
10:41:43.745239 read(20<socket:[3785769780]>,

Client/server dialogue

Here is where I run ‘show all;’ which will get hung. I’ll show all the network dialogue from this point.

I run tcpdump in background to show the packets sent on client :

[root@client tmp]# tcpdump -i bond0 src client and dst server &

and on server:

[root@server tmp]# tcpdump -i bond0 src server and dst client &

In both I have strace attached to the processes identified by netstat -p (strace -y to show file name for descriptors and -tt to show timestamp)

Here are the traces just after I run ‘show all;’ on the client, which queries the server.

The client sends a statement. Here is the write (from strace) and TCP message (from tcpdump):

10:43:44.309812 write(15, "\3+\0\0\6\0\0\0\0\0\21i{\376\377\377\377\377\377\377\377\1\0\0\0\0\0\0\0\5\0\0"..., 811) = 811
10:43:44.309853 IP client.37051 > server.1521: Flags [P.], seq 34484:35295, ack 17685, win 442, length 811

The server receives it and strace displays the end of the line with the read() return code (which is the size of the message):

"\3+\0\0\6\0\0\0\0\0\21i{\376\377\377\377\377\377\377\377\1\0\0\0\0\0\0\0\5\0\0"..., 8208) = 811

In the meanwhile, the client waits for the answer:

10:43:44.309892 read(15,

Then the server answers:

10:43:44.356795 write(20, "\0\355\0\0\6\0\0\0\0\0\v\1\5q\6\0\0\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 237
10:43:44.356830 IP server.1521> client.37051: Flags [P.], seq 17685:17922, ack 35296, win 442, length 237
) = 237

We see strace printing the call (write 237 bytes), then tcpdump showing the packet going out (same size: bytes 17685 to 17922 in this server->client stream), and then strace continues the line when the write() system call returns, displaying the number of bytes written.

Sending this packet is also the occasion to acknowledge what was received — up to byte 35296 of the client->server message.

The client gets it, strace showing the end of the read() line:

10:43:44.357178 IP client.37051 > server.1521: Flags [.], ack 17922, win 442, length 0
"\0\355\0\0\6\0\0\0\0\0\v\1\5q\6\0\0\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 8208) = 237

Note that an empty message (length=0) was sent by the client to acknowledge that it has received up to byte 17922 in the server->client message.

After having sent its answer, the server waits for the next query (and acknowledges what was received recently with a length 0 packet):

10:43:44.356894 read(20, 
10:43:44.357493 IP server.1521> client.37051: Flags [.], ack 36267, win 442, length 0

The client sends a new query:

10:43:44.357368 write(15, "\3\313\0\0\6\0\0\0\0\0\21i}\376\377\377\377\377\377\377\377\1\0\0\0\0\0\0\0\4\0\0"..., 971
10:43:44.357416 IP client.37051 > server.1521: Flags [P.], seq 35295:36266, ack 17922, win 442, length 971
) = 971

and waits for the response:

10:43:44.357461 read(15, "\0\356\0\0\6\0\0\0\0\0\v\1\5\n\7\0\0\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 8208) = 238

From here, I start to show the full line with the return code (238 bytes) but remember that only the ‘10:43:44.357461 read(15,’ part is displayed by strace before the message is actually received. The buffer content and size is written later.

Here is the server receiving the message from the client and sending the answer:

"\3\313\0\0\6\0\0\0\0\0\21i}\376\377\377\377\377\377\377\377\1\0\0\0\0\0\0\0\4\0\0"..., 8208) = 971
10:43:44.365296 write(20, "\0\356\0\0\6\0\0\0\0\0\v\1\5\n\7\0\0\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 238
10:43:44.365326 IP server.1521> client.37051: Flags [P.], seq 17922:18160, ack 36267, win 442, length 238
) = 238

We have seen that the client got it (the 238 bytes). It sends a new query (365 bytes), waits for the answer which is 237 bytes:

10:43:44.365766 write(15, "\1m\0\0\6\0\0\0\0\0\21i\177\376\377\377\377\377\377\377\377\1\0\0\0\0\0\0\0\2\0\0"..., 365) = 365
10:43:44.365817 IP client.37051 > server.1521: Flags [P.], seq 36266:36631, ack 18160, win 442, length 365
10:43:44.365870 read(15, "\0\355\0\0\6\0\0\0\0\0\v\1\5\221\1\0\0\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 8208) = 237

Here is the server part for this dialogue, receiving the 365 bytes query and writing the 237 bytes answer:

10:43:44.365408 read(20, "\1m\0\0\6\0\0\0\0\0\21i\177\376\377\377\377\377\377\377\377\1\0\0\0\0\0\0\0\2\0\0"..., 8208) = 365
10:43:44.366793 write(20, "\0\355\0\0\6\0\0\0\0\0\v\1\5\221\1\0\0\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 237
10:43:44.366829 IP server.1521> client.37051: Flags [P.], seq 18160:18397, ack 36632, win 442, length 237
) = 237

There are a few more exchanges where the client sends 381 bytes, the server answers 236 bytes, then the client sends 297 bytes and the server answers in 181 bytes. Here is the client strace and tcpddump for this:

10:43:44.367199 write(15, "\1}\0\0\6\0\0\0\0\0\21i\201\376\377\377\377\377\377\377\377\1\0\0\0\0\0\0\0\4\0\0"..., 381) = 381
10:43:44.367229 IP client.37051 > server.1521: Flags [P.], seq 36631:37012, ack 18397, win 442, length 381
10:43:44.367272 read(15, 
"\0\354\0\0\6\0\0\0\0\0\v\1\5L\1\0\0\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 8208) = 236
10:43:44.368522 write(15, "\1)\0\0\6\0\0\0\0\0\21i\203\376\377\377\377\377\377\377\377\1\0\0\0\0\0\0\0\5\0\0"..., 297) = 297
10:43:44.368555 IP client.37051 > server.1521: Flags [P.], seq 37012:37309, ack 18633, win 442, length 297
10:43:44.368598 read(15, "\0\265\0\0\6\0\0\0\0\0\10\6\0\252I(\35\2\7\0\0\2\0\0\0\1\0\0\0\0\0\0"..., 8208) = 181

and the strace and tcpdump on the server side:

10:43:44.366884 read(20, "\1}\0\0\6\0\0\0\0\0\21i\201\376\377\377\377\377\377\377\377\1\0\0\0\0\0\0\0\4\0\0"..., 8208) = 381
10:43:44.368031 write(20, "\0\354\0\0\6\0\0\0\0\0\v\1\5L\1\0\0\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 236
10:43:44.368084 IP server.1521> client.37051: Flags [P.], seq 18397:18633, ack 37013, win 442, length 236
) = 236
10:43:44.368181 read(20, "\1)\0\0\6\0\0\0\0\0\21i\203\376\377\377\377\377\377\377\377\1\0\0\0\0\0\0\0\5\0\0"..., 8208) = 297
10:43:44.370926 write(20, "\0\265\0\0\6\0\0\0\0\0\10\6\0\252I(\35\2\7\0\0\2\0\0\0\1\0\0\0\0\0\0"..., 181) = 181
10:43:44.370949 IP server.1521 > client.37051: Flags [P.], seq 18633:18814, ack 37310, win 442, length 181
10:43:44.371006 read(20, 

The client which got the 181 bytes answer from the server, now sends a new query:

10:43:44.413973 write(15, "\10V\0\0\6\0\0\0\0\0\21i\205\376\377\377\377\377\377\377\377\1\0\0\0\0\0\0\0\2\0\0"..., 2134
10:43:44.414020 IP client.37051 > server.1521: Flags [P.], seq 37309:39443, ack 18814, win 442, length 2134
) = 2134

and then waits for the answer:

10:43:44.414115 read(15,

but this is where it hangs. Remember where we are:

  • the server has acknowledged up byte 37310 in messages from client and is waiting to read from the socket
  • the client is acknowledging up to byte 18814 in messages from server and sends more bytes (from 37309 to 39443)

But the server didn’t receive them and is still waiting:

10:43:44.371006 read(20,

Both side waiting… this is where tcpdump will be helpful. strace shows only one write() call, which writes to the TCP buffer. But tcpdump shows many attempts to send this to the server:

10:43:44.617869 IP client.37051 > server.1521: Flags [P.], seq 37309:39443, ack 18814, win 442, length 2134
10:43:45.025804 IP client.37051 > server.1521: Flags [P.], seq 37309:39443, ack 18814, win 442, length 2134
10:43:45.841818 IP client.37051 > server.1521: Flags [P.], seq 37309:39443, ack 18814, win 442, length 2134
10:43:47.473818 IP client.37051 > server.1521: Flags [P.], seq 37309:39443, ack 18814, win 442, length 2134
10:43:50.737837 IP client.37051 > server.1521: Flags [P.], seq 37309:39443, ack 18814, win 442, length 2134

Same packet, same size, the 37309 to 39443 bytes of the client->server conversation. The last acknowledge from the server was 37310 — he never received this packet. And because it is TCP, it retries. The write() was done because it writes to the buffer, so the application is just waiting for the answer.

A packet has disappeared in the client->server conversation. This is a job for the network team, but there’s something to remark here. Many messages were ok, but not this one. What is special? The size is larger: 2134 bytes. The maximum we had before was 971.

Simple ping…

Those numbers may already ring a bell, but let’s try to PING with some different packets sizes. Up to 1472 goes through:

[root@client tmp]# ping -c 5 -s 1472 -W 5 server
PING client (10.30.13.62) 1472(1500) bytes of data.
1480 bytes from server(10.30.13.62): icmp_seq=1 ttl=58 time=0.607 ms
1480 bytes from server(10.30.13.62): icmp_seq=2 ttl=58 time=0.477 ms
1480 bytes from server(10.30.13.62): icmp_seq=3 ttl=58 time=0.494 ms
1480 bytes from server(10.30.13.62): icmp_seq=4 ttl=58 time=0.474 ms
1480 bytes from server(10.30.13.62): icmp_seq=5 ttl=58 time=0.533 ms

But larger fails:

[root@client tmp]# ping -c 5 -s 1473 -W 5 server
PING server (10.30.13.62) 1473(1501) bytes of data.
--- server ping statistics ---
5 packets transmitted, 0 received, 100% packet loss, time 9000ms

All packets were lost.

The default Ethernet MTU (maximum transmission size) is 1500 bytes but we use Jumbo Frames here, as defined by the interface:

[root@client tmp]# ip a | grep mtu
...
8: bond0: mtu 9000 qdisc noqueue state UP

There is probably a router in the middle which does not allow Jumbo Frames… This is now a job for the network guys. However, if I need a quick workaround I can limit the Session Data Unit on Oracle side, like:

(DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=myservice))(SDU=1430)(ADDRESS=(PROTOCOL=TCP)(HOST=server)(PORT=1521)))

With SDU at 1430 the length of the packets (as displayed by tcpdump) are at maximum 1460 bytes (NS and NT layers add 30 bytes) which is the maximum that can go through MTU 1500 (as IP and TCP headers add 20 bytes each).

If you wonder why ping was able to send packets up to 1472 bytes, that’s because the ICMP header is only 8 bytes instead of the TCP 20 bytes.

Ansible tips’n’tricks: understanding your Ansible configuration

When writing automation scripts I tend to use a local Ansible configuration file. This has certain advantages for me, such as including it in a version control system (VCS). It also is a valid option for developers without access to the global configuration file installed by the package manager. And more convenient to use than setting environment variables.

WARNING: There are some very important security considerations though, which you must be aware of before using a local configuration file.

Until now I haven’t spent a lot of time thinking about configuration variables and the order of precedence, but that is exactly what I’d like to do in this post.

When putting this short article together I used Ansible 2.7.6 on Oracle Linux 7.6

What does the documentation have to say?

The official documentation does a really good job explaining the configuration settings. You have the choice of:

  • Environment variable
  • configuration file (ansible.cfg) in your local directory
  • A playbook-independent configuration file in ~/.ansible.cfg
  • The global settings in /etc/ansible/ansible.cfg

Let’s try this

This is an example of a very minimalist project folder in my lab, deliberately omitting a local configuration file.

[vagrant@server1 ansible]$ ls -la
total 16
drwxrwxr-x. 3 vagrant vagrant   53 Feb 14 09:09 .
drwx------. 5 vagrant vagrant 4096 Feb 14 08:40 ..
drwxrwxr-x. 7 vagrant vagrant 4096 Feb 14 09:09 .git
-rw-rw-r--. 1 vagrant vagrant   17 Feb 14 09:09 .gitignore
-rw-rw-r--. 1 vagrant vagrant   94 Feb 14 08:40 hello.yml

The example is kept super-short on purpose.

Precedence

Without a local configuration file (~/.ansible.cfg or $(pwd)/ansible.cfg) and/or defined environment variables, the settings made by your friendly system administrator in /etc/ansible/ansible.cfg govern every execution. You can use ansible-config to view the non-default values:

[vagrant@server1 ansible]$ ansible-config dump --only-changed
ANSIBLE_NOCOWS(/etc/ansible/ansible.cfg) = True
[vagrant@server1 ansible]$ 

As you can see, the global configuration file prevents the use of cowsay.

Now if I as an automation developer wanted to override a certain setting, I could use an environment variable. The documentation tricked me for a moment – when you see “DEFAULT_LOG_PATH” for example, exporting that variable does not have any effect. What you need to use is shown in the same section, but further down, with the Environment: label. To set the default log path on your shell, use ANSIBLE_LOG_PATH as in this example:

[vagrant@server1 ansible]$ export ANSIBLE_LOG_PATH=/home/vagrant/ansible/logs/my_ansible.log
[vagrant@server1 ansible]$ ansible-config dump --only-changed
ANSIBLE_NOCOWS(/etc/ansible/ansible.cfg) = True
DEFAULT_LOG_PATH(env: ANSIBLE_LOG_PATH) = /home/vagrant/ansible/logs/my_ansible.log

Thankfully ansible-config shows me the origin for each setting.

Using a local configuration file

Now how does a local configuration file play into this? Let’s try! I keep the global configuration file as it is, but I unset the environment variable. Here is the result:

[vagrant@server1 ansible]$ unset ANSIBLE_LOG_PATH
[vagrant@server1 ansible]$ cat ansible.cfg 
[defaults]

stdout_callback = debug
log_path = /home/vagrant/ansible/logs/ansible_blogpost.log
[vagrant@server1 ansible]$ ansible-config dump --only-changed
DEFAULT_LOG_PATH(/home/vagrant/ansible/ansible.cfg) = /home/vagrant/ansible/logs/ansible_blogpost.log
DEFAULT_STDOUT_CALLBACK(/home/vagrant/ansible/ansible.cfg) = debug
[vagrant@server1 ansible]$ 

What’s interesting is that my “nocows” setting from the global configuration file in /etc/ansible/ansible.cfg isn’t merged into the configuration. Without the use of environment variables, the only settings coming into play are those of the local configuration file. The same seems to apply if a local configuration file exists in addition to ~/.ansible.cfg. The file in the current working directory always took precedence in my testing.

This did not affect environment variables, they have always been considered by ansible-config. If for example I wanted to temporarily save the logfile to a different place and was too lazy to fire up vim, I could use this approach

[vagrant@server1 ansible]$ ANSIBLE_LOG_PATH=/home/vagrant/ansible/logs/overriding.log ansible-config dump --only-changed
DEFAULT_LOG_PATH(env: ANSIBLE_LOG_PATH) = /home/vagrant/ansible/logs/overriding.log
DEFAULT_STDOUT_CALLBACK(/home/vagrant/ansible/ansible.cfg) = debug
[vagrant@server1 ansible]$ 

Happy scripting!

DBMS_JOB is an asynchronous mechanism

One of the very cool things about DBMS_JOB is that a job does not “exist” as such until the session that submitted the job commits the transaction. (This in my opinion is a critical feature that is missing from the DBMS_SCHEDULER package which, other than this omission, is superior to DBMS_JOB in every way).

Because DBMS_JOB is transactional, we can use it to make “non-transactional” things appear transactional. For example, if part of the workflow for hiring someone is to send an email to the Human Resources department, we can do the email via job submission so that an email is not sent if the employee record is not created successfully or is rolled back manually, eg:


begin
  insert into EMP (empno, ename) values (123, 'CONNOR');
  dbms_job.submit( :j,'notify_hr(empno=>123);');
end;

This post is not about that – that cool usage scenario is well known. But as a consequence, people tend to think that as soon as I commit, the job will spring into life. This is typically the case, but there are no guarantees that your job will run immediately, even if job queue processes are available.

Here’s a couple of examples of that in action


--
-- This one is almost instant
--
SQL> create table t
  2   ( submitted timestamp,
  3     started   timestamp );

Table created.

SQL>
SQL> create or replace
  2  procedure prc is
  3  begin
  4    update t set started = systimestamp;
  5    commit;
  6  end;
  7  /

Procedure created.

SQL>
SQL> declare
  2    j int;
  3  begin
  4    dbms_job.submit( j,'prc;');
  5    insert into t(submitted) values (systimestamp);
  6    commit;
  7  end;
  8  /

PL/SQL procedure successfully completed.

SQL>
SQL> col submitted format a32
SQL> col started format a32
SQL> select * from t;

SUBMITTED                        STARTED
-------------------------------- --------------------------------
13-FEB-19 09.31.10.956989 PM     13-FEB-19 09.31.10.994153 PM

--
-- This one not so much :-)
--

SQL> declare
  2    j int;
  3  begin
  4    dbms_job.submit( j,'prc;');
  5    insert into t(submitted) values (systimestamp);
  6    commit;
  7  end;
  8  /

PL/SQL procedure successfully completed.

SQL>
SQL> col submitted format a32
SQL> col started format a32
SQL> select * from t;

SUBMITTED                        STARTED
-------------------------------- --------------------------------
14-FEB-19 01.22.45.928000 PM

SQL> /

SUBMITTED                        STARTED
-------------------------------- --------------------------------
14-FEB-19 01.22.45.928000 PM

SQL> /

SUBMITTED                        STARTED
-------------------------------- --------------------------------
14-FEB-19 01.22.45.928000 PM     14-FEB-19 01.22.59.775000 PM

In particular, I tend to notice this more on Windows platforms than others. MOS Note 2109840.1 is also a good starting point if you are seeing huge delays – you may have hit upon a bug.

So just be aware that DBMS_JOB will run your job soon after committing, but “soon” is a flexible term Smile

No Pause on the Azure Data Factory

Using only what you need in Azure is a crucial part of optimizing your environment in the cloud. You find that as attractive as Azure is for the masses, making this change to make sure what you do use is optimal will make it downright irresistible.

Many customers, as they are ramping up with Azure Data Factory, (ADF) didn’t worry too much as they built out pipelines, as they could always pause the service at the resource level.

In recent weeks this feature has been deprecated and customers may be at a loss as to how to proceed. The best part about technology is that there’s always another way to accomplish something, you just need to figure out how to do it. Lucky for us, the Azure team wouldn’t have removed an option without another way to perform the task and in fact, introduced an enhanced way to do this.

The switch is to move in and pause at the pipeline layer vs. the resource layer. I promise- This can be done with one step and one script. Luckily, I’d already moved my current customers to this solution over the last number of months since it was the solution that had made the most sense at the time. It was a secondary benefit that when the resource level PAUSE option was removed from the resource level in the portal, no impact was experienced by my customers. I just want to make sure that those who weren’t lucky enough to plan this way have the answer.

If you were dependent upon the resource tier PAUSE option and would like the script to schedule your jobs, setting the activity on the ADF resource, please check out this SSIS blog post from Microsoft with the script included.

Happy Azure-ing </p />
</p></div>

    	  	<div class=