Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

Online Statistics Collection during Bulk Loads on Partitioned Tables

Introduction

One of the enhancements to statistics collection and management in Oracle 12c was the ability of the database will automatically collect statistics during either a create-table-as-select operation or during the initial insert into a freshly created or freshly truncated table, provide that insert is done in direct-path mode (i.e. using the APPEND hint).
When that occurs, there is an additional operation in the execution plan; OPTIMIZER STATISTICS GATHERING.

#eeeeee; border: 0px solid #000000; font-family: "courier new"; font-size: 98%; line-height: 1.0; overflow: auto; padding-left: 2px; padding-right: 2px; width: 95%;">----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | | | 495K(100)| |
| 1 | LOAD AS SELECT | | | | | |
| 2 | OPTIMIZER STATISTICS GATHERING | | 70M| 11G| 495K (2)| 00:00:20 |
| 3 | TABLE ACCESS FULL | XXXXXXXXXXXXXXX | 70M| 11G| 495K (2)| 00:00:20 |
----------------------------------------------------------------------------------------------------

The motivation for this blog was encountering a bulk insert into a partitioned table where the statistics gathering operation consumed a very significant amount of time. Partitioning gives you more things to consider.

A Simple Test

I created a simple test that compares the time taken by online statistics collection on partitioned and non-partitioned tables, with the explicit collection of statistics using DBMS_STATS. I have four tables with the same structure.

  • T1: Not partitioned. Data will be copied from this table to each of the others. 
  • T2: Partitioned. Online statistics only. 
  • T3: Partitioned. Explicitly gathered statistics. 
  • T4: Partitioned. Explicitly gathered incremental statistics.
#eeeeee; border: 0px solid #000000; font-family: "courier new"; font-size: 98%; line-height: 1.0; overflow: auto; padding-left: 2px; padding-right: 2px; width: 95%;">CREATE TABLE T1 (a number, b varchar2(1000), c number) NOLOGGING;
CREATE TABLE T2 (a number, b varchar2(1000), c number)
PARTITION BY RANGE (a) INTERVAL(100) (PARTITION t_part VALUES less than (101)) NOLOGGING;
CREATE TABLE T3 (a number, b varchar2(1000), c number)
PARTITION BY RANGE (a) INTERVAL(100) (PARTITION t_part VALUES less than (101)) NOLOGGING;
CREATE TABLE T4 (a number, b varchar2(1000), c number)
PARTITION BY RANGE (a) INTERVAL(100) (PARTITION t_part VALUES less than (101)) NOLOGGING;

I loaded 100 million rows into each in direct-path mode. The partitioned tables end up with 100 partitions, each with 1 million rows. I have also suppressed redo logging during the direct-path insert by creating the tables with the NOLOGGING attribute.

#eeeeee; border: 0px solid #000000; font-family: "courier new"; font-size: 98%; line-height: 1.0; overflow: auto; padding-left: 2px; padding-right: 2px; width: 95%;">EXEC dbms_stats.set_table_prefs(user,'T3','INCREMENTAL','FALSE');
EXEC dbms_stats.set_table_prefs(user,'T4','INCREMENTAL','TRUE');

The following set of tests will be run for different combinations of:

  • Parallel hint on query, or not 
  • Parallel hint on insert, or not 
  • Table parallelism specified 
  • Parallel DML enabled or disabled at session level 
  • Column-specific METHOD_OPT table preference specified or not. 

I enabled SQL trace, from which I was able to obtain the elapsed time of the various statements, and I can determine the amount of time spent on online statistics gathering from timings on the OPTIMIZER STATISTICS GATHERING operation in the execution plan in the trace.

#eeeeee; border: 0px solid #000000; font-family: "courier new"; font-size: 98%; line-height: 1.0; overflow: auto; padding-left: 2px; padding-right: 2px; width: 95%;">TRUNCATE TABLE T2;
TRUNCATE TABLE T3;
EXEC dbms_stats.delete_table_stats(user,'T2');
EXEC dbms_stats.delete_table_stats(user,'T3');
EXEC dbms_stats.delete_table_stats(user,'T4');

INSERT /*+APPEND &inshint*/ into T2 i SELECT * /*+&selhint*/ from t1 s;
INSERT /*+APPEND &inshint NO_GATHER_OPTIMIZER_STATISTICS*/ into T3 i SELECT /*+&selhint*/ * from t1 s;
INSERT /*+APPEND &inshint NO_GATHER_OPTIMIZER_STATISTICS*/ into T4 i SELECT /*+&selhint*/ * from t1 s;
commit;
EXEC dbms_stats.gather_table_stats(user,'T3');
EXEC dbms_stats.gather_table_stats(user,'T4');

Quirks

It was while building this test that I discovered a couple of quirks:

What Statistics Are Normally Collected by Online Statistics Gathering? 

After just the initial insert, I can see that I have table statistics on T1 and T2, but not on T3 and T4.

#eeeeee; border: 0px solid #000000; font-family: "courier new"; font-size: 98%; line-height: 1.0; overflow: auto; padding-left: 2px; padding-right: 2px; width: 95%;">SELECT table_name, num_rows from user_tables where table_name LIKE 'T_' order by 1; 

TABLE_NAME NUM_ROWS LAST_ANALYZED
---------- ---------- -----------------
T1 10000000 14:07:36 16/01/20
T2 10000000 14:07:36 16/01/20
T3
T4

I also have column statistics on T1 and T2, but no histograms.

#eeeeee; border: 0px solid #000000; font-family: "courier new"; font-size: 98%; line-height: 1.0; overflow: auto; padding-left: 2px; padding-right: 2px; width: 95%;">break on table_name skip 1
SELECT table_name, column_name, num_distinct, global_stats, histogram, num_buckets, last_analyzed
FROM user_tab_columns where table_name like 'T_' order by 1,2;

TABLE_NAME COLUMN_NAME NUM_DISTINCT GLO HISTOGRAM NUM_BUCKETS LAST_ANALYZED
---------- ------------ ------------ --- --------------- ----------- -----------------
T1 A 10000 YES NONE 1 14:06:58 16/01/20
B 10000 YES NONE 1 14:06:58 16/01/20
C 100 YES NONE 1 14:06:58 16/01/20

T2 A 10000 YES NONE 1 14:07:11 16/01/20
B 10000 YES NONE 1 14:07:11 16/01/20
C 100 YES NONE 1 14:07:11 16/01/20

T3 A NO NONE
B NO NONE
C NO NONE

T4 A NO NONE
B NO NONE
C NO NONE

However, I do not have any partition statistics (I have only shown the first and last partition of each table in this report).

#eeeeee; border: 0px solid #000000; font-family: "courier new"; font-size: 98%; line-height: 1.0; overflow: auto; padding-left: 2px; padding-right: 2px; width: 95%;">break on table_name skip 1
SELECT table_name, partition_position, partition_name, num_rows
FROM user_tab_partitions WHERE table_name like 'T_' ORDER BY 1,2 nulls first;

TABLE_NAME PARTITION_POSITION PARTITION_NAME NUM_ROWS LAST_ANALYZED
---------- ------------------ -------------------- ---------- -----------------
T2 1 T_PART

100 SYS_P20008

T3 1 T_PART

100 SYS_P20107

T4 1 T_PART

100 SYS_P20206

Online optimizer statistics gathering only collects statistics at table level but not partition or sub-partition level. Histograms are not collected.

From Oracle 18c, there are two undocumented parameters that modify this behaviour. Both default to false. Interestingly, both are enabled in the Oracle Autonomous Data Warehouse.

  • If _optimizer_gather_stats_on_load_hist=TRUE histograms are be collected on all columns during online statistics collection. 
  • If _optimizer_gather_stats_on_load_all=TRUE statistics are collected online during every direct-path insert, not just the first one into a segment. 

Do I Need Partition Statistics?

Statistics will be collected on partitions that do not have them when the automatic statistics collection job runs in the next database maintenance window. The question is whether to manage without them until then?
"The optimizer will use global or table level statistics if one or more of your queries touches two or more partitions. The optimizer will use partition level statistics if your queries do partition elimination, such that only one partition is necessary to answer each query. If your queries touch two or more partitions the optimizer will use a combination of global and partition level statistics."
 – Oracle The Data Warehouse Insider Blog: Managing Optimizer Statistics in an Oracle Database 11g - Maria Colgan
It will depend upon the nature of the SQL in the application. If the optimizer does some partition elimination, and the data is not uniformly distributed across the partitions, then partition statistics are likely to be beneficial. If there is no partition elimination, then you might question whether partitioning (or at least the current partitioning strategy) is appropriate!

What is the Fastest Way to Collect Statistics on Partitioned Tables?

Let's look at how long it takes to insert data into, and then subsequently collect statistics on the tables in my example. This test was run on Oracle 19c on one compute node of a virtualised Exadata X4 machine with 16 CPUs.  This table shows elapsed time and the total DB time include all parallel server processes for each operation.

Table Name
Oper-ation Comment Option Serial Insert & Statistics Parallel Insert & Statistics Parallel SQL & Statistics Parallel DML, Insert, Select & Statistics Parallel DML, SQL & Statistics Parallel Tables Parallel Tables & DML Parallel Tables, DML & Method Opt
Table
NOPARALLEL NOPARALLEL NOPARALLEL NOPARALLEL NOPARALLEL PARALLEL PARALLEL PARALLEL
Insert Hint blank PARALLEL(i) blank PARALLEL(i) blank blank blank blank
Select Hint blank PARALLEL(s) PARALLEL PARALLEL(s) PARALLEL blank blank blank
Parallel DML DISABLE DISABLE DISABLE ENABLE ENABLE DISABLE ENABLE ENABLE
Stats Degree none DEFAULT DEFAULT DEFAULT DEFAULT none none none
Method Opt none none none none none none none ... FOR COLUMNS SIZE 1 A
T2
Insert
Online Stats Gathering
Elapsed Time (s)
172.46
160.86
121.61
108.29
60.31
194.47
23.57
20.57
Optimizer Statistics Gathering
82.71
55.59
55.90
-
-
-
-
-
T3
Insert
NO_GATHER_OPTIMIZER_STATS
125.40
156.36
124.18
20.62
29.01
199.20
20.97
21.15
Explicit Stats
122.80
146.25
63.23
15.99
24.88
24.58
24.99
24.62
T4
Insert
NO_GATHER_OPTIMIZER_STATS
123.18
158.15
147.04
20.44
29.91
204.61
20.65
20.60
Incremental Explicit Stats
80.51
104.85
46.05
23.42
23.14
23.21
22.60
23.03
T2
Insert
Online Stats Gathering
DB Time (s)
174
163 169 359 337 248 366 308
T3
Insert
NO_GATHER_OPTIMIZER_STATS
128
193 160 290 211 236 312 326
Explicit Stats
122
146 63 265 305 262 335
T4
Insert
NO_GATHER_OPTIMIZER_STATS
126
194 167 295 205 233 304 295
Incremental Explicit Stats
80
105 2 281 266 300 179 226
  • It is difficult to determine the actual duration of the OPTIMIZER STATISTICS GATHERING operation, short of measuring the effect of disabling it. The time in the above table has been taken from SQL trace files. That duration is always greater than the amount saved by disabling online statistics gathering with the NO_GATHER_OPTIMIZER_STATS hint. However, the amount of time accounted in Active Session History (ASH) for that line in the execution plan is usually less than the elapsed saving. 
    • Eg. For the sequential insert, 83s was accounted for OPTIMIZER STATISTICS GATHERING in the trace, while ASH showed only 23s of database time for that line of the plan. However, perhaps the only meaningful measurement is that disabling online statistics gathering saved 47s, 
  • DML statements, including insert statements in direct-path, only actually execute in parallel if parallel DML is enabled. Specifying a degree of parallelism on the table, or a parallel hint is not enough. Parallel DML should be enabled.
    • either at session level
#eeeeee; border: 0px solid #000000; font-family: "courier new"; font-size: 98%; line-height: 1.0; overflow: auto; padding-left: 2px; padding-right: 2px; width: 95%;">ALTER SESSION ENABLE PARALLEL DML;
    • or for the individual statement.
#eeeeee; border: 0px solid #000000; font-family: "courier new"; font-size: 98%; line-height: 1.0; overflow: auto; padding-left: 2px; padding-right: 2px; width: 95%;">insert /*+APPEND ENABLE_PARALLEL_DML*/ into T2 SELECT * from t1;
    • Specifying parallel insert with a hint, without enabling parallel DML will not improve performance and can actually degrade it.
    • Specifying parallel query without running the insert in parallel can also degrade performance.
  • Online statistics will be collected in parallel if
    • either the table being queried has a degree of parallelism,
    • or a parallel hint applies to the table being queried, or the entire statement,
    • or parallel DML has been enabled 
  • Where statistics are collected explicitly (i.e. with a call to DBMS_STATS.GATHER_TABLE_STATS) they are collected in parallel if 
    • either, the DEGREE is specified (I specified a table statistics preference),
#eeeeee; border: 0px solid #000000; font-family: "courier new"; font-size: 98%; line-height: 1.0; overflow: auto; padding-left: 2px; padding-right: 2px; width: 95%;">EXEC dbms_stats.set_table_prefs(user,'T3','DEGREE','DBMS_STATS.DEFAULT_DEGREE');
    • or the table has a degree of parallelism.
#eeeeee; border: 0px solid #000000; font-family: "courier new"; font-size: 98%; line-height: 1.0; overflow: auto; padding-left: 2px; padding-right: 2px; width: 95%;">ALTER TABLE T3 PARALLEL;
  • Incremental statistics are generally faster to collect because they calculate table-level statistics from partition-level statistics, saving a second pass through the data.
  • When parallel DML is enabled at session level, I found that the performance of statistics collection also improves.

Conclusion

Overall, the best performance was obtained when the tables were altered to use parallelism, and parallel DML was enabled; then the query, insert and statistics collection are performed in parallel.
However, the improved performance of parallelism comes at a cost.  It can be a brutal way of bringing more resource to bear on an activity.  A parallel operation can be expected to use more database time across all the parallel server processes than the same operation processed serially.  My best results were obtained by activating all of the CPUs on the server without regard for any other activity.  Too many concurrent parallel operations have the potential to overload a system.  Remember also, that while the parallel attribute remains on the table any subsequent query will also run in parallel.
Suppressing online statistics collection saves total database time whether working in parallel or not. The saving in elapsed time is reduced when the insert and query are running in parallel.  The time taken to explicitly collect statistics will exceed that saving because it is doing additional work to collect partition statistics not done during online statistics collection.
Using incremental statistics for partitioned tables will also reduce the total amount of work and database time required to gather statistics, but may not significantly change the elapsed time to collect statistics.
If you need table statistics but can manage without partition statistics until the next maintenance window, then online statistics collection is very effective. However, I think the general case will be to require partition statistics, so you will probably need to explicitly collect statistics instead.  If you want histograms, then you will also need to explicitly collect statistics.

Structuring Content : Think Pyramid!

https://oracle-base.com/blog/wp-content/uploads/2020/01/pyramid-2611048_... 300w" sizes="(max-width: 308px) 85vw, 308px" />

This post is about making sure you get your message across by structuring your content correctly, and giving people convenient jump-off points when they’ve hit the level that is appropriate for them.

In my post about plain language, I mentioned a discussion on structuring content, and linked to a previous post called The Art of Miscommunication. I thought it was worth digging into content structure a little more.

We can think of the content structure as a pyramid. Starting at the top we keep things short and simple, then each layer down gets progressively more detailed. A person consuming the content can get to a level appropriate for them and stop, safe in the knowledge they have not missed something vital.

Level 1

What do I need to know about this content?

  • What is it about?
  • Is it really meant for me?
  • Are there any actions assigned to me?
  • Is it even worth my time reading this, or have I been included for the hell of it?

If we think about emails, blog posts and videos, it’s important we understand what the content is about early. This allows us to decide if this is the right content for us. In my opinion this is about the subject of the email, or title of blogs and videos, along with a short first paragraph.

Using the example of an email, it might be important that some of the management chain understand a situation is happening, but they may not understand the detail of the issue, or have time to dig into it further.

Here is a silly example of an email subject and first paragraph, which represents how I think the top level of the pyramid should work.

“Payroll run failed. Will be fixed by this afternoon!

This morning’s payroll run failed. Jayne is on the case, diagnosed the problem and is confident it will be fixed by this afternoon. The P1 incident number is INC123456.”

It’s short! It tells me what I need to know. It gives me some confidence I don’t need to worry about things unless I hear different. At this point I might feel safe to jump out of the email. I know it’s a Priority 1 (P1) incident, which means it will have a wash-up meeting, so I don’t need to respond asking someone to document and communicate the root cause. I feel most higher-level managers should be happy with this, and be able to duck out now.

Level 2

This is where we add a little more detail. We are still going to keep things short and simple. We will already have lost some of the readers, so the people left behind are here because they want something with a bit more depth. Maybe something like this.

“At 06:18 we got a notification to say the payroll process had died. It got escalated to Jayne, who checked the application logs. It looked like the payroll run had been hanging for a while and then died.

She asked the DBAs to check the database while she checked the OS layer on the app server. The DBAs said the database went really quiet at that time, like no new requests were coming through from the app layer, but didn’t think it was the database that was causing the problem.

Jayne noticed a Jenkins agent on the app server was grabbing 100% of the CPU, which is what killed the payroll run.

The Jenkins agent was restarted. The payroll run was restarted. Everyone is monitoring it, and they’re confident it will complete by 13:00.”

There is no evidence here, but it clearly describes what happened, and what is being done about it. If I didn’t feel confident about closing the email after the first level, I should now.

Level 3 and Beyond

In the case of an email, I don’t think anything beyond this point makes sense. Large emails and posts look daunting, and I get the impression people just “file them” to be looked at later. Maybe that’s just me. </p />
</p></div>

    	  	<div class=

SQLcmd on my Mac: Seg fault

trying to connect to SQL Server from my Mac so install sqlcmd:

brew tap microsoft/mssql-release https://github.com/Microsoft/homebrew-mssql-release
brew update
brew install mssql-tools

then get error

$ sqlcmd -S kylelfsqls.xxxxxx.us-west-2.amazonaws.com,1433 -U kylelf  
Sqlcmd: Error: Microsoft ODBC Driver 17 for SQL Server : SSL Provider: [OpenSSL library could not be loaded, make sure OpenSSL 1.0 or 1.1 is installed].
Sqlcmd: Error: Microsoft ODBC Driver 17 for SQL Server : Client unable to establish connection.
Segmentation fault: 11

Did’t find much info google, so blogging here.
Ran

ln -s /usr/local/opt/openssl@1.1 /usr/local/opt/openssl

and it worked

WITH Subquery

Here’s another anomaly that appears when you mix and match Oracle features. In this case it’s “With” subqueries (common table expressions / CTEs) and Active Dataguard (ADG) Standby databases. The problem appeared on the Oracle-l listserver and luckily for the OP another member of the list had seen it before and could point to a relevant MOS document id which explained the issue and supplied a workaround.

The OP had their standby database opened read-only for reporting and found the following oddity in the extended SQL trace file for one of their reports:


WAIT #140196872952648: nam='db file scattered read' ela= 1588 file#=4097 block#=579715946 blocks=128 obj#=-39778567 tim=17263910670242
WAIT #140196872952648: nam='db file scattered read' ela= 1495 file#=4097 block#=579715947 blocks=128 obj#=-39778567 tim=17263910672065
WAIT #140196872952648: nam='db file scattered read' ela= 1671 file#=4097 block#=579715948 blocks=128 obj#=-39778567 tim=17263910674042
WAIT #140196872952648: nam='db file scattered read' ela= 1094 file#=4097 block#=579715949 blocks=128 obj#=-39778567 tim=17263910675443

Before pointing out the oddity (if you haven’t spotted it already) I’ll just explain a few of the numbers thayt are a little unusual.

  • File# = 4097: the user has parameter db_files = 4096, so this is the first Temp file.
  • Block# = 579,715,946: the database is 120TB, and the temporary tablespace is a “bigfile” tablespace so it’s okay for the file to hold more than 579M blocks.
  • Obj# < 0: Negative object numbers is a characteristic of materialized CTEs: if you look at the execution plan a materialized CTE will be reported as a table with a name like  SYS_TEMP_FDA106F9_E259E68.  If you take the first hexadecimal number and treat is as a 32-bit signed integer you get the value that would be reported as the obj# in the trace file.  (Converting to decimal and subtract power(2,32) is one way of doing the arithmetic).
  • tim= nnnnnnnn:  this is the timestamp (usually in microseconds), and we can see intervals of roughly 1,400 to 2,000 microseconds between these lines.

So here’s the oddity: in this set of 4 consecutive waits we’re waiting for multiblock reads of 128 blocks – but each read starts one block after the previous read. It’s as if Oracle is reading 128 blocks and forgetting everything after the first one. And the timestamps are significant because they tell us that this isn’t a case of Oracle spending so much time between reads that the other blocks fall off the end of  the buffer cache before the query reaches them.

I think I’ve seen a pattern like this once before but it would have been quite a long time ago and I can’t find any notes I might have made about it (and it turns out that my previous experience was not relevant to this case). Fortunately another member of Oracle-l had also seen the pattern and supplied the solution through a reference to a MOS document that led to: Doc ID 2251339.1 With Subquery Factorization Temp Table Does Not Cache in Standby in 12.1.0.2.

It’s not a bug – Oracle is supposed to do this if you manage to materialize a CTE in a Read-only Standby database. I don’t understand exactly why there’s a problem but thanks to some feature of how consistent reads operate and block SCNs are generated when you populate the blocks of the global temporary table (GTT) that is your materialized CTE it’s possible for Oracle to produce the wrong results if it re-visits blocks that have been read into the cache from the GTT. So when you do a multiblock read during a tablescan of the GTT Oracle can use the first block it has read (presumably because it’s immediately pinned), but can’t use the remaining 127 – and so you get the odd pattern of consecutive blocks appearing at the start of consecutive multiblock reads.

This raises a couple of interesting (and nasty) questions.

  • First – does every 128 block read get read to the middle of the buffer cache, pushing another 128 blocks out of the buffer cache or does Oracle automatically read the blocks to the “cold” end of the LRU, minimising the impact on the rest of the cache; we hope it’s the latter.
  • Second – If I use a small fetch size while running my query might I find that I have to re-read the same block (with its 127 neghbours) many times because Oracle releases any pinned blocks at the end of each fetch and has to re-acquire the blocks on the next fetch.

If anyone wants to test the second question by running a query from SQL*Plus with extended trace enabled the following simple query should answer the question:

alter session set events '10046 trace name context forever, level 8';
set arraysize 2

with v1 as (select /*+ materialize */ * from all_objects)
select object_name from v1;

Workarounds

There is a workaround to the issue – you can add the hint /*+ inline */ to the query to ensure that the CTE is not materialized. There is a bit of a catch to this, though (on top of the fact that you might then need to have two slightly different versions of the code if you want to run the query on production and standby) – if Oracle places the subquery text inline the optimizer may manage to merge it into the rest of the query and come up with a bad execution plan. Again you can probably work around this threat by extending the hint to read: /*+ inline no_merge */. Even then the optimizer could decide it has better statistics about the “real” table columns that it might have lost when it materialized the subquery, so it could still produce a different execution plan from the materialized plan.

As an alternative (and somewhat more brutal) workaround you could set the hidden parameter “_with_subquery” to inline either at the session or system level, or in the startup parameter file.

 

Increasing headcount is probably not the answer!

https://oracle-base.com/blog/wp-content/uploads/2020/01/crowd-2815536_64... 300w" sizes="(max-width: 263px) 85vw, 263px" />

I’m incredibly irritated by tech people using headcount as a reason for their problems. From my experience, throwing bodies at problems is rarely the correct answer.

Increasing headcount only makes sense if:

  • You understand the problem.
  • You’ve defined the corrective actions.
  • You have processes in place to make new people productive quickly.

If you don’t understand the problem, and don’t already have a plan for solving it, hiring a load of people isn’t going to help you. It can actually make things worse. At best they will sit around and do nothing. At worst, they will start working and come up with a bunch of random “solutions” to your problems, which can leave you in a worse position than you started. Supporting a bunch of random crap is no fun at all.

My first job was a great example of doing things the right way.

  • The company signed a new customer. The software was used to track drug trials. Each trial had a unique identifier. The customer wanted to be able to refer to trials using the unique identifier, or a free text alias. This meant adding a trial alias to loads of screen in the application. There was also a need to upgrade the whole application from Forms 3.0 to Forms 4.0.
  • The analysis was done. Two procedures were defined and documented. One procedure gave directions on performing the upgrade. One procedure gave directions on adding the trial alias to the forms that needed it.
  • In addition to the existing workforce, the company hired four new people. Two were computer science graduates. Two, including me, were not. None of us had previous Oracle Database or Oracle Forms experience. After some basic training, we were put to work upgrading the forms and adding in the trial alias code.
  • It worked fine, because despite us being not much more than trained monkeys, the prerequisites had been put in place to allow someone with a PhD in cabbage sex to be a productive member of the team. There were no major blockers or constraints to deal with.

I’ve also seen it done the wrong way a bunch of times, but I’m not going to go there as it’s too depressing, and the people and companies involved will recognise themselves…

There are situations where bodies can help, but randomly throwing people at something is not a great solution if you’ve not put in the necessary effort up front to make it work. You should also be asking how many of the tasks should really be automated, so humans can be allocated to something more meaningful. In these cases, even if extra bodies could work, your focus should be on solving the root cause, not papering over the cracks.

When I discuss headcount, or throwing bodies at a problem, I could be talking about hiring more permanent staff, temporary staff or outsourcing projects. There is a lot to be said for the old saying, “You can’t outsource a problem!”, but it could easily be, “You can’t headcount a problem!” </p />
</p></div>

    	  	<div class=

Migrating Oracle Exadata Workloads to Azure

I know, I know-  there’s a number of you out there thinking-

I’m being brought in on more and more of these projects due to a recent change for some Exadata backup components many companies kept onsite, (parts that wear out more often, like PDUs and cell disks) which are no longer an option and that moving to Azure is a viable option for these workloads if you know what to identify and address before the move to the cloud.

Engineered, Not Iron

An Exadata is an engineered system-  database nodes, secondary cell nodes, (also referred to as storage nodes/cell disks), InfiniBand for fast network connectivity between the nodes, specialized cache, along with software features such as Real Application Clusters, (RAC), hybrid columnar compression, (HCC), storage indexes, (indexes in memory) offloading technology that has logic built into it to move object scans and other intensive workloads to cell nodes from the primary database nodes.  There are considerable other features, but understanding that Exadata is an ENGINEERED system, not a hardware solution is important and its both a blessing and a curse for those databases supported by one.  The database engineer must understand both Exadata architecture and software along with database administration.  There is an added tier of performance knowledge, monitoring and patching that is involved, including knowledge of the Cell CLI, the command line interface for the cell nodes.  I could go on for hours on more details, but let’s get down to what is required when I am working on a project to migrate an Exadata to Azure.

Focus on the Right Thing

I discovered that many times, it was natural for many to focus on the engineered system resources and attempt to duplicate this in Azure.  Although there are numerous features that are similar to Exadata in Azure Data Warehouse, it’s still not the correct path to focus on the engineered system.

Focus on the Database Workload.

Oracle offers us this in the way of an Automatic Workload Repository, (AWR) report.  This data is always being collected, so no extra workload will be requested on the production system.  Simply generate the report on the repository and review the information provided on CPU, memory, IO, along with any red flags that should be addressed before migrating to the cloud.

It allows us to break down workloads by databases.  For most Exadata investments, there was one or two databases that were the primary reason for the Exadata and the main consumer of resources.  By isolating by database, it allows for better sizing, pricing and licensing cost.

With the breakdown by database, we’re also evolving a spreadsheet with the information we need to calculate the Azure resources that would be needed in a lift and shift.  This makes it simpler to consume by the customer if they’re interested, too.

https://dbakevlar.com/wp-content/uploads/2020/01/awr_sizing_xls-300x146.png 300w, https://dbakevlar.com/wp-content/uploads/2020/01/awr_sizing_xls-768x374.png 768w, https://dbakevlar.com/wp-content/uploads/2020/01/awr_sizing_xls.png 1507w" sizes="(max-width: 800px) 100vw, 800px" />

Its an Exadata Thing

The next step is to identify what Exadata specific features are in play that won’t come over in a lift/shift-

  1. Storage indexes that will need to be physically created that currently only exist in memory for smart scans.
  2. What offloading to cell nodes happens currently and will require faster IO and added resources to support on the IaaS cloud solution?
  3. What HCC is in place and the added IO pressure on the system once it no longer exists? What compression options are available in Oracle on IaaS for the customer to take advantage of?
  4. What objects, etc. reside in the different caches?  Flash Cache for recently used objects must be listed and tested for latency.
  5. Was the SGA shrunk to force offloading and does this mean the new databases require more memory than what is currently allocated?
  6. If using RAC, (expected 98% of time on Exadata) is there a plan to stay RAC and if so, is it justified and if not, ensure to calculate for the savings from global cache no longer needed, but total for the needs of each node’s current workload.
  7. Flash log–  without this temporary store for low-latency redo log writes, latency impacts must be identified and optimized through other means.

All of this data is captured in some form, within the AWR reports and via queries from the Cell CLI.  I commonly request a one week snapshot report of AWR from each database on the Exadata.  If there are monthly, quarterly or yearly reports, this can be addressed farther into the project, but gaining an idea of the workloads for each of the databases is my first step.

  1. Enter in all pertinent data into the spreadsheet, including OS, db version, RAC/Non-RAC, memory, CPU/Core, etc.
  2. Trap the percentage of CPU, memory and IO being used by the workload vs. what is available.
  3. Review the AWR and perform a database health check, including any recommendations for issues that should be addressed before the database is moved to the cloud.
  4. Identify any areas that need deeper investigation- odd block sizes, non-default parameter settings, running maintenance/admin jobs that need to be addressed, etc.

I do provide a full healthcheck report of the databases.  Its not something they often receive, so to have this review is incredibly beneficial to their long-term satisfaction with their cloud migration.  Who wants to migrate over a problem with their database?  Resolving it before the cloud migration saves everyone time, money and headaches.

Second phase is to run Exadata specific collections:

  1. Query the cell cli for offloading percentages
  2. Pull the flashcache information from the AWR
  3. Query the amount of flash log that is in use and how often
  4. Query the objects that are currently using HCC, along with compression ratios, uncompressed sizes, IO percentages for workload
  5. Identify storage indexes used for smart scans.  This includes columns and object name they reside on.
  6. Inspect the PGA, is there tons of multi-pass or is the PGA sized correctly?  Will this be sufficient once off of Exadata where some hash joins may be run on cell nodes?

With this data collected, the next step is unique to the customer database, but most often requires:

  • Physical indexes created in the production database, but left invisible and only made visible after the migration to Azure.
  • Implement new compression options that are available in the version of Oracle in Azure.
  • Resize the SGA and PGA to address any shrinking that was done on the Exadata.
  • Pin objects in memory and other cache features for better performance.
  • Optimize any SQL that relied on the Exadata offloading to exist previously, where optimization should have been the first route taken.
  • Place redo logs on ultra disk if required
  • If there is a specific workload that was using high degree of parallelism,(DOP) to run, it may be time to discuss an Azure DW solution to process the data using Azure Data Factory to push the ELT.  Using the right tool for the job is always preferable and there are times where the massive parallel processing and ability to offload to a solution that can handle the workload long term without impacting the existing data store is preferential.

Testing Without Features on Exadata

There are numerous ways you can “shut off” Exadata features.  If the space exists on the Exadata or a test database(s) exist, the following parameter changes can be used to do this or it can be done as an optimization process as part of the lift and shift.

Stop offloading to cell nodes, disable bloom filters and disable storage indexes:

alter system set cell_offload_processing=false;
alter system set #555555; text-transform: none; text-indent: 0px; letter-spacing: normal; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; word-spacing: 0px; display: inline !important; white-space: normal; orphans: 2; float: none; -webkit-text-stroke-width: 0px; background-color: #ffffff;"> _BLOOM_FILTER_ENABLED=false;
alter system set #555555; text-transform: none; text-indent: 0px; letter-spacing: normal; font-size: 16px; font-style: normal; font-variant: normal; font-weight: 400; text-decoration: none; word-spacing: 0px; display: inline !important; white-space: normal; orphans: 2; float: none; -webkit-text-stroke-width: 0px; background-color: #ffffff;">_KCFIS_STORAGEIDX_DISABLED=true;

You can then verify offloading to cell nodes is no longer occurring with the following query:

select a.name, b.value/1024/1024 MB from v$sysstat a, v$sysstat b
where a.statistics#=b.statistics# and (a.name in ('physical read total bytes', 
'physical write total bytes', 'cell IO uncompressed bytes')
or a.name like 'cell phy%');

The results should no longer show cell activity once this is put into place and you can check offloading and smart scans.  SGA can be increased and other features will still be in play, but this can then be a step by step process as we migrate to the cloud.  It also gives us the ability to shut off features and test in a controlled situation.

I won’t say migrating off of Exadata is easy, but I will say it’s easily doable.  The main goal is to ensure that customers are satisfied in the long run and this is where the steps you see in this post are important.  It’s not as much a lift and shift as it is a lift and build to the cloud.  This would be a requirement in going to any non-Exadata solution and this one will save the customer money and grant them the scalability they didn’t have with an on-prem engineered solution.

 

 



Tags:  , ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright ©  [Migrating Oracle Exadata Workloads to Azure], All Right Reserved. 2020.

UTL_FILE_DIR and 18c revisited

A couple of years back (wow…time flies!) I made a video and a post about the de-support of UTL_FILE_DIR in 18c. This was good thing because the number of people opting for “utl_file_dir=*” in their init.ora file, and hence opening themselves up to all sorts of risks seemed to be large! (The post elaborates on this more with an example of erasing your database with a short UTL_FILE script Smile)

But a friend came to me with a valid rebuttal, namely that while most people tended to use UTL_FILE_DIR to read/write files to a small list of directories (most typically one), other customers had taken advantage of UTL_FILE to utilise complex directory structures with hundreds or even thousands of sub-folders. Converting that infrastructure (which is trivially implemented with “utl_file_dir=*” even with all its risks) is a tougher proposition when using directory objects in the database, because a single directory object points to a single directory.

To assist with moving away from utl_file_dir in such a circumstance, I’ve whipped up a small utility which aims to be a drop-in replacement for your UTL_FILE.FOPEN calls. The intent of the utility is that moving to the new directory objects method should simply be the case of adding a nested call. Thus if you had an existing file open call of:


f utl_file.fopen('/u01/app/oracle','myfile.dat',...);

you simply nest the utility function as follows:


f utl_file.fopen( utl_file_dir('/u01/app/oracle'),'myfile.dat',...);

and you should be good to go.

The way this is achieved is that the utility does its best to come up with a sensible but unique name for a directory object to map to the standard OS path that has been passed. The comments in the code explain the workings, and of course you’re free to modify it to suit your needs.



create or replace
function utl_file_dir(p_file_or_dir varchar2, 
                      p_create_dir boolean default true, 
                      p_add_hash boolean default true) return varchar2 is
  pragma autonomous_transaction;

  l_default_dir varchar2(128) := 'TEMP';
  l_delim       varchar2(1)   := '/';   -- change for Windows

  l_last_delim  int           := instr(p_file_or_dir,l_delim,-1);
  l_file        varchar2(255); 
  l_dir_path    varchar2(255);
  l_dir_object  varchar2(255);
  l_exists      int;
  l_clash       int;
begin
  if l_last_delim = 0 then
     --
     -- If no presence of a directory delimiter, then we assume the entire
     -- string is a file if it contains a '.' (ie, a file extension)
     -- and return a default directory object name (l_default_dir)
     -- otherwise we assume the string is a directory in its own right
     --
     if p_file_or_dir like '%.%' then
       l_dir_object := l_default_dir;
     else
       l_dir_path   := p_file_or_dir;
     end if;
  else
     --
     -- We have a delimiter. The directory is the contents up to
     -- the last delimiter, unless there is no file extension past
     -- that last delimiter. In that latter case, we assume the entire
     -- string is a directory in its own right
     --
     l_file      := substr(p_file_or_dir,l_last_delim+1);
     if l_file like '%.%' then
       l_dir_path     := substr(p_file_or_dir,1,l_last_delim-1);
     else
       l_dir_path     := p_file_or_dir;
     end if;
  end if;

  --
  -- Now we make a clean directory object name from the path. We could
  -- of course use any random string, but this is designed to make things
  -- a little more intuitive. 
  -- 
  -- For example '/u01/app/oracle' will become U01_APP_ORACLE
  --
  -- You have a choice here in terms of the risk element of collisions depending 
  -- on how loose your folder structure is.  For example, the two paths:
  --
  --   /u01/app/oracle/
  --   /u01/app/"oracle-"/
  --
  -- by default will map to the same clean name of U01_APP_ORACLE and we will   
  -- report an error in this instance.
  -- 
  -- Alternatively (and the default) is that we take our directory path and 
  -- grab the last few bytes from a MD5 hash on it, to greatly increase the likelihood
  -- of a non-clashing directory name.  In the above example, the clean directory names become
  --
  --   U01_APP_ORACLE_25B9C47A
  --   U01_APP_ORACLE_7D51D324
  -- 
  -- So what you lose in intuitive readability you gain in reduced chance of collision.
  -- This is controlled with "p_add_hash"
  --
  if l_dir_object is null then
     l_dir_object := regexp_replace(replace(replace(l_dir_path,l_delim,'_'),'-','_'),'[^[:alnum:] _]');
     l_dir_object := regexp_replace(trim('_' from upper(regexp_replace(l_dir_object,'  *','_'))),'__*','_');
     if p_add_hash then
       select substr(l_dir_object,1,119)||'_'||substr(standard_hash(l_dir_path,'MD5'),1,8)
       into   l_dir_object
       from   dual;
     else
       l_dir_object := substr(l_dir_object,1,128);
     end if;
  end if;

  -- Now we go ahead and create that directory on the database.
  -- The user running this function must have CREATE ANY DIRECTORY privilege granted
  -- explicitly, which means of course, you should protect this routine and perhaps add
  -- some sanity checking to make sure that no-one creates a directory to reference (say) 
  -- the objects in V$DATAFILE !
  
  if p_create_dir then
    select count(*),
           count(case when directory_path != l_dir_path then 1 end) 
    into   l_exists,
           l_clash
    from   all_directories
    where  directory_name = l_dir_object;

    if l_exists = 0 then
      execute immediate 'create directory "'||l_dir_object||'" as q''{'||l_dir_path||'}''';
    else
      --
      -- If (hash or not) we enter the nasty situation where the same clean name would
      -- map to 2 path names, we give up and go home.
      --
      if l_clash > 0 then
        raise_application_error(-20000,'Found matching directory object '||l_dir_object||' with different path from >'||l_dir_path||'<');
      end if;
    end if;
  end if;
  
  commit;
  return l_dir_object;
end;
/
sho err

Here are some examples of typical usage. By default we would actually create the directory object, but you can override this for simple testing like I’ve done below



SQL> -- Examples
SQL>
SQL> variable dirname varchar2(128)
SQL>
SQL> -- standard file
SQL> exec :dirname := utl_file_dir('/u01/app/oracle/test.dat',p_create_dir=>false)

PL/SQL procedure successfully completed.

SQL> print dirname

DIRNAME
-----------------------------------------------------------------------------------------------
U01_APP_ORACLE_9E2472BB

SQL>
SQL> -- quoted/spaces etc
SQL> exec :dirname :=  utl_file_dir('/u01/"asd app"/oracle/test.dat',p_create_dir=>false)

PL/SQL procedure successfully completed.

SQL> print dirname

DIRNAME
-----------------------------------------------------------------------------------------------
U01_ASD_APP_ORACLE_FFDC5BEA

SQL>
SQL> -- trailing delimiter.
SQL> exec :dirname :=  utl_file_dir('/u01/app/oracle/',p_create_dir=>false)

PL/SQL procedure successfully completed.

SQL> print dirname

DIRNAME
-----------------------------------------------------------------------------------------------
U01_APP_ORACLE_25B9C47A

SQL> exec :dirname :=  utl_file_dir('/u01/app/oracle--/',p_create_dir=>false)

PL/SQL procedure successfully completed.

SQL> print dirname

DIRNAME
-----------------------------------------------------------------------------------------------
U01_APP_ORACLE_7D51D324

SQL>
SQL> -- no file
SQL> exec :dirname :=  utl_file_dir('/u01/app/oracle',p_create_dir=>false)

PL/SQL procedure successfully completed.

SQL> print dirname

DIRNAME
-----------------------------------------------------------------------------------------------
U01_APP_ORACLE_9E2472BB

SQL>
SQL> -- no delimiter
SQL> exec :dirname :=  utl_file_dir('mydir',p_create_dir=>false)

PL/SQL procedure successfully completed.

SQL> print dirname

DIRNAME
-----------------------------------------------------------------------------------------------
MYDIR_33CD7707

SQL>
SQL> -- no delimiter but probably a file
SQL> exec :dirname :=  utl_file_dir('mydir.txt',p_create_dir=>false)

PL/SQL procedure successfully completed.

SQL> print dirname

DIRNAME
-----------------------------------------------------------------------------------------------
TEMP

TL;DR: This utility lets moving away from UTL_FILE_DIR be less difficult.

Code at https://github.com/connormcd/misc-scripts

 

MySQL – manually updating table and index optimizer statistics

Goal was to create a empty table and poplulate it with optimizer table and index stats from a poplulated table to try and get the optimizer to give the same explain plan and costs on the empty table as the populated table. The purpose of this was to be able to modify the empty table by adding indexes quickly and cheaply and still be able to get correct optimizer costs. This worked in the case of table access but unfortunately didn’t work when bring in an index.

Create procedure to populate test table

DROP PROCEDURE IF EXISTS populate_data;
DELIMITER //
CREATE PROCEDURE populate_data(max_val int)
BEGIN
DECLARE int_val INT DEFAULT 0;
test_loop : LOOP
IF (int_val = max_val) THEN
LEAVE test_loop;
END IF;
INSERT INTO source(id) VALUES (FLOOR(RAND()*(1000000+1)+1));
SET int_val = int_val +1;
END LOOP;
END;//
DELIMITER ;

Create source table and populate it.
Create source_clone that will be empty but we want to manually fill in optimizer statistics to mimic source table

CREATE TABLE source(id integer);
call populate_data(1000000);
CREATE TABLE source_clone (id integer) ;

comparing table and index stats on source and source_clone we can see as expected they are different:

select * from mysql.innodb_table_stats where table_name like 'sourc%';
+---------------+--------------+---------------------+--------+----------------------+--------------------------+
| database_name | table_name   | last_update         | n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+--------------+---------------------+--------+----------------------+--------------------------+
| kyle          | source       | 2020-01-22 18:49:48 |   9980 |                   21 |                        0 |
| kyle          | source_clone | 2020-01-22 18:49:44 |      0 |                    1 |                        0 |
+---------------+--------------+---------------------+--------+----------------------+--------------------------+
select * from mysql.innodb_index_stats where table_name like 'sourc%';
+---------------+--------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name   | index_name      | last_update         | stat_name    | stat_value | sample_size | stat_description                  |
+---------------+--------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+
| kyle          | source       | GEN_CLUST_INDEX | 2020-01-22 18:49:48 | n_diff_pfx01 |       9980 |          20 | DB_ROW_ID                         |
| kyle          | source       | GEN_CLUST_INDEX | 2020-01-22 18:49:48 | n_leaf_pages |         20 |        NULL | Number of leaf pages in the index |
| kyle          | source       | GEN_CLUST_INDEX | 2020-01-22 18:49:48 | size         |         21 |        NULL | Number of pages in the index      |
| kyle          | source_clone | GEN_CLUST_INDEX | 2020-01-22 18:49:44 | n_diff_pfx01 |          0 |           1 | DB_ROW_ID                         |
| kyle          | source_clone | GEN_CLUST_INDEX | 2020-01-22 18:49:44 | n_leaf_pages |          1 |        NULL | Number of leaf pages in the index |
| kyle          | source_clone | GEN_CLUST_INDEX | 2020-01-22 18:49:44 | size         |          1 |        NULL | Number of pages in the index      |
+---------------+--------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+

running the explain plan we get different rows returned as expected

--
mysql> explain select id from kyle.source where id > 2056;
+----+-------------+--------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table  | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+--------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | source | ALL  | NULL          | NULL | NULL    | NULL | #ff0000;">9980 | Using where |
+----+-------------+--------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.05 sec)

mysql> explain select id from kyle.source_clone where id > 2056;
+----+-------------+--------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table        | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+--------------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | source_clone | ALL  | NULL          | NULL | NULL    | NULL |    #ff0000;">1 | Using where |
+----+-------------+--------------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.04 sec)

Now let’s copy the stats from the populated source table to the empty source_clone data

-- TABLE  STATS
update mysql.innodb_table_stats AS `dest`,
     (SELECT  n_rows , clustered_index_size 
     FROM mysql.innodb_table_stats 
     WHERE 
         table_name = 'source'
    ) AS `src`
SET dest.n_rows  = src.n_rows , 
    dest.clustered_index_size  = src.clustered_index_size
WHERE 
         table_name = 'source_clone';
-- INDEX STATS
UPDATE mysql.innodb_index_stats AS `dest`,
(SELECT stat_value, stat_name
FROM mysql.innodb_index_stats
WHERE
table_name = 'source'
and index_name = 'GEN_CLUST_INDEX'
) AS `src`
SET dest.stat_value = src.stat_value
WHERE dest.table_name = 'source_clone'
and dest.index_name = 'GEN_CLUST_INDEX'
and dest.stat_name = src.stat_name;

The optimizer table and index statistics are now the same

mysql> select * from mysql.innodb_table_stats where table_name like 'sourc%';
+---------------+--------------+---------------------+--------+----------------------+--------------------------+
| database_name | table_name   | last_update         | n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+--------------+---------------------+--------+----------------------+--------------------------+
| kyle          | source       | 2020-01-22 18:49:48 |   9980 |                   21 |                        0 |
| kyle          | source_clone | 2020-01-22 18:54:31 |   9980 |                   21 |                        0 |
+---------------+--------------+---------------------+--------+----------------------+--------------------------+
--
mysql> select * from mysql.innodb_index_stats where table_name like 'sourc%';
+---------------+--------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name   | index_name      | last_update         | stat_name    | stat_value | sample_size | stat_description                  |
+---------------+--------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+
| kyle          | source       | GEN_CLUST_INDEX | 2020-01-22 18:49:48 | n_diff_pfx01 |       9980 |          20 | DB_ROW_ID                         |
| kyle          | source       | GEN_CLUST_INDEX | 2020-01-22 18:49:48 | n_leaf_pages |         20 |        NULL | Number of leaf pages in the index |
| kyle          | source       | GEN_CLUST_INDEX | 2020-01-22 18:49:48 | size         |         21 |        NULL | Number of pages in the index      |
| kyle          | source_clone | GEN_CLUST_INDEX | 2020-01-22 18:54:46 | n_diff_pfx01 |       9980 |           1 | DB_ROW_ID                         |
| kyle          | source_clone | GEN_CLUST_INDEX | 2020-01-22 18:54:46 | n_leaf_pages |         20 |        NULL | Number of leaf pages in the index |
| kyle          | source_clone | GEN_CLUST_INDEX | 2020-01-22 18:54:46 | size         |         21 |        NULL | Number of pages in the index      |
+---------------+--------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+

Running explain we are still getting different rows returned on the source clone

--
mysql> explain select id from kyle.source_clone where id > 2056;
+----+-------------+--------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table        | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+--------------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | source_clone | ALL  | NULL          | NULL | NULL    | NULL |   #ff0000;"> 1 | Using where |
+----+-------------+--------------+------+---------------+------+---------+------+------+-------------+

But by doing a flush table on the source_clone we not get the same rows returned in the explain as the source

 FLUSH TABLE source_clone;
--
mysql> explain select id from kyle.source_clone where id > 2056;
+----+-------------+--------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table        | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+--------------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | source_clone | ALL  | NULL          | NULL | NULL    | NULL | #ff0000;">9980 | Using where |
+----+-------------+--------------+------+---------------+------+---------+------+------+-------------+

Now let’s create an index and try to manually update the optimizer index stats

CREATE INDEX source_id ON source (id);
CREATE INDEX source_clone_id ON source_clone (id);
--
mysql> explain select id from kyle.source where id > 2056;
+----+-------------+--------+-------+---------------+-----------+---------+------+------+--------------------------+
| id | select_type | table  | type  | possible_keys | key       | key_len | ref  | rows | Extra                    |
+----+-------------+--------+-------+---------------+-----------+---------+------+------+--------------------------+
|  1 | SIMPLE      | source | range | source_id     | source_id | 5       | NULL | #ff0000;">4990 | Using where; Using index |
+----+-------------+--------+-------+---------------+-----------+---------+------+------+--------------------------+
--
mysql> explain select id from kyle.source_clone where id > 2056;
+----+-------------+--------------+-------+-----------------+-----------------+---------+------+------+--------------------------+
| id | select_type | table        | type  | possible_keys   | key             | key_len | ref  | rows | Extra                    |
+----+-------------+--------------+-------+-----------------+-----------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | source_clone | range | source_clone_id | source_clone_id | 5       | NULL |    #ff0000;">1 | Using where; Using index |
+----+-------------+--------------+-------+-----------------+-----------------+---------+------+------+--------------------------+
--
UPDATE mysql.innodb_index_stats AS `dest`,
(SELECT stat_value, stat_name
FROM mysql.innodb_index_stats
WHERE
table_name = 'source'
and index_name = 'source_id'
) AS `src`
SET dest.stat_value = src.stat_value
WHERE dest.table_name = 'source_clone'
and dest.index_name = 'source_clone_id'
and dest.stat_name = src.stat_name;
--
mysql> select * from mysql.innodb_index_stats where table_name like 'sourc%';
+---------------+--------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name   | index_name      | last_update         | stat_name    | stat_value | sample_size | stat_description                  |
+---------------+--------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+
| kyle          | source       | GEN_CLUST_INDEX | 2020-01-22 18:56:46 | n_diff_pfx01 |       9980 |          20 | DB_ROW_ID                         |
| kyle          | source       | GEN_CLUST_INDEX | 2020-01-22 18:56:46 | n_leaf_pages |         20 |        NULL | Number of leaf pages in the index |
| kyle          | source       | GEN_CLUST_INDEX | 2020-01-22 18:56:46 | size         |         21 |        NULL | Number of pages in the index      |
| kyle          | source       | source_id       | 2020-01-22 18:56:46 | n_diff_pfx01 |       9950 |          11 | id                                |
| kyle          | source       | source_id       | 2020-01-22 18:56:46 | n_diff_pfx02 |      10000 |          11 | id,DB_ROW_ID                      |
| kyle          | source       | source_id       | 2020-01-22 18:56:46 | n_leaf_pages |         11 |        NULL | Number of leaf pages in the index |
| kyle          | source       | source_id       | 2020-01-22 18:56:46 | size         |         12 |        NULL | Number of pages in the index      |
| kyle          | source_clone | GEN_CLUST_INDEX | 2020-01-22 18:56:46 | n_diff_pfx01 |       9980 |           1 | DB_ROW_ID                         |
| kyle          | source_clone | GEN_CLUST_INDEX | 2020-01-22 18:56:46 | n_leaf_pages |         20 |        NULL | Number of leaf pages in the index |
| kyle          | source_clone | GEN_CLUST_INDEX | 2020-01-22 18:56:46 | size         |         21 |        NULL | Number of pages in the index      |
| kyle          | source_clone | source_clone_id | 2020-01-22 18:57:38 | n_diff_pfx01 |       9950 |           1 | id                                |
| kyle          | source_clone | source_clone_id | 2020-01-22 18:57:38 | n_diff_pfx02 |      10000 |           1 | id,DB_ROW_ID                      |
| kyle          | source_clone | source_clone_id | 2020-01-22 18:57:38 | n_leaf_pages |         11 |        NULL | Number of leaf pages in the index |
| kyle          | source_clone | source_clone_id | 2020-01-22 18:57:38 | size         |         12 |        NULL | Number of pages in the index      |
+---------------+--------------+-----------------+---------------------+--------------+------------+-------------+-----------------------------------+
--
mysql> explain select id from kyle.source_clone where id > 2056;
+----+-------------+--------------+-------+-----------------+-----------------+---------+------+------+--------------------------+
| id | select_type | table        | type  | possible_keys   | key             | key_len | ref  | rows | Extra                    |
+----+-------------+--------------+-------+-----------------+-----------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | source_clone | range | source_clone_id | source_clone_id | 5       | NULL |   #ff0000;"> 1 | Using where; Using index |
+----+-------------+--------------+-------+-----------------+-----------------+---------+------+------+--------------------------+
--
 FLUSH TABLE source_clone;
--
mysql> explain select id from kyle.source_clone where id > 2056;
+----+-------------+--------------+-------+-----------------+-----------------+---------+------+------+--------------------------+
| id | select_type | table        | type  | possible_keys   | key             | key_len | ref  | rows | Extra                    |
+----+-------------+--------------+-------+-----------------+-----------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | source_clone | range | source_clone_id | source_clone_id | 5       | NULL |    #ff0000;">1 | Using where; Using index |
+----+-------------+--------------+-------+-----------------+-----------------+---------+------+------+--------------------------+
--
Flush tables;
--
mysql> explain select id from kyle.source_clone where id > 2056;
+----+-------------+--------------+-------+-----------------+-----------------+---------+------+------+--------------------------+
| id | select_type | table        | type  | possible_keys   | key             | key_len | ref  | rows | Extra                    |
+----+-------------+--------------+-------+-----------------+-----------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | source_clone | range | source_clone_id | source_clone_id | 5       | NULL |    #ff0000;">1 | Using where; Using index |
+----+-------------+--------------+-------+-----------------+-----------------+---------+------+------+--------------------------+
-- REBOOT database
mysql> explain select id from kyle.source_clone where id > 2056;
+----+-------------+--------------+-------+-----------------+-----------------+---------+------+------+--------------------------+
| id | select_type | table        | type  | possible_keys   | key             | key_len | ref  | rows | Extra                    |
+----+-------------+--------------+-------+-----------------+-----------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | source_clone | range | source_clone_id | source_clone_id | 5       | NULL |   #ff0000;"> 1 | Using where; Using index |
+----+-------------+--------------+-------+-----------------+-----------------+---------+------+------+--------------------------+

The optimizer index stats are the same, we have flushed the tables, and even did a reboot of the database, but the explain plan still shows different results on the source_clone.
The optimizer must be basing costs in this case on things other than mysql.innodb_index_stats and mysql.innodb_table_stats
 

 

Philosophy 23

It’s a long time since I wrote a note with the “philosophy” theme, but a recent complaint about Oracle prompted me to suggest the following thought:

“If you write SQL that is technically incorrect it’s not Oracle’s fault if sometimes the SQL completes without an error.”

Consider the following fragment of code:

drop table t1 purge;

create table t1 (n1 number, n2 number);
insert into t1 values (1,1);
commit;

select n1 from t1 where n2 = to_number('x');
select n1 from t1 where n2 = to_number('x') and n1 = 2;

select 
        case 
                when 
                        0 in  (select n1 from t1 where n2 = to_number('x'))
                then
                        1
                else
                        0
        end
from
        dual
;

Clearly the first query must raise an error because ‘x’ can’t be converted to a number (until Oracle supplies a format option to read it as a Roman Numeral).

Clearly the second query must raise an error because it’s just the first query with an extra predicate on the end.

Clearly the third query must raise an error because it’s going to execute a subquery that must raise an error.

Here’s the output from running the code from SQL*Plus (on 12.2.0.1).

Table dropped.


Table created.


1 row created.


Commit complete.

select n1 from t1 where n2 = to_number('x')
                                       *
ERROR at line 1:
ORA-01722: invalid number



no rows selected


CASEWHEN0IN(SELECTN1FROMT1WHEREN2=TO_NUMBER('X'))THEN1ELSE0END
--------------------------------------------------------------
                                                             0

1 row selected.

The first query fails with the expected conversion error: ORA-01722: invalid number. The second and third queries run to completion because SQL is a declarative language, not a procedural language, and the optimizer is allowed to transform your query in any way that it thinks might make it complete more quickly.

Here’s the execution plan – with predicate information – for the second query:

-------------------------------------+-----------------------------------+
| Id  | Operation          | Name    | Rows  | Bytes | Cost  | Time      |
-------------------------------------+-----------------------------------+
| 0   | SELECT STATEMENT   |         |       |       |     2 |           |
| 1   |  TABLE ACCESS FULL | T1      |     1 |    26 |     2 |  00:00:01 |
-------------------------------------+-----------------------------------+

Predicate Information:
----------------------
1 - filter(("N1"=2 AND "N2"=TO_NUMBER('x')))

The optimizer has decided that it’s more efficent to test for the constant 2 than it is to call the to_number() function, so it’s evaluated n1 = 2 first for each row and never had to check the second predicate because nothing got past the first.

The explanation for the successful completion of the third query is slightly different, but again it revolves around transforming for efficiency. Oracle will (very often) convert an IN subquery to an EXISTS subquery. In my example the resulting SQL looks like this (taken from the CBO (10053) trace file, with some cosmeticy enhancement):


SELECT 
        CASE
                WHEN 
                        EXISTS (SELECT 0 FROM T1 WHERE N1=0 AND N2=TO_NUMBER('x')) 
                THEN 1 
                ELSE 0 
        END  "CASEWHEN0IN(SELECTN1FROMT1WHEREN2=TO_NUMBER('X'))THEN1ELSE0END" 
FROM
        DUAL

Note how the “guaranteed to fail” subquery has an extra predicate added as the IN subquery is transformed into an EXISTS subquery and, just like the previous example, the extra predicate is applied before the to_number() predicate, and there’s no data to match the first predicate so the to_number() function never gets called and never gets a chance to raise an exception.

You could argue, or course, that the optimizer should spot the attempt to generate a constant and evaluate it (if possible) at parse time and raise the error before the runtime engine even sees the query – and it wouldn’t be too  hard to make a case for that – but it would only take a few seconds of thought to create slightly more complex examples to demonstrate the point I’ve been making.

Bottom line(s):

1) Use the correct datatypes for your data.

2) Don’t assume that a statement that raises an exception in some circumstances will result in an exception if it is subsequently embedded in a more complex statement. The optimizer is non-procedural and may transform your statement in a way that bypasses your bad design.

 

Footnote:

I probably shouldn’t say this – it will only encourage someone to do the wrong thing – but for the very simplest examples you can show that the ordered_predicates hint still works:


SQL> select /*+ ordered_predicates */  n1 from t1 where n2 = to_number('x') and n1 = 2;
select /*+ ordered_predicates */  n1 from t1 where n2 = to_number('x') and n1 = 2
                                                                  *
ERROR at line 1:
ORA-01722: invalid number

When AWR is not enough, a.k.a. Why your app is not cloud-ready. #SMARTDB

In my previous post I described a case of "enq: TX – row lock contention" that was actually a network latency problem.

More and more those kinds of problems (it seems like history happens all over again) caused my friend (Radosław Kut) and me to write a simple tool to analyze SQL performance from a network perspective.

We called this tool STADO (SQL Tracefile Analyzer Designed for Oracle). It parses a tcpdump outout from application server to identify the longest running queries from application and network perspective.

For now it can show you:

  • SQL ID
  • Ratio – the amount of time spent in network layer
  • Ela App (ms) – time spent in application for executing the query
  • Ela Net (ms) – time spent in network layer for executing the query
  • Exec – number of executions
  • Ela Stddev App – Stddev for time spent from an app perspective
  • Ela App/Exec – AVG time spent on execution from an app perspective
  • Ela Stddev Net – Stddev for time spent from a network perspective
  • Ela Net/Exec – AVG time spent on execution from a network perspective
  • P – amount of packets used for executing this cursor
  • S – how many sessions where executing this cursor
  • RC – how many times a cursor was reused, reducing the amount of parse calls

Let me show you how this tool works, based on a simple example.

I have two database servers (Oracle XE 18c):

  • 10.0.0.8 – remote destination with ping around 64ms
  • 192.168.56.118 – local destination with ping below 1ms

I’ll execute the following script on both servers:

set arraysize &&1
set timing on

alter system flush shared_pool;

select /*NET_TEST*/ * from EMPLOYEES;

select sql_id, elapsed_time/1000 as ela_ms, executions
from v$sql
where sql_text like '%/*NET_TEST*/%'
order by elapsed_time;

exit

While executing a script, I’ll be dumping network traffic with tcpdump like this:

inter@piclerick stado % sudo tcpdump -i any -w sqls.pcap

Let’s analyze our query performance!

REMOTE

inter@piclerick instantclient_18_1 % ./sqlplus hr/hr@10.0.0.8:1521/netst @emps 2

And the result is:

107 rows selected.

Elapsed: 00:00:03.28

SQL_ID		  ELA_MS EXECUTIONS
------------- ---------- ----------
956vu0d29ah05	  13.931	  1
3bxuudrscy1jn	  41.594	  1

Let’s compare SQL execution time from a database perspective and from an app perspective. We can see, that we spent almost 14ms executing a query, but the whole process from an app perspective took 3,28s!!!

Let’s check the same execution on local server:

LOCAL

107 rows selected.

Elapsed: 00:00:00.03

SQL_ID		  ELA_MS EXECUTIONS
------------- ---------- ----------
956vu0d29ah05	   7.702	  1
3bxuudrscy1jn	  26.069	  1

Database shows, that a query was around 2 times faster on a local server, comparing to the remote. The absolute difference is 6,229ms – but the actual difference, actual "wallclock" difference is more than 3s! That is huge.

And of course, that’s nothing new – Elaped time in V$SQL is "Elapsed time (in microseconds) used by this cursor for parsing, executing, and fetching", but it won’t show you the actual time spent on processing this SQL.

So, let’s say, that my local server is much slower than my new server in a remote destination.
I reduced the number of CPUs and RAM and made sure to keep a database a bit busy on my local machine.

Now my results are:

LOCAL

107 rows selected.

Elapsed: 00:00:00.10

SQL_ID		  ELA_MS EXECUTIONS
------------- ---------- ----------
956vu0d29ah05	  18.101	  1
3bxuudrscy1jn	  47.954	  1

REMOTE

107 rows selected.

Elapsed: 00:00:03.30

SQL_ID		  ELA_MS EXECUTIONS
------------- ---------- ----------
956vu0d29ah05	   8.627	  1
3bxuudrscy1jn	  52.103	  1

From a database perspective, the second query is much faster! But the application shows something very different.

Great, now let’s see the same performance, from a TCPDUMP perspective, using STADO:

inter@piclerick stado % ./stado
STADO (SQL Tracedump Analyzer Doing Oracle) by Radoslaw Kut and Kamil Stawiarski
Pcap file analyzer for finding TOP SQLs from an APP perspective
  -C string
    	 directory path to write SQL Charts i.e. -C DevApp
  -d int
    	Debug flag
  -f string
    	path to PCAP file for analyzing
  -i string
    	IP address of database server
  -p string
    	Listener port for database server
  -s string
    	Sort by: ela (Elapsed Time), elax (Elapsed time / Executions), pckt (Packets), rtt (Network Elapsed Time), rttx (Elapsed Time Net / Exec), rat (Ela Net / Ela App) (default "ela")

LOCAL – fetchsize 2

inter@piclerick stado % ./stado -f sqls.pcap -i "192.168.56.118 or 10.0.0.8" -p 1521 -s rat
SQL ID		Ratio		Ela App (ms)	Ela Net(ms)	Exec	Ela Stddev App	Ela App/Exec	Ela Stddev Net	Ela Net/Exec	P	S	RC
------------------------------------------------------------------------------------------------------------------------------------------------------------
5jng2h1ttn71y	0.000000	259.280000	0.000000	1	0.000000	259.280000	0.000000	0.000000	2	1	0
3bxuudrscy1jn	0.000000	87.572000	0.000000	1	0.000000	87.572000	0.000000	0.000000	3	1	0
8wmvpvzmgjmyx	0.000000	2.400000	0.000000	1	0.000000	2.400000	0.000000	0.000000	3	1	0
5w5f39jcwb5nv	0.000000	4.343000	0.000000	1	0.000000	4.343000	0.000000	0.000000	2	1	0
638x13474m3x2	0.000000	1.894000	0.000000	1	0.000000	1.894000	0.000000	0.000000	2	1	0
2s9mmb6g8kbqb	0.000000	2.210000	0.000000	1	0.000000	2.210000	0.000000	0.000000	2	1	0
956vu0d29ah05	0.542286	108.854000	59.030000	1	0.000000	108.854000	0.000000	59.030000	109	1	0

Sum App Time(s): 0.46655299999999994
Sum Net Time(s): 0.05903

No. SQLs: 7

192.168.56.118 32 kb


	Time frame:  2020-01-22 12:39:40.448777 +0100 CET  <=>  2020-01-22 12:39:41.19642 +0100 CET
	Time frame duration (s):  0.747643

REMOTE – fetchsize 2

inter@piclerick stado % ./stado -f sqlsr.pcap -i "192.168.56.118 or 10.0.0.8" -p 1521 -s rat
SQL ID		Ratio		Ela App (ms)	Ela Net(ms)	Exec	Ela Stddev App	Ela App/Exec	Ela Stddev Net	Ela Net/Exec	P	S	RC
------------------------------------------------------------------------------------------------------------------------------------------------------------
8wmvpvzmgjmyx	0.000000	59.969000	0.000000	1	0.000000	59.969000	0.000000	0.000000	3	1	0
5w5f39jcwb5nv	0.000000	68.520000	0.000000	1	0.000000	68.520000	0.000000	0.000000	2	1	0
638x13474m3x2	0.000000	63.750000	0.000000	1	0.000000	63.750000	0.000000	0.000000	2	1	0
2s9mmb6g8kbqb	0.000000	59.799000	0.000000	1	0.000000	59.799000	0.000000	0.000000	2	1	0
5jng2h1ttn71y	0.000000	241.471000	0.000000	1	0.000000	241.471000	0.000000	0.000000	2	1	0
3bxuudrscy1jn	0.000000	108.820000	0.000000	1	0.000000	108.820000	0.000000	0.000000	3	1	0
956vu0d29ah05	0.972269	3200.125000	3111.382000	1	0.000000	3200.125000	0.000000	3111.382000	109	1	0

Sum App Time(s): 3.8024539999999996
Sum Net Time(s): 3.111382

No. SQLs: 7

10.0.0.8 32 kb


	Time frame:  2020-01-22 12:40:05.489516 +0100 CET  <=>  2020-01-22 12:40:10.377233 +0100 CET
	Time frame duration (s):  4.887717

Here we can see clearly which SQL caused the biggest problem, and that time spent on the network was actually 97% of the whole execution time! (In a case of the local server the network time was only 54% – 49ms)

In most cases, this problem is due to a chatty application – like in this case: 107 rows were returned by 109 packets. That’s crazy.
In SQL*Plus we can modify fetch size with a parameter arraysize (default value is 14, which means that SQL*Plus fetches 14 rows at a time).

Let’s see how arraysize parameter influences the execution time:

LOCAL – fetchsize 14

inter@piclerick stado % ./stado -f sqls.pcap -i "192.168.56.118 or 10.0.0.8" -p 1521 -s rat
SQL ID		Ratio		Ela App (ms)	Ela Net(ms)	Exec	Ela Stddev App	Ela App/Exec	Ela Stddev Net	Ela Net/Exec	P	S	RC
------------------------------------------------------------------------------------------------------------------------------------------------------------
3bxuudrscy1jn	0.000000	72.221000	0.000000	1	0.000000	72.221000	0.000000	0.000000	3	1	0
8wmvpvzmgjmyx	0.000000	1.298000	0.000000	1	0.000000	1.298000	0.000000	0.000000	3	1	0
5w5f39jcwb5nv	0.000000	10.774000	0.000000	1	0.000000	10.774000	0.000000	0.000000	2	1	0
638x13474m3x2	0.000000	17.649000	0.000000	1	0.000000	17.649000	0.000000	0.000000	2	1	0
2s9mmb6g8kbqb	0.000000	1.427000	0.000000	1	0.000000	1.427000	0.000000	0.000000	2	1	0
5jng2h1ttn71y	0.000000	199.157000	0.000000	1	0.000000	199.157000	0.000000	0.000000	2	1	0
956vu0d29ah05	0.226816	41.937000	9.512000	1	0.000000	41.937000	0.000000	9.512000	17	1	0

Sum App Time(s): 0.344463
Sum Net Time(s): 0.009512

No. SQLs: 7

192.168.56.118 27 kb


	Time frame:  2020-01-22 14:14:23.996268 +0100 CET  <=>  2020-01-22 14:14:24.45883 +0100 CET
	Time frame duration (s):  0.462562

REMOTE – fetchsize 14

inter@piclerick stado % ./stado -f sqlsr.pcap -i "192.168.56.118 or 10.0.0.8" -p 1521 -s rat
SQL ID		Ratio		Ela App (ms)	Ela Net(ms)	Exec	Ela Stddev App	Ela App/Exec	Ela Stddev Net	Ela Net/Exec	P	S	RC
------------------------------------------------------------------------------------------------------------------------------------------------------------
3bxuudrscy1jn	0.000000	116.133000	0.000000	1	0.000000	116.133000	0.000000	0.000000	3	1	0
8wmvpvzmgjmyx	0.000000	59.618000	0.000000	1	0.000000	59.618000	0.000000	0.000000	3	1	0
5w5f39jcwb5nv	0.000000	64.575000	0.000000	1	0.000000	64.575000	0.000000	0.000000	2	1	0
638x13474m3x2	0.000000	59.540000	0.000000	1	0.000000	59.540000	0.000000	0.000000	2	1	0
2s9mmb6g8kbqb	0.000000	59.320000	0.000000	1	0.000000	59.320000	0.000000	0.000000	2	1	0
5jng2h1ttn71y	0.000000	256.340000	0.000000	1	0.000000	256.340000	0.000000	0.000000	2	1	0
956vu0d29ah05	0.836899	484.620000	405.578000	1	0.000000	484.620000	0.000000	405.578000	17	1	0

Sum App Time(s): 1.1001460000000003
Sum Net Time(s): 0.405578

No. SQLs: 7

10.0.0.8 27 kb


	Time frame:  2020-01-22 14:13:49.553632 +0100 CET  <=>  2020-01-22 14:13:51.847139 +0100 CET
	Time frame duration (s):  2.293507

LOCAL – fetchsize 200

inter@piclerick stado % ./stado -f sqls.pcap -i "192.168.56.118 or 10.0.0.8" -p 1521 -s rat
SQL ID		Ratio		Ela App (ms)	Ela Net(ms)	Exec	Ela Stddev App	Ela App/Exec	Ela Stddev Net	Ela Net/Exec	P	S	RC
------------------------------------------------------------------------------------------------------------------------------------------------------------
3bxuudrscy1jn	0.000000	87.994000	0.000000	1	0.000000	87.994000	0.000000	0.000000	3	1	0
8wmvpvzmgjmyx	0.000000	1.121000	0.000000	1	0.000000	1.121000	0.000000	0.000000	3	1	0
5w5f39jcwb5nv	0.000000	2.315000	0.000000	1	0.000000	2.315000	0.000000	0.000000	2	1	0
638x13474m3x2	0.000000	14.832000	0.000000	1	0.000000	14.832000	0.000000	0.000000	2	1	0
2s9mmb6g8kbqb	0.000000	1.259000	0.000000	1	0.000000	1.259000	0.000000	0.000000	2	1	0
5jng2h1ttn71y	0.000000	187.470000	0.000000	1	0.000000	187.470000	0.000000	0.000000	2	1	0
956vu0d29ah05	0.019606	17.546000	0.344000	1	0.000000	17.546000	0.000000	0.344000	9	1	0

Sum App Time(s): 0.31253699999999995
Sum Net Time(s): 0.00034399999999999996

No. SQLs: 7

192.168.56.118 26 kb


	Time frame:  2020-01-22 14:30:09.225337 +0100 CET  <=>  2020-01-22 14:30:09.664531 +0100 CET
	Time frame duration (s):  0.439194

REMOTE – fetchsize 200

inter@piclerick stado % ./stado -f sqlsr.pcap -i "192.168.56.118 or 10.0.0.8" -p 1521 -s rat
SQL ID		Ratio		Ela App (ms)	Ela Net(ms)	Exec	Ela Stddev App	Ela App/Exec	Ela Stddev Net	Ela Net/Exec	P	S	RC
------------------------------------------------------------------------------------------------------------------------------------------------------------
2s9mmb6g8kbqb	0.000000	63.342000	0.000000	1	0.000000	63.342000	0.000000	0.000000	2	1	0
5jng2h1ttn71y	0.000000	237.884000	0.000000	1	0.000000	237.884000	0.000000	0.000000	2	1	0
3bxuudrscy1jn	0.000000	119.722000	0.000000	1	0.000000	119.722000	0.000000	0.000000	3	1	0
8wmvpvzmgjmyx	0.000000	64.242000	0.000000	1	0.000000	64.242000	0.000000	0.000000	3	1	0
5w5f39jcwb5nv	0.000000	67.841000	0.000000	1	0.000000	67.841000	0.000000	0.000000	2	1	0
638x13474m3x2	0.000000	64.604000	0.000000	1	0.000000	64.604000	0.000000	0.000000	2	1	0
956vu0d29ah05	0.488563	127.828000	62.452000	1	0.000000	127.828000	0.000000	62.452000	11	1	0

Sum App Time(s): 0.745463
Sum Net Time(s): 0.062452

No. SQLs: 7

10.0.0.8 26 kb


	Time frame:  2020-01-22 14:33:46.121799 +0100 CET  <=>  2020-01-22 14:33:48.165371 +0100 CET
	Time frame duration (s):  2.043572

As we can see, we reduced the time of SQL statement from more than 3000ms to around 128ms by modifying the size of single fetch – fewer packets equals fewer round trips.

Still, the network plays a significant role in the performance of this query (~ 62%), and the bigger latency – the slower the query will be.
Imagine that you have an application with a lot of this kind of chatty cursors, no PL/SQL logic and everything based on JAVA or Python or something like this… Well, most of you don’t have to imagine – you see those systems every day!

If you take this kind of system and migrate it to some fancy new hardware with a turbo-cool automagic hocus-pocus self-problem-solving platform, will it run faster? If this marvelous platform is not a network switch away from your app server, the answer is simple and brutal – NO.

Cloud won’t solve your problems. If your app is designed in a shitty way – cloud will escalate your problems. There is no magic – only physics.

If you want to check if your systems are network-sensitive and chatty – just make a tcpdump and analyze output with STADO, which can be found here:
https://github.com/ora600pl/stado