Search

Top 60 Oracle Blogs

Recent comments

Oakies Blog Aggregator

improving performance with stored procedures — a pgbench example.

improving performance with stored procedures — a pgbench example.

In a previous post I mentioned that I do not use pgbench to benchmark the platform. But when it comes to measuring client/server application, pgbench fully makes makes sense.

I initialize the pgbench schema with small data:

pgbench - initialize - init-steps=dtgvpf -h localhost -p 5432 -U postgres franck

And I run the pgbench builtin workload with does something like a TPC-B

tpcb-like builtin

pgbench --builtin tpcb-like --transactions 30000 --protocol=prepared --jobs=10 --client=10 -h localhost -p 5432 -U postgres franck

I run 30000 transactions there, from 10 threads. It runs for more than 4 minutes:

The rate is 1097 transactions per second with an average of 9 milliseconds per transaction.

That’s my baseline. What the builtin transaction runs is easy to get from the source:

postgres/postgres

As pgbench can also run custom workloads, I’ll run exactly the same workload by copying those statements in a file.

tpcb-like as a custom file

Here is the file containing the same as the builtin found in pgbench.c:

cat > /tmp/tpcb-like <<'CAT'
-- tpcb-like
\set aid random(1, 100000 * :scale)
\set bid random(1, 1 * :scale)
\set tid random(1, 10 * :scale)
\set delta random(-5000, 5000)
BEGIN;
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
END;
CAT

Now running the workload from this file, still 30000 transactions from 10 threads.

pgbench --file /tmp/tpcb-like --transactions 30000 --protocol=prepared --jobs=10 --client=10 -h localhost -p 5432 -U postgres franck

The result is very similar:

The rate is 1095 transactions per second with an average of 9 milliseconds per transaction. Same statements and same throughput.

tpcb-like as a procedure + select

I create a stored procedure with all INSERT/UPDATE statements:

create procedure P_TPCB_LIKE
(p_aid integer, p_bid integer, p_tid integer, p_delta integer) AS $$
BEGIN
UPDATE pgbench_accounts SET abalance = abalance + p_delta WHERE aid = p_aid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (p_tid, p_bid, p_aid, p_delta, CURRENT_TIMESTAMP);
UPDATE pgbench_tellers SET tbalance = tbalance + p_delta WHERE tid = p_tid;
UPDATE pgbench_branches SET bbalance = bbalance + p_delta WHERE bid = p_bid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (p_tid, p_bid, p_aid, p_delta, CURRENT_TIMESTAMP);
END;
$$ language plpgsql;

Now, the custom file will only call that procedure for the modifications, and run the select:

cat > /tmp/simple-update-p <<'CAT'
-- tpcb-like
\set aid random(1, 100000 * :scale)
\set bid random(1, 1 * :scale)
\set tid random(1, 10 * :scale)
\set delta random(-5000, 5000)
BEGIN;
call P_SIMPLE_UPDATE(:aid, :bif, :tid, :delta);
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
END;
CAT

This is functionally equivalent and I run again the same number of transactions from the same number of threads:

pgbench --file /tmp/tpcb-like-p --transactions 30000 --protocol=prepared --jobs=10 --client=10 -h localhost -p 5432 -U postgres franck

Now I have a huge gain here as the throughput is 3 times higher:

The rate is 3024 transactions per second with an average of 3 milliseconds per transaction. Same statements but stored on the server.

tpcb-like as a function with refcursor

Ideally, each client/server call should be only one statement. And then I must include the SELECT part in my stored procedure. That is possible with a function that returns a refcursor:

create function F_TPCB_LIKE
(p_aid integer, p_bid integer, p_tid integer, p_delta integer) returns table(abalance integer) AS $$
DECLARE
c refcursor;
BEGIN
UPDATE pgbench_accounts SET abalance = pgbench_accounts.abalance + p_delta WHERE aid = p_aid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (p_tid, p_bid, p_aid, p_delta, CURRENT_TIMESTAMP);
UPDATE pgbench_tellers SET tbalance = tbalance + p_delta WHERE tid = p_tid;
UPDATE pgbench_branches SET bbalance = bbalance + p_delta WHERE bid = p_bid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (p_tid, p_bid, p_aid, p_delta, CURRENT_TIMESTAMP);
return query SELECT pgbench_accounts.abalance FROM pgbench_accounts WHERE aid = p_aid;
END;
$$ language plpgsql;

You note that I didn’t even change the variable names and for this reason, I prefixed abalance in the update statement.

Here is my simple call for the transaction, which does all the DML and returns a cursor:

cat > /tmp/tpcb-like-f <<'CAT'
-- tpcb-like
\set aid random(1, 100000 * :scale)
\set bid random(1, 1 * :scale)
\set tid random(1, 10 * :scale)
\set delta random(-5000, 5000)
BEGIN;
SELECT abalance from F_TPCB_LIKE(:aid, :bif, :tid, :delta);
END;
CAT

Running it with the same configuration:

pgbench --file /tmp/tpcb-like-f --transactions 30000 --protocol=prepared --jobs=10 --client=10 -h localhost -p 5432 -U postgres franck

the performance is even better:

The rate is 4167 transactions per second with an average of 2 milliseconds per transaction. Obviously this is better. The more you do in the database the better you can scale. But there’s even more: all the benefits from encapsulation. I’ll mention 3 main benefits:

  1. Here the API between the application and the database is independent of the SQL syntax. And the SQL statements that are tightly coupled with the data model (and the database system/version) are all encapsulated in the procedure, within this database.
  2. It is also a strong security advantage: no SQL injection possible as inputs go through the API which admits only a procedure name and typed parameters. And you can audit the procedure calls.
  3. When the junior architect tells you that your application is an ugly old monolith, add the following comment in front of each call to the stored procedures and you are trendy again:
--- calling the 'command' CQRS microservice
CALL my_app_procedure(...);
--- calling the 'query' CQRS microservice
SELECT my_app_function(...);

Video : Oracle REST Data Services (ORDS) : First Party (Basic) Authentication on Tomcat

In today’s video we demonstrate first party cookie-based authentication, or basic authentication, for Oracle REST Data Services when run on Tomcat.

For those of you that prefer to read, this is one of the authentication and authorization methods discussed in this article.

You can get more information about ORDS here.

The star of today’s video is Patrick Barel, who is somehow managing to remain out of focus when everything else in the video is in focus. I’m not sure how I managed that… </p />
</p></div>

    	  	<div class=

match_recognize()

A couple of days ago I posted a note with some code (from Stew Ashton) that derived the clustering_factor you would get for an index when you had set a value for the table_cached_blocks preference but had not yet created the index. In the note I said that I had produced a simple and elegant (though massively CPU-intensive) query using match_recognize() that seemed to produce the right answer, only to be informed by Stew that my interpretation of how Oracle calculated the clustering_factor was wrong and that the query was irrelevant.  (The fact that it happened to produce the right answer in all my tests was an accidental side effect of the way I had been generating test data. With Stew’s explanation of what Oracle was doing it was easy to construct a data set that proved my query was doing the wrong thing.)

The first comment I got on the posting was a request to publish my match_recognize() query – even though it was irrelevant to the problem in hand – simply because it might be a useful lesson in what could be done with the feature; so here’s some code that doesn’t do anything useful but does demonstrate a couple of points about match_recognize(). I have to stress, though, that I’m not an expert on match_recognize() so there may be a more efficient way of using it to acheieve the same result. There is certainly a simple and more efficient way to get the same result using some fairly straightforward PL/SQL code.

Requirement

I had assumed that if you set the table_cached_blocks preference to N then Oracle would keep track of the previous N rowids as it walked an index to gather stats and only increment the “clustering factor counter” if it failed to find a match for the block address of the current rowid in the block addresses extracted from the previous N rowids. My target was to emulate a way of doing this counting.

Strategy

Rather than writing code that “looked backwards” as it walked the index, I decided to take advantage of the symmetry of the situation and write code that looked forwards (you could think of this as viewing the index in descending order and looking backwards along the descending index). I also decided that I could count the number of times I did find a match in the trail of rowids, and subtract that from the total number of index entries.

So my target was to look for patterns where I start with the block address from the current rowid, then find the same block address after zero to N-1 failures to find the block address. Since the index doesn’t exist I’ll need to emulate its existence by selecting the columns that I want in the index along with the rowid, ordering the data in index order. Then I can walk this “in memory” index looking for the desired pattern.

Here’s some code to create a table to test against:


rem
rem     Script:         clustering_factor_est_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2019
rem     Purpose:        
rem
rem     Last tested 
rem             19.3.0.0
rem             12.2.0.1
rem 

create table t1
as
with generator as (
        select 
                rownum id
        from dual 
        connect by 
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                                  id,
        cast(rownum as varchar2(10))            v1,
        trunc(dbms_random.value(0,10000))       rand,
        rpad('x',100,'x')                       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 -- > comment to avoid WordPress format issue
/

My table has 1M rows, and there’s a column called rand which has 10,000 distinct values. This is generated through Oracle’s dbms_random package and the procedure I’ve used will give me roughly 100 occurrences for each value scattered uniformly across the table. An index on this column might be quite useful but it’s probably going to have a very high clustering_factor because, on average, any two rows of a particular value are likely to be separated by 10,000 rows, and even if I look for rows with a pair of consecutive values any two rows are likely to be separated by a few hundred other rows.

Here’s the code that I was using to get an estimate of (my erroneous concept of) the clustering_factor with table_cached_blocks set to 32. For debugging purposes it reports the first row of the pattern for each time my pattern is matched, so in this version of the code I’d  have to check the number of rows returned and subtract that from the number of rows in the table (where rand is not null).


select
        first_rand, first_file_id, first_block_id, first_row_id, step_size
from    (
                select
                        rand, 
                        dbms_rowid.rowid_relative_fno(rowid) file_id,
                        dbms_rowid.rowid_block_number(rowid) block_id,
                        dbms_rowid.rowid_row_number(rowid)   row_id
                from
                        t1
                where
                        rand is not null
        )
match_recognize(
        order by rand, file_id, block_id, row_id
        measures
                count(*) as step_size,
                first(strt.rand)        as first_rand,
                first(strt.file_id)     as first_file_id,
                first(strt.block_id)    as first_block_id,
                first(strt.row_id)      as first_row_id
        one row per match
        after match skip to next row
        pattern (strt miss{0, 31} hit)
        define 
                miss as (block_id != strt.block_id or  file_id != strt.file_id),
                hit  as (block_id  = strt.block_id and file_id  = strt.file_id)
)
order by 
        first_rand, first_file_id, first_block_id, first_row_id
;

The first point to note is that the inline view is the thing I use to model an index on column rand. It simply selects the value and rowid from each row. However I’ve used the dbms_rowid package to break the rowid down into the file_id, block_id and row_id within block. Technically (to deal with partitioned tables and indexes) I also ought to be thinking about the object_id which is the first component of the full rowid. I haven’t actually ordered the rows in the in-line view as I’m going to let the match_recognize() operation handle that.

A warning as we start looking at the match_recognize() section of the query: I’ll be jumping around the clause a little bit, rather than working through it in order from start to finish.

First I need the raw data sorted in order so that any results I get will be deterministic. I have an order by clause that sorts my data by rand and the three components I’ve extracted from the rowid. (It is also possible to have a partition clause – similar to the partition clause of an analytic function – but I don’t need one for this model.)

Then I need to define the “columns” that I’m planning to output in the final query. This is the set of measures and in the list of measures you can see a count(*) (which doesn’t quite mean what it usually does) and a number of calls to a function first() which takes an aliased column name as it’s input and, although the column names look familiar, the alias (strt) seems to have sprung from nowhere.

At this point I want to jump down to the define clause because that’s where the meaning of strt could have been defined.  The define clause is a list of “rules”, which are also treated as “variables”, that tell us how to classify a row. We are looking for patterns, and pattern is a set of rows that follows a set of rules in a particular order, so one of the things I need to do is create a rule that tells Oracle how to identify a row that qualifies as the starting point for a pattern – so I’ve defined a rule called strt to do this – except I haven’t defined it explicitly, it’s not visible in the define list, so Oracle has assumed that the rule is “1 = 1”, in other words every row I’m going to look at could be classified as a strt row.

Now that I have a definition for what strt means I could go back to the measures – but I’ll postpone doing that for a moment and look at the other rules in my define list. I have a rule called miss which says “if either of these comparisons evaluates to true” then the row is a miss;  but the predicates includes a reference to strt which means we are doing comparisons with the most recent row that was classified as a strt row. So a miss means we’ve found a starting row and we’re now looking at other rows comparing the block_id and file_id for each row to check that they don’t match the block_id and file_id of the starting row.

Similarly we have a hit rule which says a hit means we’ve previously found a starting row and we’re now looking at other rows checking for rows where the current block_id and file_id match the starting block_id and file_id.

Once we’ve got a set of rules explaining how to classify rows we can specify a pattern (which means going back up the match_recognize() clause one section). Our pattern reads:  “find a strt row, followed by zero and 31 miss rows, followed by a hit row”. And that’s just a description of my back-to-front way of saying “remember the last 32  rowids and check if the current block address matches the block address in one of those rowids”.

The last two clauses that need to be explained before I revisit the measures clause are the “one row per match” and “after match skip to next row”.

If I find a sequence of rows that matches my pattern there are several things I could do with that set of rows – for example I could report every row in that pattern along with the classification of strt/miss/hit (which would be useful if I’m looking for a few matches to a small pattern in a large data set), or (as I’ve done here) I could report just one row from the pattern and give Oracle some instruction about which one row I want reported.

Then, after I’ve found (and possibly reported) a pattern, what should I do next. Again there are several possibilities – the two most obvious ones, perhaps are: “just keep going” i.e. look at the row after the end of the pattern to see if it’s another strt row, and “go to the row after the start of the pattern you’ve just reported”. These translate into: “after match skip past last row” (which is the default if you don’t specify an “after match” clause) and “after match skip to next row” (which is what I’ve specified).

Finally we get back to the measures clause – I’ve defined four “columns” with names like ‘first_xxx’ and a step_size. The step_size is defined as count(*) which – in this context – means “count the number of rows in the current matched pattern”. The other measures are defined using the first() function, referencing strt alias which tells Oracle I want to retain the value from the first row that met the strt rule in the current matched pattern.

In summary, then my match_recognize() clause tells Oracle to

  • Sort the data by rand, file_id, block_id, row_id
  • For each row in turn
    • extract the file_id and block_id
    • take up to 32 steps down the list looking for a matching file_id and block_id
    • If you find a match pass a row to the parent operation that consists of: the number of rows between strt to hit inclusive, and the values of rand, file_id, block_id, and row_id of the strt row.

Before you try testing the code, you might like to know about some results.

As it stands my laptop with a virtual machine running 12.2.0.1 took 1 minute and 5 seconds to complete with “miss{0, 31}” in the pattern. In fact the first version of the code had the file_id and block_id tests in the define clause in the opposite order viz:

        define 
                miss as (file_id != strt.file_id or  block_id != strt.block_id),
                hit  as (file_id  = strt.file_id and block_id  = strt.block_id)

Since the file_id for my test is the same for every row in my table (it’s a single file tablespace), this was wasting a surprising amount of CPU, leading to a run time of 1 minute 17 seconds! Neither time looks particularly good when compared to the time required to create the index, set the table_cached_blocks to 32, and gather stats on the index – in a total time of less than 5 seconds. The larger the value I chose for the pattern match, the worse the workload became, until at “miss{0, 199}” – emulating a table_cached_blocks of 200 – the time to run was about 434 seconds of CPU!

A major portion of the problem is the way that Oracle is over-optimistic (or “greedy”, to use the technical term) with its pattern matching, combined with the nature of the data which (in this example) isn’t going to offer much opportunity for matching, combined with the fact that Oracle cannot determine that a row that is not a “miss” has to be a “hit”.  In this context “greedy” means Oracle will try to find as many consecutive occurrences of the first rule in the pattern before it tries to find and occurrence of the second rule – and when it fails to match a pattern it will “backtrack” one step and have another go, being slightly less greedy. So, for our example, the greedy algorithm will operate as follows:

  • find 31 rows that match miss, then discover the 32nd row does not match hit
  • go back to the strt and find 30 rows that match miss, then discover the 31st row does not match hit
  • go back to the strt and find 29 rows that match miss, then discover the 30th row does not match hit
  • … repeat until
  • go back to the strt and find 0 rows that match miss, then discover the 1st does not match hit
  • go to the next row, call it strt, and repeat the above … 1M times.

From a human perspective this is a pretty stupid strategy for this specific problem – but that’s because we happen to know that “hit” = “not(miss)” (ignoring nulls, of which there are none) while Oracle has to assume that there is no relationship between “hit” and “miss”.

There is hope, though, because you can tell Oracle that it should be “reluctant” rather than “greedy” which means Oracle will consume the smallest possible number of occurrences of the first rule before testing the second rule, and so on. All you have to do is append a “?” to the count qualifier:

        pattern (strt miss{0, 31 }? hit)

Unfortunately this seemed to have very little effect on execution time (and CPU usage) in our case. Again this may be due to the nature of the data etc., but it may also be a limitation in the way that the back tracking works. I had expected a response time that would more closely mirror the human expectation, but a few timed tests suggest the code uses exactly the same type of strategy for the reluctant strategy as it does for the greedy one, viz:

  • find 0 rows that match miss, then discover the next row does not match hit
  • go back to the strt and find 1 row that matches miss, then discover the 2nd row does not match hit
  • go back to the strt and find 2 rows that match miss, then discover the 3rd row does not match hit
  • … repeat until
  • go back to the strt and find 31 rows that match miss, then discover the 32nd does not match hit
  • go to the next row, call it strt, and repeat the above … 1M times.

Since there are relatively few cases in our data where a pattern match will occur both the reluctant and the greedy strategies will usually end up doing all 32 steps. I was hoping for a more human-like algorithm that would recognise that Oracle would recognise that if it’s just missed on the first X rows then it need only check the X+1th and not go back to the beginning (strt) – but my requirement makes it easy to see (from a human perspective) that that makes sense; in a generic case (with more complex patterns and without the benefit of having two mutially exclusive rules) the strategy of “start all over again” is probably a much safer option to code.

Plan B

Plan B was to send the code to Stew Ashton and ask if he had any thoughts on making it more efficient – I wasn’t expecting him to send a PL/SQL solution my return of post, but that’s what I got and published in the previous post.

Plan C

It occurred to me that I don’t really mind if the predicted clustering_factor is a little inaccurate, and that the “backtracking” introduced by the variable qualifer {0,31} was the source of a huge amount of the work done, so I took a different approach which simply said: “check that the next 32 (or preferred value of N) rows are all misses”.  This required two changes – eliminate one of the defines (the “hit”) and modify the pattern definition as follows:

         pattern (strt  miss{32} )
         define
                 miss as (block_id != strt.block_id or  file_id != strt.file_id)

The change in strategy means that the result is going to be (my version of) the clustering_factor rather than the number to subtract from num_rows to derive the clustering_factor. And I’ve introduced a small error which shows up towards the end of the data set – I’ve demanded that a pattern should include exactly 32 misses; but when you’re 32 rows from the end of the data set there aren’t enough rows left to match the pattern. So the result produced by the modified code could be as much as 32 short of the expected result.  However, when I showed the code to Stew Ashton he pointed out that I could include “alternatives” in the pattern, so all I had to do was add in to the pattern something which said “and if there aren’t 32 rows left, getting to the end of the data set is good enough” (technically that should be end of the current partition, but we have only one partition).

         pattern (strt  ( miss{32} | ( miss* $) ) )

The “miss” part of the pattern now reads:  “32 misses” or “zero or more misses and then the end of file/partition/dataset”.

It’s still not great – but the time to process the 1M rows with a table_cached_blocks of 32 came down to 31 seconds

Finally

I’ll close with one important thought. There’s a significant difference in the execution plans for the two strategies – which I’m showing as outputs from the SQL Monitor report using a version of the code that does a simple count(*) rather than listing any rows:


pattern (strt miss{0, 31} hit)
==========================================================================================================================================
| Id |         Operation          | Name  |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   |  Mem  | Activity | Activity Detail |
|    |                            |       | (Estim) |       | Active(s) | Active |       | (Actual) | (Max) |   (%)    |   (# samples)   |
==========================================================================================================================================
|  0 | SELECT STATEMENT           |       |         |       |        54 |    +14 |     1 |        1 |     . |          |                 |
|  1 |   SORT AGGREGATE           |       |       1 |       |        54 |    +14 |     1 |        1 |     . |          |                 |
|  2 |    VIEW                    |       |      1M | 12147 |        54 |    +14 |     1 |     2782 |     . |          |                 |
|  3 |     MATCH RECOGNIZE SORT   |       |      1M | 12147 |        60 |     +8 |     1 |     2782 |  34MB |          |                 |
|  4 |      VIEW                  |       |      1M |   325 |        14 |     +1 |     1 |       1M |     . |          |                 |
|  5 |       INDEX FAST FULL SCAN | T1_I1 |      1M |   325 |         7 |     +8 |     1 |       1M |     . |          |                 |
==========================================================================================================================================

pattern (strt  miss{32} )
==================================================================================================================================================================
| Id |                     Operation                      | Name  |  Rows   | Cost  |   Time    | Start  | Execs |   Rows   |  Mem  | Activity | Activity Detail |
|    |                                                    |       | (Estim) |       | Active(s) | Active |       | (Actual) | (Max) |   (%)    |   (# samples)   |
==================================================================================================================================================================
|  0 | SELECT STATEMENT                                   |       |         |       |        15 |    +16 |     1 |        1 |     . |          |                 |
|  1 |   SORT AGGREGATE                                   |       |       1 |       |        15 |    +16 |     1 |        1 |     . |          |                 |
|  2 |    VIEW                                            |       |      1M | 12147 |        15 |    +16 |     1 |     997K |     . |          |                 |
|  3 |     MATCH RECOGNIZE SORT DETERMINISTIC FINITE AUTO |       |      1M | 12147 |        29 |     +2 |     1 |     997K |  34MB |          |                 |
|  4 |      VIEW                                          |       |      1M |   325 |        16 |     +1 |     1 |       1M |     . |          |                 |
|  5 |       INDEX FAST FULL SCAN                         | T1_I1 |      1M |   325 |         9 |     +8 |     1 |       1M |     . |          |                 |
==================================================================================================================================================================

The significant line to note is operation 3 in both cases. The query with the pattern that’s going to induce back-tracking reports only “MATCH RECOGNIZE SORT”. The query with the “fixed” pattern reports “MATCH RECOGNIZE SORT DETERMINISTIC FINITE AUTO” Oracle can implement a finite state machine with a fixed worst case run-time. When you write some code that uses match_recognize() the three magic words you want to see in the plan are “deterministic finite auto” – if you don’t then, in principle, your query might be one of those that could (theoretically) run forever.

Addendum

Congratulations if you’ve got this far – but please remember that I’ve had very little practice using match_recognize; this was a little fun and learning experience for me and there may be many more things you could usefully know about the technology before you use it in production; there may also be details in what I’ve written which are best forgotten about. That being the case I’d be more than happy for anyone who wants to enhance or correct my code, descriptions and observations to comment below.

 

Oracle memory troubleshooting using analysis on heapdumps, part 2

In the article oracle memory troubleshooting using analysis on heap dumps I introduced heap_analyze.awk.

The reason the tool exists is because I am using it myself. Therefore, I ran into additional things that I wanted the tool to do. I added some stuff, which is that significant, that I decided to make another blogpost to introduce the new features.

1. Percentages
In order to get an idea of the relative size of the summarised topic, I added a percentage. For example:

top 5 allocation by total size per alloc reason per heap
==================================================================================================
heap             alloc_reason            #chunks       total_size   %
--------------------------------------------------------------------------------------------------
sga heap(1,0)    perm                         19        182947456  72
sga heap(1,0)                                 43         14488960   5
sga heap(1,0)    SO private sga               18         14284168   5
sga heap(1,0)    KGLHD                      5787          4318400   1
sga heap(1,0)    KSFD SGA I/O b                1          4190416   1

2. Enhanced perm (permanent) memory descriptions
It seems that for PGA heap dumps, sometimes there is a description for memory area’s that are perm (permanent memory, memory allocated for the lifetime of the process). This is how that’s visible in the dump:

PERMANENT CHUNKS:
  Chunk     7fcbcad6c020 sz=    20632    perm      "perm           "  alo=20632
            7fcbcad6c020 sz=    20632    cpmlst    "callback hsn   "

I must say I don’t know what ‘cpmlst’ means, so if anyone knows or has a good guess, please let me know. However, the two addresses and sizes are an exact match, so I now change the alloc_reason for the cpmlst text.
This is helpful because there is a quite some memory allocated as perm. Sadly, this is not done for SGA dumps.

3. Excel mode
If you want to store and compare different dumps, one way of doing that is pasting the output in Microsoft excel. Once you set the ‘Text To Columns’ option to space as a separator, it will put the information in its own cells. But there are a few problems with that, one of them is that the heap names and alloc_reasons can have spaces in them, so that the placement of the figures can vary. I created excel mode for that.
If you set excel mode (set the variable excel_mode to 1 on line 5):

#!/bin/awk -f
BEGIN {
  printf "Heap dump analyzer v0.2 by Frits Hoogland\n";
  group_sum_nr=5;
  excel_mode=1;
}

In this mode, the horizontal lines (with ‘-‘ and ‘=’) are omitted in the output, and all spaces are changed to underscores, so a table stays consistent when pasted in excel:

heap             alloc_reason            #chunks       total_size   %
sga_heap(1,0)    perm___________              19        182947456  72
sga_heap(1,0)    _______________              43         14488960   5
sga_heap(1,0)    SO_private_sga_              18         14284168   5
sga_heap(1,0)    KGLHD__________            5787          4318400   1
sga_heap(1,0)    KSFD_SGA_I/O_b_               1          4190416   1

4. New table which shows memory by type
Another way of looking at memory in a heap is by grouping it by type. This allows you to very quickly see if a certain type of chunk is dominating a heap:

heap                        type    #chunks         min_size         max_size       total_size   %
--------------------------------------------------------------------------------------------------
top call heap               free          7            16408            65512           390064  99
top call heap           recreate          2              992              992             1984   0
top call heap               perm          2              120              904             1024   0

This is an overview of the top call heap of a session that is not active, and therefore most of it should be empty, which is true for this dump.

Once again, get the awk script here: https://gitlab.com/FritsHoogland/oracle_memory_analyze/blob/master/heap_analyze.awk

The holistic SQL tuning series

I did a set of articles for Oracle Magazine on a more holistic view of SQL tuning. What do I mean by “holistic”? It was a reflection of a common problem that I see when questions come into AskTOM, or when people in the community approach me at conferences, namely, there is an inclination to dive straight into the deepest levels of the tuning exercise:

  • “What index should I create?”
  • “Should I increase the parallel degree?”

etc etc. And as technical practitioners, it is an easy trap to fall into. We too often fail to step back and approach the problem from its true requirement – that of, satisfying a business need. We might end up deep in the code, but we should probably not start there.

So I have consolidated the 4-part series here to provide my blueprint for tackling SQL tuning to give the most business success, which ultimately is the goal of the technology solutions we build.

Part 1 – The Business Requirement

https://blogs.oracle.com/oraclemagazine/a-higher-level-perspective-on-sql-tuning

Part 2 – Tuning the User Experience, finding problematic SQL

https://blogs.oracle.com/oraclemagazine/a-higher-level-perspective-on-sql-tuning-part-2

Part 3 – Tuning the SQL statement, the True Execution Plan

https://blogs.oracle.com/oraclemagazine/a-higher-level-perspective-on-sql-tuning-part-3

Part 4 – Common causes for poor performance

https://blogs.oracle.com/oraclemagazine/a-higher-level-perspective-on-sql-tuning-part-4

 

Enjoy!

Oracle memory troubleshooting using analysis on heapdumps

This blogpost is about analysing Oracle heap dumps. It is an extension to earlier work, Tanel Poder’s heap dump analyzer. So hat tip to Tanel, he’s done the hard work, I merely increased the presentation options. The heap analyser script that I wrote analyses Oracle heapdumps from the trace file that the dump was written to by the Oracle database. Because the heap dump representation is the same between PGA and SGA memory, it can work on both. The reason for this is that memory management is done by the same memory manager, and is commonly called ‘kgh’ (kernel generic heap) managed memory.

Please mind that for PGA analysis, not all memory is managed by the kgh memory manager. For example memory used for networking (sqlnet) is allocated totally outside of the kgh memory manager.

Let’s take the output of a PGA heap dump at level 29 (PGA, UGA, CGA, top call heaps, call heaps, session heap; executed via ‘alter session set events ‘immediate trace name heapdump level 29”):

$ ./heap_analyze.awk /u01/app/oracle/diag/rdbms/tt/tt/trace/tt_ora_6515.trc
Heap dump analyzer v0.1 by Frits Hoogland
heap size totals (all sizes in bytes)
==============================================================================================
heap name                       total size
----------------------------------------------------------------------------------------------
top call heap                       393096
callheap                             36200
pga heap                           1016816
top uga heap                        262048
session heap                        261744
------------------------------------------
total                              1969904

top 5 allocation by total size per alloc reason per heap
==============================================================================================
heap             alloc reason            #chunks       total size
----------------------------------------------------------------------------------------------
top call heap                                  5           354456
top call heap    callheap                     11            37616
top call heap    perm                          2             1024
heap             alloc reason            #chunks       total size
----------------------------------------------------------------------------------------------
callheap                                      10            35920
callheap         ctxtcds: kksQui               1              160
callheap         perm                          1              120
heap             alloc reason            #chunks       total size
----------------------------------------------------------------------------------------------
pga heap         perm                         44           290256
pga heap         Fixed UGA heap                1           226232
pga heap         diag pga                     21           214048
pga heap                                       7            62616
pga heap         kfkio bucket                  1            40984
heap             alloc reason            #chunks       total size
----------------------------------------------------------------------------------------------
top uga heap     session heap                  4           261928
top uga heap     perm                          1              120
heap             alloc reason            #chunks       total size
----------------------------------------------------------------------------------------------
session heap     perm                          3            63608
session heap                                   8            63528
session heap     koh-kghu sessi                5            28288
session heap     kxsFrame4kPage                6            24912
session heap     kxsFrame8kPage                3            24744

top 5 allocations by total size per alloc reason, type, chunk size per heap
==============================================================================================
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
top call heap                                  1       free           131048           131048
top call heap                                  2       free            65512           131024
top call heap                                  1       free            63496            63496
top call heap                                  1       free            28888            28888
top call heap    callheap                      6   freeable             4224            25344
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
callheap                                       6       free             4184            25104
callheap                                       2       free             4192             8384
callheap                                       1       free             1784             1784
callheap                                       1       free              648              648
callheap         ctxtcds: kksQui               1   freeable              160              160
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
pga heap         Fixed UGA heap                1   recreate           226232           226232
pga heap         perm                          1       perm            71720            71720
pga heap         diag pga                      1   freeable            55048            55048
pga heap         diag pga                      2   freeable            27456            54912
pga heap                                       1       free            47296            47296
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
session heap     perm                          1       perm            43584            43584
session heap                                   1       free            36544            36544
session heap     kxsFrame4kPage                6   freeable             4152            24912
session heap     kxsFrame8kPage                3   freeable             8248            24744
session heap                                   1       free            20352            20352
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
top uga heap     session heap                  3   freeable            65512           196536
top uga heap     session heap                  1   recreate            65392            65392
top uga heap     perm                          1       perm              120              120

top 5 allocations by total size per heap, alloc reason, type, chunk size
==============================================================================================
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
pga heap         Fixed UGA heap                1   recreate           226232           226232
top uga heap     session heap                  3   freeable            65512           196536
top call heap                                  1       free           131048           131048
top call heap                                  2       free            65512           131024
pga heap         perm                          1       perm            71720            71720

I figured that the first thing you want to see, are the heaps that are in the dump, and their sizes. That’s what is visible in rows 2-12.
I summed the heap sizes, which might make sense, or it might not.
In this case, with a heap dump that includes PGA, UGA and CGA plus session and call heaps, it means there are heaps in the dump that are part of another heap that is in the dump. So the total size here is bogus. This means that you need to have an understanding of what is actually dumped.

The next section, top 5 allocations by total size per alloc reason per dump, shows a per-heap summary by the allocation reason in the dump. If there’s no alloc reason, it’s free memory. Because this is a heap dump of a session that has done nothing (I just started sqlplus and ran the dump of its own PGA memory), you see that a lot of memory chunks are free memory. If you look closely to the allocation reasons, you can see that the ‘top call heap’ has a memory section that is called ‘callheap’ which is slightly larger than the ‘callheap’ section in the heap totals, and the ‘top uga heap’ section has a memory section called ‘session heap’ that is slightly larger than the ‘session heap’ section in the heap totals. In this case, it means that you can actually see the subheads in the parent heap allocation totals. The subheap size must be slightly larger in the parent heap because of headers in the memory allocations which are needed memory management. Please mind that this is based on my knowledge of how process memory is created, the only way to be absolutely sure is that a heap is part of another heap is to look at the memory addresses. This output only shows the heap names, not the addresses. The purpose of this section is to have an understanding of where memory is allocated to in a heap.

The following section, top 5 allocations by total size per alloc reason, type, chunk size per heap shows a per-heap summary by reason, type and chunksize, so you can investigate if specific types and sizes of chunks are causing issues or weird behaviour.

The last section is the same as the previous section, but doesn’t do it per heap. This is identical to what Tanel’s heapdump_analyzer shows.

You can find the heap_analyze.awk script here: https://gitlab.com/FritsHoogland/oracle_memory_analyze/blob/master/heap_analyze.awk

This is the output of a dump of the SGA of my small test database (oradebug dump heapdump 2):

$ ./heap_analyze.awk /u01/app/oracle/diag/rdbms/tt/tt/trace/tt_ora_9461.trc
Heap dump analyzer v0.1 by Frits Hoogland
heap size totals (all sizes in bytes)
==============================================================================================
heap name                       total size
----------------------------------------------------------------------------------------------
sga heap(1,0)                    285210904
sga heap(1,3)                     83885560
------------------------------------------
total                            369096464

top 5 allocation by total size per alloc reason per heap
==============================================================================================
heap             alloc reason            #chunks       total size
----------------------------------------------------------------------------------------------
sga heap(1,0)    perm                         22        188068904
sga heap(1,0)                                141         17975320
sga heap(1,0)    SO private sga               17         14268008
sga heap(1,0)    KQR PO                     8271          7568912
sga heap(1,0)    KGLHD                      9381          7023960
heap             alloc reason            #chunks       total size
----------------------------------------------------------------------------------------------
sga heap(1,3)                                 51         14120288
sga heap(1,3)    SQLA^6d9b8a7e               337          1380352
sga heap(1,3)    SQLA^31cc505b               167           684032
sga heap(1,3)    SQLA^aab93e92               162           663552
sga heap(1,3)    PLDIA^191e0a8d              155           634880

top 5 allocations by total size per alloc reason, type, chunk size per heap
==============================================================================================
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
sga heap(1,0)    perm                          1       perm         15937496         15937496
sga heap(1,0)    perm                          1       perm         15931312         15931312
sga heap(1,0)    perm                          1       perm         15811464         15811464
sga heap(1,0)    perm                          1       perm         15741952         15741952
sga heap(1,0)    perm                          1       perm         15723584         15723584
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
sga heap(1,3)                                  1       free          9883208          9883208
sga heap(1,3)                                  4     R-free           839480          3357920
sga heap(1,3)    SQLA^6d9b8a7e               336  freeableU             4096          1376256
sga heap(1,3)                                  1     R-free           839360           839360
sga heap(1,3)    SQLA^31cc505b               166  freeableU             4096           679936

top 5 allocations by total size per heap, alloc reason, type, chunk size
==============================================================================================
heap             alloc reason            #chunks       type       chunk size       total size
----------------------------------------------------------------------------------------------
sga heap(1,0)    perm                          1       perm         15937496         15937496
sga heap(1,0)    perm                          1       perm         15931312         15931312
sga heap(1,0)    perm                          1       perm         15811464         15811464
sga heap(1,0)    perm                          1       perm         15741952         15741952
sga heap(1,0)    perm                          1       perm         15723584         15723584

It’s interesting to see only subpool 1 sub-sub pool 1 and 3 are used. Subpool 1,1 contains a lot of permanent allocations and a lot of allocations that might have a more permanent nature, like KQR (dictionary cache) allocations, Subpool 1,3 seems to have allocations that are deemed more transient in nature, like SQLA (sql area) allocations. This might be wildly different in databases that are actually heavily used, this is an idle lab database.

Please mind it’s important to understand that dumping the shared pool requires obtaining the respective latches, so doing that on a live production system might lead to (severe) issues. Only do this if you know what you are doing. For the PGA there can only be one using process by definition, but be careful there too, if you request a PGA dump you are interacting with memory that is deemed private by the process that is using that.

If you require more rows to be shown than the 5 that are shown, set the ‘group_sum_nr’ variable to the amount you need on row 4 of the script.

Clustering_Factor

A few days ago I published a little note of a script I wrote some time ago to estimate the clustering_factor of an index before it had been built. At the time I pointed out that one of its limitations was that it would not handle cases where you were planning to set the table_cached_blocks preference, but a couple of days later I decided that I’d write another version of the code that would cater for the new feature – and that’s how I made an embarrassing discovery.

Having reviewed a number of notes I’ve published about the table_cached_blocks preference and its impact on the clustering_factor I’ve realised the what I’ve written has always been open to two interpretations – the one that I had in mind as I was writing, and the correct one.  I made this discovery because I had written a simple SQL statement – using the match_recognize() mechanism – to do what I considered to  be the appropriate calculation. After testing the query with a few sets of sample data that produced the correct results I emailed Stew Ashton (my “go-to” person for match_recognize() questions) asking if he would do a sanity check on the code because it was rather slow and I wondered if there was a better way of writing it.

His reply was roughly:

“I’ve read the notes you and Richard Foote have written about the clustering_factor and table_cached_blocks, and this isn’t doing what your description says it should.”

Then he explained what he had inferred from what I had written … and it made more sense than what I had been thinking when I wrote it. He also supplied some code to implement his interpretation – so I designed a couple of data models that would produce the wrong prediction for whichever piece of code implemented the wrong interpretation. His code gave the right answers, mine didn’t.

So here’s the difference in interpretation – the wrong one first – using 16 as a discussion value for the table_cached_blocks:

  • WRONG interpretation:  As you walk through index entries in order remember the last 16 rowids (that’s rowid for the rows in the table that the index is pointing to) you’ve seen. If the current rowid has a block id component that doesn’t match the block id from one of the remembered 16 rowids then increment the counter for the clustering_factor.
    • The simplicity of this algorithm means you can fix a “circular” array of 16 entries and keep walking around the circle overwriting the oldest entry each time you read a new one. It’s a pity that it’s the wrong idea because there’s a simple (though massively CPU -intensive match_recognize() strategy for implementing it – and if you were using an internal library mechanism during a proper gather_index_stats() it could be incredibly efficient.
  • RIGHT interpretation: set up an array for 16 block ids, each with an associated “row-number”. Walk through the index in order – giving each entry a row-number as you go. Extract the block id from the current entry and search through the array for a matching block id.  If you find a match then update its entry with the current row-number (so you can remembr how recently you saw the block id); if you don’t find a match then replace the entry that has the smallest (i.e. greatest distance into the past) row-number with the current block id and row-number and increment the counter for the clustering_factor.

The first piece of code that Stew Ashton sent me was an anonymous PL/SQL block that included some hard-coded fragments and embedded SQL to use a test table and index that I had defined, but he then sent a second piece of code that creates a generic function that uses dynamic SQL to construct a query against a table and an index definition that you want to test. The latter is the code I’ve published (with permission) below:


create or replace function predict_clustering_factor(
/*
Function to predict the clustering factor of an index,
taking into account the intended value of
the TABLE_CACHED_BLOCKS parameter of DBMS_STATS.SET_TABLE_PREFS.

Input is the table name, the list of column names
and the intended value of TABLE_CACHED_BLOCKS.

The function collects the last N block ids (not the last N entries).
When there is no more room, it increments the clustering factor
and replaces the least recently used block id with the current one.

Note: here a "block id" is a rowid with the row_number portion set to 0.
It is effectively a "truncated" rowid.
*/
  p_table_name in varchar2,
  p_column_list in varchar2,
  p_table_cached_blocks in number
) return number authid current_user is

  rc sys_refcursor;
  type tt_rids is table of rowid;
  lt_rids tt_rids;
  
  type t_block_list is record(
    rid rowid,
    last_hit number
  );

  type tt_block_list is table of t_block_list;
  lt_block_list tt_block_list := new tt_block_list();

  l_rid rowid;
  l_clustering_factor number := 0;
  b_block_found boolean;
  l_rn number := 0;
  l_oldest_hit number;
  i_oldest_hit binary_integer := 0;
  
  function truncated_rid(p_rid in rowid) return rowid is
    rowid_type number;
    object_number NUMBER;
    relative_fno NUMBER;
    block_number NUMBER;
    row_number NUMBER;
    rid rowid;

  begin

    DBMS_ROWID.ROWID_INFO (
      p_rid,
      rowid_type,
      object_number,
      relative_fno,
      block_number,
      row_number
    );

    rid := DBMS_ROWID.ROWID_CREATE (
      rowid_type,
      object_number,
      relative_fno,
      block_number,
      0
    );

    return rid;

  end truncated_rid;
  
begin
  if p_table_cached_blocks != trunc(p_table_cached_blocks)
  or p_table_cached_blocks not between 1 and 255 then
    raise_application_error(
      -20001, 
      'input parameter p_table_cached_blocks must be an integer between 1 and 255'
    );
  end if;

  open rc for 'select rowid from '||p_table_name||' order by '||p_column_list||', rowid';
  loop
    fetch rc bulk collect into lt_rids limit 1000;

    for irid in 1..lt_rids.count loop
      l_rn := l_rn + 1;
      l_rid := truncated_rid(lt_rids(irid));
      b_block_found := false;
      l_oldest_hit := l_rn;

      if l_rn = 1 then
        l_clustering_factor := l_clustering_factor + 1;
        lt_block_list.extend;
        lt_block_list(1).rid := l_rid;
        lt_block_list(1).last_hit := l_rn;

      else

        for i in 1..lt_block_list.count loop
          if l_oldest_hit > lt_block_list(i).last_hit then
            l_oldest_hit := lt_block_list(i).last_hit;
            i_oldest_hit := i;
          end if;
          if lt_block_list(i).rid = l_rid then
            b_block_found := true;
            lt_block_list(i).last_hit := l_rn;
            exit;
          end if;
        end loop;

        if not b_block_found then
          l_clustering_factor := l_clustering_factor + 1;
          if lt_block_list.count < p_table_cached_blocks then
            lt_block_list.extend;
            lt_block_list(lt_block_list.count).rid := l_rid;
            lt_block_list(lt_block_list.count).last_hit := l_rn; 
          else         
            lt_block_list(i_oldest_hit).rid := l_rid;
            lt_block_list(i_oldest_hit).last_hit := l_rn;
          end if;
        end if;

      end if;

    end loop;
    exit when rc%notfound;
  end loop;

  close rc;
  return l_clustering_factor;

exception when others then
  if rc%isopen then
    close rc;
  end if;
  raise;

end predict_clustering_factor;
/

After executing the above to create the function, here’s an example of usage:

rem
rem     Script:         clustering_factor_est_2.sql
rem     Author:         Jonathan Lewis
rem     Dated:          Oct 2019
rem
rem     Last tested
rem             19.3.0.0
rem             12.2.0.1
rem

create table t1
as
with generator as (
        select
                rownum id
        from dual
        connect by
                level <= 1e4 -- > comment to avoid WordPress format issue
)
select
        rownum                                  id,
        cast(rownum as varchar2(10))            v1,
        trunc(dbms_random.value(0,10000))       rand,
        rpad('x',100,'x')                       padding
from
        generator       v1,
        generator       v2
where
        rownum <= 1e6 -- > comment to avoid WordPress format issue
/

-- -------------------------------------------------------------------

SQL> execute dbms_output.put_line('Predicted cf for t1(rand, id): ' || predict_clustering_factor('t1','rand, id',16))
Predicted cf for t1(rand, id): 997218
Elapsed: 00:00:07.54

SQL> execute dbms_output.put_line('Predicted cf for t1(rand, id): ' || predict_clustering_factor('t1','rand, id',255))
Predicted cf for t1(rand, id): 985607
Elapsed: 00:00:50.61

You’ll notice that the larger the setting for the “table_cached_blocks” parameter the more time it takes to predict the clustering_factor – and it was all CPU time in my example. This isn;t surprising given the need to search through an array holding the previous history. In this example the table t1 holds 1,000,000 rows, and the number and scatter of distinct values is so arranged that the code will hardly ever find a cached block id – essentially it’s the sort of index that isn’t going to cause much of confusion to the optimizer and isn’t likely to need special attention to make the optimizer use it when it should and ignore it when it’s inappropriate.

Finally a cut-n-paste to show the accuracy of the two predictions:

SQL> create index t1_i on t1(rand, id);
Elapsed: 00:00:02.96

SQL> execute dbms_stats.set_table_prefs(null,'t1','table_cached_blocks',16)
Elapsed: 00:00:00.01

SQL> execute dbms_stats.gather_index_stats(null,'t1_i')
Elapsed: 00:00:09.55

SQL> select clustering_factor from user_indexes where index_name = 'T1_I';

CLUSTERING_FACTOR
-----------------
           997218

Elapsed: 00:00:00.11

SQL> execute dbms_stats.set_table_prefs(null,'t1','table_cached_blocks',255)
Elapsed: 00:00:00.01

SQL> execute dbms_stats.gather_index_stats(null,'t1_i')
Elapsed: 00:00:07.80

SQL> select clustering_factor from user_indexes where index_name = 'T1_I';

CLUSTERING_FACTOR
-----------------
           985607

Elapsed: 00:00:00.00

Both match perfectly – but you might notice that creating the index and gathering the stats was much faster than predicting the clustering factor for the case where we set table_cached_blocks = 255.

(If you’re wondering, my “simple but irrelevant” match_recognize() query took 370 CPU second to complete for table_cached_blocks = 200 – and a limit on march_recognize() meant that 200 was the maximum value I was allowed to use – so now you know why I emailed Stew Ashton (and just for lagniappe. he also told me about a simple workaround for the 200 limit)).

 

 

October 2nd, #SQLSatOregon in Portland and the #SQLTrain!

I’m still not sure where October went….well, actually I do-  it was a flurry of travel, writing, (but not blogging!) and working with customers on a ton of new projects for the next quarter.  With the end of October, comes the preparation for PASS Summit.  Last year, I had to pass on the SQL Train, but this year?

SQL Train is the annual pilgrimage from SQL Saturday Oregon, in Portland, to PASS Summit in Seattle via Amtrak train.  Chris Hyde is the poor, tortured soul responsible for organizing this, but I hear that if you say you’ll volunteer, it’s like Hotel California-  you can check in, but you can never check out!

I will be speaking at SQL Saturday Oregon, talking automation and DevOps, before heading to Seattle on the train.  This event always has a huge turnout of fantastic speakers, getting an early prep on for Summit.  We commonly stay at hotels close to each other, if not at the same one and then head to Portland’s Union Station to take the train to Seattle on Sunday.  I believe we have two cars booked for the trip and two cars of hyped up on caffeine, techie geeks should be fun!

I have a few sessions at PASS Summit this year, both of the sessions will be part of the new Learning Pathways, (multiple sessions, leading to deeper learning vs. higher level, single sessions.)  I’ll blog more about that later this next week in preparation, but just be prepared-  any attendees should consider bringing their laptop with access to a Linux server to make the most of the session, (or a Linux docker container or Azure BASH cloud shell…some kind of terminal access to a Linux environment!)

Have a great week!

 



Tags:  ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright ©  [October 2nd, #SQLSatOregon in Portland and the #SQLTrain!], All Right Reserved. 2019.

ANSI Plans

Here’s a thought that falls somewhere between philosophical and pragmatic. It came up while I was playing around with a problem from the Oracle database forum that was asking about options for rewriting a query with a certain type of predicate. This note isn’t really about that question but the OP supplied a convenient script to demonstrate their requirement and I’ve hi-jacked most of the code for my own purposes so that I can ask the question:

Should the presence of an intermediate view name generated by the optimizer in the course of cost-based query transformation cause two plans, which are otherwise identical and do exactly the same thing, to have different plan hash values ?

To demonstrate the issue let’s start with a simple script to create some data and generate an execution plan.


rem
rem     Script:         or_expand_plans.sql
rem     Author:         Jonathan Lewis
rem     Dated           Oct 2019
rem
rem     Last tested
rem             19.3.0.0
rem             12.2.0.1
rem
rem     Notes:
rem     Data creation copied from ODC
rem     https://community.oracle.com/thread/4297365
rem

create table t as   
select 1 as id, to_date('2019-10-11', 'YYYY-MM-DD') as lastupdated, 'Toronto' as place from dual  
union all  
select 2 as id, to_date('2019-10-12', 'YYYY-MM-DD') as lastupdated, 'Toronto' as place from dual  
union all  
select 3 as id, to_date('2019-10-15', 'YYYY-MM-DD') as lastupdated, 'Toronto' as place from dual  
union all  
select 4 as id, to_date('2019-10-21', 'YYYY-MM-DD') as lastupdated, 'Toronto' as place from dual  
;  

create table t_others as  
select 1 as id, to_date('2019-10-12', 'YYYY-MM-DD') as lastupdated, 'Blue'  as color, 'Zone 7' as zoneid from dual  
union all  
select 2 as id, to_date('2019-10-21', 'YYYY-MM-DD') as lastupdated, 'Red'   as color, 'Zone 7' as zoneid from dual  
union all  
select 3 as id, to_date('2019-10-16', 'YYYY-MM-DD') as lastupdated, 'White' as color, 'Zone 7' as zoneid from dual  
union all  
select 4 as id, to_date('2019-10-17', 'YYYY-MM-DD') as lastupdated, 'Green' as color, 'Zone 7' as zoneid from dual  
;  

create table t_further_info as  
select 1 as id, to_date('2019-10-12', 'YYYY-MM-DD') as lastupdated, 'A' as typeinfo from dual  
union all   
select 2 as id, to_date('2019-10-14', 'YYYY-MM-DD') as lastupdated, 'C' as typeinfo from dual  
union all  
select 3 as id, to_date('2019-10-21', 'YYYY-MM-DD') as lastupdated, 'D' as typeinfo from dual  
union all  
select 4 as id, to_date('2019-10-21', 'YYYY-MM-DD') as lastupdated, 'E' as typeinfo from dual  
;  

prompt  ====================
prompt  "Traditional" syntax
prompt  ====================

explain plan for
select 
        /*+ or_expand */ 
        * 
from
        t,
        t_others        pt,
        t_further_info  fi  
/*
where   (
             t.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')   
         or pt.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')  
         or fi.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')  
        )  
*/
where   to_date('2019-10-21', 'YYYY-MM-DD') <= any(t.lastupdated, pt.lastupdated, fi.lastupdated)   
and     pt.id = t.id  
and     fi.id = t.id  
;

select * from table(dbms_xplan.display(null,null,'outline'));


You’ll see that I have a simple three-table join with the nasty little detail that I have a “non-join” predicates that may require Oracle to check across all three tables before it can decide whether or not a row should be discarded. I’ve shown two variants on a theme – they both have exactly the same effect but the ANY() presentation is just a little bit neater and more compact.

Essentially Oracle can use one of two strategies for this type of query/predicate; the first is to join all three tables and wait until the final join rowsource appears and then apply the check, or it can split the query into a union all of three separate queries where each query drives off a different table selecting only the rows from that table that match “its” part of the predicate.

In the latter case the second and third branches of the union all have to be modified to ensure that they discard any rows already returned by preceding parts of the union all; this can mean lots of new predicates appearing that use the lnnvl() function. (Of course there are variants between these two extremes, but the general principle is unchanged.)

The presence of the (incomplete) /*+ or_expand */ hint in my query is there to tell the optimizer that it should attempt to transform the query into the three-part union all. This, by the way, is a 12c feature, though older versions of Oracle could get similar effects in some cases from the /*+ use_concat */ hint. Here’s the plan, with outline,  I got from 12.2.0.1:


Plan hash value: 3181357500

----------------------------------------------------------------------------------------------
| Id  | Operation               | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                    |     3 |   270 |    17   (6)| 00:00:01 |
|   1 |  VIEW                   | VW_ORE_67EF6547    |     3 |   270 |    17   (6)| 00:00:01 |
|   2 |   UNION-ALL             |                    |       |       |            |          |
|*  3 |    HASH JOIN            |                    |     1 |    55 |     6   (0)| 00:00:01 |
|   4 |     MERGE JOIN CARTESIAN|                    |     4 |   144 |     4   (0)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL  | T_FURTHER_INFO     |     1 |    13 |     2   (0)| 00:00:01 |
|   6 |      BUFFER SORT        |                    |     3 |    69 |     2   (0)| 00:00:01 |
|*  7 |       TABLE ACCESS FULL | T_OTHERS           |     3 |    69 |     2   (0)| 00:00:01 |
|*  8 |     TABLE ACCESS FULL   | T                  |     3 |    57 |     2   (0)| 00:00:01 |
|*  9 |    HASH JOIN            |                    |     2 |   182 |    11  (10)| 00:00:01 |
|  10 |     VIEW                | VW_JF_SET$FB5125FC |     2 |   156 |     9  (12)| 00:00:01 |
|  11 |      UNION-ALL          |                    |       |       |            |          |
|* 12 |       HASH JOIN         |                    |     1 |    42 |     4   (0)| 00:00:01 |
|* 13 |        TABLE ACCESS FULL| T_OTHERS           |     1 |    23 |     2   (0)| 00:00:01 |
|* 14 |        TABLE ACCESS FULL| T                  |     3 |    57 |     2   (0)| 00:00:01 |
|* 15 |       HASH JOIN         |                    |     1 |    42 |     4   (0)| 00:00:01 |
|* 16 |        TABLE ACCESS FULL| T                  |     1 |    19 |     2   (0)| 00:00:01 |
|  17 |        TABLE ACCESS FULL| T_OTHERS           |     4 |    92 |     2   (0)| 00:00:01 |
|  18 |     TABLE ACCESS FULL   | T_FURTHER_INFO     |     4 |    52 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      USE_HASH(@"SEL$7C4216F7" "PT"@"SEL$1")
      LEADING(@"SEL$7C4216F7" "T"@"SEL$1" "PT"@"SEL$1")
      FULL(@"SEL$7C4216F7" "PT"@"SEL$1")
      FULL(@"SEL$7C4216F7" "T"@"SEL$1")
      USE_HASH(@"SEL$A4A33BE0" "T"@"SEL$1")
      LEADING(@"SEL$A4A33BE0" "PT"@"SEL$1" "T"@"SEL$1")
      FULL(@"SEL$A4A33BE0" "T"@"SEL$1")
      FULL(@"SEL$A4A33BE0" "PT"@"SEL$1")
      USE_HASH(@"SET$49E1C21B_3" "T"@"SEL$1")
      USE_MERGE_CARTESIAN(@"SET$49E1C21B_3" "PT"@"SEL$1")
      LEADING(@"SET$49E1C21B_3" "FI"@"SEL$1" "PT"@"SEL$1" "T"@"SEL$1")
      FULL(@"SET$49E1C21B_3" "T"@"SEL$1")
      FULL(@"SET$49E1C21B_3" "PT"@"SEL$1")
      FULL(@"SET$49E1C21B_3" "FI"@"SEL$1")
      USE_HASH(@"SEL$5FCD2D3C" "FI"@"SEL$1")
      LEADING(@"SEL$5FCD2D3C" "VW_JF_SET$FB5125FC"@"SEL$81DF0931" "FI"@"SEL$1")
      FULL(@"SEL$5FCD2D3C" "FI"@"SEL$1")
      NO_ACCESS(@"SEL$5FCD2D3C" "VW_JF_SET$FB5125FC"@"SEL$81DF0931")
      NO_ACCESS(@"SEL$67EF6547" "VW_ORE_67EF6547"@"SEL$67EF6547")
      OUTLINE(@"SET$49E1C21B_2")
      OUTLINE(@"SET$49E1C21B_1")
      OUTLINE(@"SEL$1")
      FACTORIZE_JOIN(@"SET$49E1C21B"("FI"@"SET$49E1C21B_2" "FI"@"SET$49E1C21B_1"))
      OUTLINE(@"SET$0E101D56")
      OUTLINE(@"SEL$81DF0931")
      OUTLINE(@"SEL$5AB42CD1")
      OR_EXPAND(@"SEL$1" (1) (2) (3))
      OUTLINE(@"SET$49E1C21B")
      OUTLINE_LEAF(@"SEL$67EF6547")
      FACTORIZE_JOIN(@"SET$49E1C21B"("FI"@"SET$49E1C21B_2" "FI"@"SET$49E1C21B_1"))
      OUTLINE_LEAF(@"SET$0E101D56")
      OUTLINE_LEAF(@"SEL$5FCD2D3C")
      OUTLINE_LEAF(@"SET$FB5125FC")
      OUTLINE_LEAF(@"SEL$A4A33BE0")
      OUTLINE_LEAF(@"SEL$7C4216F7")
      OUTLINE_LEAF(@"SET$49E1C21B_3")
      ALL_ROWS
      DB_VERSION('12.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("PT"."ID"="T"."ID" AND "FI"."ID"="T"."ID")
   5 - filter("FI"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   7 - filter(LNNVL("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
   8 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
   9 - access("FI"."ID"="ITEM_1")
  12 - access("PT"."ID"="T"."ID")
  13 - filter("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
  14 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
  15 - access("PT"."ID"="T"."ID")
  16 - filter("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))


This is a wonderful demonstration of how brilliant the optimizer can be.  The query has gone through several transformations and two of them have very high visibility. First, you can see the name VW_ORE_67EF6547 at operation 1. This is a view name that Oracle generates to express (cost-based) OR-Expansion” so the optimizer has clearly obeyed my hint. As a consequence of OR-expansion we can also see several examples of the lnnvl() function appearing in the Predicate Information section of the output.; we can also see the hint re-appearing in the completed form of OR_EXPAND(@”SEL$1″ (1) (2) (3)) in the Outline Data.

However, we don’t have the union all of three pieces that we might have expected; we have a union all of two pieces and the second piece is a hash join between the table t_further_info and a view called VW_JF_SET$FB5125FC. This view is the result of “join factorization”. The optimizer has taken the 2nd and 3rd sections of our union all view and decided that it would be cost-effective to “factor out” a common table, so this:

select from t_others, t, t_further_info ... where t_others.date_predicate ...
union all
select from t, t_others, t_further_info ... where t.date_predicate ...

changes to this:

select from 
        (
        select from t_others, t   ... where t_others.date_predicate ...
        union all
        select from t, t_others   ... where t.date_predicate ...
        ),
        t_further_info 
where   ...

Having said all that, I now have to change the code because the original query was written using “ANSI” style joins – like this:


explain plan for
select
        /*+ or_expand */
        * 
from
        t   
inner join 
        t_others       pt  
on      pt.id = t.id  
inner join 
        t_further_info fi  
on      fi.id = t.id  
where
        (
             t.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')   
         or pt.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')  
         or fi.lastupdated >= to_date('2019-10-21', 'YYYY-MM-DD')  
        )
;

select  * from table(dbms_xplan.display(null,null,'outline'));


In the ANSI example I happen to have used the explicit “OR” list for the date predicates but that’s not really signficant . Here’s the plan produced by this query – and the first thing I’d like you to note is the Plan hash value:

Plan hash value: 3309788271

----------------------------------------------------------------------------------------------
| Id  | Operation               | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                    |     3 |   270 |    17   (6)| 00:00:01 |
|   1 |  VIEW                   | VW_ORE_31069B60    |     3 |   270 |    17   (6)| 00:00:01 |
|   2 |   UNION-ALL             |                    |       |       |            |          |
|*  3 |    HASH JOIN            |                    |     1 |    55 |     6   (0)| 00:00:01 |
|   4 |     MERGE JOIN CARTESIAN|                    |     4 |   144 |     4   (0)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL  | T_FURTHER_INFO     |     1 |    13 |     2   (0)| 00:00:01 |
|   6 |      BUFFER SORT        |                    |     3 |    69 |     2   (0)| 00:00:01 |
|*  7 |       TABLE ACCESS FULL | T_OTHERS           |     3 |    69 |     2   (0)| 00:00:01 |
|*  8 |     TABLE ACCESS FULL   | T                  |     3 |    57 |     2   (0)| 00:00:01 |
|*  9 |    HASH JOIN            |                    |     2 |   182 |    11  (10)| 00:00:01 |
|  10 |     VIEW                | VW_JF_SET$997549B1 |     2 |   156 |     9  (12)| 00:00:01 |
|  11 |      UNION-ALL          |                    |       |       |            |          |
|* 12 |       HASH JOIN         |                    |     1 |    42 |     4   (0)| 00:00:01 |
|* 13 |        TABLE ACCESS FULL| T_OTHERS           |     1 |    23 |     2   (0)| 00:00:01 |
|* 14 |        TABLE ACCESS FULL| T                  |     3 |    57 |     2   (0)| 00:00:01 |
|* 15 |       HASH JOIN         |                    |     1 |    42 |     4   (0)| 00:00:01 |
|* 16 |        TABLE ACCESS FULL| T                  |     1 |    19 |     2   (0)| 00:00:01 |
|  17 |        TABLE ACCESS FULL| T_OTHERS           |     4 |    92 |     2   (0)| 00:00:01 |
|  18 |     TABLE ACCESS FULL   | T_FURTHER_INFO     |     4 |    52 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      USE_HASH(@"SEL$D12FC97A" "PT"@"SEL$1")
      LEADING(@"SEL$D12FC97A" "T"@"SEL$1" "PT"@"SEL$1")
      FULL(@"SEL$D12FC97A" "PT"@"SEL$1")
      FULL(@"SEL$D12FC97A" "T"@"SEL$1")
      USE_HASH(@"SEL$09C9729D" "T"@"SEL$1")
      LEADING(@"SEL$09C9729D" "PT"@"SEL$1" "T"@"SEL$1")
      FULL(@"SEL$09C9729D" "T"@"SEL$1")
      FULL(@"SEL$09C9729D" "PT"@"SEL$1")
      USE_HASH(@"SET$E8D85587_3" "T"@"SEL$1")
      USE_MERGE_CARTESIAN(@"SET$E8D85587_3" "PT"@"SEL$1")
      LEADING(@"SET$E8D85587_3" "FI"@"SEL$2" "PT"@"SEL$1" "T"@"SEL$1")
      FULL(@"SET$E8D85587_3" "T"@"SEL$1")
      FULL(@"SET$E8D85587_3" "PT"@"SEL$1")
      FULL(@"SET$E8D85587_3" "FI"@"SEL$2")
      USE_HASH(@"SEL$95B99BAF" "FI"@"SEL$2")
      LEADING(@"SEL$95B99BAF" "VW_JF_SET$997549B1"@"SEL$BB7F1ECF" "FI"@"SEL$2")
      FULL(@"SEL$95B99BAF" "FI"@"SEL$2")
      NO_ACCESS(@"SEL$95B99BAF" "VW_JF_SET$997549B1"@"SEL$BB7F1ECF")
      NO_ACCESS(@"SEL$31069B60" "VW_ORE_31069B60"@"SEL$31069B60")
      OUTLINE(@"SEL$1")
      OUTLINE(@"SEL$2")
      MERGE(@"SEL$1" >"SEL$2")
      OUTLINE(@"SEL$58A6D7F6")
      OUTLINE(@"SEL$3")
      OUTLINE(@"SET$E8D85587_2")
      OUTLINE(@"SET$E8D85587_1")
      MERGE(@"SEL$58A6D7F6" >"SEL$3")
      OUTLINE(@"SEL$9E43CB6E")
      FACTORIZE_JOIN(@"SET$E8D85587"("FI"@"SET$E8D85587_2" "FI"@"SET$E8D85587_1"))
      OUTLINE(@"SET$6117B24C")
      OUTLINE(@"SEL$BB7F1ECF")
      OUTLINE(@"SEL$344003E3")
      OR_EXPAND(@"SEL$9E43CB6E" (1) (2) (3))
      OUTLINE(@"SET$E8D85587")
      OUTLINE_LEAF(@"SEL$31069B60")
      FACTORIZE_JOIN(@"SET$E8D85587"("FI"@"SET$E8D85587_2" "FI"@"SET$E8D85587_1"))
      OUTLINE_LEAF(@"SET$6117B24C")
      OUTLINE_LEAF(@"SEL$95B99BAF")
      OUTLINE_LEAF(@"SET$997549B1")
      OUTLINE_LEAF(@"SEL$09C9729D")
      OUTLINE_LEAF(@"SEL$D12FC97A")
      OUTLINE_LEAF(@"SET$E8D85587_3")
      ALL_ROWS
      DB_VERSION('12.2.0.1')
      OPTIMIZER_FEATURES_ENABLE('12.2.0.1')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("FI"."ID"="T"."ID" AND "PT"."ID"="T"."ID")
   5 - filter("FI"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   7 - filter(LNNVL("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
   8 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
   9 - access("FI"."ID"="ITEM_1")
  12 - access("PT"."ID"="T"."ID")
  13 - filter("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
  14 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss')))
  15 - access("PT"."ID"="T"."ID")
  16 - filter("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))


Is the plan for the “ANSI” version of the query the same as the plan for the “traditional” version? How carefully have you checked – after simply noting that the two Plan hash values were different.

The plans are the same – in that the mechanics are exactly the same and that ought to be the criterion on which we should judge them. But the hash values are different because of the change from traditional to ANSI syntax. The traditional form of the query starts life with a single query block while the ANSI form is considered to be two query blocks, so the initial ANSI query goes through three stages:


1) select from t1 join t2 join t3

2) select /*+ qb_name(sel$2) */ from (select /* qb_name(sel$1) */ from t1, t2), t3

3) select /*+ qb_name(sel$9E43CB6E) */ from t1, t2, t3

So the query is rapidly transformed to the equivalent traditional syntax but we  now have a query block name of SEL$9E43CB6E instead of SEL$1 that the traditional query (in the absence of a /*+ qb_name() */ hint would have had. This is why you see the difference in the two or_expand() hints in the Outline Data section. One reads: OR_EXPAND(@SEL$1 (1) (2) (3)), the other reads OR_EXPAND(@”SEL$9E43CB6E” (1) (2) (3)), and all the subseqent query block name differences follow on from this initial transformation. (Note: the value “sel$9e43cb6e” is derived from the input query block names of sel$1 and sel$2 that the new query block is derived from)

You may decide that this mismatch isn’t really terribly important. If you’ve modified the code to switch to ANSI style joins then you may be prepared to put in a little extra effort to check the plan in detail to see that it hasn’t changed; but it would be a lot nicer if the hash value wasn’t dependent on generated view names. You may recall that at one time Oracle had problems with plans that used materialized CTEs (“with” subqueries) because the plan hash value was affected by object names like sys_temp_0fd9d6791_dfc12da. The same principle ought, I think, to apply here.

If you don’t mind the ANSI/tradiational switch though, you might find that you’re less happy when you upgrade to 19c, because the same effect appears there too, only it’s worse. Not only do “identical” traditional and ANSI plans have different hash values, they don’t match the values from 12c because the generated name for the join factorization views (VW_JF) change in the upgrade. So if you’re depending on SQL Plan Baselines to reproduce 12c plans on 19c when you upgrade you may find cases where you know the stored baseline is giving you the same plan but Oracle thinks it isn’t and refuses to use it.

tl;dr

Plans which are functionally identical can have different plan hash values because the plans were reached through a different series of tranformations. In particular if you rewrite queries from “traditional” Oracle syntax to “ANSI” syntax you will find cases where the plan doesn’t change but the plan hash value does thanks to a change in the names of views generated by some transformations.

More significantly, if you upgrade from 12c to 19c there are case where the names of views generated by transformations may change, which could cause the optimizer to discard some of your carefully constructed SQL Plan Baselines as “not reproducible”.

Footnote

For reference, if I add the hint /*+ opt_param(‘_optimizer_join_factorization’ ‘false’) */ to the “traditional query then I get the following plan which shows more clearly the three branches that the original query has been split into – each section starting with a different choice for the driving table:


-------------------------------------------------------------------------------------------
| Id  | Operation               | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |                 |     3 |   270 |    19   (6)| 00:00:01 |
|   1 |  VIEW                   | VW_ORE_67EF6547 |     3 |   270 |    19   (6)| 00:00:01 |
|   2 |   UNION-ALL             |                 |       |       |            |          |
|*  3 |    HASH JOIN            |                 |     1 |    55 |     7  (15)| 00:00:01 |
|*  4 |     HASH JOIN           |                 |     1 |    32 |     4   (0)| 00:00:01 |
|*  5 |      TABLE ACCESS FULL  | T               |     1 |    19 |     2   (0)| 00:00:01 |
|   6 |      TABLE ACCESS FULL  | T_FURTHER_INFO  |     4 |    52 |     2   (0)| 00:00:01 |
|   7 |     TABLE ACCESS FULL   | T_OTHERS        |     4 |    92 |     2   (0)| 00:00:01 |
|*  8 |    HASH JOIN            |                 |     1 |    55 |     6   (0)| 00:00:01 |
|   9 |     MERGE JOIN CARTESIAN|                 |     4 |   144 |     4   (0)| 00:00:01 |
|* 10 |      TABLE ACCESS FULL  | T_OTHERS        |     1 |    23 |     2   (0)| 00:00:01 |
|  11 |      BUFFER SORT        |                 |     4 |    52 |     2   (0)| 00:00:01 |
|  12 |       TABLE ACCESS FULL | T_FURTHER_INFO  |     4 |    52 |     2   (0)| 00:00:01 |
|* 13 |     TABLE ACCESS FULL   | T               |     3 |    57 |     2   (0)| 00:00:01 |
|* 14 |    HASH JOIN            |                 |     1 |    55 |     6   (0)| 00:00:01 |
|  15 |     MERGE JOIN CARTESIAN|                 |     4 |   144 |     4   (0)| 00:00:01 |
|* 16 |      TABLE ACCESS FULL  | T_FURTHER_INFO  |     1 |    13 |     2   (0)| 00:00:01 |
|  17 |      BUFFER SORT        |                 |     3 |    69 |     2   (0)| 00:00:01 |
|* 18 |       TABLE ACCESS FULL | T_OTHERS        |     3 |    69 |     2   (0)| 00:00:01 |
|* 19 |     TABLE ACCESS FULL   | T               |     3 |    57 |     2   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Outline Data
-------------

  /*+
      BEGIN_OUTLINE_DATA
      USE_HASH(@"SET$BB614FD2_3" "T"@"SET$BB614FD2_3")
      USE_MERGE_CARTESIAN(@"SET$BB614FD2_3" "PT"@"SET$BB614FD2_3")
      LEADING(@"SET$BB614FD2_3" "FI"@"SET$BB614FD2_3" "PT"@"SET$BB614FD2_3"
              "T"@"SET$BB614FD2_3")
      FULL(@"SET$BB614FD2_3" "T"@"SET$BB614FD2_3")
      FULL(@"SET$BB614FD2_3" "PT"@"SET$BB614FD2_3")
      FULL(@"SET$BB614FD2_3" "FI"@"SET$BB614FD2_3")
      USE_HASH(@"SET$BB614FD2_2" "T"@"SET$BB614FD2_2")
      USE_MERGE_CARTESIAN(@"SET$BB614FD2_2" "FI"@"SET$BB614FD2_2")
      LEADING(@"SET$BB614FD2_2" "PT"@"SET$BB614FD2_2" "FI"@"SET$BB614FD2_2"
              "T"@"SET$BB614FD2_2")
      FULL(@"SET$BB614FD2_2" "T"@"SET$BB614FD2_2")
      FULL(@"SET$BB614FD2_2" "FI"@"SET$BB614FD2_2")
      FULL(@"SET$BB614FD2_2" "PT"@"SET$BB614FD2_2")
      USE_HASH(@"SET$BB614FD2_1" "PT"@"SET$BB614FD2_1")
      USE_HASH(@"SET$BB614FD2_1" "FI"@"SET$BB614FD2_1")
      LEADING(@"SET$BB614FD2_1" "T"@"SET$BB614FD2_1" "FI"@"SET$BB614FD2_1"
              "PT"@"SET$BB614FD2_1")
      FULL(@"SET$BB614FD2_1" "PT"@"SET$BB614FD2_1")
      FULL(@"SET$BB614FD2_1" "FI"@"SET$BB614FD2_1")
      FULL(@"SET$BB614FD2_1" "T"@"SET$BB614FD2_1")
      NO_ACCESS(@"SEL$49E1C21B" "VW_ORE_67EF6547"@"SEL$67EF6547")
      OUTLINE(@"SEL$1")
      OR_EXPAND(@"SEL$1" (1) (2) (3))
      OUTLINE_LEAF(@"SEL$49E1C21B")
      OUTLINE_LEAF(@"SET$BB614FD2")
      OUTLINE_LEAF(@"SET$BB614FD2_1")
      OUTLINE_LEAF(@"SET$BB614FD2_2")
      OUTLINE_LEAF(@"SET$BB614FD2_3")
      ALL_ROWS
      OPT_PARAM('_optimizer_join_factorization' 'false')
      DB_VERSION('19.1.0')
      OPTIMIZER_FEATURES_ENABLE('19.1.0')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("PT"."ID"="T"."ID")
   4 - access("FI"."ID"="T"."ID")
   5 - filter("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
   8 - access("PT"."ID"="T"."ID" AND "FI"."ID"="T"."ID")
  10 - filter("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
  13 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00',
              'syyyy-mm-dd hh24:mi:ss')))
  14 - access("PT"."ID"="T"."ID" AND "FI"."ID"="T"."ID")
  16 - filter("FI"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss'))
  18 - filter(LNNVL("PT"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00',
              'syyyy-mm-dd hh24:mi:ss')))
  19 - filter(LNNVL("T"."LASTUPDATED">=TO_DATE(' 2019-10-21 00:00:00',
              'syyyy-mm-dd hh24:mi:ss')))

88 rows selected.

Although the “traditional” and “ANSI” plans still show a difference in their Plan hash values when join factorization is blocked, the absence of the join factorization view means that the plan hash values are now consistent between 12c to 19c (the output above came from 19.3.0.0 as you can see in the Outline information).

Video : Oracle REST Data Services (ORDS) : OAuth Client Credentials Authorization

Today’s video is a zip through the OAuth Client Credentials Authorization flow in Oracle REST Data Services (ORDS).

For those of you that are afraid of videos, this is one of the authentication and authorization methods discussed in this article.

You can get more information about ORDS here.

The star of today’s video is Øyvind Isene, who is trying to act all cool about being in one of my videos, when in fact I’ve made all his dreams come true. </p />
</p></div>

    	  	<div class=