Search

OakieTags

Who's online

There are currently 0 users and 42 guests online.

Recent comments

Oakies Blog Aggregator

temporary undo in 12c

This feature seems a no-brainer once you’re on 12c. After all, why would you want your global temporary tables to be hammering away at your redo logs.  With that in mind, my initial tinkering with the feature had me getting ready for a “blog rant” because it did not seem to work.  Let’s see how you might end up unimpressed. 

Here’s the standard usage of undo (as per 11.2 and below).

 

SQL> create global temporary table T
  2  ( x char(1000));

Table created.

SQL> insert into T
  2  select 'a'
  3  from dual
  4  connect by level <= 50000;

50000 rows created.

SQL> set autotrace traceonly stat
SQL> delete from T;

50000 rows deleted.

Statistics
----------------------------------------------------------
         55  recursive calls
      71980  db block gets
       7288  consistent gets
          0  physical reads
   60562540  redo size
        859  bytes sent via SQL*Net to client
        822  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
      50000  rows processed
      

So you can see that the delete’s (which are probably the expensive of operations when it comes to undo, and hence redo for that undo) chewed up 60 megabytes of redo.

I then thought I’d try the new feature, and simply edited my script to set temp_undo_enabled before my delete statement:

 

SQL> create global temporary table T
  2  ( x char(1000));

Table created.

SQL> insert into T
  2  select 'a'
  3  from dual
  4  connect by level <= 50000;

50000 rows created.

SQL> set autotrace traceonly stat
SQL> alter session set temp_undo_enabled=true;

Session altered.

SQL> delete from T;

50000 rows deleted.


Statistics
----------------------------------------------------------
         65  recursive calls
      72100  db block gets
       7283  consistent gets
          0  physical reads
   60566000  redo size
        859  bytes sent via SQL*Net to client
        822  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
      50000  rows processed

Well, that was an underwhelming result.  However,  there is nothing wrong with the feature…just my usage of it.  This is actually documented in the reference manual.  I just had neglected to read it carefully enough Smile

“Once the value of the parameter is set, it cannot be changed for the lifetime of the session. If the session has temporary objects using temporary undo, the parameter cannot be disabled for the session. Similarly, if the session already has temporary objects using regular undo, setting this parameter will have no effect.”

So let’s try again, this time setting temp_undo_enabled as soon as we commence the session

SQL> alter session set temp_undo_enabled=true;

Session altered.

SQL>
SQL>
SQL> create global temporary table T
  2  ( x char(1000));

Table created.

SQL>
SQL> insert into T
  2  select 'a'
  3  from dual
  4  connect by level <= 50000;

50000 rows created.

SQL>
SQL> set autotrace traceonly stat
SQL>
SQL> delete from T;

50000 rows deleted.


Statistics
----------------------------------------------------------
         64  recursive calls
      71923  db block gets
       7242  consistent gets
          0  physical reads
          0  redo size
        858  bytes sent via SQL*Net to client
        822  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
      50000  rows processed

So if you’re planning on using this feature,  consider setting it either system-wide, or with a logon trigger for appropriate sessions.

(And read the manuals more carefully than I did …. )

No Local Oracle User Group? Oh Well, Go to a Bar…

Is there no local Oracle user group in your area? Do you wish you could share experiences with like-minded people? Is there no opportunity to talk about the technology you work with? Do you feel you would benefit from expanding your network of friends and contacts? But without a local user group it’s really hard to do any of that! – At least face-to-face. And, let’s face it, meeting for real really does beat meeting on-line. I know, you are sad about it.

Well, go to a bar. Have a drink, it might make you feel better. Especially if you go with Dave in your team. Ask your friend across town along who also works with Oracle Tech. And maybe she could bring her friend who is an Oracle DBA too.

Well done! You now have an Oracle User Group!

It really is that simple to start a user group. You do not need an organisation, you do not need membership and you do not need presenters. You just need three of you and a place to meet. I might be saying a bar above (or, in England, it would be a local Pub, my good chap) but it can be a coffee house, a cafe, a wine bar, the local library maybe or anywhere you can meet easily and relax. Obviously increasing from 3 to 4 to 5 etc makes it all more interesting with more stories, tips and experiences to share.

I’m in a user group just like that, it’s called the LOB – London Oracle Beers. We started in around 2009, 2010. Initially it was myself, Neil Chandler and Doug Burns occasionally meeting for a pint or three (and later arguing about who started LOB). Soon Pete Scott was joining us, then Dawn, then Graham {what happened to Graham?} then Martin Bach… It got serious when I put together a mail list. We’ve been going ever since and although the regularity of the meetings fluctuates, as does the size of the group, it seems to keep going at between once a month to once every 3 months. Thinking about it, we are due a get-together.

How to Start a Small, Social User Group

There is one thing that IS needed for a user group like the above, and in fact for the others I am going to mention.

You need someone to regularly say “let’s meet”.

It does not need to be one person, it can be shared between several people. In the current LOB it is mostly myself that sends out a call but Neil does too. Anyone in the group can make the call and occasionally others do (Dawn, Pete) and some ask me to make the call on their behalf, which I do even if I can’t attend. But that’s really all you need, someone to make the call.

The other thing you need to do is, as a group, invite some others along. Not everyone you can think of, for a social user group let it grow at a steady, organic rate. People drop in and out of user groups so you need to constantly keep an eye on things and if the numbers drop, ask a few others. People’s lives and circumstances alter so they can’t come or they just decide they’ve had enough and that’s fine. For this sort of social-centric user group I would suggest you stick to inviting friends and friendly contacts and try not to let it get too large (A nice problem for a user group to have!)

So Just Do it! If you do not have a local user group and you want one, be the person to ask a couple of friends and if there are 3 or more of you, make that call. And a month or two later, make the call again. The worst that will happen is that it won’t take off and, if it does not, you know you tried (and not many people will know you failed :-) ). I’d honestly be surprised if it does not at least take off for a while.

The Presenting User Group

Another sort of user group is where you start off by wanting it to be a bit more structured, to have presentations involved. This does take more organisation: a location where you can present (it does not have to be the same place each time), someone to present and it helps if you have a sponsor. As having somewhere to speak may well involve renting a room and it’s nice if you can offer some drinks and snacks. You don’t need a lot of sponsorship (if any). Ask some local Oracle-centric firms, the worst they will say is “no” and the best they will say is “sure, here is enough money for some pizza and Rohan in the Dev team is happy to talk about Blargh at the first meeting”. But work out what you need (say rental on the room and enough for a couple of Samosas for everyone) and only ask for what you need. Your sponsor may well want to put up a banner or have someone say something but that is part of the deal.

I’m involved in two such user groups in the UK at the moment:

Oracle Midlands run by Mike Mckay-Dirden. They are about to have their 8th meeting (Follow the link <- over there), on the 19th May in their usual location in Aston, Birmingham. I managed to get to most of their first meetings, spoke at one and sadly missed the last couple due to timing clashes. Might be true for this next one too :-(. Mike does brilliantly on this, he got a local firm (or part of the firm) Red Stack Technology to sponsor him and he has the gall to ask Oracle Ace Directors and other known people to speak :-)

Club Oracle London was started by Jonathan Lewis (I think prompted by the LOB and also what Mike was doing – but don’t blame Jonathan if I have my facts wrong) and is sponsored and run by e-DBA but with a very light touch. This will be their 4th or 5th meeting. I’m speaking at this one and I’ve been to all of them. Again, follow the link for more details and to register for the event on Thursday.

If you visit my blog often or follow me on Twitter you will have seen me promote these events. I’m very keen to support smaller, local user groups.

Again, it needs someone to Make The Call and also get at least the first speaker(s), but you can share the load for that. The other difference is that you probably want to spread the call a little wider. Tweet about it, use Facebook and all those other social media things.  Tell all the people you know who might be interested and ask them to spread the word as you want a reasonable crowd for the speaker.

There is more to organising these more formal user groups but nothing that one determined person or a small group of fairly determined people cannot make happen.

Larger User Groups

The next step up are the large user groups where you have membership and paid-for events, like national, regional (or state in the US) user groups. You need a run up to create one of those! However, they are still user groups, they are all part of the “environment” of the total user community.

These user groups can still be created by a small number of people but doing so is a bigger task and I suggest you contact other people who are involved in such things and really plan what you want to achieve – it’s a topic beyond a single blog post like this. But it can be done and it can grow out of the two sorts of user group above.

I would like to highlight that starting your own local, small user group should be no barrier to being part of the large user groups. I attend, promote and present at the small user groups. Heck, you could say I run one with the LOB (along with other people). However, I am an active member of the UKOUG, deputy-chairing a Special Interest Group that meets 3 times a year and I’m involved in organising the content of UKOUG Tech15 (See the banner -> over there -> – at the time of writing the Call for Papers is open ). We can all live together.

 

Getting Speakers

If you want a speaker at your event (which you can have at the social sort of user group but you need to make sure the speaker is not expecting a projector and can’t user powerpoint!) then you can ask someone in your group to do it, you can do it yourself (we all have knowledge and experience to contribute) or you could try to get a speaker outside your group.

If you are trying to start up the more formal Presenting User Group then a known name will help draw people to your event. But there is the danger that not enough people will turn up for the speaker! You will worry about that. Well, don’t. Just be honest with the speaker about the numbers you expect and be realistic. In many ways I personally like smaller crowds and I know other speakers who do. I’d rather have 5 enthusiastic people than 50 indifferent ones.

Obviously, the more geographically local the speaker is the more likely they will say yes  and asking the stellar stars is likely to get a “no” as they are simply too busy – but if they are Local-local, they may say yes! Remember, potential speakers have to earn a living so are not available at the drop of a hat and some only do conferences. Again, the worst you will get is a “no”.

I’ll make an offer – If you decide to start such a group in the UK and you need a presenter, let me know. I can’t promise but I’ll try to oblige. If you are further afield I’m afraid it is less likely I can help as I have to pay my own travel and expenses. But you never know, the odd jaunt over to Europe does happen.

Also, try looking up local Oracle ACEs and OakTable members. Again,they might say no, they might say yes but Oracle ACEs and OakTable members are generally inclined to help, it’s a large part of why we have those labels.

 

As the annoying advert says “Just Do It”

So in summary: If you want a user group and there is not one, maybe you can start one. If you want it to be a little more formal with presentations, look for a sponsor and ask some local Oracle Names if they would present.

Good Luck!

Enabling Edition Based Redefinition On A Schema With Object-Relational Tables

This is just a heads-up for those thinking about using Edition Based Redefinition (EBR) and enabling it on an existing schema with objects. Although EBR isn't exactly a new feature its current adoption level is probably not that high (which probably changes in future as Oracle E-Business Suite uses EBR now as part of their default upgrade procedure as far as I understood).I was recently contacted by someone who enabled EBR on an existing schema using ALTER USER ... ENABLE EDITIONS and had to use the "FORCE" option since there were (according to the official ALTER USER documentation) "objects that are not editionable and that depend on editionable type objects in the schema. ... In this case, all the objects that are not editionable and that depend on the editionable type objects in the schema being editions-enabled become invalid".Although one could say it is clearly mentioned in the documentation, the consequences are probably not that obvious to everyone if those non-editionable objects are tables having columns based on user-defined types. So I state it here to make it hopefully clear enough:If you use the FORCE option of ALTER USER ... ENABLE EDITIONS to enable editions on an existing schema already containing objects and among those objects are tables having columns based on user-defined types, then effectively those tables will become invalid and stay invalid. There is no officially documented way to reverse this step or compile tables to become valid again (there's no ALTER TABLE COMPILE or similar). The table cannot be accessed any longer and all data contained is hence officially gone, too.This means the affected tables need to be restored from a (hopefully existing and usable) backup (I don't know if there's a dictionary hack available that is officially sanctioned by Oracle support to make the table valid again).In my opinion the FORCE option should check if the schema contains such dependent tables and in such a case error out with an error message that the table needs to be dropped first (or moved to a different schema) before ENABLE EDITIONS can succeed. This would make the situation much clearer, rather than leaving tables in INVALID state behind that cannot be fixed/recompiled afterwards.Below is a simple test case that demonstrates the issue:


-- Create a test user for enabling editions
drop user ebr_test cascade;

create user ebr_test identified by ebr_test;

grant dba to ebr_test;

-- Connect as test user
connect ebr_test/ebr_test

-- and create a object relational type
create type test_coll_type as table of number;

-- and a table having such a column of that type
create table test_ebr_table (col1 number, col2 number, test_coll test_coll_type) nested table test_coll store as test_coll_table;

-- Some test data
insert into test_ebr_table (col1, col2, test_coll) values (1, 1, test_coll_type(1, 2 ,3));

commit;

-- Everything is fine so far
select * from test_ebr_table, table(test_coll);

-- Enable editions, FORCE is required
alter user ebr_test enable editions force;

-- This no longer works (ORA-04063: table "EBR_TEST.TEST_EBR_TABLE" has errors)
select * from test_ebr_table, table(test_coll);

-- Not even simple scalar values can be accessed from the table
select col1 from test_ebr_table;

-- The table has gone INVALID
select status from user_objects where object_name = 'TEST_EBR_TABLE';

Not Exists

Another question on a seemingly simple “not exists” query has appeared on OTN just a few days after my last post about the construct. There are two little differences between the actual form of the two queries that make it worth repeating the analysis.

The first query was of the form:


select from big_table
where  not exists (select exact_matching_row from small table);

while the new query is of the form:


select from big_table alias1
where not exists (select inexact_matching_row from big_table alias2)

In the absence of a complete explanation, we might guess that the intention of the first query is to: “check correctness of undeclared foreign key constraint” i.e. small_table is the parent table with unique values, and big_table is the child end with some data that may be invalid due to a recent bulk data load. (In my example for the previous posting I relaxed the uniqueness assumption in small_table to make the problem a little more expensive.)

Our guess for the second query ought to be different; we are using the same table twice, and we are checking for the non-existence of “imperfect matches”. This introduces two potential threats – first that the (possibly pseudo-)join between the two tables is between two large tables and therefore inherently likely to be expensive; second that we may be allowed to have very large numbers of “perfect matches” that will escalate the scale of the join quite dramatically. Here’s the actual query (second version) from the posting; this will make it easier to explain why the structure of the query introduces the second threat and requires us (as so often) to understand the data in order to optimise the query execution:


select  count(*)
from    ubl_stg.wk_sap_fat w1
where   not exists (
                select  1
                from    ubl_stg.wk_sap_fat w2
                where   w2.mes_mese_id =  w1.mes_mese_id
                and     w2.sistema     <> w1.sistema
        )
;

Note the schema name ubl_stg – doesn’t that hint at “staging tables” for a data load.
Note the column name mes_mese_id – an “id” column, but clearly not one that’s supposed to be unique, so possibly a foreign key column that allows repetitions

The code is looking for, then counting, cases where for a given value of mes_mese_id there is only one corresponding value used for sistema. Since we don’t know the application we don’t know whether this count should be low or high relative to the number of rows in the table, nor do we know how many distinct values there might be of mes_mese_id – and these pieces of information are critical to identifying the best execution plan for the query.

The owner of this query told us that the table held 2 million rows which are “deleted and reloaded” every day, showed us that the (default) execution plan was a hash anti-join which took two hours to complete, told us that he (or she) didn’t think that an index would help because of the “not equal” predicate, and finally reported back that the problem was solved (no timing information supplied) by the addition of the /*+ no_unnest */ hint.

The question is – what has happened and why is there such a difference in performance ?

The first observation is that the 2 hours does seem an unreasonably long time for the query to run – and since it’s only a select statement it would be good to run it again and take a snapshot of the session statistics and session events (v$sesstat, v$session_event) for the session to see what the workload was and where the time was spent. Given the “deleted and reloaded” comment it’s possible that there may be some unexpected overhead due to some strange effects of read-consistency or delayed block cleanout, so we might also take a snapshot of tablespace or file I/O to check for lots of I/O on the undo tablespace (which might also show an I/O problem on the table’s tablespace or the user’s TEMP tablespace). The snapshots give us a very cheap, non-invasive option for getting some summary stats – but the tablespace/file stats are system-wide, of course, so may not tell us anything about our specific task, so we might even enable extended tracing (event 10046 level 8 / dbms_monitor with waits) to see in detail where we are losing time.

Since we don’t currently have the information we need to explain the two hours (it may have appeared by the time I post this note) it might be instructive to make some guesses about where the time could go in the hash anti-join, and why the /*+ no_unnest */ hint could make a difference.

Ignoring the possibility of strange undo/read-consistency/cleanout effects the first possiblity is simply that the hash join is large and turns into a multi-pass I/O thrash. The mes_mese_id column looks like it might be a number (id columns so often are) but the sistema column has the flavour of a reasonably large character column – so maybe our hash table has to be a couple of hundred megabytes – that could certainly be enough to spill to disc, though you’d have to have a really small PGA availability for it to turn into a multi-pass hash join.

Another possibility is that the pattern in the data makes the hash join burn up a huge amount of CPU – that should be easy to see on a re-run.  If there are relatively few distinct sets of values for (mes_mese_id, sistema) and there are very few cases where a mes_mese_id is associated with more than one sistema, then a large fraction of the hash table probes would have to follow a very long chain of matches to the very end, and that would take a large amount of CPU.

Pursue that “long chain” hypothesis to a slight extreme – what if there’s one mes_mese_id that appears in 250,000 of the 2M rows, and the sistema value is the same for every one of those quarter million rows,  which would require Oracle to walk a chain of 250,000 elements 250,000 times, for a total of 62.5 billion pointers to follow and comparisions to make – how much CPU might that take.  Worse still,since having the same mes_mese_id (the hashing is only on the equality predicate) means the quarter of a million rows would have to go into the same hash bucket so we might end up doing a multipass operation because that one bucket was very much larger than anything Oracle had anticipated when it did its internal partitioning calculations for the hash table. (There are some related notes in this article and in chapter 12 of Cost Based Oracle – Fundamentals)

Why would the /*+ unnest */ hint address these performance problems ? My first guess would be that the OP may have created the index on (mes_mese_id, sistema), in which case a full scan of that index (or even a fast-full scan if the index were newly created, or even a tablescan if the data had been loaded in the right order) followed by the filter subquery being driven by an index range scan would result in a relatively efficient subquery being executed once per distinct value of (mes_mese_id, sistema) rather than once per row. This could be much more efficient than a really badly skewed data set doing the hash anti-join. In fact, even in the absence of the index, if the number of distinct combinations was really quite small, that many tablescans – if the table were cached, whether in Oracle or the filesystem or the SAN cache – might still be a lot faster than the hash anti-join. (We were told that the problem was solved – but not told how much time constituted a viable solution.)

Models

Hand-waving and talk is fine – but a few modelled results might make it easier to comprehend, so here’s a way of generating a few versions of data sets and testing:


drop table t1 purge;

create table t1
as
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
)
select
        rownum                                          id,
/*
        --
        --      Random generation nearly guaranteeing no random
        --      duplicates of (x,y) but quite a lot of mismatches
        --      Count 735,715 in 3.36 seconds.
        --
        trunc(dbms_random.value(0,2e6))                 x,
        lpad(trunc(dbms_random.value(0,1e6)),64,'0')    y
*/
/*
        --
        --      One specific pair repeated many times with no mismatch the rest
        --      as previously generated above. Times with different repeat counts:
        --       10,000            14 seconds
        --       20,000            52 seconds
        --       40,000           202 seconds
        --      CPU time quadruples as the count doubles (twice the number of
        --      probes walking twice the number of steps in the hash chain)
        --       80,000 =>        800 seconds
        --      160,000 =>      3,200 seconds
        --      240,000 =>      ca. 2 hours.
*/
        case
                when rownum <= 40000
                        then 2e6
                        else trunc(dbms_random.value(0,2e6))
        end                                             x,
        case
                when rownum <= 40000
                        then
                                lpad(0,64,'0')
                        else
                                lpad(
                                        trunc((rownum - 1)/1000) +
                                                case mod(rownum-1,1000) when 0 then 0 else 0 end,
                                        64,'0'
                                )
        end                                             y
/*
        --
        --      2,000 distinct values repeated 1,000 times each
        --      Query result: 2,000,000 in 235 seconds.
        --
        trunc((rownum - 1)/1000)                x,
        lpad(
                trunc((rownum - 1)/1000) +
                        case mod(rownum-1,1000) when 0 then 0 else 0 end,
                64,'0'
        )                                       y
*/
from
        generator       v1,
        generator       v2
where
        rownum <= 2e6
;

begin
        dbms_stats.gather_table_stats(
                ownname          => user,
                tabname          =>'T1',
                method_opt       => 'for all columns size 1'
        );
end;
/

select
        count(*)
from    t1 w1
where   not exists (
                select  1
                from    t1 w2
                where   w2.x = w1.x
                and     w2.y <> w1.y
);  

As you can see I’ve changed the table and column names – this keeps them in line with the original SQL statement presented on OTN before we got the graphic display of a similar statement and plan. The SQL to create the data includes three variants and 5 sets of results (and 3 conjectures) running 11.2.0.4. These results appeared when the optimizer took the hash anti-join execution plan, and also spilled to disc with a one-pass workarea operation that first extended to about 200MB of PGA then dropped back to about 8MB.

To summarise the results recorded in the SQL – if we use the term “bad data” to describe rows where more than one Y value appears for a given X value then:

  1. With a large number of distinct pairs and a lot of bad data: the anti-join is pretty fast at 3.36 seconds.
  2. With no bad data, a small number of distinct pairs (2,000) and lots of rows per pair (1,000): the anti-join takes 235 CPU seconds
  3. As for #1 above, but with one extreme “good” pair that appears a large number of times: CPU time is proportional to the square of the number of duplicates of this value

I didn’t actually test beyond 40,000 duplicates for the last case, but you can see the double/quadruple pattern very clearly and the CPU time would have hit 2 hours at around 240,000 identical copies of one (x,y) pair.

/*+ no_unnest */

So what happens if you try using the /*+ no_unnest */ hint ? The target here is that the more repetitive the data the smaller the number of times you may have to run the subquery; and if you can get the driving data ordered you can guarantee the smallest possible number of runs of the subquery. I haven’t worked through all the possibilities, but to give you the flavour of what can happen, when I added the /*+ no_unnest */ hint to the query (and ensured that the table would be cached rather than read using direct path reads into the PGA) the execution time for the test with 1,000 copies of 2,000 pairs took 181 seconds to do 2,001 tablescans (compared with 235 seconds to do 2 tablescans and a hash anti-join) with the following execution path:


----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     1 |    69 |  8256K  (4)| 11:28:04 |
|   1 |  SORT AGGREGATE     |      |     1 |    69 |            |          |
|*  2 |   FILTER            |      |       |       |            |          |
|   3 |    TABLE ACCESS FULL| T1   |  2000K|   131M|  2877   (4)| 00:00:15 |
|*  4 |    TABLE ACCESS FULL| T1   |     2 |   138 |     4   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter( NOT EXISTS (SELECT /*+ NO_UNNEST NO_INDEX ("W2") */ 0
              FROM "T1" "W2" WHERE "W2"."X"=:B1 AND "W2"."Y"<>:B2))
   4 - filter("W2"."X"=:B1 AND "W2"."Y"<>:B2)

More significantly, when I created the optimum index the execution time dropped to 0.9 seconds – here’s the create index statement and subsequent plan – the extreme benefit appears because the data was effectively loaded in sorted order; if this had not been the case I would have forced an index full scan for the driving data set (with a “not null” predicate or constraint to make it possible for the optimizer to use the index to drive the query)


create index t1_i1 on t1(x,y) compress pctfree 0;

-----------------------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       |     1 |    69 |  6008K  (1)| 08:20:45 |
|   1 |  SORT AGGREGATE     |       |     1 |    69 |            |          |
|*  2 |   FILTER            |       |       |       |            |          |
|   3 |    TABLE ACCESS FULL| T1    |  2000K|   131M|  2877   (4)| 00:00:15 |
|*  4 |    INDEX RANGE SCAN | T1_I1 |     2 |   138 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter( NOT EXISTS (SELECT /*+ NO_UNNEST */ 0 FROM "T1" "W2"
              WHERE "W2"."X"=:B1 AND "W2"."Y"<>:B2))
   4 - access("W2"."X"=:B1)
       filter("W2"."Y"<>:B1)

Similarly, when I had the index in place and the 40,000 repetitions of a “good pair”, Oracle took a total of 9 seconds even though it had to run the subquery 1,960,000 times for the non-repetitive data (and once for the repetitive_pair). I have to say I was a little surprised at how rapidly it managed to get through that 2M subquery executions – but then I keep forgetting how ridiculously overpowered my new laptop is.

With these figures in mind you can appreciate that if the OP had lots of pairs with tens of thousands of repetitions, then even without creating the index on the table, the query time might drop from 2 hours for the hash anti-join to “a few minutes” for the filter subquery with a time that was good enough to look like “problem solved”.

Summary

If you have a few “long hash chains” in the build table – i.e. rows from the “first” table in the join that hash to the same value – then the amount of work Oracle has to do to check a single row from the probe (“second”) table that matches on the hash value can become significant. If a large number of rows from the probe table hit a long hash chain then the CPU time for the whole join can climb dramatically and you may want to force Oracle away from the hash join.

If the the long chains are the result of a skewed distribution where a small number of values appear very frequently in a table with a large number of distinct values that each appears infrequently then the optimizer may not notice the threat and may choose the hash plan when there is a much less resource-intensive alternative available.

Footnote:

There was another interesting observations I made while doing the experiments relating to whether the chains are due to hash collisions or require exact matches in the data – but I’ve spent an hour on desgining and running tests, and nearly 4 hours writing up the results so far. I need to do a few more tests to work out whether I’m seeing a very clever optimisation or a lucky coincidence in a certain scenario – so I’m going to save that for another day.

 

 

Advanced Oracle Troubleshooting Guide – Part 12: control file reads causing enq: SQ – contention waits?

Vishal Desai systematically troubleshooted an interesting case where the initial symptoms of the problem showed a spike of enq: SQ – contention waits, but he dug deeper – and found the root cause to be quite different. He followed the blockers of waiting sessions manually to reach the root cause – and also used my @ash/ash_wait_chains.sql and @ash/event_hist.sql scripts to extract the same information more conveniently (note that he had modified the scripts to take AWR snap_ids as time range parameters instead of the usual date/timestamp):

Definitely worth a read if you’re into troubleshooting non-trivial performance problems :)

Database landscape 2014 visualization

I saw this database landscape 2014 overview from “451 Research” with an very nice visualization…

Little things worth knowing: Is there a penalty in establishing a connection to Oracle using the MAA connection string? Part 2

In the very lengthy previous post about the MAA connect string I wanted to explain the use of the MAA connection string as promoted by Oracle. I deliberately kept the first part simple: both primary and standby cluster were up, and although the database was operating in the primary role on what I called standby cluster (again it’s probably not a good idea to include the intended role in the infrastructure names) there was no penalty establishing a connection.

As pointed out by readers of the blog entry that is of course only one part of the story (you may have guessed by the TNS alias MAA_TEST1 … there are more to come). When you define the connection string this way, you cater for the situation where something goes wrong. Let’s try a few more scenarios. Here is the current Data Guard situation:

DGMGRL> show configuration

Configuration - martin

  Protection Mode: MaxPerformance
  Members:
  CDB   - Primary database
    Error: ORA-16810: multiple errors or warnings detected for the database

    STDBY - Physical standby database 
      Error: ORA-12543: TNS:destination host unreachable

Fast-Start Failover: DISABLED

Configuration Status:
ERROR   (status updated 71 seconds ago)
DGMGRL> show database 'CDB'

Database - CDB

  Role:               PRIMARY
  Intended State:     TRANSPORT-ON
  Instance(s):
    CDB1
      Error: ORA-16737: the redo transport service for standby database "STDBY" has an error

    CDB2
      Error: ORA-16737: the redo transport service for standby database "STDBY" has an error

Database Status:
ERROR

As you can see CDB is operating in primary role, and redo transport is broken. This simple reason for broken redo transport is the fact that the standby cluster in its entirety is down. For the sake of completeness I have repeated the MAA connection string here:

MAA_TEST1 =
  (DESCRIPTION_LIST=
     (LOAD_BALANCE=off)(FAILOVER=on)
       (DESCRIPTION=
         (CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
           (ADDRESS_LIST= (LOAD_BALANCE=on) (ADDRESS=(PROTOCOL=TCP)(HOST=rac12pri-scan)(PORT=1521)))
           (CONNECT_DATA=(SERVICE_NAME=rootsrv)))
       (DESCRIPTION= (CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
           (ADDRESS_LIST= (LOAD_BALANCE=on) (ADDRESS=(PROTOCOL=TCP)(HOST= rac12sby-scan)(PORT=1521)))
           (CONNECT_DATA=(SERVICE_NAME=rootsrv)))
  )

Scenario 1: shut down of the standby cluster, database in primary role on primary cluster

This does not seem to be much of a problem. Using the same test as in part 1 I get no noteworthy penalty at all:

[oracle@lab tns]$ time sqlplus system/secret@maa_test1 < 
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
CDB2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real	0m0.246s
user	0m0.020s
sys	0m0.025s
[oracle@lab tns]$ time sqlplus system/secret@maa_test1 < 
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
CDB1

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real	0m0.171s
user	0m0.018s
sys	0m0.018s
[oracle@lab tns]$ 

Trying from a 12.1.0.1.0 client on a different machine:

[oracle@oracledev ~]$ time sqlplus system/secret@maa_test1 < select sys_context('userenv','instance_name') from dual;
> exit
> EOF

SQL*Plus: Release 12.1.0.1.0 Production on Thu Apr 23 10:13:50 2015

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Last Successful login time: Thu Apr 23 2015 10:12:11 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL> 
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
CDB1

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real	0m0.427s
user	0m0.012s
sys	0m0.027s
[oracle@oracledev ~]$ time sqlplus system/secret@maa_test1 < 
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
CDB2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real	0m0.158s
user	0m0.012s
sys	0m0.016s

That seems ok, but can be easily explained by the (LOAD_BALANCE=off)(FAILOVER=on) in the DESCRIPTION_LIST. But what if we do it the other way around? Let’s start by reversing the roles:

DGMGRL for Linux: Version 12.1.0.2.0 - 64bit Production

Copyright (c) 2000, 2013, Oracle. All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected as SYSDBA.
DGMGRL> switchover to 'STDBY'
Performing switchover NOW, please wait...
New primary database "STDBY" is opening...
Oracle Clusterware is restarting database "CDB" ...
Switchover succeeded, new primary is "STDBY"

DGMGRL> show configuration

Configuration - martin

  Protection Mode: MaxPerformance
  Members:
  STDBY - Primary database
    CDB   - Physical standby database 

Fast-Start Failover: DISABLED

Configuration Status:
SUCCESS   (status updated 48 seconds ago)


Scenario 2: black-out of the primary cluster, database in primary role on standby cluster

The next is the worst case: a date centre has gone down, and none of the hosts in the other cluster (rac12pri) is reachable. To simulate this I simply shut down rac12pri1 and rac12pri2, that’s as dead as it gets. The SCAN will still resolve in DNS, but there is neither a SCAN VIP nor a SCAN LISTENER anywhere to answer on the primary cluster.

As you saw in the output leading to this paragraph the primary database is started on rac12sby, nodes rac12sby1 and rac12sby2. In the current format, there is – for the first time – a connection penalty:

preventde
[oracle@rac12sby1 ~]$ time sqlplus system/secret@maa_test1 < select sys_context('userenv','instance_name') from dual;
> exit;
> EOF

SQL*Plus: Release 12.1.0.2.0 Production on Tue Apr 21 07:17:36 2015

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Last Successful login time: Mon Apr 20 2015 10:39:29 +01:00

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL>
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m35.770s
user    0m0.024s
sys     0m0.043s

[oracle@rac12sby1 ~]$ time sqlplus system/secret@maa_test1 <
SYS_CONTEXT('USERENV','INSTANCE_NAME')
--------------------------------------------------------------------------------
STDBY2

SQL> Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real    0m36.219s
user    0m0.021s
sys     0m0.022s

During testing with 12c these ranged between 30 seconds and 38. Here is proof that I truly cannot connect to the primary cluster:

[oracle@lab tns]$ ping rac12pri-scan
PING rac12pri-scan.example.com (192.168.100.111) 56(84) bytes of data.
From lab.example.com (192.168.100.1) icmp_seq=1 Destination Host Unreachable
From lab.example.com (192.168.100.1) icmp_seq=2 Destination Host Unreachable
From lab.example.com (192.168.100.1) icmp_seq=3 Destination Host Unreachable
From lab.example.com (192.168.100.1) icmp_seq=4 Destination Host Unreachable
^C
--- rac12pri-scan.example.com ping statistics ---
5 packets transmitted, 0 received, +4 errors, 100% packet loss, time 4001ms

[oracle@rac12sby1 ~]$ time telnet rac12pri-scan 1521
Trying 192.168.100.112...
telnet: connect to address 192.168.100.112: No route to host
Trying 192.168.100.113...
telnet: connect to address 192.168.100.113: No route to host
Trying 192.168.100.111...
telnet: connect to address 192.168.100.111: No route to host

real	0m9.022s
user	0m0.001s
sys	0m0.003s

I found it interesting that the telnet command took almost no time at all to complete, yet trying to connect using SQL_Net it seems to take forever. Why could that be? Trace!

Getting to the bottom of this

Tracing can help sometimes, so I tried that. I try and keep my working TNS configuration separate from the production entries in $ORACLE_HOME. Using ~/tns I defined sqlnet.ora to read:

  • diag_adr_enabled = off
  • trace_level_client = support
  • trace_directory_client = /home/oracle/tns
  • log_directory_client = /home/oracle/tns
  • trace_unique_client = true

And connected again. I noticed time increases whenever nsc2addr was invoked:

[oracle@lab tns]$ grep "nsc2addr.*DESC" cli_6630.trc | nl
     1  (1947452928) [23-APR-2015 11:07:45:376] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.112)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     2  (1947452928) [23-APR-2015 11:07:48:383] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.111)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     3  (1947452928) [23-APR-2015 11:07:51:390] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.113)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     4  (1947452928) [23-APR-2015 11:07:54:396] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.113)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     5  (1947452928) [23-APR-2015 11:07:57:402] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.111)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     6  (1947452928) [23-APR-2015 11:08:00:408] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.112)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     7  (1947452928) [23-APR-2015 11:08:03:414] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.113)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     8  (1947452928) [23-APR-2015 11:08:06:421] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.111)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     9  (1947452928) [23-APR-2015 11:08:09:427] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.112)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
    10  (1947452928) [23-APR-2015 11:08:12:433] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.111)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
    11  (1947452928) [23-APR-2015 11:08:15:439] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.112)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
    12  (1947452928) [23-APR-2015 11:08:18:445] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.113)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
    13  (1947452928) [23-APR-2015 11:08:21:452] nsc2addr: (DESCRIPTION=(CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.120)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))

Sorry for the lengthy listing but that way it shows best! The primary SCAN is set to 111,112 and 113 with the standby SCAN on 119,120, and 121.

What you can see clearly by looking at the timestamps is that the transport_connect_timout works perfectly well-a new address is tried every 3 seconds. What surprises me is that each SCAN VIP is tried 4 times before the second SCAN is tried (lines 1-12). I thought I had that limited to 3 retries … Line 13 is the first reference to the second SCAN, and as soon as that is referenced a connection is established.

That can’t be it, can it?

Hmm maybe Oracle went a bit too far by adding 3 retries to every single SCAN VIP. The whole process can be sped up by trying every SCAN VIP only once before failing over to the second SCAN. This can be done with the following connection string:

MAA_TEST2 =
  (DESCRIPTION_LIST=
     (LOAD_BALANCE=off)(FAILOVER=on)
     (CONNECT_TIMEOUT=5)(TRANSPORT_CONNECT_TIMEOUT=3)(RETRY_COUNT=3)
       (DESCRIPTION=
           (ADDRESS_LIST= (LOAD_BALANCE=on) (ADDRESS=(PROTOCOL=TCP)(HOST=rac12pri-scan)(PORT=1521)))
           (CONNECT_DATA=(SERVICE_NAME=rootsrv)))
       (DESCRIPTION= 
           (ADDRESS_LIST= (LOAD_BALANCE=on) (ADDRESS=(PROTOCOL=TCP)(HOST= rac12sby-scan)(PORT=1521)))
           (CONNECT_DATA=(SERVICE_NAME=rootsrv)))
  )

Trying this works a lot better from a timing perspective:

[oracle@lab tns]$ time sqlplus system/secret@maa_test2< Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

real	0m9.829s
user	0m0.041s
sys	0m0.022s
[oracle@lab tns]$ 

Visible in the SQL*Net trace:

[oracle@lab tns]$ grep "nsc2addr.*DESC" cli_7305.trc | nl 
     1	(3474875904) [23-APR-2015 11:35:17:935] nsc2addr: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.113)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     2	(3474875904) [23-APR-2015 11:35:20:940] nsc2addr: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.112)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     3	(3474875904) [23-APR-2015 11:35:23:948] nsc2addr: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.111)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))
     4	(3474875904) [23-APR-2015 11:35:26:956] nsc2addr: (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=192.168.100.119)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=rootsrv)(CID=(PROGRAM=sqlplus)(HOST=lab.example.com)(USER=oracle))))

That was the solution to the “slow” connectivity that the person who asked me the question in Paris shared, but I haven’t been able to find proof as to why it works better: this is it. Now the question remains why the MAA string in the Application Continuity white paper is more complex? I think what’s failing here is the CONNECT_TIMEOUT. According to tnsnames.ora reference in 12c the purpose of the parameter is “To specify the timeout duration in seconds for a client to establish an Oracle Net connection to an Oracle database”. You could read this so that after CONNECT_TIMEOUT is reached the failover to the next SCAN occurs. However it appears – by looking at the trace – that the connection timeout is similar to transport timeout and the lower of both is used.

There is a high probability that the syntax is in the document there for a reason, although it escapes me right now. I guess the penalty when creating new connections with connection pools doesn’t really matter that much, because you create the pool (where you “pay” the penalty) only once, and subsequently pull connections from there, which should be very fast.

Singapore Maths Question Solution and Very Interesting Observation (The Trickster)

OK, time to reveal the solution to the somewhat tricky Singapore maths exam question I blogged previously. Remember, there were 10 dates: May 13   May 15   May 19 June 13   June 14 July 16   July 18 August 14   August 15   August 16 Bowie only knew the month of my birthday, Ziggy only knew the day. Bowie […]

Golden Oldies

I’ve just been motivated to resurrect a couple of articles I wrote for DBAZine about 12 years ago on the topic of bitmap indexes. All three links point to Word 97 documents which I posted on my old website in September 2003. Despite their age they’re still surprisingly good.

Update: 26th April 2015

Prompted by my reply to comment #2 below to look at what I said about bitmap indexes in Practical Oracle 8i (published 15 years ago), and found this gem:

An interesting feature of bitmap indexes is that it is rather hard to predict how large the index segment will be. The size of a B-tree index is based very closely on the number of rows and the typical size of the entries in the index column. The size of a bitmap index is dictated by a fairly small number of bit-strings which may have been compressed to some degree depending upon the number of consecutive 1’s and 0’s.

To pick an extreme example, imagine a table of one million rows that has one column that may contain one of eight values ‘A’ to ‘H’ say, which has been generated in one of of the two following extreme patterns:

  • All the rows for a given value appear together, so scanning down the table we get 125,000 rows with ‘A’ followed by 125,000 rows of ‘B’ and so on.
  • The rows cycle through the values in turn, so scanning down the table we get ‘A’,’B’. . . ‘H’ repeated 125,000 times.

What will the bitmap indexes look like in the two cases case?

For the first example, the basic map for the ‘A’ value will be 125,000 one-bits, followed by 875,000 zero bits – which will be trimmed off. Splitting the 125,000 bits into bytes and adding the necessary overhead of about 12% we get an entry of the ‘A’ rows of 18K. A similar argument applies for each of the values ‘B’ to ‘H’, so we get a total index size of around 8 x 18K – giving 156K.

For the second example, the basic map for the ‘A’ value will be a one followed by 7 zeros, repeated 125,000 times. There is no chance of compression here, so the ‘A’ entry will start at 125,000 bytes. Adding the overhead this goes up to 140K, and repeating the argument for the values ‘B’ to ‘H’ we get a total index of 1.12 MB.

This wild variation in size looks like a threat, but to put this into perspective, a standard B-tree index on this column would run to about 12 Mb irrespective of the pattern of the data. It would probably take about ten times as long to build as well.

As we can see, the size of a bitmap index can be affected dramatically by the packing of the column it depends upon as well as the number of different possible values the column can hold and the number of rows in the table. The compression that is applied before the index is stored, and the amazing variation in the resulting index does mean that the number of different values allowed in the column can be much larger than you might first expect. In fact it is often better to think of bitmap indexes in terms of how many occurrences of each value there are, rather than in terms of how many different values exist. Viewing the issue from this direction, a bitmap is often better than a B-tree when each value occurs more than a few hundred times in the table (but see the note below following the description of bitmap index entries).

 

Kickstarting After a Failed Addition to the AWR Warehouse

A common issue I’ve noted are dump files generated from the AWR Warehouse, but upon failure to transfer, the dumpfiles simply exist, never upload and the data is stuck in a “limbo” state between the source database, (target) and the AWR Warehouse.  This can be a very difficult issue to troubleshoot, as no errors are seen in the actual AWR Warehouse “View Errors” and no data from the source is present in the AWR Warehouse.

awrw_nw2

Empowered by EM Jobs

If you go to Enterprise –>  Jobs –> Activity and inspect the Log Report after a search for %CAW% jobs that perform the extraction, transfer and load that experienced a problem, you will then be able to view the error and can inspect the details of the issue.

awr_nw4

 

If you double click on the job in question and note in the example above, you’ll notice that the jobs have the naming convention of CAW_RUN_ETL_NOW.  This is due to a force run via the AWR Warehouse console, (Actions –>  Upload Snapshots Now.)  If the job was a standard ETL run, the naming convention will be CAW_TRANSFER.

The job is a multi-step process, so you will see where the extraction step failed.  This is a bit misleading, as the previous failure that set into motion was a preferred credential issue that stopped the transfer step to the AWR Warehouse.  If you look far enough back in the jobs, you’ll find the original error, but now we are stuck in a loop-  the source can’t go forward once the credentials are fixed and yet it’s difficult for someone unfamiliar with the process to know where it all went wrong.  The first recommendation is often to remove the database and re-add it, but in this scenario, we are going to kickstart the process now that the credentials have been fixed.

Digging into a Job Error

awrw_nw5

 

 

As we stated earlier, you’ll see in the steps, that the extract failed, but the transfer would have been successful if a file had been created.  Double click on “Failed” to see the error that occurred:

awrw_nw4

 

In the SQL output, the error states, ORA-20137:  NO NEW SNAPSHOTS TO EXTRACT

Now we know that no files have been uploaded to the AWR Warehouse, yet, the logic written to the AWR Warehouse package that is in the DBNSMP schema in the source database thinks it’s already pulled all of these snapshots to an extract and created a dumpfile.  This error is very clear in telling the user what is going on.  There is a data telling the ETL process the data ALREADY has been extracted.

Disclaimer:  This solution we are about to undertake is for a BRAND NEW ADDITION TO THE AWR WAREHOUSE ONLY.  You wouldn’t want to perform this on a source database that had been loading properly and then stopped after successful uploads to the warehouse, (if you have one of those, I would want to proceed differently, so please keep this in mind before you attempt this in your own environment….) This fix is also dependent upon all preferred credential issues to be resolved BEFOREHAND.

The Kickstart

To “kickstart” the process after a failure, first, verify that there are no errors that aren’t displaying in the console:

select * from dbsnmp.caw_extract_metadata;

Next, gather the location for the dumpfiles:

select * from dbsnmp.caw_extract_properties;

There will be one line in this table-  It will include the oldest snapID, the newestID and the location of the dumpfile, (often the agent home unless otherwise configured.) This is the table the logic in the package is using to verify what has been already extracted.  We now need to remove this tracking information and the pre-existing dumpfiles created in the previous failed processes:

  1. Make a copy of this table, (create table dbsnmp.caw_extract_prop2 as select * from dbsnmp.caw_extract_properties;)
  2. Truncate dbsnmp.caw_extract_properties table.
  3. Delete the extract, (dumpfiles) from the directory shown in the caw_extract_properties table.  Don’t remove anything else from that directory!
  4. Log into the AWR Warehouse console.
  5. Click on the source database you just scoured of the ETL extract files.
  6. Highlight the database, click on Actions, Click on “Upload Snapshots Now.”
  7. View the job via the link displayed at the top of the console and monitor to completion.
  8. Once the job has succeeded completely, remove the dbsnmp.caw_extract_prop2 table and the backup files you moved that were created earlier from failed extracts.

You should now see successful upload jobs from this point on in the job, along with data in your AWR Wareshouse:

awrw_nw6



Tags:  ,


Del.icio.us



Facebook

TweetThis

Digg

StumbleUpon




Copyright © DBA Kevlar [Kickstarting After a Failed Addition to the AWR Warehouse], All Right Reserved. 2015.