Search

OakieTags

Who's online

There are currently 0 users and 42 guests online.

Recent comments

Affiliations

June 2011

InSync11 : 16-17 August 2011…

A few weeks ago one of the AUSOUG committee asked me to submit some abstracts for InSync11. I got the official acceptance through today, so travel approval permitting, I’ll be in Sydney on August 16-17 to speak at the conference.

I’m not looking forward to the flights, but it will be really cool to see the Aussies again. I was teaching some Oracle University courses in Australia last year, but it’s a couple of years since I’ve spoken at a conference down there.

Cheers

Tim…




Implicit Datatype Conversion + Histograms = Bad Execution Plan?

Earlier today I exchanged some tweets with @martinberx about some optimizer questions and after posting more information on the ORACLE-L list, I was able to reproduce what he was observing.

The issue:

DB: 11.2.0.2.0 – 64bit /> I have a small query with a little error, which causes big troubles. /> The relevant part of the query is /> WHERE …. /> AND inst_prod_type=003 /> AND setid=’COM01′

but INST_PROD_TYPE is VARCHAR2.

this leads to filter[ (TO_NUMBER("INST_PROD_TYPE")=3 AND "SETID"='COM01') ]

based on this TO_NUMBER ( I guess!) the optimiser takes a fix selectivity of 1%.

Can someone tell me if this 1% is right? Jonathan Lewis “CBO Fundamentals” on page 133 is only talking about character expressions.

Unfortunately there are only 2 distinct values of INST_PROD_TYPE so this artificial [low] selectivity leads to my problem: /> An INDEX SKIP SCAN on PS0RF_INST_PROD is choosen. (columns of PS0RF_INST_PROD: INST_PROD_TYPE, SETID, INST_PROD_ID )

After fixing the statement to /> AND inst_prod_type=’003′ /> another index is used and the statement performs as expected.

Now I have no problem, but want to find the optimizers decisions in my 10053 traces.

The Important Bits of Information

From Martin’s email we need to pay close attention to:

  • Predicate of “inst_prod_type=003″ where INST_PROD_TYPE is VARCHAR2 (noting no single quotes around 003)
  • Implicite datatype conversion in predicate section of explain plan – TO_NUMBER(“INST_PROD_TYPE”)=3
  • only 2 distinct values of INST_PROD_TYPE

From this information I’ll construct the following test case:

create table foo (c1 varchar2(8));
insert into foo select '003' from dual connect by level <= 1000000;
insert into foo select '100' from dual connect by level <= 1000000;
commit;
exec dbms_stats.gather_table_stats(user,'foo');

And using the display_raw function we’ll look at the column stats.

col low_val     for a8
col high_val    for a8
col data_type   for a9
col column_name for a11

select
   a.column_name,
   display_raw(a.low_value,b.data_type) as low_val,
   display_raw(a.high_value,b.data_type) as high_val,
   b.data_type,
   a.density,
   a.histogram,
   a.num_buckets
from
   user_tab_col_statistics a, user_tab_cols b
where
   a.table_name='FOO' and
   a.table_name=b.table_name and
   a.column_name=b.column_name
/

COLUMN_NAME LOW_VAL  HIGH_VAL DATA_TYPE    DENSITY HISTOGRAM       NUM_BUCKETS
----------- -------- -------- --------- ---------- --------------- -----------
C1          003      100      VARCHAR2          .5 NONE                      1

Take note of the lack of a histogram.

Now let’s see what the CBO estimates for a simple query with and without quotes (explicit cast and implicit cast).

SQL> explain plan for select count(*) from foo where c1=003;

Explained.

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 1342139204

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     4 |   875   (3)| 00:00:11 |
|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| FOO  |  1000K|  3906K|   875   (3)| 00:00:11 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER("C1")=003)

14 rows selected.

SQL> explain plan for select count(*) from foo where c1='003';

Explained.

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 1342139204

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     4 |   868   (2)| 00:00:11 |
|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| FOO  |  1000K|  3906K|   868   (2)| 00:00:11 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("C1"='003')

14 rows selected.

In this case the estimated number of rows is spot on – 1 million rows. Now lets regather stats and because of our queries using C1 predicates, it will become a candidate for a histogram. We can see this from sys.col_usage$.

select  oo.name owner,
        o.name table_name,
        c.name column_name,
        u.equality_preds,
        u.equijoin_preds,
        u.nonequijoin_preds,
        u.range_preds,
        u.like_preds,
        u.null_preds,
        u.timestamp
from    sys.col_usage$ u,
        sys.obj$ o,
        sys.user$ oo,
        sys.col$ c
where   o.obj#   = u.obj#
and     oo.user# = o.owner#
and     c.obj#   = u.obj#
and     c.col#   = u.intcol#
and     oo.name  = 'GRAHN'
and     o.name   = 'FOO'
/

OWNER TABLE_NAME COLUMN_NAME EQUALITY_PREDS EQUIJOIN_PREDS NONEQUIJOIN_PREDS RANGE_PREDS LIKE_PREDS NULL_PREDS TIMESTAMP
----- ---------- ----------- -------------- -------------- ----------------- ----------- ---------- ---------- -------------------
GRAHN FOO        C1                       1              0                 0           0          0          0 2011-06-08 22:29:59

Regather stats and re-check the column stats:

SQL> exec dbms_stats.gather_table_stats(user,'foo');

PL/SQL procedure successfully completed.

SQL> select
  2     a.column_name,
  3     display_raw(a.low_value,b.data_type) as low_val,
  4     display_raw(a.high_value,b.data_type) as high_val,
  5     b.data_type,
  6     a.density,
  7     a.histogram,
  8     a.num_buckets
  9  from
 10     user_tab_col_statistics a, user_tab_cols b
 11  where
 12     a.table_name='FOO' and
 13     a.table_name=b.table_name and
 14     a.column_name=b.column_name
 15  /

COLUMN_NAME LOW_VAL  HIGH_VAL DATA_TYPE    DENSITY HISTOGRAM       NUM_BUCKETS
----------- -------- -------- --------- ---------- --------------- -----------
C1          003      100      VARCHAR2  2.5192E-07 FREQUENCY                 2

Note the presence of a frequency histogram. Now let’s re-explain:

SQL> explain plan for select count(*) from foo where c1=003;

Explained.

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 1342139204

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     4 |   875   (3)| 00:00:11 |
|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| FOO  |     1 |     4 |   875   (3)| 00:00:11 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER("C1")=003)

SQL> explain plan for select count(*) from foo where c1='003';

Explained.

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 1342139204

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     4 |   868   (2)| 00:00:11 |
|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| FOO  |  1025K|  4006K|   868   (2)| 00:00:11 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("C1"='003')

And whammy! Note that the implicit cast [ filter(TO_NUMBER("C1")=003) ] now has an estimate of 1 row (when we know there is 1 million). /> So what is going on here? Let’s dig into the optimizer trace for some insight:

SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for FOO[FOO]
  Column (#1):
    NewDensity:0.243587, OldDensity:0.000000 BktCnt:5458, PopBktCnt:5458, PopValCnt:2, NDV:2
  Column (#1): C1(
    AvgLen: 4 NDV: 2 Nulls: 0 Density: 0.243587
    Histogram: Freq  #Bkts: 2  UncompBkts: 5458  EndPtVals: 2
  Using prorated density: 0.000000 of col #1 as selectvity of out-of-range/non-existent value pred
  Table: FOO  Alias: FOO
    Card: Original: 2000000.000000  Rounded: 1  Computed: 0.50  Non Adjusted: 0.50
  Access Path: TableScan
    Cost:  875.41  Resp: 875.41  Degree: 0
      Cost_io: 853.00  Cost_cpu: 622375564
      Resp_io: 853.00  Resp_cpu: 622375564
  Best:: AccessPath: TableScan
         Cost: 875.41  Degree: 1  Resp: 875.41  Card: 0.50  Bytes: 0

As you can see from the line

Using prorated density: 0.000000 of col #1 as selectvity of out-of-range/non-existent value pred

The presence of the histogram and the implicit conversion of TO_NUMBER(“C1″)=003 causes the CBO to use a density of 0 because it thinks it’s a non-existent value. The reason for this is that TO_NUMBER(“C1″)=003 is the same as TO_NUMBER(“C1″)=3 and for the histogram the CBO uses TO_CHAR(C1)=’3′ and 3 is not present in the histogram only ’003′ and ’100′.

Dumb Luck?

So, what if the predicate contained a number that was not left padded with zeros, say 100, the other value we put in the table?

SQL> explain plan for select count(*) from foo where c1=100;

Explained.

SQL> select * from table(dbms_xplan.display());

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------
Plan hash value: 1342139204

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     4 |   875   (3)| 00:00:11 |
|   1 |  SORT AGGREGATE    |      |     1 |     4 |            |          |
|*  2 |   TABLE ACCESS FULL| FOO  |  1009K|  3944K|   875   (3)| 00:00:11 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(TO_NUMBER("C1")=100)

While not exact, the CBO estimate is quite close to the 1 million rows with C1=’100′.

Summary

It’s quite clear that Martin’s issue came down to the following:

  • implicit casting
  • presences of histogram
  • zero left padded number/string

The combination of these created a scenario where the CBO thinks the value is out-of-range and uses a prorated density of 0 resulting in a cardinality of 1 when there are many more rows than 1.

The moral of the story here is always cast your predicates correctly. This includes explicit cast of date types as well – never rely on the nls settings.

All tests performed on 11.2.0.2.

NOLOGGING Option and log file sync Waits – What is Wrong with this Quote?

June 8, 2011 I have not had published a “What is Wrong with this Quote” blog article for a couple of months, so I thought that I would publish a few of those types of articles. The first quote is from the Alpha copy of the book “Oracle 11g Performance Tuning Recipes”, recipe 5-7 Resolving Log File [...]

Flashback Query "AS OF" - Tablescan costs

This is just a short note prompted by a recent thread on the OTN forums. In recent versions Oracle changes the costs of a full table scan (FTS or index fast full scan / IFFS) quite dramatically if the "flashback query" clause gets used.

It looks like that it simply uses the number of blocks of the segment as I/O cost for the FTS operation, quite similar to setting the "db_file_multiblock_read_count" ("dbfmbrc"), or from 10g on more precisely the "_db_file_optimizer_read_count", to 1 (but be aware of the MBRC setting of WORKLOAD System Statistics, see comments below) for the cost estimate of the segment in question.

This can lead to some silly plans depending on the available other access paths as can be seen from the thread mentioned.

How to hint – 1

Here’s a quick tutorial in hinting, promped by a question on the OTN database forum.
The OP has a hash semi-join and Oracle appears to be ignoring a hint to use a nested loop:

    > I tried forcing the optimizer to not use hash join by adding NO_USE_HASH, USE_NL to my sql but it doesn’t seem to work.
    > Can anyone please help check what I have done wrong.
    > select /*+ NO_USE_HASH(C2)  USE_NL(C2) */
    >         SC.SID, SC.MID, SC.INDATE, SC.EXDATE, SC.AUDATE
    > FROM    SSCLASS SC
    > WHERE   SC.SID = 0
    > AND     SC.CID = 0
    > AND     SC.MID = 1
    > AND     SC.INDATE <= SC.EXDATE
    > AND     EXISTS (
    >                 SELECT  SSCID FROM SSCLASS C2
    >                 WHERE   C2.SSCID = SC.SSCID
    >                 AND     C2.AUDATE >= to_date('2009-01-01','yyyy-MM-dd')
    >         )
    > ORDER BY
    >        SSCID, INDATE, EXDATE
    >
    > PLAN_TABLE_OUTPUT
    > Plan hash value: 1476588646
    >
    > ------------------------------------------------------------------------------------------------------
    > | Id  | Operation                     | Name         | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
    > ------------------------------------------------------------------------------------------------------
    > |   0 | SELECT STATEMENT              |              |   204K|    10M|       | 35799   (1)| 00:07:10 |
    > |   1 |  SORT ORDER BY                |              |   204K|    10M|    25M| 35799   (1)| 00:07:10 |
    > |*  2 |   HASH JOIN SEMI              |              |   204K|    10M|    10M| 33077   (1)| 00:06:37 |
    > |*  3 |    TABLE ACCESS BY INDEX ROWID| SSCLASS      |   204K|  7983K|       |  9110   (1)| 00:01:50 |
    > |*  4 |     INDEX RANGE SCAN          | X5_SSCLASS   |   204K|       |       |   582   (1)| 00:00:07 |
    > |*  5 |    INDEX RANGE SCAN           | X6_SSCLASS   |  4955K|    66M|       | 17276   (1)| 00:03:28 |
    > ------------------------------------------------------------------------------------------------------
    

I’m not going to argue about what plans might be good or bad, and I’m going to assume the OP simply wants a nested loop semi join using a “good” index into the table aliased as C2; so I’m just going to demonstrate on this simple example how to approach that specific problem. The critical error the OP has made is that the join he’s trying to affect doesn’t exist in the query block where he’s put his hint – so he needs to find out what query will exist after the subquery has been nested and the optimizer is looking at the semi-join.

Here’s initial query, with default execution plan, I’ll point out that there is an index on the n1 column that I’m using in the existence test:

select
	*
from
	t2
where	t2.n2 = 15
and	exists (
		select
			null
		from	t1
		where	t1.n1 = t2.n1
	)
;

-------------------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |    15 |  2865 |    26   (4)| 00:00:01 |
|*  1 |  HASH JOIN SEMI       |       |    15 |  2865 |    26   (4)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL   | T2    |    15 |  2805 |    22   (0)| 00:00:01 |
|   3 |   INDEX FAST FULL SCAN| T1_I1 |  3000 | 12000 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T1"."N1"="T2"."N1")
   2 - filter("T2"."N2"=15)

So I’ve emulated the hash semi-join into the second table that the OP wants to get rid of, and I’m not using the target index in a “precision” fashion.

I happen to know that there is a hint that I can use to make the subquery operate as a nested loop semijoin. It’s /*+ nl_sj */ and it has to go in the subquery. Unfortunately it’s a hint that’s deprecated in 10g, but never mind that for the moment. I’m also going to adopt “sensible practice” and give each of my query blocks a name. Let’s see what we get from dbms_xplan with the hint.

explain plan
set statement_id = 'sj_hinted'
for
select
	/*+
		qb_name(main)
	*/
	*
from
	t2
where	t2.n2 = 15
and	exists (
		select
			/*+
				qb_name(subq) nl_sj
			*/
			null
		from	t1
		where	t1.n1 = t2.n1
	)
;

select * from table(dbms_xplan.display(null,'sj_hinted','outline'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value: 635111780

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |    15 |  2865 |    37   (0)| 00:00:01 |
|   1 |  NESTED LOOPS SEMI |       |    15 |  2865 |    37   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| T2    |    15 |  2805 |    22   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN | T1_I1 |  3000 | 12000 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      USE_NL(@"SEL$A93AFAED" "T1"@"SUBQ")
      LEADING(@"SEL$A93AFAED" "T2"@"MAIN" "T1"@"SUBQ")
      INDEX(@"SEL$A93AFAED" "T1"@"SUBQ" ("T1"."N1"))
      FULL(@"SEL$A93AFAED" "T2"@"MAIN")
      OUTLINE(@"SUBQ")
      OUTLINE(@"MAIN")
      UNNEST(@"SUBQ")
      OUTLINE_LEAF(@"SEL$A93AFAED")
      ALL_ROWS
      OPTIMIZER_FEATURES_ENABLE('10.2.0.3')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T2"."N2"=15)
   3 - access("T1"."N1"="T2"."N1")

Note how I’ve used a statement_id to label my plan, and I’ve added the extra predicate ‘outline’ to the call to dbms_xplan. The outline shows me the complete set of hints I need to reproduce the execution plan; technically it’s the information that would be stored by Oracle as an outline or an SQL Baseline.

There are a few session-level parameter settings I don’t really need included, and a couple of things which can’t qualify as “legal” SQL hints, though, and I’m going to ignore those. (Don’t you love the “ignore the hints” hint, though!)

So let’s take the minimum set of hints back into the SQL:

explain plan
set statement_id = 'full_hints'
for
select
	/*+
		qb_name(main)
		unnest(@subq)
		leading(@sel$a93afaed t2@main t1@subq)
		use_nl(@sel$a93afaed t1@subq)
		full(@sel$a93afaed t2@main)
		index(@sel$a93afaed t1@subq(t1.n1))
	*/
	*
from
	t2
where	t2.n2 = 15
and	exists (
		select
			/*+
				qb_name(subq)
			*/
			null
		from	t1
		where	t1.n1 = t2.n1
	)
;

select * from table(dbms_xplan.display(null,'full_hints','outline'));

----------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       |    15 |  2865 |    37   (0)| 00:00:01 |
|   1 |  NESTED LOOPS SEMI |       |    15 |  2865 |    37   (0)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| T2    |    15 |  2805 |    22   (0)| 00:00:01 |
|*  3 |   INDEX RANGE SCAN | T1_I1 |  3000 | 12000 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------

Outline Data
-------------
  /*+
      BEGIN_OUTLINE_DATA
      USE_NL(@"SEL$A93AFAED" "T1"@"SUBQ")
      LEADING(@"SEL$A93AFAED" "T2"@"MAIN" "T1"@"SUBQ")
      INDEX(@"SEL$A93AFAED" "T1"@"SUBQ" ("T1"."N1"))
      FULL(@"SEL$A93AFAED" "T2"@"MAIN")
      OUTLINE(@"SUBQ")
      OUTLINE(@"MAIN")
      UNNEST(@"SUBQ")
      OUTLINE_LEAF(@"SEL$A93AFAED")
      ALL_ROWS
      OPTIMIZER_FEATURES_ENABLE('10.2.0.3')
      IGNORE_OPTIM_EMBEDDED_HINTS
      END_OUTLINE_DATA
  */

Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter("T2"."N2"=15)
   3 - access("T1"."N1"="T2"."N1")

Job done – we used a bit of hackery to get the plan we wanted, then used the legal hints to reproduce the plan.

It is important to name your query blocks as this helps you to identify what transformations apply when, and how to label your tables correctly in your code; and you have to remember that the “strange” query block names that appear (such as @”SEL$A93AFAED”) are dependent on the query block names you originally supplied.

The method isn’t perfect since (a) sometimes hints that are needed don’t get into the outline, and (b) sometimes the outline actually doesn’t reproduce the plan if all you use are the “legal” hints – but it may help you in most cases.

Getting Fit: Another Update…

It’s a little over a month since I joined the gym. In that time I’ve had 4 days off, so it’s been pretty full on. I had my weekly personal training session today and I’m staggered with my current level of fitness compared to where I started. The trainer is pure evil and he really put me though it today, but I did way more than I thought was possible.

The daily interval training has worked wonders for my recovery time. Towards the end of the today’s session I finished one circuit, puked and was ready to start again in a minute. Pretty phenomenal by my standards.

The weight is coming off very slowly now. It’s hard not to fixate about the scales, but I am aware that I’ve put on a lot of muscle in this month, so the scales are not a very good indication of what’s going on. Even so, it’s kinda tough when people ask how much weight you’ve lost and you tell them you’ve killed yourself in the gym for a month and only lost 7 pounds.

So the journey continues. There is still a very long way to go, but I’m still moving in the right direction so that’s good news.

I’m hoping to speak at the InSync11 conference in Australia in a couple of months. It would be nice if I can drop a some more weight before then. It would certainly make the economy class seats feel a little better. :)

Cheers

Tim…




Additional date for my Grid Infrastructure seminar in the UK

I was very pleasently surprised that Oracle University are offering another day for my “Grid Infrastructure and Database High Availability Deep Dive” seminar. In addition to the immenent seminars in June (I blogged about them earlier), this one is in London, England. For anyone interested, here is the link:

http://education.oracle.com/pls/web_prod-plq-dad/db_pages.getCourseDesc?dc=D70365_1439069&p_org_id=28&lang=US

The date has been set to October 10th, so there is plenty of time still, but nevertheless I hope to see you there!


V$SESSION’s ROW_WAIT_OBJ# Column Shows the DATA_OBJECT_ID – Verify the Documentation Challenge

June 7, 2011 While reading through the alpha copy of a performance tuning book, I noticed a set of SQL statements that were identical to those found in the Oracle Database 11.2 Performance Tuning Guide.  The set of SQL statements from the Performance Tuning Guide follow: 10.3.1.1 Causes To determine the possible causes, first query [...]

Fedora 15: Further Observations…

I’ve mentioned Fedora 15 a couple of times recently:

A couple more things I’ve noticed along the way:

  • Lots of the early press complained about the “reduced productivity” of the GNOME3 desktop. Having used it for a while now I can safely say it doesn’t impact me at all. When I’m using my MacBook Pro I have to move the mouse down to the bottom of the screen to activate the doc. It’s like a reflex reaction. I’ve found I do much the same thing with GNOME3. If I’m using the mouse to switch to Activities (rather than the Windows key), I make a circular anti-clockwise motion with the mouse so it touches the top-left corner of the screen. That switches to Activities without a mouse click, so on the down move of the circle I’m over the doc. It’s really pretty stress free. A single mouse move with no extra clicks. Sorted.
  • I did an upgrade from Fedora 14 on one machine. Fedora 15 includes LibreOffice in place of OpenOffice. As a result OpenOffice was not “upgraded”, and the existing OpenOffice installation stopped working. It wasn’t a problem, I just installed LibreOffice, just an observation.

Cheers

Tim…




Audit Excess

So you’ve decided you want to audit a particular table in your database, and think that Oracle’s built in audit command will do what you want. You discover two options that seem to be relevant:

audit all on t1 by access;
audit all on t1 by session;

To check the audit state of anything in your schema you can then run a simple query – with a few SQL*Plus formatting commands – to see something like the following:

select
	*
from
	user_obj_audit_opts
;

OBJECT_NAME          OBJECT_TYPE       ALT AUD COM DEL GRA IND INS LOC REN SEL UPD REF EXE CRE REA WRI FBK
-------------------- ----------------- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---
T1                   TABLE             S/S S/S S/S S/S S/S S/S S/S S/S S/S S/S S/S -/- -/- -/- -/- -/- S/S

The appearance of all the ‘S’ markers in my output is because I actually ran the two earlier commands in the order shown, so the ‘by session’ command after the ‘by access’ command. If I’d run the commands the other way round then you would have seen ‘A’s everywhere.

Each of the columns (‘ALT’,'SEL’, etc.) represents an action that can be audited, and the column content shows when and why it will be audited. The significance of the “X/X” is that the default audit is “always” but could be limited to “whenever successful” (the first ‘X’) or “whenever not successful” (the second  ‘X’).

The question is this: is auditing by session cheaper than auditing by action ? You initial thought is that it probably ought to be, because you only need one row per audited object per session when auditing by session, not one row per session per access – but how much do you really save ? When in doubt, one of the first tests is to run a little code with sql_trace enabled.

Here, from 10.2.0.3, is the interesting bit from tkprof after running “select * from t1″ eleven times – enabling sql_trace after the first select so that I can see the work done by auditing for the last ten selects:

Audit by access
===============
insert into sys.aud$( sessionid,entryid,statement,ntimestamp#, userid,
  userhost,terminal,action#,returncode, obj$creator,obj$name,auth$privileges,
  auth$grantee, new$owner,new$name,ses$actions,ses$tid,logoff$pread,
  logoff$lwrite,logoff$dead,comment$text,spare1,spare2,  priv$used,clientid,
  sessioncpu,proxy$sid,user$guid, instance#,process#,xid,scn,auditid,sqlbind,
  sqltext)
values
(:1,:2,:3,SYS_EXTRACT_UTC(SYSTIMESTAMP),     :4,:5,:6,:7,:8,     :9,:10,:11,
  :12,     :13,:14,:15,:16,:17,     :18,:19,:20,:21,:22,     :23,:24,:25,:26,
  :27,     :28,:29,:30,:31,:32,:33,:34)

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        0      0.00       0.00          0          0          0           0
Execute     10      0.00       0.00          0          0         41          10
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       10      0.00       0.00          0          0         41          10

Audit by session
================
update sys.aud$ set ses$actions=merge$actions(ses$actions,:3), spare2=
  nvl(spare2,:4)
where
 sessionid=:1 and ses$tid=:2 and action#=103 and (priv$used=:5 or priv$used
  is null and :5 is null)

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        0      0.00       0.00          0          0          0           0
Execute     10      0.00       0.00          0         40         30          10
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total       10      0.00       0.00          0         40         30          10

If you thought that auditing by session was a light-weight operation, think again. Interestingly, the auditing anomaly that I reported recently re-appears in the “audit by session” option. You will recall that when auditing is enabled, Oracle turns “update row” into “lock row, update row” – this happens to the audit table (aud$) as well, so every update to an aud$ row generates two redo entries. Here’s a comparison of the redo generated (by 10.2.0.3) for the two tests:

Audit by access
===============
redo entries                                                                30
redo size                                                               10,360
undo change vector size                                                  2,280

Audit by session
================
redo entries                                                                30
redo size                                                                8,112
undo change vector size                                                  2,320

The results are startlingly similar. For “audit by access” we see one redo entry for the table insert, one for the index insert, one for the commit; for “audit by session” we see one redo entry for the “lock row”, one for the “update row”, one for the commit. You may recall, by the way, that one of the comments on my blog about the lock/update anomaly pointed out that this bug was limited to 10.2.0.1 – 10.2.0.3; that may be true most of the time, but when you get to the aud$ table the same lock/update pattern reappears.

And here’s a final kicker when you think about auditing: sometimes it seems like a good idea to write a pl/sql function with an embedded query that you can use in a select list rather than doing an outer join to a table in the from clause. If you enable auditing on the embedded table, every call to the pl/sql function results in an insert (or update) of a row in the audit table.

Footnote: If you want to see what the three-letter column names in the user_obj_audit_opts view mean, you can run a query like this:


column column_name format a6 heading "Option"
column comments format a70 heading "Description"

select
	column_name, comments
from
	all_col_comments
where
	owner = 'SYS'
and	table_name = 'USER_DEF_AUDIT_OPTS'
;

Option Description
------ ----------------------------------------------------------------------
ALT    Auditing ALTER WHENEVER SUCCESSFUL / UNSUCCESSFUL
AUD    Auditing AUDIT WHENEVER SUCCESSFUL / UNSUCCESSFUL
COM    Auditing COMMENT WHENEVER SUCCESSFUL / UNSUCCESSFUL
DEL    Auditing DELETE WHENEVER SUCCESSFUL / UNSUCCESSFUL
GRA    Auditing GRANT WHENEVER SUCCESSFUL / UNSUCCESSFUL
IND    Auditing INDEX WHENEVER SUCCESSFUL / UNSUCCESSFUL
INS    Auditing INSERT WHENEVER SUCCESSFUL / UNSUCCESSFUL
LOC    Auditing LOCK WHENEVER SUCCESSFUL / UNSUCCESSFUL
REN    Auditing RENAME WHENEVER SUCCESSFUL / UNSUCCESSFUL
SEL    Auditing SELECT WHENEVER SUCCESSFUL / UNSUCCESSFUL
UPD    Auditing UPDATE WHENEVER SUCCESSFUL / UNSUCCESSFUL
REF    Dummy REF column. Maintained for backward compatibility of the view
EXE    Auditing EXECUTE WHENEVER SUCCESSFUL / UNSUCCESSFUL
FBK    Auditing FLASHBACK WHENEVER SUCCESSFUL / UNSUCCESSFUL
REA    Auditing READ WHENEVER SUCCESSFUL / UNSUCCESSFUL