Search

OakieTags

Who's online

There are currently 0 users and 13 guests online.

Recent comments

Affiliations

11.1.0.7

Time Model Bug

Tasks that are performed via jobs in the database will be double accounted in the system time model that has been introduced with Oracle 10g.

So if you execute significant workload via DBMS_JOB or DBMS_SCHEDULER any system time model related statistic like DB Time, DB CPU etc. that gets recorded for that workload gets double accounted.

This bug is not particularly relevant since your top workloads will still be the same top workloads, because all other statistics (like Elapsed Time, CPU, Buffer Gets etc.) are not affected by the bug.

I mention it only here since the bug (see below for details) as of the time of writing can't yet be found on My Oracle Support in the bug database but I recently came across several AWR reports where the majority of workload was generated via job processes and therefore the time model statistics were effectively doubled.

It might help as a viable explanation if you sometimes wonder why an AWR or Statspack report only captures 50% or less of the recorded total DB Time or DB CPU and where this unaccounted time has gone. If a significant part of the workload during the reporting period has been performed by sessions controlled via DBMS_JOB or DBMS_SCHEDULER then probably most of the unaccounted time is actually not unaccounted but the time model statistics are wrong.

So if you have such an otherwise unexplainable unaccounted DB Time / DB CPU etc. you might want to check if significant workload during the reporting period was executed via the job system. Note that I don't say that this is the only possible explanation of such unaccounted time - there might be other reasons like uninstrumented waits, other bugs etc.

Of course all the percentages that are shown in the AWR / ADDM / Statspack reports that refer to "Percentage of DB Time" or "Percentage of DB CPU" will be too small in such cases.

Compression Restrictions

When using table or index compression certain restrictions apply. Since I find it always hard to gather that information from the manuals and also some of the restrictions are not documented properly or not at all, I'll attempt to summarize the restrictions that I'm aware of here:

1. Compression attribute on subpartitions
Note that you can't define the COMPRESSION on subpartition level - the subpartitions inherit this attribute from the parent partition.

It is however perfectly valid and legal to exchange a subpartition with a compressed table and that way introduce individual compression on subpartition level. It is unclear why this is only possible using this indirect method - it simply looks like a implementation restriction. Since compressed data is most suitable for data warehouse environments and these ought to use EXCHANGE [SUB]PARTITION technology anyway this restriction is probably not that crucial.

2. Number of columns
Basic heap table compression and the new OLTP (in 11.1 called "for all operations") compression are both limited to 255 columns. If you attempt to compress a table with more than 255 columns then no compression will be performed although you don't get an error message. Interestingly the COMPRESSION column in the dictionary still shows "ENABLED" which is certainly misleading.

3. DDL Restrictions - modification of table structure
As soon as a segment contains compressed data or has been marked for compression (does not apply in all cases, see below for more information), the following restrictions regarding the modification of the table structure apply:

DBMS_STATS - Gather table statistics with many columns

Here is another good reason why you probably don't want to use tables with too many columns.

The first good reason is that Oracle stores rows of conventional heap tables with more than 255 columns (where at least one column value after the 255th is non-null) in multiple row pieces even when the entire row may fit into a single block (and up to 11.2 doesn't support basic and OLTP heap table compression on tables with more than 255 columns). This leads to something that is sometimes called "intra-row chaining" which means that Oracle needs to follow the row piece pointer to access columns after the 255th one leading to multiple logical I/Os per row, up to four for a row approaching the hard limit of 1,000 columns.

Of course such a multiple-piece row easily can result in actual row chaining where the row pieces are scattered across different blocks now potentially leading to additional random single block I/O, but I digress...

This simple example allows to see this (and the other, yet to describe) effect in action - it creates by default a table with 1,000 columns with 10,000 rows where each row resides in a separate block (and therefore only intra-row chaining should occur). I used a 8K block size tablespace with Manual Segment Space Management (MSSM) in order to avoid the ASSM overhead in this case:

----------------------------------------------------------------------------------------------------------
-- MANY_X_COLS.SQL
-- Usage: @MANY_X_COLS [NUM_COLS] [PCTFREE] [PCTUSED] [TABLESPACE] [COL1] [COL2] [COL3] [COL4] [NUM_ROWS]
-- Defaults: [1000 ] [99 ] [1 ] [TEST_8K ] [001 ] [256 ] [512 ] [768 ] [10000 ]
--
-- Create a table MANY_X_COLS with a configurable number of columns
-- and populate four columns of it using skewed data (normal distribution)

Advanced Oracle Troubleshooting Session - PGA/UGA memory fragmentation

A troubleshooting session at one of my clients started with the following description of the problem:

In some specific database environments sometimes (but not always) a particular batch process takes significantly longer than expected - which in that case meant hours instead of a few minutes.

It could also be observed that particular SQL statements showed up as most prominent in that case when monitoring the process.

So the first thing was to be able to reproduce the issue at will which was not the case so far.

It was quite quickly possible to narrow the problem down to a single call to a complex PL/SQL stored procedure (that was part of a package and called a lot of other stored procedures / packages) that dealt with handling of LOBs representing XML data.

It turned out that by simply calling this stored procedure multiple times in a loop the issue could be reproduced at will and also in different database environments as initially reported.

The rather unique aspect of that stored procedure was that it got executed in an "exclusive" mode which means that it was run on a quite large IBM pSeries server (AIX 5.3, 11.1.0.7, 240 GB RAM, numerous P6 multi-cores, expensive storage) with no other business processes active at the same time. It was mostly a serial execution and when it was monitored, it spent most of its time in two SQL statements that were very simple and straightforward. It actually were static SQLs embedded in PL/SQL that queried a single, very small table using a unique index access plus a table access by ROWID. So we had two SQL statements that under normal circumstances at most could generate two logical I/Os per execution, since the index/table queried was so small and therefore was completely held in the buffer cache.

Frequency histograms - edge cases

Oracle introduced with the 10.2.0.4 patch set a significant change how non-existing values are treated when there is a frequency histogram on a column / expression. See Jonathan Lewis' blog post which is probably the most concise description of the issue. In a nutshell the change is about the following (quoted from Jonathan's post): "If the value you supply does not appear in the histogram, but is inside the low/high range of the histogram then the cardinality will be half the cardinality of the least frequently occurring value that is in the histogram".

I'm still a bit puzzled why Oracle introduced such a significant change to the optimizer with a patch set, but one of the most obvious reasons might be that the change allows to generate frequency histograms using a rather low sample size, because there is no longer a similar threat as before when the frequency histogram misses one of the existing values (which would then return an estimate of 1 if they didn't appear in the histogram).

In fact when using the default DBMS_STATS.AUTO_SAMPLE_SIZE Oracle exactly does this: It uses by default a quite low sample size to perform the additional runs required for each histogram - probably an attempt to minimize the additional work that needs to be done for histogram generation.

In it is however quite interesting to see how exactly this behaviour together with the new treatment of non-existing values can turn into a threat as a recent thread on OTN demonstrated.

Consider the following scenario: You have a column with a highly skewed data distribution; there is a single, very popular value, and a few other, very unpopular values.

TEMP Table Transformation and Remote Procedure Calls

I've recently come across a interesting side-effect regarding temp table transformations at one of my clients.

There was a PL/SQL package procedure that worked fine when called locally but somehow "hung" when being called from a remote database - all it did was to call exactly the same package procedure with the same parameters as the local call, but one of the SQL statements executed as part of the procedure generated an suboptimal execution plan that never completed.

Further investigations revealed that the significant difference between the execution plan of the local and the remote execution of the procedure was the different treatment of a contained "WITH" clause.

The interesting point is that the procedure called itself didn't perform any "distributed" queries or DML - the only difference was that one time the procedure got called locally, and one time remotely per database link. All processing within the procedure was local - no activities using database links were involved.

There are (at least) two known areas where Oracle can optionally use a so called TEMP TABLE TRANSFORMATION as part of the execution plan:

1. Materialization of a Subquery Factoring, also known as "Common Table Expression" or simply "WITH clause"

Oracle uses this when the subquery is used more than once in the execution plan, or if forced with the undocumented MATERIALIZE hint as part of the SELECT in the WITH clause. There are a few (not really documented) limitations of this materialization, in particular if LOBs or LONGs are part of the projection then this TEMP TABLE transformation can't get used.

2. Star transformation with TEMP TABLE transformation

Star transformations can also make use of the TEMP TABLE transformation. This is enabled by default when STAR_TRANSFORMATION_ENABLED is set to TRUE, but can be disabled by setting STAR_TRANSFORMATION_ENABLED to TEMP_DISABLE.

Multi-column joins

Consider the following scenario with four tables. Two of them represent master data, the third one uses a concatenated primary key consisting of foreign keys to the first two, and the fourth one has a foreign key to the third one.

drop table t1 cascade constraints purge;

drop table t2 cascade constraints purge;

drop table t3 cascade constraints purge;

drop table t4 cascade constraints purge;

create table t1 (
t1_id integer not null constraint pk_t1 primary key,
filler1 varchar2(40),
filler2 varchar2(40)
);

create table t2 (
t2_id integer not null constraint pk_t2 primary key,
filler1 varchar2(40),
filler2 varchar2(40)
);

create table t3 (
t1_id integer not null,
t2_id integer not null,
filler1 varchar2(40),
filler2 varchar2(40),
constraint pk_t3 primary key (t1_id, t2_id) using index (
create index pk_t3 on t3 (t1_id, t2_id)
),
constraint fk_t3_1 foreign key (t1_id) references t1 (t1_id),
constraint fk_t3_2 foreign key (t2_id) references t2 (t2_id));

create table t4 (
t4_id integer not null constraint pk_t4 primary key,
t1_id integer,
t2_id integer,
filler1 varchar2(40),
filler2 varchar2(40),
constraint t4_fk_1 foreign key (t1_id, t2_id) references t3 (t1_id, t2_id)
);

Notice that the primary key of "t3" is using a non-unique index, which is supported and can be used e.g. for deferrable constraints or when loading data into tables that might be non-unique so that the constraint can be disabled without dropping the (unique) index. This allows to simply re-enable the constraint after cleaning up the non-unique rows instead of re-creating an unique index (and the risk of losing the index if anything goes wrong).

Optimizer cleverness

At present I'm quite busy and therefore don't have much time to spent on writing blog notes, but I couldn't resist to publish this small and simple test case.

Often you can read (mostly unqualified) rants in various places and forums about the Cost Based Optimizer how stupid, unpredictable etc. it seems to be.

So I think it's time to demonstrate how clever the optimizer sometimes can be.

Consider the following setup:

drop table t_opt_clever purge;

-- Use PCTFREE 99 so that only one row per (leaf) block
-- This can tell us how many "rows" had to be inspected
-- by checking the number of (leaf) blocks accessed
-- Unfortunately Oracle (usually) doesn't provide the information
-- how many rows have been accessed in the execution plan,
-- but only how many rows are returned by an operation
create table t_opt_clever (
id not null constraint pk_opt_clever primary key,
col1 not null,
col2 not null,
col3 not null,
col4 not null,
col5 not null,
filler
)
pctfree 99
pctused 1
as
select
level as id
, round(dbms_random.value(0, 200)) as col1
, round(dbms_random.value(0, 400)) as col2
, case
when level <= 666
then 'FIRST_BUCKET'
when level <= 833
then 'SECOND_BUCKET'
when level <= 1000
then 'THIRD_BUCKET'
end as col3
, round(dbms_random.value(0, 600)) as col4
, round(dbms_random.value(0, 800)) as col5
, rpad('x', 100, 'x') as filler
from
dual
connect by
level <= 1000;

create index idx_opt_clever1 on t_opt_clever (col5, col1, col4, col2) pctfree 99 compute statistics;

create index idx_opt_clever2 on t_opt_clever (col5, col1, col3, col4, col2) pctfree 99 compute statistics;

Locked table statistics and subsequent create index

Just a minor thing to consider: By default in 10g and later index statistics are generated along with an index creation (option COMPUTE STATISTICS in previous releases enabled by default), so a newly created index usually has computed statistics.

10g also introduced the option to lock table statistics.

Now if you lock statistics in 10g in later using DBMS_STATS.LOCK_TABLE_STATS or LOCK_SCHEMA_STATS and create an index on a locked table the statistics for the index will not be generated along with the CREATE INDEX command. Unfortunately there is no corresponding "FORCE" option in CREATE INDEX available to overwrite that behaviour that I'm aware of so it looks like you're only left with two choices:

1. Use a separate DBMS_STATS.GATHER_INDEX_STATS call with the FORCE=>true option to override the lock on the statistics

2. Temporarily unlock the table statistics before creating the index

The first option can be costly if the index is large, the second option requires additional steps to be taken, and it obviously needs to be ensured that the table statistics are not modified while they are unlocked (e.g. by the default statistics job in 10g and later).

A small testcase run on 10.2.0.4 Win32 follows to demonstrate the issue. I got the same result on 11.1.0.7 Win32.

SQL>
SQL> drop table lock_test purge;

Table dropped.

SQL>
SQL> create table lock_test
2 as
3 select
4 *
5 from
6 all_objects
7 where
8 rownum <= 1000;

Table created.

SQL>
SQL> exec dbms_stats.lock_table_stats(null, 'lock_test')

PL/SQL procedure successfully completed.

SQL>
SQL> create index lock_test_idx on lock_test (object_name) compute statistics;
create index lock_test_idx on lock_test (object_name) compute statistics

Temporary Tablespace Groups

Oracle 10g introduced the concept of temporary tablespace groups.

These allow to group multiple temporary tablespaces into a single group and assign a user this group of tablespaces instead of a single temporary tablespace.

This raises some interesting questions, and for some of these I don't find answers in the official documentation. Some of these questions are:

- Can a single workarea execution allocate space from more than one temporary tablespace, e.g. to support large serial sort operations?

A workarea belongs to a single operation of an execution plan. There are several different types of operations that require a workarea, among them are sorts, hash joins, group bys and sort/merge joins.

This workarea can fit into available PGA memory, but can also spill to disk in case there is insufficient memory available to support the operation.

Furthermore this implies that a execution of a single SQL statement can require multiple workareas, e.g. a quite simple statement might need two workareas for two hash joins and a third one for a subsequent sort order by operation.

Note that there are other types of operations that don't require a workarea, e.g. a nested loop join doesn't require a workarea (and therefore will never acquire temporary space).

Details about workareas can be obtained from various dynamic performance views, e.g. V$SQL_WORKAREA, V$SQL_WORKAREA_ACTIVE and V$SQL_WORKAREA_HISTOGRAM.

- Can multiple workareas of a single session allocate space from different tablespaces?

- According to the documentation different sessions of the same user can use different temporary tablespaces from the group. Is this correct?