Getting tired of typing and setting your ORACLE_HOME, ORACLE_SID and troubleshooting your . oraenv ?
going/connecting to different client sites with different platforms and server environments sometimes it is frustrating just how long it takes to get a proper SQL*Plus environment
So I always make use of this script
and for every server that I access I just have to do three bits of typing:
One of the easiest ways to understand something is to see a visualization. Looking at Active Session History (ASH) data is no exception and I’ll dive into how to do so with R and how I used R plots to visually present a problem and confirm a hypothesis. But first some background…
Frequently DBAs use the Automatic Workload Repository (AWR) as an entry point for troubleshooting performance problems and in this case the adventure started the same way. In the AWR report Top 5 Timed Foreground Events, the log file sync event was showing up as the #3 event. This needed deeper investigation as often times the cause for longer log file sync times is related to longer log file parallel write times.
This one’s so odd I nearly posted it as a “Quiz Night” – but decided that it would be friendlier simply to demonstrate it. Here’s a simple script to create a couple of identical tables. It’s using my standard environment but, apart from fiddling with optimizer settings, I doubt if there’s any reason why you need to worry too much about getting the environment exactly right.
This is not really anything new - in fact Tanel Poder has already blogged about it a while ago. Tanel has specifically covered the handling of "urgent" TCP packets and how this could be used to signal a "cancel" to another process, however this only works on Unix environments and not with Windows SQL*Plus clients. In Tanel's article it is also mentioned that there is an officially documented way of doing this via the Resource Manager if you happen to have an Enterprise Edition license.
In my quick tests however the call to DBMS_RESOURCE_MANAGER.SWITCH_CONSUMER_GROUP_FOR_SESS using "CANCEL_SQL" as consumer group only errors out with ORA-29366 saying that the specified consumer group is invalid.
If you want to make use of Oracle's cunning Star Transformation feature then you need to be aware of the fact that the star transformation logic - as the name implies - assumes that you are using a proper star schema.
Here is a nice example of what can happen if you attempt to use star transformation but your model obviously doesn't really correspond to what Oracle expects:
purge table d;
drop table t;
purge table t;
create table t
rownum as id
, mod(rownum, 100) + 1 as fk1
, 1000 + mod(rownum, 10) + 1 as fk2
, 2000 + mod(rownum, 100) + 1 as fk3
, rpad('x', 100) as filler
level <= 1000000
exec dbms_stats.gather_table_stats(null, 't')
create bitmap index t_fk1 on t (fk1);
Ok, I think it’s time to write another blog entry. I’ve been traveling and dealing with jetlag from 10-hour time difference, then traveling some more, spoken at conferences, drank beer, had fun, then traveled some more, trained customers, hacked some Exadatas and now I’m back home.
Anyway, do you know what is the SQL_EXEC_ID in V$SESSION and ASH views?
Oh yeah, it’s the “SQL Execution ID” just like the documentation says … all clear. Um … is it? I’d like to know more about it – what does it actually stand for?! Is it session level, instance level or a RAC-global counter? And why does it start from 16 million, not 1?
There are many reasons why a parallel execution might not run with the expected degree of parallelism (DOP), beginning with running out of parallel slaves (PARALLEL_MAX_SERVERS or PROCESSES reached), PARALLEL_ADAPTIVE_MULTI_USER, downgrades at execution time via the Resource Manager, or the more recent features like PARALLEL_DEGREE_LIMIT or the Auto DOP introduced in Oracle 11.2.
Time for another of those little surprises that catch you out after the upgrade.
Take a look at this “Top N” from a standard AWR report, from an instance running 18.104.22.168
Top 5 Timed Foreground Events ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Avg wait % DB Event Waits Time(s) (ms) time Wait Class ------------------------------ ------------ ----------- ------ ------ ---------- direct path read 3,464,056 6,593 2 33.5 User I/O DB CPU 3,503 17.8 db flash cache single block ph 2,293,604 3,008 1 15.3 User I/O db file sequential read 200,779 2,294 11 11.6 User I/O enq: TC - contention 82 1,571 19158 8.0 Other
Have you ever heard the suggestion that if you see time lost on event write complete waits you need to get some faster discs.
So what’s the next move when you’ve got 96GB of flash cache plugged into your server (check the parameters below) and see time lost on event write complete waits: flash cache ?
db_flash_cache_file /flash/oracle/flash.dat db_flash_cache_size 96636764160
Here’s an extract from a standard 22.214.171.124 AWR report:
Here’s one I keep forgetting – and spending 15 minutes trying to think of the answer before getting to the “deja vu” point again. I’ve finally decided that I’ve got to write the answer down because that will save me about 14 minutes the next time I forget.
Q. In a Statspack or AWR report there is a section titles “Segments by Row Lock Waits”. Why could an index be subject to a Row Lock Wait ?
A. Try inserting into a table from two different sessions (without committing) two rows with the same primary key. The second insert will wait on event enq: TX – row lock contention, and show up in v$lock with a lock request for a TX lock in mode 4. When you issue a commit or rollback on the first session, and the second statement errors or completes (depending on whether you commit or rollback the first session) it will increase the value for row lock waits in v$segstat (and v$segment_statistics) for the index by 1.
There are variations on the theme, of course, but the key feature is uniqueness with one session waiting for another session to commit or rollback on a conflicting value. This includes cases of foreign key constraint checking such as inserting a child for a parent that has been deleted but not committed (and there’s an interesting anomaly with that scenario which – in 10g, at least – reports more row lock waits on the parent PK than you might expect.)