Search

OakieTags

Who's online

There are currently 0 users and 45 guests online.

Recent comments

Oakies Blog Aggregator

RuOUG in Saint Petersburg

On February 10th I was in Saint Petersburg on a seminar organized by Russian Oracle User Group. Actually, it was mostly prepared by company called Devexperts. Seminar took place in their office, with most presentations done by their people.

  • First presentation was called “AWR, ASH – use cases and 11.2 enhancements”. The content was targeted to people who are not very familiar with these facilities. It was good enough and covered (I think) most of new features of the AWR reports such as more OS level data, Wait Event Histogram, aggregate IO statistics (based on V$IOSTAT_FUNCTION_DETAIL), colored SQL and others. After AWR presentation went to what’s new in ASH. Most of it comes to a fact that number of columns in dba_hist_active_sess_history is almost doubled in 11.2.0.2 compared to 10.2.0.5. There were several examples of custom queries to ASH to find what’s missing in averages and how to identify a skew (aka spike). There are references to other good presentations at the end. One presentation that is missing an active link is John Beresniewicz. Practical Active Session History. And here are some random notes on topic of AWR and alike:
    • the function dbms_xplan.display_awr to display execution plan stored in AWR is not working correctly – it misses predicates information even in 11.2.0.3. Major gap.
    • v$sql is too costly way to find hard queries due to its nature – this is shared cursors structure that needs to be protected by latches/mutexes. v$sqlstats is more suitable for such task, and it has bigger retention time
    • there’s a thread on SQL.ru where it is noted that some of the Top N SQL sections of AWR are derived from the Top SQL by Elapsed Time, CPU Time, Parse Calls, Shareable Memory, Version Count. Namely, Top N SQL by Gets, Physical reads and Executions are formed based on the previously selected Top SQLs. This could potentially lead to a situation when SQL is missing in the reports even if it’s, for example, absolute #1 query by buffer gets. This doesn’t sound right. I have no confirmation or denial of that, but I believe in it right now./
    • there are, of course more easier way to miss a query in the AWR report – SQL could be loaded and flushed from the shared pool in between of snapshots
    • Igor Usoltsev has recently published a note about how AWR snapshot creation could be stopped due to a slow query in 11g caused by a missing fixed statistics. Although I know that there’re indeed cases when fixed objects statistics helps Oracle to build a good plan, I’ve also seen a number of cases where gathering fixed objects statistics lead to other problems. So it’s not a win-win suggestion and have to be carefully tested – and the only way to test it “right” is to do so in production or, if you can reproduce the environment and the load, on its copy.
    • If you haven’t seen Doug Burns’ post on a query to find ASH outliers (written by John Beresniewicz), here it is. I haven’t tried it myself (yet).
  • Second presentation was “My experience building a custom ETL system“. It was done by a developer who has written custom ETL solution based on Streams & Java. He has even written a custom query language – mix of Java & PL/SQL. Most of the things were developer related so I listened not very carefully (especially since Yandex DBAs sitting next to me recovered one of the production databases at the same time – it was more interesting :) ). One issue that caught my attention was around Streams Apply process. As I understood it, there’s possibility to get two changes associated with the same point in time (SCN). I’ve never worked with Streams and can’t comment on such topic; a quick MOS search didn’t reveal anything close to this.
  • Next was the most interesting presentation – “Exadata: Deployment story“. It talks about real-life Exadata implementation with the issues encountered: more than 1 day deploy time, broken power supply, slow NFS to copy data, RAT bug, and ORA-00600 after upgrade. Since the environment has Streams in place, they used a tricky procedure to guarantee fallback if something goes wrong during upgrade. There’re also some performance metrics of the average waits (which are not representative, you know) and SQL performance. Very nice presentation plus a discussion afterwards on the patching and installing additional software on the DB grid in Exadata.
  • Final presentation was “GoldenGate Overview” (it’s mostly in Russian). I can’t add something about it, except it was a good presentation, as usual :)

The conference was broadcasted for the first time at anymeeting.com. Live stream allowed me to ask questions to presenters from those who followed online and make several screenshots to twit them. It was positive experience with some (I think minor) sound issues. I hope it will become a standard thing and more people could virtually attend such events.
I’d like to say big thanks to people who organized this – it was a pleasure to be there!

PS. I was in a book store for the first time in many years and found three books on Oracle on the book shelf. Don’t know if it’s right or wrong.
PPS. If you want to see St Petersburg beauties online, have a look at Yandex maps for panoramic pictures taken from the air.

Filed under: Oracle Tagged: ASH, AWR, Exadata, RuOUG

Product Review

A while ago I said that I’d be posting a product review here. Unfortunately work got in the way of me posting the review as quickly as I would have liked. The nice people at Ucertify sent me a review kit for the OCA Oracle Database 11g: Administration 1 exam. I very much like the […]

Slow querying of DBA_EXTENTS, DBA_FREE_SPACE and dropping tables

Are you finding queries to identify free space, the size of segments or details of extents in your database are very slow? You could have a database with a very, very large number of extents. If you are on version 9 or up, I would suggest you check the extent management settings for any tablespace holding large segments (see later).

{Caveat – I’m writing this posting based on what I remember of some issues I had on two V10.2 databases 2 and 3 years ago respectively, and those systems are no longer available to me. So please treat everything I say here with some caution – I’ve not checked and verified it to the level I normally would. But I wanted to move this information from another posting I am writing on a recent experience I will verify…}

First, what this article is NOT about. It has been known for a very long time that creating tables with a large number of extents can have a negative impact on SQL performance on that table. Except it never really did and it was a myth. Way back prior to Oracle 7.3 you could only create so many extents per segment, depending on block size, but that was another issue. It used to be argued that SQL select against such tables with many extents was slower. I did some tests and it was not – unless you were in the strange situation where your extent size was less than your multi-block read count, and even then the impact was not huge, it was a slowdown of a few percent to maybe 25%.

However, dropping such tables, truncating such tables and queries against the dictionary objects that deal with extents and free space could and still can become very slow. As we have progressed through the Oracle versions from 8 ,9 and 10 this problem has become less common and the impact has become less, mostly due to Locally Managed Tablespaces (LMTs) and Automatic Segment Space Management {though that is more in respect of concurrent DML than select}.

LMTs in particular have generally removed the issue. If you do not use LMTS and have no very,very pressing reason to not {like 3rd party support for applications}, then swap to LMTs. LMTs have been the default for user defined tablespaces since oracle 9 and have two options. Uniform (Fixed) extent sizes or Automatic, where oracle uses 8 block extents for a new segment to start, then 64 block extents, then 1024 block extents as the segment grows {I might be wrong on the exact size details but don’t worry about it, I certainly don’t}. You can check the settings for tablespaces as is demonstrated below. I create two tablespaces, one with uniform extent sizes and then one with automanaged extent sizes, and check the relevant information (this is on 11.2.0.3):

create tablespace mdw_uni_1m
datafile 'C:\APP\MARTIN\ORADATA\MDW1123\MDW_UNI_1M'
size 100m
extent management local uniform size 1M;

create tablespace mdw_auto
datafile 'C:\APP\MARTIN\ORADATA\MDW1123\MDW_AUTO'
size 100m
extent management local autoallocate;

select tablespace_name,initial_extent,next_extent
      ,extent_management,min_extlen,allocation_type
from dba_tablespaces where tablespace_name like 'MDW%';

TABLESPACE_NAME    INITIAL_EXTENT NEXT_EXTENT EXTENT_MAN MIN_EXTLEN ALLOCATIO
------------------ -------------- ----------- ---------- ---------- ---------
MDW_UNI_1M                1048576     1048576 LOCAL         1048576 UNIFORM
MDW_AUTO                    65536             LOCAL           65536 SYSTEM

As you can see, tablespace MDW_UNI_1M uses uniform extents of 1M and MDW_AUTO has system managed allocation and starts with 64K extents – 8 blocks with my 8k block size.

As a quick demo, I’ll just create and populate two simple tables and see what extent sizes are created for them:

create table mdw_big
tablespace mdw_auto
as select rownum id,lpad('A',1000,'A') pad
from dual connect by level <10000

create table mdw_big_uni
tablespace mdw_uni_1m
as select rownum id,lpad('A',1000,'A') pad
from dual connect by level <10000

select owner,segment_name,blocks,count(*)
from dba_extents
where segment_name like 'MDW%'
group by owner,segment_name,blocks

OWNER           SEGMENT_NAME        BLOCKS   COUNT(*)
--------------- --------------- ---------- ----------
ERIC            MDW_BIG_UNI            128         12
ERIC            MDW_BIG                  8         16
ERIC            MDW_BIG                128         11

3 rows selected.

So, how do issues with large numbers of extents still arise with modern oracle systems? Well, the two situations I’ve seen on Oracle 10 had the same cause {and, as a teaser for later this week, I’ve seen a variation of this issue on a nice, shiny Exadata X2-2 Oracle 11.2.0.3 box :-) You can readabout that here }. What alerted me was slow performance querying the data dictionary, in particular my scripts for checking free space, the size of segments and how many extents they consisted of.

If you create a tablespace with fixed extent sizes of eg 64K and then create a table in there that is 100GB in size, you will get an awful lot of extents. Now make it worse and have a partitioned table that ends up being several hundreds of GB in size with all those segments in that tablespace (or similarly defined tablespaces).

Since I hit the problem twice myself, I’ve chatted to others who have had the same issue. The above usually happens because of a mistake. The tablespace(s) in question are set up to hold small tables and then get used for large tables, either by a mistake in stating the exact tablespace to use or having the tablespace default to your default tablespace – which just happens to be a tablespace with fixed but small extent sizes.

The end result is a massive number of small extents in these tablespaces, usually with extents for different objects mixed in. Some dictionary queries slow down and, in particular, anything to do with looking at extents. For one site, I was trying to use my own code to gather statistics on tables that replaced the standard automated job. It’s fairly “smart” code and chooses a sample size based on the size of the segments. Only, the data dictionary was performing so slowly for the check on segment size that it was taking over 5 seconds to get the information – longer than some of the stats gathers.

You can logically understand why dropping or truncating the table is slow. Oracle has to sort out all those extents, remove the information from the data dictionary. This is not helped by the fact that part of the data dictionary is being slowed down due to all those pesky records…

You MAY be able to get some relief from this situation by gathering fixed object statistics. I did so at one site, where the queries against free_space and segment size sped up by around 80%. I have no demonstrated proof of this, it is just what I saw in one situation, so feel free to try it but don’t sue me if it does not help. Also, it took over 3 hours to gather the fixed object stats and you only do this sort of thing, untested, on a production system if you are already in a bad place.

{update – I just tested this on a private 11.2 db that was taking 10.3 seconds to count all extents, all 12,742 of them. Gathering fixed object stats made no difference at all.}

However, the real answer is to laboriously rebuild those segments in tablespaces with correctly specified uniform extent sizes. Which we did, over several weeks, and it made a difference.

If I was doing this task today, if I could get the outage to do it, I would create COPIES of those segments that were in the wrong tablespaces, re-name and re-apply any constraints and move the other other, smaller tables and indexes to a new tablespace – and then drop the tablespaces including contents. Why? As dropping a table with lots and lots of small extents seemed to take a very long time (many minutes per partition and we had thousands of them). Again, my proof is lost in the mists of time, but that’s what I would aim to do.

Introducing the Hardline According to Tim Hall…

I’ve made quite a big decision today. I’ve decided that I will no longer waste my time answering crap questions on my forum. I don’t mean I will not answer any questions. Just that I will not answer questions where the poster has put little to no effort into the question. The sort of things that drive me mad are:

  • No indentation in code.
  • Refusing to use the CODE tags around formatted text, thereby making it unreadable.
  • Refusing to provide CREATE TABLE and INSERT statements to set up a test case. I don’t care that I could knock up the test case in 5 minutes, I shouldn’t have to. You are asking for help. You are not paying me to help you. You should do everything in your power to minimize the time I have to spend on your problem.
  • Sending me whole scripts or packages when you already know the issue is focused around a single query. If you can’t be bothered to strip that query out and send it to me as part of the test case, I can’t be bothered to answer your question.
  • Ignoring my requests for specific information.
  • People who post things like “URGENT”. This implies your issue is more important than anyone elses, or indeed my time. That’s rather rude. If you need an SLA, you should pay someone for support. What you consider urgent and what I consider urgent may be two different things. I had a guy post a question on a Friday evening with “URGENT” plastered all over it. I felt charitable so I took the time out to help. A few minutes later I got a reply saying, “Thanks for the reply. I’ll check it out on Monday.” Are you kidding me? It’s so urgent it can wait until Monday?
  • People who post, “Give me a detailed answer” or “Give me a step-by-step procedure”. I know you don’t want an RTFM, but those sort of sentences make my blood boil.

The list could go on, but you get the idea.

I’ve tried to be tougher a few times, then felt guilty and reverted. I’ve also considered closing the forum a number of times and hanging out on the OTN forums instead. That way I could be more selective, helping the people that actually take the time to post reasonably structured questions.

For the most part I really enjoy answering questions and helping people, but the lack of effort on some people’s part gets on my tit end.

There is a sticky post at the top of every forum describing how you should ask a question in the forum. From now on, people who don’t follow the rules get ignored or their accounts deleted. I simply don’t have the will to carry on trying to help people who can’t even be bothered to write a reasonable question and then ignore my calls for clarification.

If you are one of the people in question and you think I’m being unreasonable, then I would politely suggest that you jog on!

Cheers

Tim…




Subquery Factoring

Here’s an interesting little conundrum about subquery factoring that hasn’t changed in the recent (11.2.0.3) patch for subquery factoring. It came to me from Jared Still (a fellow member of Oak Table Network) shortly after I’d made some comments about the patch. It’s an example based on the scott/tiger schema – which I’ve extracted from the script $ORACLE_HOME/rdbms/admin/utlsampl.sql (though the relevant scripts may be demobld.sql or scott.sql, depending on version).

As usual I’ve used an 8KB block size, LMT with uniform 1MB extents, and no ASSM to hold the data. I won’t reproduce the code to generate the schema, just the versions of the query with, then without, subquery factoring:

with e as (
	select
		/*+ gather_plan_statistics */
		deptno
	from	emp
	order by
		deptno
)
select
	deptno, count(*)
from
	e
group by deptno
;

select   deptno, count(*)
from (
	select
		/*+ gather_plan_statistics */
		deptno
	from	emp
	order by
		deptno
) e
group by deptno
;

I have to say that I’m not entirely sure what Oracle should do with these queries – but I do think it should end up with the same result in both cases, but here are the two sets of results and their execution plans (in the same order that I listed the queries  above):

    DEPTNO   COUNT(*)
---------- ----------
        10          3
        20          5
        30          6

3 rows selected.

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID  fk4xrmz1wpcb9, child number 0
-------------------------------------
with e as (  select   /*+ gather_plan_statistics */   deptno  from emp
order by   deptno ) select  deptno, count(*) from  e group by deptno

Plan hash value: 1536940522

-------------------------------------
| Id  | Operation            | Name |
-------------------------------------
|   0 | SELECT STATEMENT     |      |
|   1 |  SORT GROUP BY NOSORT|      |
|   2 |   VIEW               |      |
|   3 |    SORT ORDER BY     |      |
|   4 |     TABLE ACCESS FULL| EMP  |
-------------------------------------

    DEPTNO   COUNT(*)
---------- ----------
        30          6
        20          5
        10          3

3 rows selected.

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID  17m8h7wgmxvr0, child number 0
-------------------------------------
select   deptno, count(*) from (  select   /*+ gather_plan_statistics
*/   deptno  from emp  order by   deptno ) e group by deptno

Plan hash value: 4067220884

-----------------------------------
| Id  | Operation          | Name |
-----------------------------------
|   0 | SELECT STATEMENT   |      |
|   1 |  HASH GROUP BY     |      |
|   2 |   TABLE ACCESS FULL| EMP  |
-----------------------------------

As you can see, with the inline view Oracle has managed to eliminate the order by clause; but when we use subquery factoring Oracle inlines the subquery as a non-mergeable view and preserves the order by, then recognises that it can achieve the aggregation through a “sort group by” on data that doesn’t need any more sorting.

Personally I think that the original authors of this SQL have got lucky – so far. Neither version of the statement, as written, can guarantee any order to the data because the final operator is not an order by. It is simply a limitation in the current optimizer code that the order by clause hasn’t been eliminated from the factored subquery. At the moment, then, this limitation means that this third party application is getting the expected results by accident – and the next upgrade might make the accident stop happening.

There is a fairly simple workaround to this type of problem – if you write the query with the inline view but add the hint /*+ no_eliminate_oby */ to the inline view, you can capture the baseline for the view and attach it to the version of the query with the inline view but without the hint. (See Fake Baselines for one possible method of doing this capture.)

Temporary tablespaces in RAC

Temporary tablespaces are shared objects and they are associated to an user or whole database (using default temporary tablespace). So, in RAC, temporary tablespaces are shared between the instances. Many temporary tablespaces can be created in a database, but all of those temporary tablespaces are shared between the instances. Hence, temporary tablespaces must be allocated in shared storage or ASM. We will explore the space allocation in temporary tablespace in RAC, in this blog entry.

In contrast, UNDO tablespaces are owned by an instance and all transactions from that instance is exclusively allocated in that UNDO tablespace. Remember that other instances can read blocks from remote undo tablespace, and so, undo tablespaces also must be allocated from shared storage or ASM.

Space allocation in TEMP tablespace

TEMP tablespaces are divided in to extents (In 11.2, extent size is 1M, not sure whether the size of an extent is controllable or not). These extent maps are cached in local SGA, essentially, soft reserving those extents for the use of sessions connecting to that instance. But, note that, extents in a temporary tablespace are not cached at instance startup, instead instance caches the extents as the need arises. We will explore this with a small example:

This database has two instances and a TEMP tablespace. TEMP tablespace has two temp files, 300M each.


Listing 1-1: dba_temp_files

  1* select file_name, bytes/1024/1024 sz_in_mb from dba_temp_files
SYS@solrac1:1>/

FILE_NAME                                                      SZ_IN_MB
------------------------------------------------------------ ----------
+DATA/solrac/tempfile/temp.266.731449235                            300
+DATA/solrac/tempfile/temp.448.775136163                            300

Initially, no extents were cached, and no extents were in use as shown from the output of gv$temp_extent_pool view in Listing 1-2.

Listing 1-2: Initial view of temp extents

select inst_id, file_id, extents_cached, extents_used from gv$temp_extent_pool order by 1,2;

   INST_ID    FILE_ID EXTENTS_CACHED EXTENTS_USED
---------- ---------- -------------- ------------
         1          1              0            0
         1          2              0            0
         2          1              0            0
         2          2              0            0

We are ready to start a test case

Listing 1-3: Script in execution

select inst_id, file_id, extents_cached, extents_used from gv$temp_extent_pool order by 1,2;
  INST_ID    FILE_ID EXTENTS_CACHED EXTENTS_USED
---------- ---------- -------------- ------------
         1          1              0            0
         1          2              0            0
         2          1             22           22
         2          2             23           23
...
/
   INST_ID    FILE_ID EXTENTS_CACHED EXTENTS_USED
---------- ---------- -------------- ------------
         1          1              0            0
         1          2              0            0
         2          1            108          108
         2          2            111          111

I started a small SQL script that joins multiple tables with hash join so as to induce disk based sorting. After starting the SQL script execution in instance 2, you can see that extents are cached and used in the instance 2, as shown in Listing 1-3. Initially, 45 extents were in use, few seconds later, temp tablespace usage grew to 219 extents.

Listing 1-4: script completion

  INST_ID    FILE_ID EXTENTS_CACHED EXTENTS_USED
---------- ---------- -------------- ------------
         1          1              0            0
         1          2              0            0
         2          1            163            0
         2          2            166            0

After the completion of the script,as shown in Listing 1-4, extents_used column is set to 0, But extents_cached is still at maximum usage level (319 extents). Meaning that, extents are cached (soft reserved) in an instance and not released (until another instance asks for it, as we see later).

You should also note that extents are equally spread between two files in that temporary tablespace. If you have more files in that temporary tablespace, then the extents will be uniformly allocated in all those temp files.

Space reallocation

Even if the cached extents are free, these extents are not available to use in other instance(s) immediately. An instance will request the owning instance to uncache the extents and then only those extents are available for use in the requesting instance. We will demonstrate this concept with the same test case, except that we will execute that test case in instance 1.

Listing 1-5: script in instance #1 execution

  INST_ID    FILE_ID EXTENTS_CACHED EXTENTS_USED
---------- ---------- -------------- ------------
         1          1             42           42
         1          2             42           42
         2          1            163            0
         2          2            166            0

At the start of SQL execution, instance started to reserve extents by caching them. My session was using those extents as visible from gv$temp_extent_pool. Number of extents used by the instance #1 was slowly growing.See Listing 1-5.

Listing 1-6: instance #1 stole the extents from instance #2

   INST_ID    FILE_ID EXTENTS_CACHED EXTENTS_USED
---------- ---------- -------------- ------------
         1          1            195           71
         1          2            133          116
         2          1             63            0 <-- note here
         2          2            166            0

It gets interesting. Notice that 329 extents were reserved In Listing 1-5. Since my SQL script needs 329M of space in the temp tablespace, instance 1 needs to steal space from instance 2.

In Listing 1-6, Instance 1 needed more extents and so, Instance 2 uncached 100 extents as the extents_cached column went down from a value of 163 to 63 extents (third row in the output above). Essentially, in this example, instance 1 requested instance 2 to uncache the extents and instance 2 obliged and uncached 100 extents. Prior to 11g, un-caching of extents used to be at one extent per request. From 11g onwards, 100 extents are released for a single request and all 100 extents are acquired by the requesting instance. Instance 1 acquired those 100 extents, cached those extents, and then the session continued to use those temp extents.

Listing 1-7: script completion and node #1 has more extents cached.

  INST_ID    FILE_ID EXTENTS_CACHED EXTENTS_USED
---------- ---------- -------------- ------------
         1          1            195            0
         1          2            133            0
         2          1             63            0
         2          2            166            0

After the completion of the script execution, instance 1 did not release the extents. Cached extents are not released (extents are soft reserved )until another instance asks for those extents to be un-cached.

I also enabled sql trace in my session from instance 1 while executing the script. SQL trace file spills out the details about un-reserving of these extents.

Listing 1-8: SQL Trace
...
#1: nam='enq: TS - contention' ela= 4172867 name|mode=1414725636 tablespace ID=3 dba=2 obj#=0 tim=6322835898
#2: nam='enq: SS - contention' ela= 608 name|mode=1397948417 tablespace #=3 dba=2 obj#=0 tim=6322837101
#3: nam='enq: SS - contention' ela= 414 name|mode=1397948422 tablespace #=3 dba=2 obj#=0 tim=6322837710
#4: nam='DFS lock handle' ela= 389 type|mode=1128857605 id1=14 id2=1 obj#=0 tim=6322838264
#5: nam='DFS lock handle' ela= 395 type|mode=1128857605 id1=14 id2=3 obj#=0 tim=6322838788
#6: nam='DFS lock handle' ela= 260414 type|mode=1128857605 id1=14 id2=2 obj#=0 tim=6323099335
...

Line #1 above shows a tablespace level lock (TS enqueue) is taken on TEMP tablespace (ID=3 is ts# column in sys.ts$ table). Then SS locks were acquired on that tablespace, first with mode=1 and then with mode=6 (line #2 and #3). In Line #4, Cross Invocation Call (CIC) was used to ask remote SMON process to un-reserve the cached extents using CI type locks and DFS lock handle mechanism with lock types CI-14-1, CI-14-2, and CI-14-3.

Listing 1-9: Enqueue type

select chr(bitand(&&p1,-16777216)/16777215) || chr(bitand(&&p1,16711680)/65535) type,
mod(&&p1, 16) md from dual;
Enter value for p1: 1397948422

TY         MD
-- ----------
SS          6

Enter value for p1: 1128857605
TY         MD
-- ----------
CI          5

From Listing 1-8, Approximately, 4.5 seconds were spent to move the cached extents from the one instance to another instance. Prior to 11g, this test case will run much longer, since the extents were un-cached 1 extent per request. Hundreds of such request would trigger tsunami of SS, CI enqueue requests leading to massive application performance issues. In 11g, Oracle Development resolved this issue by un-caching 100 extents per request.

Important points to note

  1. As you can see, extents are allocated from all temporary files uniformly. There are also changes to file header block during this operation. This is one of the reason, to create many temporary files in RAC. Recommendation is to create, as many files as the # of instances. If you have 24 nodes in your RAC cluster, yes, that would imply that you would have to create 24 temp files to the TEMP tablespace.
  2. As we saw in our test case locking contention output, having more temp tablespace might help alleviate SS enqueue contention since SS locks are at tablespace level. Essentially, more temporary tablespace means more SS enqueues, But, you will move the contention from SS locks to ‘DFS lock handle’ waits as Cross invocation Call is one per the instance for extents un-caching operation.
  3. Temporary tablespace groups is of no use since the contention will be at Cross Invocation Call. In fact, there is a potential for temporary tablespace groups to cause more issues since the free space in one temp tablespace can not be reassigned to another temp tablespace dynamically, even if they are in the same tablespace group. In theory, it is possible to have more SS, CI locking contention with temp tablespace groups.
  4. Probably a good approach is to assign different temporary tablespace to OLTP users and DSS users and affinitize the workload to specific instances.

Update 1: Remember that you need to understand your application workload before following my advice

Next Public Appearance – Scottish SIG on 29th Feb

Who’s up for a beer or whiskey in Edinburgh on the evening of 28th Feb?

I’ve been promising myself I’d do the Scottish SIG for three or four years but life has always conspired to stop me. However, at last I am going to manage it this year.

The meeting is on the 29th February at the Oracle {was Sun} office in Linlithgow. You can see the schedule and details here. As ever, it is being chaired by Thomas Presslie, though I {and I suspect Mr Hasler} will be hoping he is not forcing drams of Whiskey on people before 10am in the morning, as he did at the last UKOUG conference…

I’m presenting on Index Organised Tables again, following up on the series of posts I did {and have still to finish}. As well as myself there is also Tony Hasler talking about stabilising statistics {one of the key things to stable and thus acceptable performance from a very knowledgeable man}, a presentation by Wayne Lewis on Unbreakable Enterprise Kernel 2 {which I understand is Oracle Linux with the extra bits Larry wants in there before they have gone through to the official Open Source release} and Harry Hall talking about all the new stuff on OEM 12C. If he says Cloud too often I might lob something heavy at him :-) {nothing personal Harry, I’m just tired of the C word in connection with Oracle already}. Additionally, Gordon Wilkie will also be giving an Oracle Update.

Part of the reason I want to do the Scottish SIG is that I really like Edinburgh {and Scotland in general – wonderful geography}. My original intention was to take my wife up there and make the trip into a short break – but she has to go to the US that week and I have a new client that needs my time, so it will be a dash up there the evening before and back in the afternoon.

So, is anyone around in Edinburgh from late evening on the 28th of Feb and fancies showing me one or two nice pubs?

Tuning is in the eye of the beholder… Memory is memory right?

It is human nature to draw from experiences to make sense of our surroundings.  This holds true in life and performance tuning.   A veteran systems administrator will typically tune a system different from an Oracle DBA.  This is fine, but often what is obvious to one, is not to the other.  It is sometimes necessary to take a step back to tune from another perspective.

I recently have ran across a few cases where a customer was tuning “Sorts” in the database by adding memory. Regardless of your prospective, every one knows memory is faster than disk; and the goal of any good tuner is to use as much in memory as possible.   So, when it was noticed by the systems administrator that the “TEMP” disks for Oracle were doing a tremendous amount of IO,  the answer was obvious right?

RamDisk to the rescue

To solve this problem, the savvy systems administrator added a RAM disk to the database.  Since, it was only for “TEMP” space this is seemed reasonable.

ramdiskadm -a oratmp1 1024m
/dev/ramdisk/oratmp1

Indeed user performance was improved.  There are some minor issues around recovery upon system reboot or failure that are annoying, but easily addressed with startup scripts.  So, SLA’s were met and everyone was happy.  And so things were fine for a few years.

Double the HW means double the performance… right?

Fast forward a few years in the future.  The system was upgraded to keep up with demand by doubling the amount of memory and CPU resources.  Everything should be faster right? Well not so fast.  This action increased the NUMA ratio of the machine.  After doubling memory and CPU the average user response time doubled from ~1 second to 2 seconds.  Needless to say, this was not going to fly.   Escalations were mounted and the pressure to resolve this problem reached a boiling point. The Solaris support team was contacted by the systems administrator.  Some of the best kernel engineers in the business began to dig into the problem.  Searching for ways to make the “ramdisk” respond faster in the face of an increased NUMA ratio.

A fresh set of eyes

Since I have worked with the Solaris support engineers on anything Oracle performance related for many years, they asked me to take a look.  I took a peak at the system and noticed the ramdisk in use for TEMP.  To me this seemed odd, but I continued to look at SQL performance.   Things became clear once I saw the “sort_area_size” was default.

It turns out, Oracle was attempting to do in-memory sorts, but with the default settings all users were spilling out to temp.  With 100’s of users on the system, this became a problem real fast.  I had the customer increase the sort_area_size until the sorts occurred in memory with out the extra added over head of spilling out to disk (albit fast disk).  With this slight adjustment, the average user response time was better than it had ever been.

lessons learned

  • Memory is memory, but how you use it makes all the difference.
  • It never hurts to broaden your perspective and get a second opinion

Filed under: Linux, Oracle, Solaris, Storage Tagged: Oracle, ramdisk, Solaris, sort_area_size, temp, tuning

Will Enabling a 10046 Trace Make My Query Slower (or Faster)?

February 13, 2012 (Modified February 14, 2012) It seems that I often wonder about the things that I read… is it really true?  Several years ago, one of the items that I wondered about is whether or not an execution plan for a SQL statement could change magically simply because the Oracle RDBMS knows that it is [...]

Oracle Security Training in Berlin ... and more ...

I am going to be teaching by two day Oracle security training course in Berlin on March 6th and 7th 2012 for DOAG - the German Oracle users group. You can find details of the course and also register to....[Read More]

Posted by Pete On 13/02/12 At 11:57 AM