Oakies Blog Aggregator

Database Links and PeopleSoft

I have seen Oracle database links used in conjuction with PeopleSoft at a number of customer sites over the years. I think some scenarios are examples where it is reasonable to use a database link, some are not.
In Oracle RDBMS, a database link (see also the Oracle Concepts Manual) specifies how a database server can create a session on another database in order to perform a remote operation. The remote database may be another Oracle database, or another type of database. The remote operation may be a query, or it may modify remote data (DML operation), in which case a two-phase commit mechanism ensures transaction integrity across the databases.

PeopleSoft does not use database links between databases because they are a database platform specific technology. Instead, the PeopleSoft methodology is to replicate data between databases with an application message. This works well with small data sets, and with larger data sets that change slowly. However, there are scenarios with very large data volumes where Application Messaging will struggle to keep up, and Oracle RDMBS specific replication technologies are an appropriate alternative.

Data can be replicated between databases with Materialised Views (formerly known as Snapshots). This is a SQL based technology. Materialised Views Logs on the source database track changes in the source table. An incremental or full refresh process is run on the target, usually as a scheduled job.

Oracle introduced an alternative replication technology in Oracle 9i called ‘Streams’. This uses supplementary redo logging on the source database which can then be applied to the target. Additional database server processes transmit, receive and apply this redo stream. This technology is aimed a large volume replication for data warehouses. I know of one site where several hundred database tables are replicated from PeopleSoft HR to a data warehouse.

I think the clearest way is to explain the good and bad use of links is by example.

At two different sites with HR and Finance, I have seen similar examples of both good and bad use of database links (one was on PeopleTools 7.0, before Application Messaging was available).

  • GL data needs to be sent from Payroll in the HR database to GL in the Financials database. The PeopleSoft delivered mechanism was to use generate and reload an interface file. This customer changed replaced the interface table in HR with a view that referenced a table with the same name and structure on the Financial system via a link. The payroll GL extract process now inserted directly into the table on Financials. I think this is a perfectly reasonable use of a database link. It was simple. It performed better than the interface file, and would certainly have been faster than application messaging. 
  • However, the same customer also used the employee expenses module in Financials. This required employee personal data. So they made the PS_PERSONAL_DATA table in Financials a view of the corresponding table in HR. The result was that every time somebody opened the expenses component in Financials, the application server process referenced the database link to HR. There is a limit on the maximum number of database links that a session can concurrently open (OPEN_LINKS) and that the whole database instance can concurrently open (OPEN_LINKS_PER_INSTANCE). They both default is 4, which gives an indication of how Oracle expect you to use this feature. This system ran out of database links. No Oracle errors were generated, instead sessions wait to be allowed to make the connection to the remote database. There are specific database link wait events that report the amount of time lost.  Eventually you reach the point where Tuxedo services timeout and then users receive errors. I think this is an example of the wrong way to use a database link. Instead I think that the data should have been replicated from HR to Financials. In a modern PeopleSoft system this should be done with an application message (on the 7.0 system a Materialised View could have been used). 
  • Not only were database links used to provide data to components, but the same views, that referenced remote objects, were used in queries and reports resulting in complex distributed queries. When multiple remote objects are referenced in a single SQL, I have seen Oracle decide to copy the whole of one or more of these objects to the local temporary tablespace. This does not result in good performance. Remember also, that even query only operations via a database link create a transaction on the remains until a commit or rollback is issued.
  • I worked on another site where Materialised Views were used to incrementally replicate data from CRM and HR to EPM databases. Processes in the EPM database then referenced the Materialised Views. In this system, there was never any question of EPM processes referencing objects via the links. The data volumes were such that Application Messaging would never have coped. In this case only the Materialised View refresh process references the database links, and so the DBA can manage the usage of links. This is a system where (after upgrade to Oracle 10g) Streams could also have been used to replicate the data.

Conclusion

My view is that database links should not be used to directly support functionality in either on-line components or reporting. When this is done in reports and queries it presents the problem of tuning distributed queries, and having to decide whether local or remote database should drive the query. Instead data should be replicated to the local database, preferably by PeopleSoft messaging, but if necessary by Oracle replication. However, I think that it can be reasonable to use a database link in an occasional batch process that moves data between systems.

Multi-column joins

Consider the following scenario with four tables. Two of them represent master data, the third one uses a concatenated primary key consisting of foreign keys to the first two, and the fourth one has a foreign key to the third one.

drop table t1 cascade constraints purge;

drop table t2 cascade constraints purge;

drop table t3 cascade constraints purge;

drop table t4 cascade constraints purge;

create table t1 (
t1_id integer not null constraint pk_t1 primary key,
filler1 varchar2(40),
filler2 varchar2(40)
);

create table t2 (
t2_id integer not null constraint pk_t2 primary key,
filler1 varchar2(40),
filler2 varchar2(40)
);

create table t3 (
t1_id integer not null,
t2_id integer not null,
filler1 varchar2(40),
filler2 varchar2(40),
constraint pk_t3 primary key (t1_id, t2_id) using index (
create index pk_t3 on t3 (t1_id, t2_id)
),
constraint fk_t3_1 foreign key (t1_id) references t1 (t1_id),
constraint fk_t3_2 foreign key (t2_id) references t2 (t2_id));

create table t4 (
t4_id integer not null constraint pk_t4 primary key,
t1_id integer,
t2_id integer,
filler1 varchar2(40),
filler2 varchar2(40),
constraint t4_fk_1 foreign key (t1_id, t2_id) references t3 (t1_id, t2_id)
);

Notice that the primary key of "t3" is using a non-unique index, which is supported and can be used e.g. for deferrable constraints or when loading data into tables that might be non-unique so that the constraint can be disabled without dropping the (unique) index. This allows to simply re-enable the constraint after cleaning up the non-unique rows instead of re-creating an unique index (and the risk of losing the index if anything goes wrong).

Now when using an uncorrelated data set for the concatenated keys, Oracle's default (join) selectivity formulas apply and the estimated cardinalities are correct. Table "t1" has 10,000 rows, "t2" 3 rows, table "t3" holds 30,000 rows combining "t1" and "t2" data. "t4" has 300,000 rows.

-- non-correlated column values
exec dbms_stats.set_table_stats(null, 't1', numrows=>10000, numblks=>100, avgrlen=>100)

exec dbms_stats.set_table_stats(null, 't2', numrows=>3, numblks=>1, avgrlen=>100)

exec dbms_stats.set_table_stats(null, 't3', numrows=>30000, numblks=>300, avgrlen=>100)

exec dbms_stats.set_table_stats(null, 't4', numrows=>300000, numblks=>3000, avgrlen=>100)

exec dbms_stats.set_column_stats(null, 't1', 't1_id', distcnt=>10000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't2', 't2_id', distcnt=>3, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't3', 't1_id', distcnt=>10000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't3', 't2_id', distcnt=>3, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't4', 't4_id', distcnt=>300000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't4', 't1_id', distcnt=>10000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't4', 't2_id', distcnt=>3, nullcnt=>0)

exec dbms_stats.set_index_stats(null, 'pk_t1', numdist=>10000, clstfct=>10000, indlevel=>2, numlblks=>10, numrows=>10000)

exec dbms_stats.set_index_stats(null, 'pk_t2', numdist=>3, clstfct=>3, indlevel=>1, numlblks=>1, numrows=>3)

exec dbms_stats.set_index_stats(null, 'pk_t3', numdist=>30000, clstfct=>30000, indlevel=>2, numlblks=>30, numrows=>30000)

exec dbms_stats.set_index_stats(null, 'pk_t4', numdist=>300000, clstfct=>300000, indlevel=>2, numlblks=>300, numrows=>30000)

Joining t4 to t3 results in a correct estimate of 300K rows:

select
count(*)
from
t4
, t3
where
t3.t1_id = t4.t1_id
and t3.t2_id = t4.t2_id;

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 52 | | 1456 (3)| 00:00:18 |
| 1 | SORT AGGREGATE | | 1 | 52 | | | |
|* 2 | HASH JOIN | | 300K| 14M| 1120K| 1456 (3)| 00:00:18 |
| 3 | INDEX FAST FULL SCAN| PK_T3 | 30000 | 761K| | 11 (10)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T4 | 300K| 7617K| | 833 (3)| 00:00:10 |
----------------------------------------------------------------------------------------

Things look however different if we have the awkward situation of correlated column values for the concatenated keys:

-- correlated column values
exec dbms_stats.set_table_stats(null, 't1', numrows=>10000, numblks=>100, avgrlen=>100)

exec dbms_stats.set_table_stats(null, 't2', numrows=>20000, numblks=>200, avgrlen=>100)

exec dbms_stats.set_table_stats(null, 't3', numrows=>30000, numblks=>300, avgrlen=>100)

exec dbms_stats.set_table_stats(null, 't4', numrows=>300000, numblks=>3000, avgrlen=>100)

exec dbms_stats.set_column_stats(null, 't1', 't1_id', distcnt=>10000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't2', 't2_id', distcnt=>20000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't3', 't1_id', distcnt=>10000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't3', 't2_id', distcnt=>20000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't4', 't4_id', distcnt=>300000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't4', 't1_id', distcnt=>10000, nullcnt=>0)

exec dbms_stats.set_column_stats(null, 't4', 't2_id', distcnt=>20000, nullcnt=>0)

exec dbms_stats.set_index_stats(null, 'pk_t1', numdist=>10000, clstfct=>10000, indlevel=>2, numlblks=>10, numrows=>10000)

exec dbms_stats.set_index_stats(null, 'pk_t2', numdist=>20000, clstfct=>20000, indlevel=>2, numlblks=>20, numrows=>20000)

exec dbms_stats.set_index_stats(null, 'pk_t3', numdist=>30000, clstfct=>30000, indlevel=>2, numlblks=>30, numrows=>30000)

exec dbms_stats.set_index_stats(null, 'pk_t4', numdist=>300000, clstfct=>300000, indlevel=>2, numlblks=>300, numrows=>30000)

Here we simulate 20,000 distinct values in one column, 10,000 distinct values in the second one, but only 30,000 distinct values for the combination of both columns. In this case Oracle's default selectivity formula underestimates the cardinality since it is assuming uncorrelated values:

select /*+ opt_param('_optimizer_join_sel_sanity_check', 'false') */
count(*)
from
t4
, t3
where
t3.t1_id = t4.t1_id
and t3.t2_id = t4.t2_id;

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 52 | | 1455 (3)| 00:00:18 |
| 1 | SORT AGGREGATE | | 1 | 52 | | | |
|* 2 | HASH JOIN | | 45 | 2340 | 1120K| 1455 (3)| 00:00:18 |
| 3 | INDEX FAST FULL SCAN| PK_T3 | 30000 | 761K| | 11 (10)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T4 | 300K| 7617K| | 833 (3)| 00:00:10 |
----------------------------------------------------------------------------------------

Oracle simply multiplies the selectivity of the two columns and arrives at a join cardinality of 45 rows (1/20,000*1/10,000*300,000*30,000).

You'll notice that I had to use a undocumented optimizer parameter to arrive at that default selectivity. If you run an EXPLAIN PLAN for the same statement without the hint, you'll get the following estimate:

select
count(*)
from
t4
, t3
where
t3.t1_id = t4.t1_id
and t3.t2_id = t4.t2_id;

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 52 | | 1455 (3)| 00:00:18 |
| 1 | SORT AGGREGATE | | 1 | 52 | | | |
|* 2 | HASH JOIN | | 30000 | 1523K| 1120K| 1455 (3)| 00:00:18 |
| 3 | INDEX FAST FULL SCAN| PK_T3 | 30000 | 761K| | 11 (10)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T4 | 300K| 7617K| | 833 (3)| 00:00:10 |
----------------------------------------------------------------------------------------

It can be seen from a 10053 optimizer trace file that Oracle uses a "Multi-column cardinality sanity check" by default in cases where the calculated multi-column selectivity falls below a certain limit, obviously using the smaller selectivity available from the different 1/num_rows of the tables/row sources involved in the join, arriving at an estimate 30,000 rows in this particular case.

Changing the non-unique index used for the primary key on "t3" to a unique index will bring another sanity check into the picture: The "concatenated index" sanity check that uses the number of distinct values of an unique index that corresponds exactly to the join columns used.

create table t3 (
t1_id integer not null,
t2_id integer not null,
filler1 varchar2(40),
filler2 varchar2(40),
constraint pk_t3 primary key (t1_id, t2_id) using index (
create unique index pk_t3 on t3 (t1_id, t2_id)
),
constraint fk_t3_1 foreign key (t1_id) references t1 (t1_id),
constraint fk_t3_2 foreign key (t2_id) references t2 (t2_id));

With this unique index in place Oracle uses the number of distinct keys from this index to calculate the selectivity of the join and therefore arrives at the correct cardinality again:

select
count(*)
from
t4
, t3
where
t3.t1_id = t4.t1_id
and t3.t2_id = t4.t2_id;

----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 52 | | 1455 (3)| 00:00:18 |
| 1 | SORT AGGREGATE | | 1 | 52 | | | |
|* 2 | HASH JOIN | | 300K| 14M| 1120K| 1455 (3)| 00:00:18 |
| 3 | INDEX FAST FULL SCAN| PK_T3 | 30000 | 761K| | 11 (10)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T4 | 300K| 7617K| | 833 (3)| 00:00:10 |
----------------------------------------------------------------------------------------

So this is another case where the uniqueness of an index makes a significant difference for optimizer calculations.

Note that from 11g on there more options to help the optimizer to come up with a better estimate even with the non-unique index on (t3.t1_id, t3.t2_id). Obviously 11g introduced extended statistics on column groups, so we can do the following:

variable ext_name varchar2(30)

exec :ext_name := dbms_stats.create_extended_stats(null, 't3', '(t1_id, t2_id)')

exec dbms_stats.set_column_stats(null, 't3', :ext_name, distcnt=>30000, nullcnt=>0)

This allows to derive the correct selectivity for these correlated column values using the extended statistics set.

Another option in 11g is adding an index on (t4.t1_id, t4.t2_id), like that:

create index ix_t4 on t4 (t1_id, t2_id);

exec dbms_stats.set_index_stats(null, 'ix_t4', numdist=>30000, clstfct=>30000, indlevel=>2, numlblks=>30, numrows=>300000)

Having now two non-unique indexes Oracle 11g comes up again with the correct join cardinality of 300K. Notice that this doesn't work in pre-11g. Pre-11g versions require the index on t3 to be unique to take advantage of the "concatenated index" sanity check.

Having demonstrated all these sanity checks available for multi-column joins (the general multi-column and the concatenated index sanity check), let's see what happens when joining three tables:

select
count(*)
from
t1
, t3
, t4
where
t4.t1_id = t1.t1_id
and t3.t1_id = t4.t1_id
and t3.t2_id = t4.t2_id;

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | | 1468 (4)| 00:00:18 |
| 1 | SORT AGGREGATE | | 1 | 65 | | | |
|* 2 | HASH JOIN | | 300K| 18M| | 1468 (4)| 00:00:18 |
| 3 | INDEX FAST FULL SCAN | PK_T1 | 10000 | 126K| | 4 (0)| 00:00:01 |
|* 4 | HASH JOIN | | 300K| 14M| 1120K| 1455 (3)| 00:00:18 |
| 5 | INDEX FAST FULL SCAN| PK_T3 | 30000 | 761K| | 11 (10)| 00:00:01 |
| 6 | TABLE ACCESS FULL | T4 | 300K| 7617K| | 833 (3)| 00:00:10 |
-----------------------------------------------------------------------------------------

All I've done is to add "t1", in this case joining to "t4" on "t1_id". Thanks to the concatenated index sanity check (or the extended column group statistics in 11g) the calculated join cardinality is still 300K.

Now what happens if one decides to join "t3" to "t1" on "t1_id" instead of "t4.t1_id"? From a logical point of view this should lead to exactly the same result, since we can deduce that if "t4.t1_id" = "t1.t1_id" and "t3.t1_id = t1.t1_id" then "t3.t1_id = t4.t1_id".

select
count(*)
from
t1
, t3
, t4
where
t4.t1_id = t1.t1_id
and t3.t1_id = t1.t1_id
and t3.t2_id = t4.t2_id;

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 65 | | 1475 (4)| 00:00:18 |
| 1 | SORT AGGREGATE | | 1 | 65 | | | |
|* 2 | HASH JOIN | | 45 | 2925 | | 1475 (4)| 00:00:18 |
| 3 | INDEX FAST FULL SCAN | PK_T1 | 10000 | 126K| | 4 (0)| 00:00:01 |
|* 4 | HASH JOIN | | 450K| 22M| 1120K| 1459 (3)| 00:00:18 |
| 5 | INDEX FAST FULL SCAN| PK_T3 | 30000 | 761K| | 11 (10)| 00:00:01 |
| 6 | TABLE ACCESS FULL | T4 | 300K| 7617K| | 833 (3)| 00:00:10 |
-----------------------------------------------------------------------------------------

The result is astonishing. By making this simple change we have effectively disabled all available sanity checks and arrive at the result based on the the default, uncorrelated selectivity.

So whenever you perform multi-column joins and the column data is correlated, be very careful how you join the tables - it might make a significant difference to the calculations of the optimizer.

UKOUG 2009

I'll be giving two one hour presentations at the upcoming UKOUG conference 2009 at Birmingham (30th November - 2nd December):

1. Monday, 30th November, 16:00-17:00: "CBO fundamentals: Understanding the different modes of System Statistics"

2. Wednesday, 2nd December, 11:05-12:05: "Everything you always wanted to know about FIRST_ROWS_N but were afraid to ask"

Both presentations cover fundamental functionality of the Cost-Based Optimizer which will help you to better understand why and how the optimizer comes up with certain execution plans, and in particular how the underlying cost calculation works. So eventually, if you haven't got an explanation yet I'll show you what the "cost" calculated effectively means and how to appropriately use the different optimization modes (ALL_ROWS, FIRST_ROWS, FIRST_ROWS_n) available.

So if you happen to attend the conference I'm looking forward to meeting you there.

By the way, I recommend visiting this link and save your personalised agenda to help with the room planning.

CBO: NewDensity for Frequency Histograms,11g-10.2.0.4 (densities part IV)

As we have seen in the previous posts of this series, in 11g a new figure named "NewDensity" has been introduced as a replacement for the "density" column statistic for columns whose histogram has been collected; this change has been backported in 10.2.0.4 also.
In the previous post we discussed how NewDensity influences the CBO [...]

Use NULL for unknown data...

Hah, it goes back much further than I thought... Snopes.com pointed out this morning that the use of a 'bad default value' dates back to at least 1979...

See http://tkyte.blogspot.com/2009/10/back-from-oracle-openworld.html for the original reference.

Back from Oracle OpenWorld

I am back and have been taking a set of new questions on asktom. Last week was a busy one out in California and I'm finally getting caught up on emails and questions (100% on the former, still working on the latter)

Anyway, I saw an article and it made me laugh - and sort of cry at the same time. It has to do with the use of default values. A lot of developers/DBAs have a very certain fear (that is the best word I can think of to describe their attitude) of NULL - the 'unknown' value. So, instead of using NULL for an effective_end_date field (for records that we don't know the end date for, they don't have one) or using NULL for values they do not know the value of - they use some 'fake' value. This fake value is assumed to be a value that could never possibly be used.

But....

Things change over time.

I'm sure when the developers implemented this system - using XXXXXXX for a license plate value that was unknown seemed 'reasonable'. I mean - who would ever ask for a vanity plate with seven X's on them?

Talk to the guy with almost $20k in fines that aren't his to see if he might know someone that might want a vanity plate with XXXXXXX on it :)

Do not fear NULL.

Understand it, but don't fear it.

http://tkyte.blogspot.com/2006/01/something-about-nothing.html
http://tkyte.blogspot.com/2006/01/mull-about-null.html

Things that make you go "hmmmmm"

I was reading a blog post entitled How Good is Good?. It was written 8 years ago by a graphic designer by the name of Stefan Sagmeister. The post was speaking about creating designs that were meaningful and make an impact on the world (or the part of the world each design is directed toward). While the post was interesting, it was one of the comments that caught my eye. Most of the commenters were inspired by the post but one commenter, in reference to the idea of making a match between good causes and good design a priority vs the more mundane every day marketing stuff, said:

Simple, don't go into graphic design, Choose something else. Cause if you don't do "that" work, other designers will. Do people really think that everybody enjoys their job?

It was the commenter's question that made me thoughtful: Do people really think that everybody enjoys their job?

I've always felt very fortunate to be able to do work that I really love doing. There may be times when I get frustrated with some of the particulars, but overall I know I'm doing "the thing" I'm supposed to be doing.

My choice of career path came to me in college. I started out as an accounting major but after my first computer-related course, I switched majors and knew that working in the information technology field was for me. So, I got my degree and have worked in the field ever since.

I honestly can't imagine working in a job that I didn't like. So, it was just a bit of an "ah-ha" moment to consider the idea that not everyone enjoys their job. What would that be like? What would it be like to wake up every morning and go do something that you don't enjoy for 8 or more hours? That would seem like torture to me.

But, I suppose it may be more likely that more people don't enjoy their jobs than do. Have you ever seen the TV show "Dirty Jobs"? Some of those jobs are (in my opinion) just horrid. I can't imagine having to do some of those tasks every day. But, there are a lot of jobs out there that I can't imagine doing that many people love.

However, I think that it may be more about "doing what you gotta do" many times vs actually choosing a job you want/enjoy/love. I just find it sad, and somewhat humbling, to think about all the people who do things they don't enjoy just to make enough money to maintain a life for themselves and their families. To think of spending such a large portion of my life doing something I found no enjoyment in really makes my heart ache.

Are you doing what you really enjoy?

OpenWorld 2009 Recap

Having now had several days to recover from, and reflect on, my trip to beautiful San Francisco for Oracle OpenWorld 2009, I wanted to follow up with a blog post describing my experience. First, before anything else, I went to In-N-Out Burger where I had my usual Double-Double with fries animal style; oh, how I missed thee. Next, with the […]

Oracle Closed World - now closed

It's Friday morning and I'm on my way away from San Francisco after a splendid week of OOW, good guys, a few beers, and a lot of tech talk.

We ran OCW four times from Monday to Thursday, and it was really good presenters we had talked into showing up:

Monday: Jeff Needham on processors and how Oracle runs on them. Opteron good. Nehalem good. A reporter named Kate was present in order to write about OCW. Code: 41.

Tuesday: Jonathan Lewis showing why the crowd were not experts. Ouch. Code 43.

Wednesday: Jeremiah Wilton about the Cloud, and especially the Amazon Cloud. He seems to know a good deal about Amazon. Code 24.

Thursday: Uri Shaft on counting eg. NDV in the optimizer, and some compression theory - and then Dan Norris & Greg Rahn about the Database Machine. Code 42.

And Kate's funny article about OCW appeared in the daily conference newspaper on Thursday. She got all the technical and non-technical stuff right - very impressive! She also gave away the secret location (Thirsty Bear on 661 Howard, upstairs), but thankfully only on the very last day of OCW :).

I truly enjoyed it, and so did several others, so we'll probably do it again next year.

Apart from that, it also appears that the guys from Miracle who were here with me (Morten Tangaa, Jesper Haure, Kaj Christensen, Claus Sørensen) got good things out of the conference.

While I remember it: Thank you to Victoria Lira, Lillian Buziak, and Justin Kestelyn for allocating a reporter for OCW, for managing the whole ACE Director thing, and many other favors that make the conference work.

OOW09 and my favorite 11gR2 feature

I decided to wait until the end of Oracle Open World 2009 before I posted anything. So many people have been tweeting and blogging and shooting video of this year's event, I figured I'd put in my two cents worth at the end.

Man...it's been a l-o-n-g week. I don't know if it's my increasing age or what, but being in a herd of this size for an entire week has left me feeling somewhat bruised (both physically and mentally). I started at 9am on Sunday morning by attending Tom Kyte's keynote for Oracle Develop and have made it through Thursday - which seems like a minor miracle somehow. But I have survived!

So much goes on here, it's hard to keep moving through it given the mass of people and multiple session locations (not to mention the after hours opportunities for fun and frolic). Add a bit of rain (some kind of monsoon as it seemed to me) and, at times, it was like being in an undersized kennel filled with wet dogs. :)

The sessions I attended, and there were many, were all quite excellent. I sat in on most all of the 11gR2 sessions and am really amazed at many of the new features and options 11gR2 offers. My favorite is edition-based redefinition (also referred to as "online application upgrades"). This feature isn't getting nearly as much press time as some of the other more sexy things (for instance, you couldn't move 10 feet without seeing or hearing about Exadata) but as far as I'm concerned, editioning is a hugely beneficial new feature.

By the way, it is a feature, not an option. Wow...for once it's something you don't have to pay extra for! I loved that when Tom Kyte did his "10 - no 11 - top things about 11gR2" session, that he gave editioning two spots on the list. So what is editioning?

1. You can create a new edition to contain any new code changes.
2. Data changes can be made (add new columns or new tables) and the new code can safely write to the new stuff without any of it being seen by the old edition.
3. Different projections of tables are exposed via editioning views so that each edition is allowed to see only its own columns.
4. If the old edition makes data changes those changes are propogated using cross-edition triggers into the new edition's columns (or vice-versa).

The bottom-line is that now with 11gR2, if you want to make changes to procedures, packages, views, synonyms, etc, you can do it using editioning and maintain both the old (original) version as well as the new. It also means a lot less headache. Have you ever tried to CREATE OR REPLACE PACKAGE in production while the instance is in use? What happens? Well, if the package is in use, Oracle has to wait until it is no longer in use in order to replace it. So, Oracle waits. But, if new requests to use the package come in during this time, those sessions must wait also. That doesn't even begin to touch what happens if the new package causes other objects to be invalidated and thus they need recompilation also. It can be a pretty long, arduous and resource-intensive process to get something seemingly so simple done.

With editioning, a new edition is created and the modified package is compiled into that edition. No muss, no fuss...and no waiting! This is just a bit about this great feature, but I recommend you look on OTN for more info and several whitepapers Oracle has available on the topic.

There's so much more I could blabber on about, but I'll save other thoughts for other posts. All in all, it's been a good week, my presentation on Tuesday was well attended and I got great feedback, and I've got lots of great material to take back home for further review and sharing with colleagues.