Top 60 Oracle Blogs

Recent comments

Upgrade Whoa

(The title’s a pun, by the way – an English form of humour that is not considered good unless it’s really bad.)

Very few people try to email me or call me with private problems – which is the way it should be, and I am grateful to my audience for realizing that this blog isn’t trying to compete with AskTom – but I do get the occasional communication and sometimes it’s an interesting oddity that’s worth a little time.

Today’s blog item is one such oddity – it was a surprise, it looked like a nasty change in behaviour, and it came complete with a description of environment, and a neatly formatted, complete, demonstration. For a discussion of the problem in Spanish you can visit the blog of John Ospino Rivas, who sent me the original email and has written his own blog post on the problem.

We start with a simple table, and then query it with a ‘select for update from two different sessions:

drop table tab1 purge;

create table tab1(
	id	number,
	info	varchar2(10),
	constraint tab1_pk primary key (id)
		using index (create index idx_tab1_pk on tab1(id))

insert into tab1 values(1,'a');
insert into tab1 values(2,'a');
insert into tab1 values(3,'a');

execute dbms_stats.gather_table_stats(user,'tab1',cascade=>true)

column id new_value m_id

set autotrace on explain

select  id
from    tab1
where   id = (
            select  min(id)
            from    tab1
for update

set autotrace off

prompt	=============================================================
prompt  Now repeat the query in another session and watch it lock
prompt	And use a third session to check v$lock
prompt  Then delete here, commit and see what the second session does
prompt	=============================================================

accept X prompt 'Press return to delete and commit'

set verify on
delete from tab1 where id = &m_id;

The fact that the primary key index is created as a non-unique index isn’t a factor that affects this demonstration.

Given the query and the data in the table, you won’t be surprised by the result of the query from the first session (for convenience I’ve captured the selected value using the ‘column new_value’ option). Here’s the result of the query and its execution plan:


| Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT             |             |     1 |     3 |     1   (0)| 00:00:01 |
|   1 |  FOR UPDATE                  |             |       |       |            |          |
|*  2 |   INDEX RANGE SCAN           | IDX_TAB1_PK |     1 |     3 |     0   (0)| 00:00:01 |
|   3 |    SORT AGGREGATE            |             |     1 |     3 |            |          |
|   4 |     INDEX FULL SCAN (MIN/MAX)| IDX_TAB1_PK |     3 |     9 |     1   (0)| 00:00:01 |

Predicate Information (identified by operation id):
   2 - access("ID"= (SELECT MIN("ID") FROM "TAB1" "TAB1"))

At this point the program issues instructions to repeat the query from a second session, then waits for you to press Return. When you run the same query from another session it’s going to see the data in read-consistent mode and try to select and lock the row where ID = 1, so the second session is going to hang waiting for the first session to commit or rollback.

Here’s the key question: what’s the second session going to return when you allow the first session to continue, delete the row it has selected, and commit ? Here’s the answer if you’re running or (which is what I happen to have easily available):

SQL> select  id
  2  from    tab1
  3  where   id = (
  4              select  min(id)
  5              from    tab1
  6          )
  7  for update
  8  ;


1 row selected.

Now, this seems perfectly reasonable to me – especially since I’ve read Tom Kyte’s notes on “write consistency” and seen the “rollback and restart” mechanism that kicks in when updates have to deal with data that’s changed since the start of the update. Session 2 had a (select for) update, and when it finally got to a point where it could lock the data it found that the read-consistent version of the data didn’t match the current version of the data so it restarted the statement at a new SCN. At the new SCN the current highest value was 2.

Now here’s what happened when I ran the test under

SQL> select  id
  2  from    tab1
  3  where   id = (
  4              select  min(id)
  5              from    tab1
  6          )
  7  for update
  8  ;

no rows selected

The upgrade produces a different answer !

At first sight (or guess) it looks as if the query has run in two parts – the first part producing the min(id) of 1 using a read-consistent query block, with the second part then using the resulting “known value” to execute the outer select (shades of “precompute_subquery”) and restarting only the second part when it discovers that the row it has been waiting for has gone away.

It doesn’t really matter whether you think the old behaviour or the new behaviour is correct – the problem is that the behaviour has changed in a way that could silently produce unexpected results. Be careful if any of your code uses select for update with subqueries.

As a defensive measure you might want to change the code to use the serializable isolation level – that way the upgraded code will crash with Oracle error ORA-08177 instead of silently giving different answers:

SQL> alter session set isolation_level = serializable;

Session altered.

SQL> get afiedt.buf
  1  select  /*+ gather_plan_statistics */
  2          id
  3  from    tab1
  4  where   id = (
  5              select  min(id)
  6              from    tab1
  7          )
  8* for update
  9  /
from    tab1
ERROR at line 3:
ORA-08177: can't serialize access for this transaction

It might be a way of avoiding this specific problem, of course, but it’s not a frequently used feature (the first pages of hits on Google are mostly about SQL Server) so who knows what other anomalies this change in isolation level might introduce.