Search

OakieTags

Who's online

There are currently 0 users and 30 guests online.

Recent comments

Affiliations

Running Total

Here’s a question on OTN from a  SQL Server user that should prompt a few interesting ideas. Re-arranged and paraphrased it goes something like this:

In SQL Server I can write the following code:

DECLARE @Counter INT
SELECT @Counter = 0

UPDATE TempDB.dbo.TransactionDetail
SET @Counter = AccountRunningCount = @Counter + 1
FROM TempDB.dbo.TransactionDetail WITH (TABLOCKX)

What I want to do is more like this:

DECLARE @Total INT = 0
UPDATE StringOutput
set @Total = SumOfLength = @Total + ColLength
OPTION (MAXDOP 1)

How do I do something similar in Oracle ?

I suspect that there may be an error in the second piece of code if it is to run on SQL Server since it doesn’t have a “FROM StringOutput” to match the “FROM TempDB.dbo.TransactionDetail” that appears in the first piece – but perhaps it’s implied by the reference to StringOutput in the UPDATE clause.

To an Oracle user, the set clause is particularly strange – but this is a feature of how T-SQL (roughly the equivalent of  PL/SQL) can work for SQL Server.  This set clause allows each row to modify, and be modified by, a variable. So, for each row, this double assignment first sets the column sumoflength to the value of the variable total plus the value of the column collength,   then it sets the variable total to this new value of sumoflength. The net effect is that the column sumoflength acquires a running total of collength as you read through the table. (I am basing my explanation on “reasonable guesswork” combined with the title of the thread which was:  “Update Failed for Sum of previous row and current row”.

So how do you do this in Oracle ? The simple answer is “you shouldn’t, but if you do the strategy has to be completely different because the tools are different”.

The reason for the “shouldn’t” is that there is no implicit order in a select statement unless it includes an “order by” clause – so any thoughts of “previous” or “next” for data in a table are erroneous. Assume then that we define an order through a unique, non-null column – we would then look at the SQL Server code and wonder whether its internal implementation was effectively a row by row mechanism, or whether there was some cunning array based processing taking place (perhaps like the trick that PL/SQL uses to turn an implicit cursor for loop into array fetches of 100 rows at a time).

My first thoughts about the task (assuming we actually do want to do it) point me to analytic functions as a way of producing the data – but I fear I may have to use PL/SQL because of the strong flavour of row-based activity that the code suggests. Let’s start with the first stab at an analtyic approach:

create table t1
as
with generator as (
	select	--+ materialize
		rownum id 
	from dual 
	connect by 
		level <= 1e4
)
select
	rownum			id,
	rownum			n1,
	cast(null as number(9))	running_tot
from
	generator	v1
where
	rownum <= 1e3
;

alter table t1 add constraint t1_pk primary key(id);

-- collect stats

This gives us a primary key for ordering, a column to generate a running total for, and a column to store the running total – currently null. Let’s write a query that shows us the data we ultimately want to store in the table:

select
	id,
	n1,
--	running_tot,
	sum(n1) over (order by id)	running_tot
from
	t1
;

        ID         N1 RUNNING_TOT
---------- ---------- -----------
         1          1           1
         2          2           3
         3          3           6
         4          4          10
         5          5          15
         6          6          21
         7          7          28
         8          8          36
         9          9          45
        10         10          55

Since we can calculate the right information in a select statement it would be nice if we could turn that statement into an updatable view; unfortunatley it doesn’t work.

update
	(
	select
		id,
		n1,
		running_tot,
		sum(n1) over (order by id)	running_tot_calc
	from
		t1
	)
set
	running_tot = running_tot_calc
;

        (
        *
ERROR at line 2:
ORA-01732: data manipulation operation not legal on this view

That doesn’t seem entirely reasonable, but we can take the code and wrap it in pl/sql as a cursor for loop and (left as an exercise to the reader) we could even make the code more complex to do array updates rather than using the simplest code to do row by row updates:

declare
	cursor c1 is
		select
			id,
			n1,
			running_tot,
			sum(n1) over (order by id)	running_tot_calc
		from
			t1
		for update of 
			running_tot
	;
begin
	for r1 in c1 loop
		update t1 set running_tot = r1.running_tot_calc where current of c1;
	end loop;
	commit;
end;
/

For the small data set I had generated, this runs pretty quickly, and gets the right answer. I find it slightly surprising, though, that I’m allowed to use the “select for update”/”current of” approach when a straight update isn’t legal.

Of course we might consider going for the full, non-procedural, array update – and luckily we’ve got a primary key that will help us to generate an updatable join view, or a simple, update-only, merge.

merge
	into t1	od
using	(
	select
		id,
		n1,
		running_tot,
		sum(n1) over (order by id)	running_tot_calc
	from
		t1
	)	nd
on
	(nd.id = od.id)
when matched then
	update set od.running_tot = nd.running_tot_calc
;

I won’t show you the updatable join view version – because it didn’t work – I got the “non-key-preserved” error ORA-01779. I think the optimizer is having some trouble (or being a little too clever) with the analytic sum(). Again, with the small data set, this update was pretty quick. For larger data sets, the choice of general strategy is left to the reader.

I’d still be interested to know, though, whether the SQL Server code was internally array based or operates row by row.