Search

OakieTags

Who's online

There are currently 0 users and 30 guests online.

Recent comments

Affiliations

Parallel to Serial

Here’s a little problem that came up on the Oracle-L listserver today:

I’m trying to write a query which reads the corresponding partition of the fact, extracts the list of join keys, materialises this result set, and finally joins the necessary dimensions. The key thing I’m trying to do is to run the initial query on the fact in parallel and then the rest of the query serially.

The full requirement, if you follow the link, may seem a little puzzling but there’s no point in second-guessing every question that people ask – there’s usually a reason for doing something in a particular way – so I just rattled off the first thing that came to mind, which was this:  when you include rownum in a parallel query Oracle has to serialise to generate the rownum – so create an inline view which does the parallel but adds a rownum to the select list, then join to the inline view. The plan should include a VIEW operator holding the parallel bit, and then you can hint as necessary to make the subsequent activity serial.

I’m not going to guarantee that it’s the best strategy (or even the only strategy) – but it sounded like a strategy that would work without much extra coding effort, risk of error, or run-time resource consumption. So I ran up a quick check in the laziest way possible (the tables are a cut-n-paste from a very old script, so don’t ask why I used their definition):


create table t1
as
select
	trunc((rownum-1)/15)	n1,
	trunc((rownum-1)/15)	n2,
	rpad(rownum,180)	v1
from all_objects
where rownum <= 3000
;

create table t2
as
select
	mod(rownum,200)		n1,
	mod(rownum,200)		n2,
	rpad(rownum,180)	v1
from all_objects
where rownum <= 3000 ;

begin
 	dbms_stats.gather_table_stats(
 		user,
 		't1',
 		method_opt => 'for all columns size 1'
	);

	dbms_stats.gather_table_stats(
		user,
		't2',
		method_opt => 'for all columns size 1'
	);
end;
/

alter table t1 parallel (degree 2);
alter table t2 parallel (degree 2);

set autotrace traceonly explain

select
	t1.v1 t1v1, t2.v1 t2v1, t2.n1
from
	t1, t2
where
	t2.n1 = t1.n1
;

select
	/*+ no_parallel(t3) */
	t3.n2, t1v1, t2v1
from	(
	select
		t1.v1 t1v1, t2.v1 t2v1, t2.n2, rownum r
	from
		t1, t2
	where
		t2.n1 = t1.n1
	)	v1,
	t1	t3
where
	t3.n1 = v1.n2
;

set autotrace off

As you can see I’ve written a query to join a couple of “parallelised” tables, then I’ve put that query into an inline view with rownum added to the select list, and joined the inline view to a third table (which happens to be an aliased copy of the first table) with a no_parallel hint. Here are the two plans – first the simple parallel join:


------------------------------------------------------------------------------------------------------------------
| Id  | Operation                | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |          | 45000 |    15M|    15  (14)| 00:00:01 |        |      |            |
|   1 |  PX COORDINATOR          |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)    | :TQ10001 | 45000 |    15M|    15  (14)| 00:00:01 |  Q1,01 | P->S | QC (RAND)  |
|*  3 |    HASH JOIN             |          | 45000 |    15M|    15  (14)| 00:00:01 |  Q1,01 | PCWP |            |
|   4 |     PX BLOCK ITERATOR    |          |  3000 |   541K|     7   (0)| 00:00:01 |  Q1,01 | PCWC |            |
|   5 |      TABLE ACCESS FULL   | T1       |  3000 |   541K|     7   (0)| 00:00:01 |  Q1,01 | PCWP |            |
|   6 |     BUFFER SORT          |          |       |       |            |          |  Q1,01 | PCWC |            |
|   7 |      PX RECEIVE          |          |  3000 |   541K|     7   (0)| 00:00:01 |  Q1,01 | PCWP |            |
|   8 |       PX SEND BROADCAST  | :TQ10000 |  3000 |   541K|     7   (0)| 00:00:01 |  Q1,00 | P->P | BROADCAST  |
|   9 |        PX BLOCK ITERATOR |          |  3000 |   541K|     7   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|  10 |         TABLE ACCESS FULL| T2       |  3000 |   541K|     7   (0)| 00:00:01 |  Q1,00 | PCWP |            |
------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   3 - access("T2"."N1"="T1"."N1")

Since the tables are small Oracle has used the broadcast strategy when doing the hash join – and obviously the closing stages (passing data to the query co-ordinator) have to introduce a P->S (parallel to serial) step.

Now the extension that joins the extra table:

---------------------------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |          |   675K|   126M|    35  (29)| 00:00:01 |        |      |            |
|*  1 |  HASH JOIN                  |          |   675K|   126M|    35  (29)| 00:00:01 |        |      |            |
|   2 |   TABLE ACCESS FULL         | T1       |  3000 | 24000 |    12   (0)| 00:00:01 |        |      |            |
|   3 |   VIEW                      |          | 45000 |  8261K|    15  (14)| 00:00:01 |        |      |            |
|   4 |    COUNT                    |          |       |       |            |          |        |      |            |
|   5 |     PX COORDINATOR          |          |       |       |            |          |        |      |            |
|   6 |      PX SEND QC (RANDOM)    | :TQ10001 | 45000 |    15M|    15  (14)| 00:00:01 |  Q1,01 | P->S | QC (RAND)  |
|*  7 |       HASH JOIN             |          | 45000 |    15M|    15  (14)| 00:00:01 |  Q1,01 | PCWP |            |
|   8 |        PX BLOCK ITERATOR    |          |  3000 |   547K|     7   (0)| 00:00:01 |  Q1,01 | PCWC |            |
|   9 |         TABLE ACCESS FULL   | T2       |  3000 |   547K|     7   (0)| 00:00:01 |  Q1,01 | PCWP |            |
|  10 |        BUFFER SORT          |          |       |       |            |          |  Q1,01 | PCWC |            |
|  11 |         PX RECEIVE          |          |  3000 |   541K|     7   (0)| 00:00:01 |  Q1,01 | PCWP |            |
|  12 |          PX SEND BROADCAST  | :TQ10000 |  3000 |   541K|     7   (0)| 00:00:01 |  Q1,00 | P->P | BROADCAST  |
|  13 |           PX BLOCK ITERATOR |          |  3000 |   541K|     7   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|  14 |            TABLE ACCESS FULL| T1       |  3000 |   541K|     7   (0)| 00:00:01 |  Q1,00 | PCWP |            |
---------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   1 - access("T3"."N1"="V1"."N2")
   7 - access("T2"."N1"="T1"."N1")

Note the VIEW operator at line 3 which calls the COUNT operator that is generating the rownum, and the P->S at line 5 feeding the query co-ordinator that makes this possible. Then (thanks to the no_parallel hint) the join to alias t3 is serialised.

Footnote: I did try (though not exhaustively) a couple of options to eliminate the need for the reference to rownum, one using a no_merge hint on the inline view, but that allowed Oracle to scan the t3 alias serially then broadcast it (S->P) to continue with a parallel join. It is possible, though, that a little extra thought might produce a solution without the rownum. A trap to watch out for – it’s just possible that in a newer version of Oracle the optimizer will detect that the rownum is not used in the final select list and decide to eliminate it from the inline view, which would make the serialisation redundant, leading to a simple 3 table parallel join (with an S->P to step to obey my no_parallel hint).