Search Oracle Blogs

List of Blogs maintained by Paweł Barut.


Wednesday, November 22, 2006

MERGE and SEQUENCE

Last days I was working on code and I was using merges often. And i've notice some side-effect. When you use sequence in WHEN NOT FOUND clause then next sequence value is gathered for all rows processed in query. I've observed it on 10g R2 EE (10.2.0.2) and on Oracle XE. Let me show test case. First create sequence:
SQL> create sequence a_sq start with 1 increment by 1 nocache;
Sequence created.
This sequence generator starts with 1 with step 1. Lets taka look how many rows we have in USER_OBJECTS view:
SQL> select count(*) from user_objects;
 COUNT(*)
----------
 46
Lets create sample table and populate it with 10 rows:
SQL> create table a_tab as
 2 select object_id o_id, object_name o_name, a_sq.nextval o_sq
 3 from user_objects where rownum <=10;
Table created.
So we have table with 10 rows, and last number generated by sequence is also 10:
SQL> select count(*), max(o_sq) from a_tab;
 COUNT(*) MAX(O_SQ)
---------- ----------
 10 10
Now we run merge:
SQL> merge into a_tab
 2 using (select object_id , object_name
 3 from user_objects )
 4 on (object_id = o_id)
 5 when matched then
 6 update set o_name = object_name
 7 when not matched then
 8 insert (o_id, o_name, o_sq)
 9 values (object_id, object_name, a_sq.nextval);
47 rows merged.
47 rows were merged, but 10 rows was updated and 37 inserted. Lets check it:
SQL> select count(*), max(o_sq) from a_tab;
 COUNT(*) MAX(O_SQ)
---------- ----------
 47 57
It seems that sequence generated 47 new values, but only 37 rws were inserted. Lets run the same merge one more time. This time rows will be only updated, as all rows are already in a_tab table.
SQL> merge into a_tab
 2 using (select object_id , object_name
 3 from user_objects )
 4 on (object_id = o_id)
 5 when matched then
 6 update set o_name = object_name
 7 when not matched then
 8 insert (o_id, o_name, o_sq)
 9 values (object_id, object_name, a_sq.nextval);
47 rows merged.
Now no new rows where created. But what happen to sequence:
SQL> select a_sq.nextval from dual;
 NEXTVAL
----------
 105
It generated another 47  values, that are useless. It can be proved:
SQL> select count(*), max(o_sq) from a_tab;
 COUNT(*) MAX(O_SQ)
---------- ----------
 47 57
This side effect might have impact on performance. Also you loose lot of numbers, when most of rows are updated. Cheers, Paweł

6 comments:

Samer said...

I faced the same problem developing data warehousing solutions. The side effects of this go beyond performance, they can lead to failure of the database. When you load high volumes of data (most of which will be updates), the sequence will rapidly reach very large numbers (like 1.0E7). If you have columns with precision of NUMBER(6,0), those new values of sequence could not be inserted any more, leading to failure of the warehousing process. This bug has created a huge mess in my data warehouse!

My workaround consists of wrapping the sequence.nextval in a pl/sql function, which seems to solve the problem. So if we call this function seq_iterator, then our merge statement will look like:

merge
...
when matched
update ...
when not matched
insert (...,...)
values (seq_iterator('your_sequence'), ...,...)

This bug can actually be a huge pain in the ass, and is less innocent than it seems.

Anonymous said...

can you post the scrip for the solution u used

Paweł Barut said...

Hi Anonymous,

The function that Samer probably had in mind can look like that:

function my_seq_gen return number
is
v_number number;
begin
select my_seq.nextval
into v_number
from dual;
return v_number;
end;

or in Oracle 11g even shorter:

function my_seq_gen return number
is begin
return my_seq.nextval;
end;

/Paweł

benjamin said...

A little late but this function can do the stuff:


create or replace
FUNCTION GET_SEQUENCE (name_sequence VARCHAR2)
RETURN INTEGER IS v_seq INTEGER;
BEGIN
EXECUTE IMMEDIATE 'select ' || name_sequence || '.nextval FROM dual' INTO v_seq;
RETURN v_seq;
END GET_SEQUENCE;

Paweł Barut said...

Hello benjamin,

Yes this function does job, but has some disadvantages, and personally I do not recommend it.
1. It is not optimized for performance - Execute immediate is much more expensive to execute then embedded SQL code.
2. If this function will be available to some users - it make be used to break in using SQL-Injection. So The first step in this function should be to check that sequence name references to valid object. And that will cost even more

Best Regards,
Paweł

Anonymous said...

Samer,

It's not a bug. It is documented that this will happen in a merge.

 

Copyright © Paweł Barut
Printing from DOS to USB Printer