MERGE!

5

When I was confronted with the challenge to import about 260 tables from two different Rdb’s into a single Oracle database (9.0.1.x) on a daily basis via a database link my first thought was: Oh no, I am not going to write 260 times two individual DML statements, one UPDATE and one INSERT. Or one of the alternatives: UPDATE a row and if SQL%NOTFOUND then INSERT, or try to INSERT a row and if it fails with a DUP_VAL_ON_INDEX exception, then do the UPDATE.

So, I re-read the features of 9i and I re-discovered the MERGE statement. In various Oracle articles is mentioned that MERGE is a useful statement in a data warehouse environment, where tables need to be periodically refreshed with the new data arriving from on-line systems. Well, same here. There are two Rdb environments which come together in a single Oracle 9i environment, in which the data has to be kept at least a couple of years. So, I wrote procedures with the MERGE statement, a parameter is included to switch between the two Rdb’s.

The MERGE statement is of course not something new. It was introduced in Oracle 9i and it allows you to select rows from one table for update or insertion into another table. The decision whether to update or insert into the target table is based on a condition in the ON clause. This statement is a convenient way to combine at least two operations. It lets you avoid multiple INSERT and UPDATE DML statements.

And now in 10g, the MERGE statement has been extended to cover a larger variety of complex and conditional data transformations. This will allow faster loading of large volumes of data (1). And in 10g it is possible to DELETE rows from the target table (2).

MERGE [ hint ]
INTO [ schema. ]table [ t_alias ]
USING [ schema. ] { table | view | subquery }
[ t_alias ]
ON ( condition )
WHEN MATCHED THEN
UPDATE SET column = { expr | DEFAULT }
[, column = { expr | DEFAULT } ]…
[ where_clause ]
[ DELETE where_clause ]
WHEN NOT MATCHED THEN
INSERT [ (column [, column ]…) ]
VALUES ({ expr [, expr ]… | DEFAULT })
[ where_clause ]

In 10g it is possible to add a WHERE clause to the UPDATE statement and the INSERT statement. The condition can refer to either the data source or the target table. And it is possible to delete records in the update statement. The only rows affected by this clause are those rows in the destination table that are updated by the merge operation. That is, the DELETE WHERE condition evaluates the updated value, not the original value that was evaluated by the UPDATE SET … WHERE condition. If a row of the destination table meets the DELETE condition but is not included in the join defined by the ON clause, then it is not deleted (2).

But when do you use this MERGE statement?

In the PL/SQL User’s Guide and Reference 10g Release 1 (10.1) is stated that this statement is primarily useful in data warehousing situations where large amounts of data are commonly inserted and updated. If you only need to insert or update a single row, it is more efficient to do that with the regular PL/SQL techniques: try to update the row, and do an insert instead if the update affects zero rows; or try to insert the row, and do an update instead if the insert raises an exception because the table already contains that primary key (4).

And Tom Kyte says about MERGE: Truncate and reload would most likely be faster than merge if most of the table is going to be modified. The reload could be done in direct path, without indexes — rebuild indexes in parallel with no logging afterwards. Merge does an outer join between the two sets basically and updates matches as it hits them and inserts misses. It is not like a “procedure”, it is like insert, update or delete (5).

According to an Oracle Technical White Paper (3) the performance improved on average with 30% with a Serial Execution and 22% with a Parallel Execution.
(Serial Execution: table SALES_FACT contained 5 million rows and table SALES_NOV99 contained 1 million rows. Parallel Execution: table SALES_FACT contained 50 million rows (10 partitions), and table SALES_NOV99 contained 5 million rows).

Besides performance it is also good to know if there are some unknown features in this statement. After creating the procedures and testing them I came across a bug in 9i. If you are using a sequence in the target table, say as a unique ID, this sequence will be incremented for each row in the source table each time the MERGE statement is executed, even if there are no insert statements. So if you call your procedure every day and the source table contains 100 records, the sequence will be incremented by 100, even if there are no new records inserted. This bug is solved in 10g.

So, if you expect to update or insert a single row, just use UPDATE – SQL%NOTFOUND – INSERT. If the entire table is going to be modified; truncate and reload the table. In my situation I think it’s better to use MERGE. Every day there will be new records inserted, some updated. At this moment there are no exact figures available on the annually growth of these tables, but one thing is for sure: they will increase and multiple insertions and updates are going to take place on a daily basis. It’s too bad that I had to create triggers on the different tables in Oracle in order to use a sequence, but besides this minor problem I would say: use MERGE!

1. Oracle® Database New Features Guide 10g Release 1 (10.1) Part No. B10750-01 December 2003, p. 1-50
2. http://download-west.oracle.com/docs/cd/B14117_01/server.101/b10759/statements_9016.htm#SQLRF01606
3. Performance and Scalability in DSS Environment with Oracle9i, page 8-11
4. http://download-west.oracle.com/docs/cd/B14117_01/appdev.101/b10807/13_elems029.htm#sthref1735
5. http://asktom.oracle.com/pls/ask/f?p=4950:8:11788848576231155945::NO::F4950_P8_DISPLAYID,F4950_P8_CRITERIA:5318183934935

Share.

About Author

5 Comments

  1. In Oracle 10gR2 you solve this with the DML Error Log clause, see post 10gR2 New Feature: DML Error Logging .

    Pre 10gR2 you are sort of stuck: the MERGE statement is an Atomic operation: it either completes successfully and in its entirety or it fails – again in its entirety. There is no easy way to programmatically and automatically deal with such recordlevel exceptions in statements.

    In PL/SQL, you can make use of Bulk DML operations that are actually bundles up packages of individual statements. Those statements can be processed even if one of them fails, using the EXCEPTIONS INTO clause. However, this requires a procedural approach, which is of course what you wanted to avoid in the first place.

    I can also think of a simple way to find the offending record in the MERGE statement: create before record and after record insert and update triggers on the table that you are merging into. In the before record trigger, store the ID of the record in a package variable, in the after record trigger you can remove it again (not really necessary by the way). When the statement fails, you will find the identification of the offending record in the package variable. This may help you deal with the problem, by for example running the MERGE statement again, this time excluding the offending record.

    You could try to find the offending record

  2. Arindam Mukherjee on

    I need one solution.

    Suppose 100 rows are being processed in Merge. Now 55th row raises an exception. Now I need to commit the previous 54 rows before leaving the block containing the MERGE statement. So I have written an EXCEPTION block but COMMIT is not Working. In addition to, I have also written “PRAGMA AUTONOMOUS_TRANSACTION;” but this is an exercise in futility.

    EXCEPTION
    WHEN OTHERS THEN
    commit;
    v_ErrCode := SQLCODE;
    v_ErrMsg := SQLERRM;
    DBMS_OUTPUT.PUT_LINE(‘Code : ‘||SQLCODE||’ And Error Msg –>> ‘||v_ErrMsg);

    If you have time, please help me get the right way to meet the requirement.

  3. Hi,
    Merge have four cases:
    1. inbound merge: means upsert local DB (where you are) with remote one via DB link.
    2. outbound merge: upsert remote DB with local DB.
    3. local merge: target and source are in local DB.
    4. remote merge: target and source are in remote DB.

    Have no idea if all works with MERG command, please lend you advise on the four MERGEs.

    Hank

  4. Hi,
    I read your document and confirm that my problem (sequence number continues increaing even if no really insert in merge statement)
    is an Oracle bug. Thank you for your good article.