DBMS_COMPARISON – to compare and synchronize tables (new in Oracle 11g)

3

One of the interesting new pieces of functionality in Oracle 11g is offered by the DBMS_COMPARISON package. This package allows us to compare two tables – or two views on tables – and inform us on the differences in data. More specifically: the package can summarize if there any changes between the data sets and if so how many. It can also inform us in detail about the specific rows that are different between the two. Finally, DBMS_COMPARISON can converge the data sets, taking one of the two as the master and fully aligning the other one. Using DBMS_COMPARISON, we get on-demand replication for ordinary tables, within schemas, across schemas or even across databases.

In this article, I will demonstrate – on the 11.1.0.6 (Beta) release for Windows – how I can compare table EMP with a clone, whose data has been modified. We will then see how we can re-align the clone with the original (master) table. In this example, I work within a single schema. However, everything I do can be done across schemas and even across a database link, catering for custom data replication scenarios.

Here is our challenge:....

The following steps have been taken, to create a clone EMP_COPY of the EMP table in the SCOTT schema and make some changes in the data of EMP_COPY:

create table emp_copy as select * from emp<br />/<br /><br />alter table emp_copy<br />add constraint emp_copy_pk primary key (empno)<br />/<br /><br />update emp_copy<br />set    comm = comm * 1.21<br />/<br /><br />update emp_copy<br />set    ename= initcap(ename)<br />where  job ='CLERK'<br />/<br /><br />delete emp_copy<br />where  job = 'ANALYST' <br />/<br /><br />insert into emp_copy<br />(empno, ename, deptno, job)<br />values<br />(1313, 'JELLEMA', 30, 'TRAINER')<br />/<br /><br /><br />select empno<br />,      ename<br />,      comm<br />from   emp_copy<br />/<br /><br />     EMPNO ENAME            COMM<br />---------- ---------- ----------<br />      7369 Smith<br />      7499 ALLEN             363<br />      7521 WARD              605<br />      7566 JONES<br />      7654 MARTIN           1694<br />      7698 BLAKE<br />      7782 CLARK<br />      7839 KING<br />      7844 TURNER              0<br />      7876 Adams<br />      7900 James<br /><br />     EMPNO ENAME            COMM<br />---------- ---------- ----------<br />      7934 Miller<br />      1313 JELLEMA<br /><br />13 rows selected.<br />&nbsp;

Now let’s pretend EMP_COPY is a remote table that really ought to be synchronized with table EMP. Of course we could try to write a MERGE statement possibly updating far too many rows that have no differences – , combined with a separate DELETE operation for removing rows in EMP_COPY that do not exist in EMP. Or we can use DBMS_COMPARISON.

First our user needs execute privileges on DBMS_COMPARISON.

Then we create a comparison for these two objects:

BEGIN<br />  DBMS_COMPARISON.CREATE_COMPARISON<br />  ( comparison_name =&gt; 'compare_emp_and_clone'<br />  , schema_name     =&gt; 'scott'<br />  , object_name     =&gt; 'emp'<br />  , dblink_name     =&gt; null<br />  , remote_schema_name=&gt;'scott'<br />  , remote_object_name=&gt;'EMP_COPY'<br />  );<br />END;<br />&nbsp;

Note that since the two tables are in the same database, we specify the dblink_name parameter as null. The CREATE_COMPARISON procedure has more parameters, such as INDEX_NAME to indicate the index that is used for matching records in the two data sets and COLUMN_LIST to specify which columns are involved in comparing and converging.

Having created a comparison, we can perform a compare operation:

DECLARE<br />  consistent   BOOLEAN;<br />  scan_info    DBMS_COMPARISON.COMPARISON_TYPE;<br />BEGIN<br />  consistent := DBMS_COMPARISON.COMPARE<br />                ( comparison_name =&gt; 'compare_emp_and_clone'<br />                , scan_info       =&gt; scan_info<br />                , perform_row_dif =&gt; TRUE<br />                );<br />  DBMS_OUTPUT.PUT_LINE('Scan ID: '||scan_info.scan_id);<br />  IF consistent=TRUE THEN<br />    DBMS_OUTPUT.PUT_LINE('No differences were found.');<br />  ELSE<br />    DBMS_OUTPUT.PUT_LINE('Differences were found.');<br />  END IF;<br />END;<br /><br />Scan ID: 79<br />Differences were found.<br /><br />PL/SQL procedure successfully completed.<br />&nbsp;

The result of this operation can be retrieved from a number of Data Dictionary Views: USER_COMPARISON, USER_COMPARISON_SCAN_SUMMARY, USER_COMPARISON_COLUMNS and USER_COMPARISON_ROW_DIF.

We can find out whether the two objects involved in this COMPARISON – or rather their data sets – are equal or have been found to have differences and if so, how many.

SELECT s.scan_id <br />,      c.COMPARISON_NAME<br />,      c.SCHEMA_NAME<br />,      c.OBJECT_NAME <br />,      s.CURRENT_DIF_COUNT <br />FROM   USER_COMPARISON c<br />,      USER_COMPARISON_SCAN_SUMMARY s<br />WHERE  c.COMPARISON_NAME = s.COMPARISON_NAME <br />AND    s.scan_id = 79<br /><br />   SCAN_ID Comparison Name      Schema Name Object Name Differences<br />---------- -------------------- ----------- ----------- -----------<br />        79 COMPARE_EMP_AND_CLON SCOTT       EMP                  10<br />           E<br />&nbsp;

Drilling down, we can find out the rowid or index values of the rows that either exist only in one of the two data sets or have differences in their data values.

SELECT c.COLUMN_NAME<br />,      r.INDEX_VALUE <br />,      case <br />       when r.LOCAL_ROWID is null<br />       then 'No'<br />       else 'Yes'<br />       end  LOCAL_ROWID<br />,      case <br />       when r.REMOTE_ROWID is null<br />       then 'No'<br />       else 'Yes'<br />       end  REMOTE_ROWID<br />FROM   USER_COMPARISON_COLUMNS c<br />,      USER_COMPARISON_ROW_DIF r<br />,      USER_COMPARISON_SCAN s<br />WHERE  c.COMPARISON_NAME = 'COMPARE_EMP_AND_CLONE' <br />AND    r.SCAN_ID         = s.SCAN_ID <br />AND    s.last_update_time &gt; systimestamp - 1/24/15  <br />AND    r.STATUS          = 'DIF' <br />AND    c.INDEX_COLUMN    = 'Y' <br />AND    c.COMPARISON_NAME = r.COMPARISON_NAME <br />ORDER <br />BY     r.INDEX_VALUE<br /><br /><p>Index Column    Index Value     Local Row Exists?    Remote Row Exists?<br />--------------- --------------- -------------------- --------------------<br />EMPNO           1313            No                   Yes<br />EMPNO           7369            Yes                  Yes<br />EMPNO           7499            Yes                  Yes<br />EMPNO           7521            Yes                  Yes<br />EMPNO           7654            Yes                  Yes<br />EMPNO           7788            Yes                  No<br />EMPNO           7876            Yes                  Yes<br />EMPNO           7900            Yes                  Yes<br />EMPNO           7902            Yes                  No<br />EMPNO           7934            Yes                  Yes<br /><br />10 rows selected. <br /></p>

Knowing all this, we can decide to go for the synchronization. Using DBMS_COMPARISON.CONVERGE, we can synchronize either the local table with the remote data set or vice versa. In this example, we will roll EMP into EMP_COPY, making sure that EMP_COPY is back in line with its master:

DECLARE<br />  scan_info    DBMS_COMPARISON.COMPARISON_TYPE;<br />BEGIN<br />  DBMS_COMPARISON.CONVERGE<br />  ( comparison_name  =&gt; 'compare_emp_and_clone'<br />  , scan_id          =&gt; 79<br />  , scan_info        =&gt; scan_info<br />  , converge_options =&gt; DBMS_COMPARISON.CMP_CONVERGE_LOCAL_WINS<br />  );<br />  DBMS_OUTPUT.PUT_LINE('Local Rows Merged: '||scan_info.loc_rows_merged);<br />  DBMS_OUTPUT.PUT_LINE('Remote Rows Merged: '||scan_info.rmt_rows_merged);<br />  DBMS_OUTPUT.PUT_LINE('Local Rows Deleted: '||scan_info.loc_rows_deleted);<br />  DBMS_OUTPUT.PUT_LINE('Remote Rows Deleted: '||scan_info.rmt_rows_deleted);<br /
>END;<br /><br />Local Rows
Merged: 0<br />Remote Rows Merged: 9<br />Local Rows Deleted: 0<br />Remote Rows Deleted: 1<br /><br />PL/SQL procedure successfully completed. <br />

This simple statement – combined with the earlier creation of the COMPARISON itself – is all that is needed to synchronize the two tables! When we verify the situation in EMP_COPY, we find:

SQL&gt; select empno<br />  2  ,      ename<br />  3  ,      comm<br />  4  from   emp_copy<br />  5  /<br /><br />     EMPNO ENAME            COMM<br />---------- ---------- ----------<br />      7369 SMITH<br />      7499 ALLEN             300<br />      7521 WARD              500<br />      7566 JONES<br />      7654 MARTIN           1400<br />      7698 BLAKE<br />      7782 CLARK<br />      7839 KING<br />      7844 TURNER              0<br />      7876 ADAMS<br />      7900 JAMES<br /><br />     EMPNO ENAME            COMM<br />---------- ---------- ----------<br />      7934 MILLER<br />      7788 SCOTT<br />      7902 FORD<br /><br />14 rows selected. <br />

And indeed EMP_COPY is once again in line with EMP.

A second scenario: The most convoluted UPDATE statement in the world

In this second scenario, I wil show the probably most unlikely way to update the salaries of the managers in table EMP. However, it serves as a demonstration of two important facets of the COMPARISONS:

  • Comparisons not only can be applied to Tables, they can also be used with (single table) Views and Materialized Views
  • Comparisons do not have to apply to all columns in the tables or views they refer to – both for comparison and update, it is possible to specify a subset to process

Our challenge is to raise the salaries of the three MANAGERs in EMP by 15%. But we tread carefully. We first take a long and close look at the effect before we ask DBMS_COMPARISON to perform the update.

In the view CHANGED_MANAGERS, we see the result of our update:

create view changed_managers<br />as<br />select empno<br />,      ename<br />,      sal * 1.15 as sal<br />from   emp<br />where  job = 'MANAGER'<br />/<br /><br />select *<br />from   changed_managers<br />/<br /><br />

The original situation is visible in the view MANAGERS:

Now we create a COMPARISON between CHANGED_MANAGERS and MANAGERS:

< <code>>BEGIN<br />  DBMS_COMPARISON.CREATE_COMPARISON<br />  ( comparison_name =&gt; 'update_managers'<br />  , schema_name     =&gt; 'scott'<br />  , object_name     =&gt; 'MANAGERS'<br />  , dblink_name     =&gt; null<br />  , column_list =&gt; 'EMPNO,SAL'<br />  , remote_schema_name=&gt;'scott'<br />  , remote_object_name=&gt;'CHANGED_MANAGERS'<br />  );<br />END;<br />&nbsp;>

Two things to notice: we create the COMPARISON for two views rather than tables. Note that we could have used a Table and View. Also notice how we specified a column list: only differences in Salary are of interest right now (EMPNO is included since that is the INDEX_COLUMN through which the managers are identified.

Using this COMPARISON, we can do a compare:

DECLARE<br />  consistent   BOOLEAN;<br />  scan_info    DBMS_COMPARISON.COMPARISON_TYPE;<br />BEGIN<br />  consistent := DBMS_COMPARISON.COMPARE(<br />                  comparison_name =&gt; 'update_managers',<br />                  scan_info       =&gt; scan_info,<br />                  perform_row_dif =&gt; TRUE);<br />  DBMS_OUTPUT.PUT_LINE('Scan ID: '||scan_info.scan_id);<br />  IF consistent=TRUE THEN<br />    DBMS_OUTPUT.PUT_LINE('No differences were found.');<br />  ELSE<br />    DBMS_OUTPUT.PUT_LINE('Differences were found.');<br />  END IF;<br />END;<br />&nbsp;

which does not tell us anything surprising. We can then perform a converge from CHANGED_MANAGERS to MANAGERS to apply the changes. Indirectly, this is the same as performing the update emp set sal = sal * 1.15 where job=’MANAGER’ – very indirectly!

prompt Merge the Remote (Changed_Managers) into the Local (Managers) data set (updating the EMP table in the process)<br /><br />DECLARE<br />  scan_info    DBMS_COMPARISON.COMPARISON_TYPE;<br />BEGIN<br />  DBMS_COMPARISON.CONVERGE(<br />    comparison_name  =&gt; 'update_managers',<br />    scan_id          =&gt; &amp;scan_id,<br />    scan_info        =&gt; scan_info,<br />    converge_options =&gt; DBMS_COMPARISON.CMP_CONVERGE_REMOTE_WINS);<br />  DBMS_OUTPUT.PUT_LINE('Local Rows Merged: '||scan_info.loc_rows_merged);<br />  DBMS_OUTPUT.PUT_LINE('Remote Rows Merged: '||scan_info.rmt_rows_merged);<br />  DBMS_OUTPUT.PUT_LINE('Local Rows Deleted: '||scan_info.loc_rows_deleted);<br />  DBMS_OUTPUT.PUT_LINE('Remote Rows Deleted: '||scan_info.rmt_rows_deleted);<br />END;<br />/<br />&nbsp;

The result of the converge is – not surprisingly -:

SQL&gt; select *<br />  2  from   managers<br />  3  /<br /><br />     EMPNO ENAME             SAL<br />---------- ---------- ----------<br />      7566 JONES         3421.25<br />      7698 BLAKE          3277.5<br />      7782 CLARK          2817.5<br /><br />3 rows selected. <br />

With the COMPARISON we already had created, it is very easy to return the salaries back to the original situation. We recreate CHANGED_MANAGERS to:

create or replace view changed_managers<br />as<br />select empno<br />,      ename<br />,      sal/1.15 as sal<br />from   emp<br />where  job = 'MANAGER'<br />/<br />&nbsp;

and re-execute the CONVERGE operation

< <code>>DECLARE<br />  consistent   BOOLEAN;<br />  scan_info    DBMS_COMPARISON.COMPARISON_TYPE;<br />BEGIN<br />  consistent := DBMS_COMPARISON.COMPARE<br />                ( comparison_name =&gt; 'update_managers'<br />                , scan_info       =&gt; scan_info<br />                , perform_row_dif =&gt; TRUE<br />                );<br />  DBMS_COMPARISON.CONVERGE<br />  ( comparison_name  =&gt; 'update_managers'<br />  , scan_id          =&gt; scan_info.scan_id<br />  , scan_info        =&gt; scan_info<br />  , converge_options =&gt; DBMS_COMPARISON.CMP_CONVERGE_REMOTE_WINS -- CHANGED_MANAGERS rolls into MANAGERS<br />  );<br />END; <br />>

With the effect that the salaries of the Managers are back to where they were.

Resources

Oracle® Database 2 Day + Data Replication and Integration Guide – Chapter 10 – Comparing and Converging Data (http://download.oracle.com/docs/cd/B28359_01/server.111/b28324/tdpii_diverge.htm )

Oracle Database PL/SQL Packages and Types Reference – DBMS_COMPARISON (http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28419/d_comparison.htm#ARPLS868 )

PSOUG – Quick Reference DBMS_COMPARISON (http://psoug.org/reference/dbms_comparison.html

Share.

About Author

Lucas Jellema, active in IT (and with Oracle) since 1994. Oracle ACE Director for Fusion Middleware. Consultant, trainer and instructor on diverse areas including Oracle Database (SQL & PLSQL), Service Oriented Architecture, BPM, ADF, Java in various shapes and forms and many other things. Author of the Oracle Press book: Oracle SOA Suite 11g Handbook. Frequent presenter on conferences such as JavaOne, Oracle OpenWorld, ODTUG Kaleidoscope, Devoxx and OBUG. Presenter for Oracle University Celebrity specials.

3 Comments

  1. Hi,
    The procedure DBMS_COMPARISON.CREATE_COMPARISON creates a comparison. What if I want to delete the created comparison and its data in the dictionary views?

  2. The challenge we have is not comparing table data (which is offered in other products) but to compare ALL the metadata associated with the two tables. This is the challenge and has been the challenge with profiling and cleansing. I have recently become a convert of an open-standard called JUMP Metadata (sourceforge project JUMPER has script plug-ins for some tools) and cannot wait for vendors to finally see the light and begin to adopt this method.

  3. You did describe how to compare tables by setting the scan info. But i have a query how do you go for comparing two adjacent rows of same table. I was thinking to use varrays aong with data dictionary but the coulmn type wouldn’t be same for each column in the varray.
    So how do i use the DBMS.compasrison package.