One of the interesting new pieces of functionality in Oracle 11g is offered by the DBMS_COMPARISON package. This package allows us to compare two tables – or two views on tables – and inform us on the differences in data. More specifically: the package can summarize if there any changes between the data sets and if so how many. It can also inform us in detail about the specific rows that are different between the two. Finally, DBMS_COMPARISON can converge the data sets, taking one of the two as the master and fully aligning the other one. Using DBMS_COMPARISON, we get on-demand replication for ordinary tables, within schemas, across schemas or even across databases.

In this article, I will demonstrate – on the 11.1.0.6 (Beta) release for Windows – how I can compare table EMP with a clone, whose data has been modified. We will then see how we can re-align the clone with the original (master) table. In this example, I work within a single schema. However, everything I do can be done across schemas and even across a database link, catering for custom data replication scenarios.

Here is our challenge:

The following steps have been taken, to create a clone EMP_COPY of the EMP table in the SCOTT schema and make some changes in the data of EMP_COPY:

create table emp_copy as select * from emp
/

alter table emp_copy
add constraint emp_copy_pk primary key (empno)
/

update emp_copy
set    comm = comm * 1.21
/

update emp_copy
set    ename= initcap(ename)
where  job ='CLERK'
/

delete emp_copy
where  job = 'ANALYST' 
/

insert into emp_copy
(empno, ename, deptno, job)
values
(1313, 'JELLEMA', 30, 'TRAINER')
/


select empno
,      ename
,      comm
from   emp_copy
/

     EMPNO ENAME            COMM
---------- ---------- ----------
      7369 Smith
      7499 ALLEN             363
      7521 WARD              605
      7566 JONES
      7654 MARTIN           1694
      7698 BLAKE
      7782 CLARK
      7839 KING
      7844 TURNER              0
      7876 Adams
      7900 James

     EMPNO ENAME            COMM
---------- ---------- ----------
      7934 Miller
      1313 JELLEMA

13 rows selected.

Now let’s pretend EMP_COPY is a remote table that really ought to be synchronized with table EMP. Of course we could try to write a MERGE statement possibly updating far too many rows that have no differences – , combined with a separate DELETE operation for removing rows in EMP_COPY that do not exist in EMP. Or we can use DBMS_COMPARISON.

First our user needs execute privileges on DBMS_COMPARISON.

Then we create a comparison for these two objects:

BEGIN
  DBMS_COMPARISON.CREATE_COMPARISON
  ( comparison_name => 'compare_emp_and_clone'
  , schema_name     => 'scott'
  , object_name     => 'emp'
  , dblink_name     => null
  , remote_schema_name=>'scott'
  , remote_object_name=>'EMP_COPY'
  );
END;

Note that since the two tables are in the same database, we specify the dblink_name parameter as null. The CREATE_COMPARISON procedure has more parameters, such as INDEX_NAME to indicate the index that is used for matching records in the two data sets and COLUMN_LIST to specify which columns are involved in comparing and converging.

Having created a comparison, we can perform a compare operation:

DECLARE
  consistent   BOOLEAN;
  scan_info    DBMS_COMPARISON.COMPARISON_TYPE;
BEGIN
  consistent := DBMS_COMPARISON.COMPARE
                ( comparison_name => 'compare_emp_and_clone'
                , scan_info       => scan_info
                , perform_row_dif => TRUE
                );
  DBMS_OUTPUT.PUT_LINE('Scan ID: '||scan_info.scan_id);
  IF consistent=TRUE THEN
    DBMS_OUTPUT.PUT_LINE('No differences were found.');
  ELSE
    DBMS_OUTPUT.PUT_LINE('Differences were found.');
  END IF;
END;

Scan ID: 79
Differences were found.

PL/SQL procedure successfully completed.

The result of this operation can be retrieved from a number of Data Dictionary Views: USER_COMPARISON, USER_COMPARISON_SCAN_SUMMARY, USER_COMPARISON_COLUMNS and USER_COMPARISON_ROW_DIF.

We can find out whether the two objects involved in this COMPARISON – or rather their data sets – are equal or have been found to have differences and if so, how many.

SELECT s.scan_id 
,      c.COMPARISON_NAME
,      c.SCHEMA_NAME
,      c.OBJECT_NAME 
,      s.CURRENT_DIF_COUNT 
FROM   USER_COMPARISON c
,      USER_COMPARISON_SCAN_SUMMARY s
WHERE  c.COMPARISON_NAME = s.COMPARISON_NAME 
AND    s.scan_id = 79

   SCAN_ID Comparison Name      Schema Name Object Name Differences
---------- -------------------- ----------- ----------- -----------
        79 COMPARE_EMP_AND_CLON SCOTT       EMP                  10
           E

Drilling down, we can find out the rowid or index values of the rows that either exist only in one of the two data sets or have differences in their data values.

SELECT c.COLUMN_NAME
,      r.INDEX_VALUE 
,      case 
       when r.LOCAL_ROWID is null
       then 'No'
       else 'Yes'
       end  LOCAL_ROWID
,      case 
       when r.REMOTE_ROWID is null
       then 'No'
       else 'Yes'
       end  REMOTE_ROWID
FROM   USER_COMPARISON_COLUMNS c
,      USER_COMPARISON_ROW_DIF r
,      USER_COMPARISON_SCAN s
WHERE  c.COMPARISON_NAME = 'COMPARE_EMP_AND_CLONE' 
AND    r.SCAN_ID         = s.SCAN_ID 
AND    s.last_update_time > systimestamp - 1/24/15  
AND    r.STATUS          = 'DIF' 
AND    c.INDEX_COLUMN    = 'Y' 
AND    c.COMPARISON_NAME = r.COMPARISON_NAME 
ORDER 
BY     r.INDEX_VALUE

Index Column Index Value Local Row Exists? Remote Row Exists?
————— ————— ——————– ——————–
EMPNO 1313 No Yes
EMPNO 7369 Yes Yes
EMPNO 7499 Yes Yes
EMPNO 7521 Yes Yes
EMPNO 7654 Yes Yes
EMPNO 7788 Yes No
EMPNO 7876 Yes Yes
EMPNO 7900 Yes Yes
EMPNO 7902 Yes No
EMPNO 7934 Yes Yes

10 rows selected.

Knowing all this, we can decide to go for the synchronization. Using DBMS_COMPARISON.CONVERGE, we can synchronize either the local table with the remote data set or vice versa. In this example, we will roll EMP into EMP_COPY, making sure that EMP_COPY is back in line with its master:

DECLARE
  scan_info    DBMS_COMPARISON.COMPARISON_TYPE;
BEGIN
  DBMS_COMPARISON.CONVERGE
  ( comparison_name  => 'compare_emp_and_clone'
  , scan_id          => 79
  , scan_info        => scan_info
  , converge_options => DBMS_COMPARISON.CMP_CONVERGE_LOCAL_WINS
  );
  DBMS_OUTPUT.PUT_LINE('Local Rows Merged: '||scan_info.loc_rows_merged);
  DBMS_OUTPUT.PUT_LINE('Remote Rows Merged: '||scan_info.rmt_rows_merged);
  DBMS_OUTPUT.PUT_LINE('Local Rows Deleted: '||scan_info.loc_rows_deleted);
  DBMS_OUTPUT.PUT_LINE('Remote Rows Deleted: '||scan_info.rmt_rows_deleted);
END;

Local Rows
Merged: 0
Remote Rows Merged: 9
Local Rows Deleted: 0
Remote Rows Deleted: 1

PL/SQL procedure successfully completed.

This simple statement – combined with the earlier creation of the COMPARISON itself – is all that is needed to synchronize the two tables! When we verify the situation in EMP_COPY, we find:

SQL> select empno
  2  ,      ename
  3  ,      comm
  4  from   emp_copy
  5  /

     EMPNO ENAME            COMM
---------- ---------- ----------
      7369 SMITH
      7499 ALLEN             300
      7521 WARD              500
      7566 JONES
      7654 MARTIN           1400
      7698 BLAKE
      7782 CLARK
      7839 KING
      7844 TURNER              0
      7876 ADAMS
      7900 JAMES

     EMPNO ENAME            COMM
---------- ---------- ----------
      7934 MILLER
      7788 SCOTT
      7902 FORD

14 rows selected.

And indeed EMP_COPY is once again in line with EMP.

A second scenario: The most convoluted UPDATE statement in the world

In this second scenario, I wil show the probably most unlikely way to update the salaries of the managers in table EMP. However, it serves as a demonstration of two important facets of the COMPARISONS:

Comparisons not only can be applied to Tables, they can also be used with (single table) Views and Materialized Views
Comparisons do not have to apply to all columns in the tables or views they refer to – both for comparison and update, it is possible to specify a subset to process

Our challenge is to raise the salaries of the three MANAGERs in EMP by 15%. But we tread carefully. We first take a long and close look at the effect before we ask DBMS_COMPARISON to perform the update.

In the view CHANGED_MANAGERS, we see the result of our update:

create view changed_managers
as
select empno
,      ename
,      sal * 1.15 as sal
from   emp
where  job = 'MANAGER'
/

select *
from   changed_managers
/

The original situation is visible in the view MANAGERS:

Now we create a COMPARISON between CHANGED_MANAGERS and MANAGERS:

< >BEGIN
  DBMS_COMPARISON.CREATE_COMPARISON
  ( comparison_name => 'update_managers'
  , schema_name     => 'scott'
  , object_name     => 'MANAGERS'
  , dblink_name     => null
  , column_list => 'EMPNO,SAL'
  , remote_schema_name=>'scott'
  , remote_object_name=>'CHANGED_MANAGERS'
  );
END;
 >

Two things to notice: we create the COMPARISON for two views rather than tables. Note that we could have used a Table and View. Also notice how we specified a column list: only differences in Salary are of interest right now (EMPNO is included since that is the INDEX_COLUMN through which the managers are identified.

Using this COMPARISON, we can do a compare:

DECLARE
  consistent   BOOLEAN;
  scan_info    DBMS_COMPARISON.COMPARISON_TYPE;
BEGIN
  consistent := DBMS_COMPARISON.COMPARE(
                  comparison_name => 'update_managers',
                  scan_info       => scan_info,
                  perform_row_dif => TRUE);
  DBMS_OUTPUT.PUT_LINE('Scan ID: '||scan_info.scan_id);
  IF consistent=TRUE THEN
    DBMS_OUTPUT.PUT_LINE('No differences were found.');
  ELSE
    DBMS_OUTPUT.PUT_LINE('Differences were found.');
  END IF;
END;

which does not tell us anything surprising. We can then perform a converge from CHANGED_MANAGERS to MANAGERS to apply the changes. Indirectly, this is the same as performing the update emp set sal = sal * 1.15 where job=’MANAGER’ – very indirectly!

prompt Merge the Remote (Changed_Managers) into the Local (Managers) data set (updating the EMP table in the process)

DECLARE
  scan_info    DBMS_COMPARISON.COMPARISON_TYPE;
BEGIN
  DBMS_COMPARISON.CONVERGE(
    comparison_name  => 'update_managers',
    scan_id          => &scan_id,
    scan_info        => scan_info,
    converge_options => DBMS_COMPARISON.CMP_CONVERGE_REMOTE_WINS);
  DBMS_OUTPUT.PUT_LINE('Local Rows Merged: '||scan_info.loc_rows_merged);
  DBMS_OUTPUT.PUT_LINE('Remote Rows Merged: '||scan_info.rmt_rows_merged);
  DBMS_OUTPUT.PUT_LINE('Local Rows Deleted: '||scan_info.loc_rows_deleted);
  DBMS_OUTPUT.PUT_LINE('Remote Rows Deleted: '||scan_info.rmt_rows_deleted);
END;
/

The result of the converge is – not surprisingly -:

SQL> select *
  2  from   managers
  3  /

     EMPNO ENAME             SAL
---------- ---------- ----------
      7566 JONES         3421.25
      7698 BLAKE          3277.5
      7782 CLARK          2817.5

3 rows selected.

With the COMPARISON we already had created, it is very easy to return the salaries back to the original situation. We recreate CHANGED_MANAGERS to:

create or replace view changed_managers
as
select empno
,      ename
,      sal/1.15 as sal
from   emp
where  job = 'MANAGER'
/

and re-execute the CONVERGE operation

< >DECLARE
  consistent   BOOLEAN;
  scan_info    DBMS_COMPARISON.COMPARISON_TYPE;
BEGIN
  consistent := DBMS_COMPARISON.COMPARE
                ( comparison_name => 'update_managers'
                , scan_info       => scan_info
                , perform_row_dif => TRUE
                );
  DBMS_COMPARISON.CONVERGE
  ( comparison_name  => 'update_managers'
  , scan_id          => scan_info.scan_id
  , scan_info        => scan_info
  , converge_options => DBMS_COMPARISON.CMP_CONVERGE_REMOTE_WINS -- CHANGED_MANAGERS rolls into MANAGERS
  );
END; 
>

With the effect that the salaries of the Managers are back to where they were.

Resources

OracleÂ® Database 2 Day + Data Replication and Integration Guide – Chapter 10 – Comparing and Converging Data (http://download.oracle.com/docs/cd/B28359_01/server.111/b28324/tdpii_diverge.htm )

Oracle Database PL/SQL Packages and Types Reference – DBMS_COMPARISON (http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28419/d_comparison.htm#ARPLS868 )

PSOUG – Quick Reference DBMS_COMPARISON (http://psoug.org/reference/dbms_comparison.html )

3 Comments

Ahmed Baraka February 3, 2008

Hi,
The procedure DBMS_COMPARISON.CREATE_COMPARISON creates a comparison. What if I want to delete the created comparison and its data in the dictionary views?
Kumar Sansar February 1, 2008

The challenge we have is not comparing table data (which is offered in other products) but to compare ALL the metadata associated with the two tables. This is the challenge and has been the challenge with profiling and cleansing. I have recently become a convert of an open-standard called JUMP Metadata (sourceforge project JUMPER has script plug-ins for some tools) and cannot wait for vendors to finally see the light and begin to adopt this method.
Pratik K October 17, 2007

You did describe how to compare tables by setting the scan info. But i have a query how do you go for comparing two adjacent rows of same table. I was thinking to use varrays aong with data dictionary but the coulmn type wouldn’t be same for each column in the varray.
So how do i use the DBMS.compasrison package.

DBMS_COMPARISON – to compare and synchronize tables (new in Oracle 11g)

A second scenario: The most convoluted UPDATE statement in the world

Resources

Like this:

About The Author

Lucas Jellema

3 Comments

A second scenario: The most convoluted UPDATE statement in the world

Resources

Share this:

Like this:

Related Posts

About The Author

Lucas Jellema

3 Comments