ETL using Oracle DBMS_HS_PASSTHROUGH and SQL Server etMlT1M8 VkFEOtnvJ5B3moWF4U

ETL using Oracle DBMS_HS_PASSTHROUGH and SQL Server

While I prefer a “loosely coupled architecture” for replication between Oracle and SQL Server, sometimes a direct (database) link cannot be avoided. By using DBMS_HS_PASSTHROUGH for data extraction the 2 other ETL processes (transformation and load) can be configured and administered with more flexibility, providing an almost acceptable level of “loosely coupled processing“.
Consider this as a really simple ETL config:

    Extract: Select SQL Server data with native sql, using DBMS_PASSTHROUGH and a PIPELINED function.
    Transform: Define a view on top of the function and transform column_names and column datatypes correctly.
    Load: SQL> insert into oracle_table select * from oracle_view;

When you use DBMS_HS_PASSTHROUGH Oracle doesn’t interpret the data you receive from SQL Server. By default this is done by the dg4odbc process, and the performance benefit in bypassing this process is considerable. Also, you’re not restricted by the limitations of dg4odbc and can transform the data into anything you need.

Like dg4odbc DBMS_HS_PASSTHROUGH depends on Heterogeneous Services (a component built-in to Oracle) to provide the connectivity between Oracle and SQL Server. Installation of unixODBC and a freeTDS driver on Linux is required to setup the SQL Server datasource… installation and configuration steps can be found here and here. DBMS_HS_PASSTHROUGH is invoked through an Oracle database link. The package conceptually resides at SQL Server but, in reality, calls to this package are intercepted and mapped to one or more Heterogeneous Services calls. The freeTDS driver, in turn, maps these calls to the API of SQL Server. More about DBMS_HS_PASSTHROUGH here.

Next a short example of how to setup data extraction from SQL Server with DBMS_HS_PASSTHROUGH and data transformation within the definition of a view. In this example the SQL Server column names differ from the ones in Oracle in case, length and/or in name and/or in datatype, and are transformed by the view. NLS_DATE_FORMAT synchronization is an exception… it’s done in the extract package itself. Reason is that all dates in this particular SQL Server database use a specific format, and it doesn’t really obscure the code. But if you choose to refrain from all transformation code in the extract package, you could create types with VARCHAR2’s only, and put all your to_number, to_date and to_timestamp conversion code in the view definition.

Extract

-- create Oracle types for uninterpreted SQL Server data
CREATE OR REPLACE TYPE E01_REC 
AS OBJECT(
  C01    NUMBER(8),
  C02    VARCHAR2(25 CHAR),
  C03    VARCHAR2(3 CHAR),
  C04    NUMBER(8),
  C05    DATE,
  C06    DATE );
/

CREATE OR REPLACE TYPE E01_TAB AS TABLE OF E01_REC;
/

-- create the extract package
CREATE OR REPLACE PACKAGE E AUTHID DEFINER AS
--------------------------------------------------------- 
  FUNCTION E01 RETURN E01_TAB PIPELINED;
---------------------------------------------------------
END E;
/

-- create the extract package body
CREATE OR REPLACE PACKAGE BODY E AS
  v_cursor   BINARY_INTEGER;  
  v_out_e01  E01_REC:=E01_REC(NULL,NULL,NULL,NULL,NULL,NULL);
-------------------------------------------------------------------------
  v_stat_e01 VARCHAR2(100):= 'Select SiteID
                                   , SiteName
                                   , SiteMnemonic
                                   , PointRefNumber
                                   , OpeningDate
                                   , ClosingDate
                               From ObjSite';
                                 
-------------------------------------------------------------------------
FUNCTION E01
RETURN E01_TAB PIPELINED
  IS
BEGIN
  execute immediate 'alter session set NLS_DATE_FORMAT = ''YYYY-MM-DD HH24:MI:SS'' ';
  v_cursor := DBMS_HS_PASSTHROUGH.OPEN_CURSOR@<DBLINK>;
  DBMS_HS_PASSTHROUGH.PARSE@<DBLINK>(v_cursor,v_stat_e01);
  WHILE DBMS_HS_PASSTHROUGH.FETCH_ROW@<DBLINK>(v_cursor) > 0
    LOOP
      DBMS_HS_PASSTHROUGH.GET_VALUE@<DBLINK>(v_cursor,1,v_out_e01.c01);
      DBMS_HS_PASSTHROUGH.GET_VALUE@<DBLINK>(v_cursor,2,v_out_e01.c02);
      DBMS_HS_PASSTHROUGH.GET_VALUE@<DBLINK>(v_cursor,3,v_out_e01.c03);
      DBMS_HS_PASSTHROUGH.GET_VALUE@<DBLINK>(v_cursor,4,v_out_e01.c04);
      DBMS_HS_PASSTHROUGH.GET_VALUE@<DBLINK>(v_cursor,5,v_out_e01.c05);
      DBMS_HS_PASSTHROUGH.GET_VALUE@<DBLINK>(v_cursor,6,v_out_e01.c06);
    PIPE ROW(v_out_e01);
    END LOOP;
  DBMS_HS_PASSTHROUGH.CLOSE_CURSOR@<DBLINK>(v_cursor);
  RETURN;
EXCEPTION
  WHEN NO_DATA_NEEDED THEN
    DBMS_HS_PASSTHROUGH.CLOSE_CURSOR@<DBLINK>(v_cursor);
  WHEN OTHERS THEN
    DBMS_HS_PASSTHROUGH.CLOSE_CURSOR@<DBLINK>(v_cursor);
    DBMS_OUTPUT.PUT_LINE(SQLERRM||'--'||DBMS_UTILITY.FORMAT_ERROR_BACKTRACE);
  RAISE;
END E01;
------------------------------------------------------------------------
END E;
/

Transform

CREATE OR REPLACE FORCE VIEW SITE_VW
AS 
SELECT TO_NUMBER(C01) SITEID,
       C02            STATIONNAME,
       C03            SITEMNEMONIC,
       TO_NUMBER(C04) STATIONID,
       C05            OPENINGDATE,
       C06            CLOSINGDATE
FROM TABLE(E.E01);

Load

INSERT INTO SITE SELECT * FROM SITE_VW;
COMMIT;