While I prefer a “loosely coupled architecture” for replication between Oracle and SQL Server, sometimes a direct (database) link cannot be avoided. By using DBMS_HS_PASSTHROUGH for data extraction the 2 other ETL processes (transformation and load) can be configured and administered with more flexibility, providing an almost acceptable level of “loosely coupled processing“.
Consider this as a really simple ETL config:
- Extract: Select SQL Server data with native sql, using DBMS_PASSTHROUGH and a PIPELINED function.
- Transform: Define a view on top of the function and transform column_names and column datatypes correctly.
- Load: SQL> insert into oracle_table select * from oracle_view;
When you use DBMS_HS_PASSTHROUGH Oracle doesn’t interpret the data you receive from SQL Server. By default this is done by the dg4odbc process, and the performance benefit in bypassing this process is considerable. Also, you’re not restricted by the limitations of dg4odbc and can transform the data into anything you need.
Like dg4odbc DBMS_HS_PASSTHROUGH depends on Heterogeneous Services (a component built-in to Oracle) to provide the connectivity between Oracle and SQL Server. Installation of unixODBC and a freeTDS driver on Linux is required to setup the SQL Server datasource… installation and configuration steps can be found here and here. DBMS_HS_PASSTHROUGH is invoked through an Oracle database link. The package conceptually resides at SQL Server but, in reality, calls to this package are intercepted and mapped to one or more Heterogeneous Services calls. The freeTDS driver, in turn, maps these calls to the API of SQL Server. More about DBMS_HS_PASSTHROUGH here.
Next a short example of how to setup data extraction from SQL Server with DBMS_HS_PASSTHROUGH and data transformation within the definition of a view. In this example the SQL Server column names differ from the ones in Oracle in case, length and/or in name and/or in datatype, and are transformed by the view. NLS_DATE_FORMAT synchronization is an exception… it’s done in the extract package itself. Reason is that all dates in this particular SQL Server database use a specific format, and it doesn’t really obscure the code. But if you choose to refrain from all transformation code in the extract package, you could create types with VARCHAR2’s only, and put all your to_number, to_date and to_timestamp conversion code in the view definition.
Extract
-- create Oracle types for uninterpreted SQL Server data CREATE OR REPLACE TYPE E01_REC AS OBJECT( C01 NUMBER(8), C02 VARCHAR2(25 CHAR), C03 VARCHAR2(3 CHAR), C04 NUMBER(8), C05 DATE, C06 DATE ); / CREATE OR REPLACE TYPE E01_TAB AS TABLE OF E01_REC; / -- create the extract package CREATE OR REPLACE PACKAGE E AUTHID DEFINER AS --------------------------------------------------------- FUNCTION E01 RETURN E01_TAB PIPELINED; --------------------------------------------------------- END E; / -- create the extract package body CREATE OR REPLACE PACKAGE BODY E AS v_cursor BINARY_INTEGER; v_out_e01 E01_REC:=E01_REC(NULL,NULL,NULL,NULL,NULL,NULL); ------------------------------------------------------------------------- v_stat_e01 VARCHAR2(100):= 'Select SiteID , SiteName , SiteMnemonic , PointRefNumber , OpeningDate , ClosingDate From ObjSite'; ------------------------------------------------------------------------- FUNCTION E01 RETURN E01_TAB PIPELINED IS BEGIN execute immediate 'alter session set NLS_DATE_FORMAT = ''YYYY-MM-DD HH24:MI:SS'' '; v_cursor := DBMS_HS_PASSTHROUGH.OPEN_CURSOR@<DBLINK>; DBMS_HS_PASSTHROUGH.PARSE@<DBLINK>(v_cursor,v_stat_e01); WHILE DBMS_HS_PASSTHROUGH.FETCH_ROW@<DBLINK>(v_cursor) > 0 LOOP DBMS_HS_PASSTHROUGH.GET_VALUE@<DBLINK>(v_cursor,1,v_out_e01.c01); DBMS_HS_PASSTHROUGH.GET_VALUE@<DBLINK>(v_cursor,2,v_out_e01.c02); DBMS_HS_PASSTHROUGH.GET_VALUE@<DBLINK>(v_cursor,3,v_out_e01.c03); DBMS_HS_PASSTHROUGH.GET_VALUE@<DBLINK>(v_cursor,4,v_out_e01.c04); DBMS_HS_PASSTHROUGH.GET_VALUE@<DBLINK>(v_cursor,5,v_out_e01.c05); DBMS_HS_PASSTHROUGH.GET_VALUE@<DBLINK>(v_cursor,6,v_out_e01.c06); PIPE ROW(v_out_e01); END LOOP; DBMS_HS_PASSTHROUGH.CLOSE_CURSOR@<DBLINK>(v_cursor); RETURN; EXCEPTION WHEN NO_DATA_NEEDED THEN DBMS_HS_PASSTHROUGH.CLOSE_CURSOR@<DBLINK>(v_cursor); WHEN OTHERS THEN DBMS_HS_PASSTHROUGH.CLOSE_CURSOR@<DBLINK>(v_cursor); DBMS_OUTPUT.PUT_LINE(SQLERRM||'--'||DBMS_UTILITY.FORMAT_ERROR_BACKTRACE); RAISE; END E01; ------------------------------------------------------------------------ END E; /
Transform
CREATE OR REPLACE FORCE VIEW SITE_VW AS SELECT TO_NUMBER(C01) SITEID, C02 STATIONNAME, C03 SITEMNEMONIC, TO_NUMBER(C04) STATIONID, C05 OPENINGDATE, C06 CLOSINGDATE FROM TABLE(E.E01);
Load
INSERT INTO SITE SELECT * FROM SITE_VW; COMMIT;