Model driven Design of ETL functions

5

Analysis and functional design for regular application development has been standardised over the years. Now working on Business Intelligence (BI) projects for the last one and a half years, it seems to me that these kind of standards do not yet apply to Data Warehouse environments and more specific to ETL-processes (Extract Transform Load). Last week I read an interesting article about “Model driven Design of ETL functions” by Mark Zwijsen in Database Magazine. It interested me because I both know the author and the Data Warehouse environment he uses as an example. His thesis is that by structuring the functional design of the ETL process, it should be possible to automatically generate about 80% of the actual ETL code. The underlying model shows a possible structure.

ETL Designmodel3

This model is a usefull structure for functional design documents of ETL functions. Most of the model is self explanatory. I had some difficulty with the difference between Enrichment Rules and Coupling Rules.

Enrichment Rules add extra related data from the source model, that are necessary for further processing. For instance, if you would process Customers of a Telecom Network Operator and these Customers could be either Customer of an Internal Service Provider or an External Service Provider, then you would have to Enrich the Customer data with data of the related Service Provider in order to Select the Customers of the internal Service Provider.

Coupling Rules are necessary to relate the Source data to Target data. If for instance the Source contains a Source specific Service Provider code and the Target contains a source independent Dimension for Service Provider, then you have to couple the Source specific Service Provider code to a source independent Service Provider code in order to be able to relate the Customer to the Service Provider Dimension of the Target Model.

To conclude with, the History Processing is defined as the process to handle Slowly Changing Dimensions (SCD) in case of the SCD-2 mechanism. If you work in the field of BI and Data Warehousing, this should be enough information to interpret the above Model.

Share.

About Author

5 Comments

  1. Quite simple – use Oracle Warehouse Builder. The 10gR2 release – of which we have been using beta-drops since december 2004 – is awesome.

    regards,

    Lucas

  2. Manikantha varma on

    Hi,

    my query is:

    actually my role is cleanse_and_transform
    my source tables are nearly 150 and target were 90
    my question is

    How procedures need for above requirement
    and also i need sample coding

  3. Mark Zwijsen on

    Kherat,

    Although I wonder why you posted this question in this blog, I might be able to answer your question about handling type 1, 2 and 3 changes in Business Objects Data Integrator.
    Please mail me at mark.zwijsen@atosorigin.com, so that I can plan to give you a comprehensive answer.

    Besides that, what do you think of the original blog. Since I am the author quoted in this blog, I am very interested in you opinion

    Regards,
    Mark Zwijsen

  4. dear sir;
    could you tell me how to make generic way to detect type one,two,three of SlowlyChangingDimenssions using busniess object Data Integrator designer
    and is there any function to do that
    thanks