Monday, August 2, 2010

Extract / Transform Process

Extract
Generally this is a two stage processes sometimes one stage depending on the source system. If the source system is a non standard source, such as flat file, excel files, EAV model databases, ERP systems, CRM systems, etc, then the Extract phase becomes a two stage process, i.e.,

  • extract data from source to temporary storage area (sometimes called PREP area)
  • load the extracted data in to the Staging area

If the source system is any operational database, then the extract the data from different operational databases for e.g. : individual stores level data of a retail chain, etc. and load into Stage area as is and proceed to the next steps to apply transformation/business rules and load from Stage area to Data warehouse/mart.

Transformation:
Here the following are addressed:

  • The business logics to the necessary fields
  • Populate the surrogate keys and check referential integrity constraints by performing look up for foreign keys.
  • Aggregation of data
  • Data conversion if necessary
  • Implement SCD's

No comments:

Post a Comment


All Rights Reserved @ Raju Das