Extract
Generally this is a two stage processes sometimes one stage depending on the source system. If the source system is a non standard source, such as flat file, excel files, EAV model databases, ERP systems, CRM systems, etc, then the Extract phase becomes a two stage process, i.e.,
- extract data from source to temporary storage area (sometimes called PREP area)
- load the extracted data in to the Staging area
If the source system is any operational database, then the extract the data from different operational databases for e.g. : individual stores level data of a retail chain, etc. and load into Stage area as is and proceed to the next steps to apply transformation/business rules and load from Stage area to Data warehouse/mart.
Transformation:
Here the following are addressed:
- The business logics to the necessary fields
- Populate the surrogate keys and check referential integrity constraints by performing look up for foreign keys.
- Aggregation of data
- Data conversion if necessary
- Implement SCD's
No comments:
Post a Comment