Follow

How to set up Source Based Incremental Load

Introduction

Source based incremental load is a method of doing incremental loads where one or more fields in the source table is used to determine if a record is new or modified. This will often be a datetime (ie CreatedDateTime and ModifiedDataTime) but it could also be a numeric field that increases over time, like a voucher number or a invoice number.

The prerequisites for using source based incremental load are:

  • A Unique primary key must be defined on the table. The Primary Key is used to identify whether an incoming record is new or an update of an existing record
  • One or more fields in the source table can identify new or modified records

Please note that only inserts and updates on the source table are handled by the source based incremental load mechanism. It is possible to do periodical full loads of the table to handle deletes. You can read more about this in another article here on the support-site.

Step-by-step

  1. Define a primary key on the table by right-clicking the relevant fields:

    2012-07-17_11h51_15.png
  2. Enable source based incremental load on the table (Right click table, Advanced / Advanced Settings):

    2012-07-17_11h52_44.png
  3. Go to the Data Source and set up an Incremental Selection rule for the table:

    2012-07-17_11h54_15.png
  4. Select the field(s) that should be used for source based incremental load:

    2012-07-17_11h58_03.png
  5. For each selected field you can choose a value to subtract from the last maximum value to ensure an overlap. This can be useful if there could be uncommitted transactions on the source system that holds a smaller value in the incremental selection rules field. The primary key will be used to determine if the record is already present or not. When using a datetime field, you can subtract Days, Hours, Minutes and Seconds, when using a numeric field you can choose a value to subtract:

    2012-07-17_12h01_37.png

Notes

When Source based incremental load is enabled, the first load will always be a full load. This means that all data in the Valid (_V) and Raw (_R) instance are deleted and all data are transferred from the source table to the Raw instance in the staging database. Data cleansing is performed on all data in the raw instance and - hopefully - ending up in the valid table instance. After the data cleansing has finished the Incremental table (_I or _Incr) is updated with the maximum value for the Incremental Selection Rules field.

On subsequent loads, the Raw instance is emptied. When extracting records from the source table to the raw instance a Where-clause is added, selecting only records where <Incremental Selection Rule Fields> are greater than Last Max Value (From the Incremental Table) minus the subtraction value(s), if any. Data Cleansing is performed on records in the raw instance. New records are added, modified records are updated.


This article applies to: tX2012, TX2014.

Was this article helpful?
1 out of 1 found this helpful
Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.