0

Allow SCD changes within one execution / data load

Currently, TimeXtender will report a primary key violation if you load multiple records for the primary key to a table with history within one execution.

Imagine a table containing a task ID and a status ID where the status can change quickly, faster than the reload granularity of the DWH. For the client it is important to be able to track all status changes (even switching back-and-forth between two statuses). There is no way to guarantee that increasing the rate of reload will result in not having multiple records per fact id. Given the volume of data, having an extra field to identify status updates (like a sequence number) would lead to unmanageably large tables.

If you make the table historic, with the task ID as primary (and natural) key and the statusID as type 2 history enabled, TX will not pass through any records other than the first.

Intuitively you would expect that any subsequent records are treated as type 2 changes and stored as such.

Outreach to TX consultants leads to the conclusion that the requested behaviour would require scripting, but there may be other options:

- add an option to process record-by-record. This would be less performant, but should capture records with the same key as SCD type 2 changes (given correct ordering of input). The current implementation for this specific client runs row-by-row and is deemed acceptable, so performance is not the key issue here.

- allow historic tables to operate with the warning flag. Duplicates that only differ in type 2 columns would result in warnings, other key clashes would result in error.

3 comments

Please sign in to leave a comment.