I often get the question about how to handle deletes in the source system on source based incrementally loaded tables in TX.
The source based incremental load method requires a field in the source table that can help us identify new or modified records – and this will be included in the selection rule for the table. Deleted records in the source system are obviously not affected by the selection rule and as a result the deleted data will still be present in the TX Staging Area and Data Warehouse.
Does this mean that we should not do incremental load on source tables where deletes could occur?
No! – but we need to find a solution. How do we get those deleted records deleted from our data?
I have with success used the following method.
The scenario is that I have a large table that I am loading incrementally:
To figure out which records has been deleted, I load the primary key fields from the table into the same staging area:
I have selected not to have a physical valid table on the “DeleteCheck” table:
The “DeleteCheck” table will contain the complete list of records that still exist in the source system. The records are uniquely identified by the primary key field(s).
Create a Script Action that deletes the rows in the Incrementally loaded that no longer exists in the “DeleteCheck” table.
Set the Script Action as a Post Data Cleansing Script on the Incrementally loaded table:
To ensure the correct execution order in managed Thread Execution, remember to setup table dependencies on the incrementally loaded table, so it becomes dependant on the “DeleteCheck” table:
Note: If you want to do “Soft Delete” instead of physically removed deleted rows from the table, then you could change the script into an update statement instead – and update an “DeletedDateTime” column.
Warning! Be careful about executing the table individually. If the script is executed and there is no records at all in the “DeleteCheck” table, then all records in the incrementally loaded table is deleted and a full load is the only recovery.