The history feature is like "track changes" for tables. It allows you to record how data in a table changes over time so you can create reports that show how data looked at a particular moment in time.
The principle behind history on tables is quite simple. Instead of simply storing every unique record once, a version of the record is stored every time an important change is made to the record. To keep track of what record is the currently valid one, meta data fields are added to the table. Two of these, "valid from" and "valid to", can be combined to get the time span the table was valid in.
When a table has history enabled, Discovery Hub compares the records loaded in from the data source with the records already in the data warehouse. If the record is new, it is added to the data warehouse. If there is a record with the same key already, hashed versions of the two records are compared to each other to find changes. If no changes are detected, nothing is updated in the data warehouse. If one or more changes are detected, what happens depend on the type of the field or fields that have changed. Discovery Hub support the following types:
- Type 0: Fields that are basically ignored by the history logic. They are inserted together with the rest of the record when it is created – either on the initial load or on a type II change - but they do not trigger an update. Type 0 fields give you flexibility in your history-enabled tables. For instance, you might have a type 0 field with values that are calculated by custom code inside or outside Discovery Hub.
- Type I (default): Fields that are overwritten with new data when data changes in the source. This means that there will be no history of the change. When a type I field changes, the current version of the record is updated with the new field value. You can also configure Discovery Hub to overwrite all instances of the record with the new value. If, for instance, a customer changes name from ABC Consulting to Acme Consulting, the default behavior is to update the current record with the new name. Previous records will still contain the old name, ABC Consulting. If you enable the option, all instances will contain the new name, Acme Consulting. This is often useful in reporting where the purpose is to have a recognizable name for the customer, not the correct name at a specific time.
- Type II: Fields that will cause a new record to be created, thus creating history about the change. Discovery Hub will create a new record if one of the type II fields on a record has changed, not a new record for each change. The new record will be a copy of the record from the data source. This means that any type 0 fields will be updated.
In addition to these three types, a field can be part of the natural key that uniquely identifies a record to the history logic. The natural key is usually identical with the primary key on the table.
Follow the steps below to enable history on a table.
- Right click the table and click Table Setting. The Table Settings window appears. Click the History tab.
- Click Enable History.
- Check Use soft deletes to mark records that have been deleted in the source system as deleted in the table. This is done by setting the "Is Tombstone" system field on the record to 1. Check Create a new record when a deleted record reappears in the source to keep track of history when a record is deleted and later restored in the source system. When this option is disabled, the status of the record will simply change between deleted and not deleted with no information saved about the fact that the record was missing from the source system for a while.
- Check Update all records with new value on type I change to update all versions of a record with the new value when a type I change is detected. The default behavior is to only change the value in the currently valid record.
- Check Treat type II as type I when field value is null to not insert a new record when a type II field changes from null to a non-null value. Enable this when you are not interested in keeping track in this kind of change, e.g. when you have added a new field to the table, thus creating a field where all values are null.
- Under New columns valid from, click Time of load to have Discovery Hub insert the time of load in the "valid from" field when a new record is added. Depending on you reporting needs, this might make more sense than the default, 1900-01-01.
- Resolve any conflicting table settings (marked with error icons) and the click OK.
- Locate the table. The table icon will be overlaid with an "H" to make it easy to identify as a history-enabled table. Expand the node and click on History Settings. The History settings pane appears.
- Under Natural Key, select the fields you want to use to uniquely identify the records. The primary key fields are selected per default. Note that if you add another field to the primary key, you have to add it to the natural key as well to have it work with the history logic.
- Select the fields you want to be either type 0 or type II under the respective headings - Type 0 fields (ignore) and Type II fields (insert new record). Any fields that you do not select as any of these types will be type I.
- Deploy the table to have the settings take effect.
Note: You might see a Clean Up Tombstone Field button under history settings. This refers to the "Is Tombstone" field used for keeping track of records that have been deleted in the source. Earlier versions of Discovery Hub would create an "Is Tombstone" field on history-enabled tables even if deletes was not enabled. Click the button to remove the unnecessary field from the table.