deselect fields in valid tables

Kimball is very clear. Fact tables should only contain surrrogate keys, measures and degenerate dimensions. Currently this is not possible with timextender.


In the datawarehouse, surrogate keys are calculated for every dimension table using the dw_id using a custom transformation. ex. SK_KEY = isnull(SK_KEY,dw_id)

Surrogate keys NEED to be calculated at DWH level and cannot be calculated at staging level due to a variety of reasons.  When a fact table is loaded in the DWH, lookups are performed to pickup the correct surrogate key from the dimension. To do this, the business key needed to perform the lookup remains in the fact table, polluting the fact tables.

This causes the record width of fact table to be too large, especially with very large fact tables, the amount of unneccary physical disc reads skyrockets.

The fact table should only contain measures, surrogate keys and degenerate dimensions meaning that all the business keys should be dropped when the lookups are done.

By allowing a user to exclude fields from the valid table, this behavior can be achieved. Raw table contains business keys, lookup is performed and valid table only contains what is needed. Unneccesary fields are dropped.

Currently this can only be achieved by copying the entire fact again (without the BK), leading to waste of storage.


Please sign in to leave a comment.