Virtual table option (improve performance + model semantics)

I would like to make a case for modeling tables that in reality will be completely virtual. For example by generating views instead of ETL.



Business Units have all the tables.

For all kinds of reasons, you may have multiple business units.

This prohibits integration and transformation in the business unit tier.

To model these transformations in the DWH tier, you have to move the data there.



Problem 1. In almost all situations, moving this data to the DWH is not needed but cost time.

  • If you are in Azure, you will typically have all your tiers in the same database.
  • If you are not in Azure, your tiers are probably physically close enough to enable direct queries between them.

Problem 2. Additional tables exist in the DWH that are "2nd grade citizens". That is, they are only there to support another table but not because you want them there. This makes the model more complex than you want.



Several solutions come to mind.


Solution a. Improve the "disable physical table" feature on the Performance settings page so that we can create a completely virtual table.

In this scenario, the valid table would be a view directly on the source table. This has the advantage of eliminating two data movements compared to a regular table. The drawback is that you still have the table in your DWH model.

It is expected that for such a table, functionality is limited. (No field transformations, validations, checkpoints, incremental, history)

Currently, we can achieve something close to this by dragging a table from the business unit to the "views" node in the DWH. However

* this does not keep the mapping. If the source table changes, the view stops working.

* objects modeled as views kill data lineage. Since in the proposed solution there is a 1:1 field mapping, data lineage should be preserved.


Solution b. Allow for business units objects to be used in parameters (for scripts and views), so that the existing option to drag a business unit table to the DWH "views" node is more robust. This has the advantage of eliminating data movements and decluttering the list of tables. Significant drawback: no data lineage which is probably one of the key reasons of using a model driven design tool like TX.


Solution c. Allow for objects from business units to be used directly from other tiers in certain transformations, such as lookups and table inserts.


Please sign in to leave a comment.