Skip to main content

A Junk Dimension is a concept in dimensional modeling. It combines multiple low-cardinality attributes and indicators into a single dimension table as opposed to adding separate dimension tables. This reduces the size of the fact table and makes the dimensional model easier to work with.

The Junk Dimension table contains a row for all the distinct combinations of the junk dimension attributes, and each row contains a key that identifies that specific combination. The junk dimension attribute fields can be removed from the fact table and then replaced with a single field that references a row in the junk dimension table.

Multiple tables can utilize the same junk dimension table.

Add a Junk Dimension Table

Use the following steps to add Junk Dimension Table to an existing table in the Prepare instance.

  1. Right-click on a table, select Advanced, and then select Add Junk Dimension Table

     

  2. Select the fields to be added to the Junk Dimension Table and click OK.
  3. Enter a Name for the table.

     

  4. (Optional) Select a table in the Available tables list and click Add to associate this other table to the junk dimension table. In the prompt that appears, select whether to implement automatic field mapping. The automatic field mapping algorithm will map a field in the new table to a field in the junk dimension table based on whether one of the following conditions is true:
    1. The new table field has the same name as the junk dimension table field.
    2. The new table field has the same name as another field that is mapped to the junk dimension table field.

       

  5. Map the fields in the junk dimension table with the fields in the associated tables. Each row in the table represents one field in the junk dimension table while each column represents a table.
    • To add a new field to the junk dimension table, click the empty field in the bottom row of the second column and type a name.
    • To remove a field from the junk dimension table, right-click the first column of the corresponding row and click Remove.
    • To remove a table from the junk dimension table, right-click the header row of the corresponding column - the table name - and click Remove.
    • To change the order the tables are loaded in, click on the header row of the corresponding column - the table name - and drag it to the desired location.
  6. (Optional) In the Hashing algorithm list, click on the hashing algorithm to be used for the dimension table key. See this guide for information regarding the different algorithms. Junk dimensions have a special hashing algorithm that is available called "Legacy integer", which is for compatibility with older versions of Analysis Services. This algorithm should be avoided if possible, however, since it is less safe than other algorithms and only uses 8 bytes for the hash value, which means an increased risk that two different data sets may end up having the same hash value. 

After it is created, the new Junk Dimension Table will appear with a yellow table icon. Use the generated key to create relations between the junk dimension and the fact tables.

When the junk dimension table is executed, it will insert non-existing junk dimension combinations from the associated tables. The junk dimension table preforms no truncation of the raw table.

Hi ​@Syed Yousuf ,

Junk dimensions are a very powerful feature. However, the lack of key column data types in Junk Dimension tables is the reason I don’t use it very often. If you choose any hashing algorithm other than ‘Legacy Integer’, the key column will get a varbinary(64) datatype.

You cannot create relationships in Tabular based on varbinary(64) columns, not even in the latest Tabular versions. Therefore, the only option when using Tabular is to use ‘Legacy Integer’. However, this is a very slow and not recommended hashing algorithm.

As a workaround, I create the Junk Dimension key based on a modern and fast hashing algorithm. From this varbinary(64) column, I will create a Supernatural Key to get the Bigint datatype. It would be much better if we could directly generate a (big)int datatype key column in the Junk Dimensions using a modern hashing algorithm, just as is possible for Supernatural keys.

A few years ago, an idea was raised for this: Change Junk Dimension Key Data Type | Community

 


Reply