Follow

How to handle that DistinctCount counts NULL values

Introduction

This article describes a few approaches to handling the fact that DistinctCount counts NULL values.

The reason for this behaviour is that in contrast to most other systems, Analysis Services handles NULL and 0 as the same thing, and 0 would normally be considered something worth counting, whereas NULL wouldn't.

I will describe two approaches here. There are multiple other ways of doing this, and Google will be helpful explaining the alternatives if you don't like either of these.

Consider this example:

Documents:
100
100
200
NULL
200
NULL

A distinct count of the documents in this fact table would return 3, counting 100, 200 and NULL. In some cases we would want this measure to return 2 and ignore any NULL values.

Subtract One (1)

This is a very simple approach, but is a little bit more 'dirty'.

  1. Create a new column on the fact table that indicates whether the field we want to distinctcount is null. To do this we add a new column and give it the fixed value 1 when the DocumentNumber is empty:

  2. Create three measures
    • DistinctOriginal: Standard measure, DistinctCount on DocumentNumber, hidden measure
    • IsDocumentNumberNull:Standard measure, MAX on isDocumentNumberNull, hidden measure
    • DistinctDocuments: Calculated measure, Formula: DistinctOriginal - IsDocumentNumberNull

    And that's it, use the new calculated measure for a distinct count that does not count NULLs.

Separate fact table

Another way to go is to create a separate fact table that contains only the field that you want to do the distinct count on, and all dimension keys, and filter out all records where the value for the measure column is NULL. Then add this fact table to the cube, link it to all the dimensions and create your distinctcount measure on this table.

To create this new fact table drag the original fact table on the Data Warehouse to the Views node to create a custom view. Edit this view and remove all columns except for the dimension keys that you use to link to the dimensions and the distinct count column, for example DocumentNumber. Add a where clause the end of the statement:

WHERE [DocumentNumber] IS NOT NULL

This article applies to: timeXtender 4.5, tX2012, TX2014.

Was this article helpful?
5 out of 5 found this helpful
Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.