Skip to main content

Handling DistinctCount for NULL Values


Christian Hauggaard
Community Manager
Forum|alt.badge.img+5

In TimeXtender, DistinctCount includes NULL values in its calculations, which may not always be the desired outcome. This behavior stems from SQL Server Analysis Services (OLAP Multidimensional cubes) treating NULL and 0 as equivalent, where 0 is typically counted while NULL is often excluded. Here are two approaches to address this issue:

Subtract One Method

This straightforward method involves creating a new column and three measures:

  1. Create a new column on the fact table:
    • Add a column named "isDocumentNumberNull"
    • Set its value to 1 when DocumentNumber is empty (NULL)
  2. Create the following measures:
    1. DistinctOriginal: Standard measure, DistinctCount on DocumentNumber, hidden measure
    2. IsDocumentNumberNull:Standard measure, MAX on isDocumentNumberNull, hidden measure
    3. DistinctDocuments: Calculated measure, Formula: DistinctOriginal - IsDocumentNumberNull
  3. Use the DistinctDocuments measure for a distinct count that excludes NULL values.

Separate Fact Table Method

This approach involves creating a new fact table:

  1. Create a custom view:
    • Drag the original fact table to the Views node in the Data Warehouse
    • Edit the view to include only:
      • Dimension keys used for linking
      • The distinct count column (e.g., DocumentNumber)
  2. Add a WHERE clause to filter out NULL values at the end of the new statement
    WHERE [DocumentNumber] IS NOT NULL
  3. Add the new fact table to the cube
  4. Link it to all relevant dimensions
  5. Create your DistinctCount measure on this new table

Additional Considerations

Both methods effectively exclude NULL values from DistinctCount calculations, allowing for more accurate reporting when NULL values should not be considered distinct entities. Here are some additional considerations:

  • Choose the method that best fits your specific use case and data model
  • Consider performance implications, especially for large datasets
  • Test thoroughly to ensure the chosen method produces the expected results across various scenarios
Did this topic help you find an answer to your question?

0 replies

Be the first to reply!

Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings