Skip to main content
Solved

Data Profile does not always give the correct highest/lowest occurence

  • January 10, 2023
  • 1 reply
  • 26 views

rory.smith
TimeXtender Xpert
Forum|alt.badge.img+7

When using the Data Profile function on the Person.Person table in the AdventureWorks2014 database I see strange results for the highest and lowest occurence in the LastName field: Diaz is the highest and Billstrom is the lowest. In reality Abbas is the lowest when sorting by LastName ASC where count(*) == 1

 

I am not entirely sure that 'Highest occurence’ is clear to everyone that is not a natvie speaker: you could also assume this should be the highest sorted value.

Best answer by JTreadwell

Hi @rory.smith, In the Data Profile feature highest and lowest occurrence is indicating which value in a given field occurs the most, and which occurs the least. In SQL this would look like:

SELECT [LastName], Count([LastName])
FROM [MDW].[Person]
GROUP BY [LastName]
ORDER BY Count([LastName]) DESC

In my version of AdventureWorks, the Person table results in Diaz occurring the most at 211 times. 

If you reverse the Order By ASC, I get Barlow, then Billstrom preceding 687 other names that only occur once.  Beyond the count, I don’t believe there is any additional sorting happening, so the 687 single occurrences may appear in a different order. 

 

Using this logic, it seems to be working as expected. 

View original
Did this topic help you find an answer to your question?

1 reply

JTreadwell
Employee
Forum|alt.badge.img+5
  • Employee
  • 179 replies
  • Answer
  • January 10, 2023

Hi @rory.smith, In the Data Profile feature highest and lowest occurrence is indicating which value in a given field occurs the most, and which occurs the least. In SQL this would look like:

SELECT [LastName], Count([LastName])
FROM [MDW].[Person]
GROUP BY [LastName]
ORDER BY Count([LastName]) DESC

In my version of AdventureWorks, the Person table results in Diaz occurring the most at 211 times. 

If you reverse the Order By ASC, I get Barlow, then Billstrom preceding 687 other names that only occur once.  Beyond the count, I don’t believe there is any additional sorting happening, so the 687 single occurrences may appear in a different order. 

 

Using this logic, it seems to be working as expected. 


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings