Written By: Doug Wynkoop
Edited By: Adam McCormack, Joseph Treadwell
Introduction to Aggregate Tables
TX DWA aggregate tables allow you to quickly and easily combine data to find averages, sums, counts, and minimum and maximum values. Aggregate tables are most useful when serving as the basis for standardized reports that don't change very often. For this kind of set reporting format, aggregate tables are quick, reliable, and easy for both developers and users. The fact that they are easy to set up makes aggregate tables useful for ad-hoc reporting as well.
This speed and simplicity comes at the cost of flexibility. Aggregate tables are less useful for analysts who want to be able to look at data in lots of different ways. Unlike an OLAP cube, aggregate table data can't be pivoted, nor can you drill down to a lower level and see the underlying transactions. Still, they can be very useful when used in the right context.
To illustrate how to create an aggregate table, we will create a table to aggregate sales data by customer, a common requirement. For the purposes of this demonstration, please refer to the simple staging area shown here:
This guide uses the demonstration database AdventureWorks 2012 as a data source, which can be downloaded here. Two sample projects are attached to this article as well. Aggregate Demo (Start) contains the staging area shown above. Aggregate Demo (Complete) contains the advanced aggregate table shown at the end of this guide.
Creating an Aggregate Table
Aggregate tables are based on the data of a pre-existing table. In this case, the requirement is to aggregate sales data, so the base table will be SalesOrderDetail. Right-click on the table and navigate to Add Aggregate Table as shown below.
Configuring an Aggregate Table
There are two main sections to configure in an aggregate table. In Group By, specify which fields to analyze the data by. In Aggregate, choose which fields aggregate and how to aggregate them - by count, sum, average, etc. In this case, we want to total up net sales by customer.
- Be sure to give your table a name
- Under GroupBy, choose CustomerKey, a unique customer identifier
- Under Aggregate, choose LineTotal, a net sales amount for each sales transaction line
- Rename the LineTotal aggregation so that your users will know they are looking at aggregated data
- Add all of the sales totals together with the Sum Aggregation Type
Click OK when you are done. The aggregate table will appear at the bottom of the table list as a yellow table with a black sigma character: .
Add Lookups (Optional)
If the aggregate table feeds a highly standardized report that requires no more advanced analysis, it can sometimes be a good idea to add additional information to the aggregate table that you'd ordinarily want to retain in a dimension table. If your analysts only ever want to see customer name and number, and no other customer-related data, it can save time and effort to bring customer name in as a lookup, as shown here:
After deploying and executing the staging area, you should be able to preview your aggregate table and see the data as shown below.
This format makes asking certain questions very easy. For instance, if you wanted to see the top ten all-time customers, you could simply use a query like this:
SELECT TOP 10 CustomerFullName, TotalSales
ORDER BY TotalSales DESC
Advanced Aggregate Tables
Other common fields your analysts might be interested in are the customer's first order date, most recent order date, and the number of lifetime orders they've placed. An aggregate table that includes this information might look like this:
FirstOrder and LastOrder are simply aggregate fields based on OrderDate from SalesOrderDetail that use the minimum and maximum aggregation functions. When building this type of table, it's helpful to remember that lookups are actually aggregations as well! In this case, the SalesCount is a count lookup of SalesOrderIDs from the SalesOrderHeader table.
When run, the data in the advanced table looks like this:
With aggregate tables and a few minutes of development time, you have now provided your analysts with some vital customer data. For more advanced reporting, a proper data warehouse, cube, or front-end reporting system will be required, but for highly formatted or ad-hoc reporting, aggregate tables can be a powerful tool.