We looked at the significant performance gains achieved by aggregating the balances fact. The case for aggregations is not as clear-cut with the transactions fact.
Sparse transactions
The key difference between balances and transactions for aggregations is that the transactions are sparse.
With balances, you need to maintain a balance for every customer, every product, every day irrespective of whether the balance changes or not. Thus, aggregation on balances always results in significant and known reduction in data volume.
You will NOT have a transaction for every customer, every product, every day. The transaction volume is small relative to balances:
| 30 million Accounts | 90 million Balances per day | 2 million transactions per day |
Aggregating transactions does not help much
Let us continue with the in the transactions fact example:
| Account (customer) 30 million records | Product | |||
| \ | / | |||
| Account customer demographics mini dimension | - | Summarized Transactions 2 million records per day | - | Organization (branch, financial adviser) |
| / | \ | |||
| Account status mini dimension | Calendar |
When the transactions are aggregated by eliminating the customer dimension, the chances are much higher that there will be a transaction for every day, every product, every organization unit, and every account status, etc. The volume of aggregated transactions fact table, therefore, ends up being very similar to the aggregated balances table that was derived after eliminating the customer dimension:
| Account customer demographics mini dimension | Product | |||
| \ | / | |||
| Aggregated Transactions 1 million records per day | - | Organization (branch, financial adviser) | ||
| / | \ | |||
| Account status mini dimension | Calendar |
You do not achieve the kind of performance improvement here that was achieved for the aggregated balances fact. The dimensionality of the balances and transactions being same at this point, they can be stored together in the same table. All the types of aggregations including the reduction in time periods stored can now be applied to this fact.