In the absence of other advanced techniques for star join optimization, nested fact join with dimensions while ensuring that the indexes are used for each dimension join delivers reasonable performance.
The following indexes are created on the star schema example shown:
- Primary key unique index on surrogate key for each dimension. One index each is created on: calendar (calendar_key), transaction_details (transaction_detail_key), product (product_key), sales_organization (sales_org_key)
- One index is created for each foreign key on the fact table: sales_transaction (calendar_key), sales_transaction (transaction_detail_key), sales_transaction (product_key), sales_organization (sales_org_key)
- In addition some indexes may be created on the dimension attributes that are frequently queried: transaction_details (transaction_type), product (category), sales_organization (store), etc.
As long as the optimization rules for a star schema are followed, the explain plan determined by the optimizer will look very similar to this:mysql> SELECT transaction_type, product.category, store, sum(dollar_retail_amount - dollar_discount_amount) AS dollar_sales
-----> FROM sales_transaction, calendar, transaction_details, product, sales_organization
-----> WHERE sales_transaction.calendar_key = calendar.calendar_key
-----> ANDsales_transaction.transaction_detail_key = transaction_details.transaction_detail_keyAND sales_transaction.product_key = product.product_key
----->
-----> AND sales_transaction.sales_org_key = sales_organization.sales_org_key
-----> AND calendar.month = 200702
-----> AND transaction_type = '1-Regular'
-----> AND product.category in ('Apparel', 'Personal Care')
-----> AND store in ('Sierra', 'Tundra')
-----> GROUP BYtransaction_type, product.category, store;
The following steps are executed in this explain plan:.+---+-------------+---------------------+--------+-----------------------------------------------------------------------------------+------------------------+------------------------------------------------+----------------------------------------------+
.|id | select_type | table | type | possible_keys | key | ref | Extra |
.+---+-------------+---------------------+--------+-----------------------------------------------------------------------------------+------------------------+------------------------------------------------+----------------------------------------------+
1| 1 | SIMPLE | transaction_details | range | PRIMARY,trans_type_idx |trans_type_idx| NULL | Using where; Using temporary; Using filesort |
2| 1 | SIMPLE | sales_transaction | ref | s_trans_cal_key_idx,s_trans_detail_key_idx,s_trans_prod_key_idx,s_trans_s_org_idx |s_trans_detail_key_idx| dwh.transaction_details.transaction_detail_key | Using where |
3| 1 | SIMPLE | product | eq_ref | PRIMARY,prod_category_idx | PRIMARY | dwh.sales_transaction.product_key | Using where |
4| 1 | SIMPLE | sales_organization | eq_ref | PRIMARY,sales_org_store_idx | PRIMARY | dwh.sales_transaction.sales_org_key | Using where |
5| 1 | SIMPLE | calendar | eq_ref |PRIMARY |PRIMARY | dwh.sales_transaction.calendar_key |Using where |.+---+-------------+---------------------+--------+-----------------------------------------------------------------------------------+------------------------+------------------------------------------------+----------------------------------------------+
- Filter dimension (transaction_details) rows using the index in the filtered attribute in the where clause
- Filter fact rows by joining the fact to the dimension rows filtered in step 1, the fact foreign key index is used for the join
- Filter the fact rows further by joining it with the next dimension (product), the dimension primary key index is used for the join
- Filter the fact rows further by joining it with the next dimension (sales_organization), the dimension primary key index is used for the join
- Filter the fact rows further by joining it with the next dimension (calendar), the dimension primary key index is used for the join