Wednesday, December 26, 2007

Range partitioning large database tables

Divide and Conquer --Gaius Julius Caesar

Table partitioning in data warehouses is used to divide a large database table into smaller physical tables called partitions1 that are easier to manage. The partitions still retain a single logical view for the user query execution. Range partitioning by date facilitates historical ETL processing and trend reporting:
  1. ETL: Only current partition is impacted for load and index rebuild, the downtime is minimized as new data is rolled in while old data is rolled out
  2. Querying and Reporting: Only relevant partitions are read based on the query's where clause, there is a significant improvement in response times
ETL

In a properly designed star schema, the fact table is the only large table. The daily load of the fact table is a resource intensive process and can take several hours. Rebuilding the indexes (after the load) on a large table can sometimes takes longer than the load itself. Range partitioning provides and easy way out of this bind. In the star schema example, the sales transaction fact is partitioned by months:
Oracle> CREATE TABLE sales_transaction (
calendar_key NUMBER(8),
transaction_detail_key NUMBER(5),
product_key NUMBER(6),
sales_org_key NUMBER(5),
transaction_number VARCHAR2(20),
transaction_line NUMBER(4),
dollar_retail_amount NUMBER(10,2),
dollar_discount_amount NUMBER(10,2),
dollar_tax_amount NUMBER(10,2),
cost_of_goods_sold NUMBER(10,2),
insert_date
DATE
) NOLOGGING PARTITION BY RANGE(calendar_key) (
PARTITION sales_jan2007 VALUES LESS THAN(20070201),
PARTITION sales_feb2007 VALUES LESS THAN(20070301),
PARTITION sales_mar2007 VALUES LESS THAN(20070401),
PARTITION sales_apr2007 VALUES LESS THAN(20070501),
PARTITION sales_may2007 VALUES LESS THAN(20070601),
PARTITION sales_jun2007 VALUES LESS THAN(20070701)
);
A few months are shown here, in reality the months span several years. The indexes created on this table are also partitioned by month, as signified by the LOCAL clause:
Oracle> CREATE BITMAP INDEX sales_trans_calendar_key_bix ON sales_transaction (calendar_key) NOLOGGING LOCAL;
Oracle> CREATE BITMAP INDEX sales_trans_trans_det_key_bix ON sales_transaction (transaction_detail_key) NOLOGGING LOCAL;
Oracle> CREATE BITMAP INDEX sales_trans_product_key_bix ON sales_transaction (product_key) NOLOGGING LOCAL;
Oracle> CREATE BITMAP INDEX sales_trans_sales_org_key_bix ON sales_transaction (sales_org_key) NOLOGGING LOCAL;
Duration, several hours: To load the data for the month of July 2007, first create a new table. The data load for the new month is then done to this independent table. After the load is complete, the indexes can be created (The LOCAL clause is not specified as this is not a partitioned table):
Oracle> CREATE TABLE sales_transaction_load (
calendar_key NUMBER(8),
transaction_detail_key NUMBER(5),
product_key NUMBER(6),
sales_org_key NUMBER(5),
transaction_number VARCHAR2(20),
transaction_line NUMBER(4),
dollar_retail_amount NUMBER(10,2),
dollar_discount_amount NUMBER(10,2),
dollar_tax_amount NUMBER(10,2),
cost_of_goods_sold NUMBER(10,2),
insert_date
DATE
) NOLOGGING;

Oracle> CREATE BITMAP INDEX sales_trans_load_cal_key_bix ON sales_transaction_load (calendar_key) NOLOGGING;
Oracle> CREATE BITMAP INDEX sales_trans_load_t_det_key_bix ON sales_transaction_load (transaction_detail_key) NOLOGGING;
Oracle> CREATE BITMAP INDEX sales_trans_load_prod_key_bix ON sales_transaction_load (product_key) NOLOGGING;
Oracle> CREATE BITMAP INDEX sales_trans_load_s_org_key_bix ON sales_transaction_load (sales_org_key) NOLOGGING;
Duration: sub-second: The loaded data and pre-built indexes are now appended to the main partitioned table:
Oracle> ALTER TABLE sales_transaction ADD PARTITION sales_jul2007 VALUES LESS THAN (20070801);

Oracle> ALTER TABLE sales_transaction EXCHANGE PARTITION
sales_jul2007
WITH TABLE sales_transaction_load INCLUDING INDEXES WITHOUT VALIDATION;
Duration, sub-second: When a new month added, an old month probably needs to be deleted as well. The data in the old month partition is swapped into an empty independent table for archival processing, and then deleted from the main table:
Oracle> CREATE TABLE sales_transaction_archive (
calendar_key NUMBER(8),
transaction_detail_key NUMBER(5),
product_key NUMBER(6),
sales_org_key NUMBER(5),
transaction_number VARCHAR2(20),
transaction_line NUMBER(4),
dollar_retail_amount NUMBER(10,2),
dollar_discount_amount NUMBER(10,2),
dollar_tax_amount NUMBER(10,2),
cost_of_goods_sold NUMBER(10,2),
insert_date
DATE
) NOLOGGING;

Oracle> CREATE BITMAP INDEX sales_trans_arch_cal_key_bix ON sales_transaction_archive (calendar_key) NOLOGGING;
Oracle> CREATE BITMAP INDEX sales_trans_arch_t_det_key_bix ON sales_transaction_archive (transaction_detail_key) NOLOGGING;
Oracle> CREATE BITMAP INDEX sales_trans_arch_prod_key_bix ON sales_transaction_archive (product_key) NOLOGGING;
Oracle> CREATE BITMAP INDEX sales_trans_arch_s_org_key_bix ON sales_transaction_archive (sales_org_key) NOLOGGING;

Oracle> ALTER TABLE sales_transaction EXCHANGE PARTITION sales_jan2007
WITH TABLE sales_transaction_archive
INCLUDING INDEXES WITHOUT VALIDATION;
Oracle> ALTER TABLE sales_transaction DROP PARTITION sales_jan2007;

Duration, several hours: The sales_transaction_archive table can now be archived or backed-up to offline storage.

As shown above, the resource and time intensive load, index build, and archive processing is all done on smaller tables. The addition or deletion of data on the partitioned table is an extremely fast process with negligible downtime for the main table that is being queried by the users.

Note that constraints such as primary keys and associated non-partitioned indexes (GLOBAL indexes in Oracle-speak) are not used in data warehousing. All the constraint checks are performed by the ETL process. Also, database logging (known as hot backup or online backup) is not required for ETL batch processing as signified by the NOLOGGING clause in the DDL statements shown above.

Querying and Reporting

Consider a typical query on the star schema:
Oracle> SELECT SUM(dollar_retail_amount), SUM(cost_of_goods_sold)
FROM sales_transaction, calendar, transaction_detail
WHERE sales_transaction.calendar_key = calendar.calendar_key
AND sales_transaction.transaction_detail_key = transaction_detail.transaction_detail_key
AND calendar.quarter = '2007Q2'
AND transaction_detail.transaction_type = '1-Regular';
The sales_transaction fact table has 10 million rows per month and half a billion rows over 4 years of history. This query needs to read data only for 3 partitions related to the second quarter of 2007. The explain plan2 shows the partition pruning on the fact table, as seen by the Pstart and Pstop columns, to indicate that only a fraction of the overall table and index data is being accessed to return the results of the query:
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Pstart| Pstop |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | |
| 1 | SORT AGGREGATE | | | |
| 2 | HASH JOIN | | | |
| 3 | TABLE ACCESS FULL | CALENDAR | | |
| 4 | HASH JOIN | | | |
| 5 | TABLE ACCESS FULL | TRANSACTION_DETAIL | | |
| 6 | PARTITION RANGE SUBQUERY | |KEY(SQ)|KEY(SQ)|
| 7 | TABLE ACCESS BY LOCAL INDEX ROWID| SALES_TRANSACTION |KEY(SQ)|KEY(SQ)|
| 8 | BITMAP CONVERSION TO ROWIDS | | | |
| 9 | BITMAP AND | | | |
| 10 | BITMAP MERGE | | | |
| 11 | BITMAP KEY ITERATION | | | |
| 12 | BUFFER SORT | | | |
| 13 | TABLE ACCESS FULL | TRANSACTION_DETAIL | | |
| 14 | BITMAP INDEX RANGE SCAN | SALES_TRANS_TRANS_DET_KEY_BIX |KEY(SQ)|KEY(SQ)|
| 15 | BITMAP MERGE | | | |
| 16 | BITMAP KEY ITERATION | | | |
| 17 | BUFFER SORT | | | |
| 18 | TABLE ACCESS FULL | CALENDAR | | |
| 19 | BITMAP INDEX RANGE SCAN | SALES_TRANS_CALENDAR_KEY_BIX |KEY(SQ)|KEY(SQ)|
----------------------------------------------------------------------------------------------
Note that the query above did not refer to the partition key (sales_transaction.calendar_key) directly:
WHERE sales_transaction.calendar_key between (20070401 and 20070630)
...but through the join with the calendar dimension. The database optimizer, is therefore, only able to perform partion pruning at runtime (as opposed to query parse time). This is called dynamic partition pruning, and is required to support the typical data warehouse trend queries.


1 In Oracle, a partition is a table partition. To confuse the terminology, DB2 has traditionally used the term partition to define the hash partitioned node of an MPP cluster. Now with newer releases, DB2 also has table level range partitions in addition to node level hash partitions.

2 You also need to know the use of bitmap indexes in a star transformation to fully understand this explain plan.