I’ve recently been involved in a number of projects requiring some complex database partitioning strategies. The partitioning strategies chosen were not immediately obvious – in fact they were designed in an iterative manner, taking into account a number of different requirements which, at first, seemed somewhat contradictory, but with refinement of the requirements resulted in strategies that fulfilled multiple goals. This has spurred me to write this article, as I feel this is a much overlooked, yet essential topic for large datawarehouses.
For the sake of clarity, the partitioning referred to in this article is horizontal table partitioning – the division of a table into multiple partitions by associating rows together in each partition. (Another form of partitioning is vertical partitioning – the division of a table by associating groups of columns together. Since vertical partitioning is a structural change to a design, it necessitates coding changes to access and modify the data for any one row. Vertical partitioning is not discussed in this article.)
There are a number of goals that many partitioning strategies try to solve. Not all are present in each scenario:
One of the main goals of partitioning is to ensure that database performance is consistent even as the data volume grows over time: With an unpartitioned table, the worst case performance is that performance will decline either in proportion to the number of rows of data in the table, or at a greater rate still, disproportional to the number of rows of data in the table.
The goal of partitioning in this scenario is to ensure that the worst case performance is that performance will decline either in proportion to the number of rows of data in a partition, or at a greater rate still, disproportional to the number of rows of data in a partition.
Therefore the goal of partitioning is to change the relationship between data volume and performance from one where the number of rows in a table is a determinant of performance to one where the number of rows in a partition is a determinant.
Another goal of partitioning is to isolate the loading of data, so that loading of data does not impact the availability of data in partitions not being subject to a data load.
An example of such a scenario is a centralised datawarehouse for global data, where data is loaded from different regions in different timezones around the world. Another example is where the data load must be atomic – either all the data in the load succeeds in loading or none of it must succeed, for example where partial data will distort the inference derived from the data. Yet another example is where data locking is required to ensure consistency of concurrent query results during the load process – and it is desired to isolate the data being loaded from a locking perspective.
Sometimes there is a requirement to isolate ranges of data for other reasons: Perhaps the data stored is for multiple clients and there is a requirement to ensure that each client’s data is isolated from each other, not so much from a security perspective (although row security may be a requirement), but to ensure high speed query performance. In order to ensure such performance, queries will need to include a mandatory client criterion, but if the main bulk of queries comes from clients themselves about their own data, then partitioning by the client key may be a great strategy.
It is frequently the case that older data is no longer required after a period of time. That data may be archived elsewhere, or simply deleted. However deleting large amounts of data can be a very time-consuming operation that can impact both reporting and loading cycles.
Utilising partitioning, an entire partition of data can be instantaneously dropped, reducing both the duration and any locking impact on the remaining data. Since the dropping of a partition is typically a meta-data operation (involving the de-allocation of an entire region of file or memory), the operation will always be near-instantaneous, in contrast to a typical delete operation, which will vary in duration, often in proportion to the quantity of data deleted.
In a datawarehouse where older data is largely immutable, partitioning can be used to reduce the dataset backed up in each backup cycle, by only backing up the latest partitions.
Where restoration time is important and where restoring the latest data before older data has precedence, partitioning can be used to restore the most recent data before older data.
Simple strategies are those that essentially partition the data by one simple logical dimension key. In many warehouses, this is likely to be a time-correlated dimension, because time-correlated partitioning strategies inherently facilitate the pruning of old data. Also a time-correlated partitioning strategy inherently facilitates any query where the time-correlated dimension is a query criterion.
Where the underlying database supports backups of individual partitions, or permits the exclusion of partitions marked as read-only, time-correlated partitioning also facilitates the goal of reducing backup and restore times.
With so many goals fulfilled by time-correlated partitioning, it is no wonder that this partitioning strategy is the most popular strategy adopted among many datawarehouses.
However time-correlated partitioning is a dynamic partitioning strategy – partitions need to be created over time according as new time-correlated dimension keys are created. Some database systems automatically create partitions for each new value, making this operation simple (such as HP Vertica) while others (such as Microsoft SQL Server) do not, necessitating the writing of custom code for partition management.
Partitioning by a dimension key that is non-time correlated can fulfill specific goals, such as data isolation and isolation of loading processes, but on its own, it will not fulfill goals such as high-speed bulk deletion of aged data and reducing backup/restore times by excluding aged, immutable, read-only data. Because these latter requirements are frequently requested, it is not surprising that partitioning by a dimension that is not correlated with time is a less popular strategy.
Range partitioning is a strategy where the rows in a table are partitioned according to a range of data values, rather than one partition per discreet value. Its advantage is that there can be many fewer partitions than the discreet values present in the partition. This is particularly useful where the underlying database has a limit on the number of partitions that any table can support and the number of discreet values exceeds this. It’s also useful where the underlying database uses columnar compression, because the compression may be much more efficient operating on fewer, larger partitions than many, smaller partitions. The downside to this is that there may be drastically different number of rows present in each partition where the distribution of partition key values is highly skewed.
List partitioning is a static partition strategy where a pre-defined list of partition key values determines into which partition data is inserted. Because it is a static partition strategy, this is only useful where the list of all data values is known from the outset; otherwise manual intervention will be necessary if new, as yet unknown partition values are added to the database.
Hash Partitioning is a special partitioning strategy worthy of mention: It involves creating a hash of a dimension key or other value, rather than using a dimension key or other value itself. Depending upon the hashing algorithm, the number of partitions will be determined by the size of the hash bucket – all values sharing the same hash will be placed into the same partition. The advantage of this algorithm is that the number of partitions required will be constant and the data when hashed, should provide a fairly even distribution of data in each partition. Some database systems will automatically create the partitions depending upon the column/s serving as the input to the hashing algorithm.
The disadvantage to hash partitioning is that it will often preclude many of the goals of partitioning, such as the pruning of aged data or reducing the time taken to backup/restore data.
For range queries spanning a number of different partition values, partition elimination will not be possible without re-writing the queries to express the query predicates in terms of the hash values. This will preclude the use of many BI front-end tools which dynamically generate SQL based upon the underlying entity-relationship model.
To provide any performance enhancement, the column/s required for the hashing algorithm should be commonly used query criterion/criteria.
Hash partitioning does not solve many of the goals mentioned in this article. It’s main use is in database systems where large tables having no obvious partitioning key require to be distributed over multiple storage devices to increase throughput and/or parallelism, and there is no other structure in the database system’s architecture for providing storage on multiple devices. For this reason, its use should be considered very carefully.
Composite partitioning is a strategy of partitioning that partitions based upon values in more than one column. In a dimensional model, if the multiple columns are different dimension keys, then a partition can be created for each combination of dimension values, or range/list of values.
Some database systems may support composite partitioning on multiple columns: For those that do not, similar functionality may be introduced into the model by creating a dimension key that refers to a snowflake dimension, where multiple branches of the snowflaked dimension represent different dimensions.
Related to composite partitioning is the concept of correlated dimension partitioning. This is where two logical dimensions are combined into a single dimension, with a single hierarchy, where it is realised that one dimension is always correlated with values in another dimension.
An example of this is the modelling of financial risk systems: Simulations are run on a large number of trading positions in order to calculate the risk associated with each simulated scenario. Each set of simulations is normally calculated together and collated as a batch of data to be loaded. But each set of simulations is normally calculated for a specific point in time. Therefore there is a correlation between the unit of load and the time dimension. While there may be many sets of data (constituting multiple loads) at a single point in time, a batch of data never contains data from more than one point in time. Therefore we have a natural hierarchy, where any point in time can contain one or more batches, but never the other way around. This realisation permits us to combine the time and batch/load dimensions into a single dimension. Not only will this enhance performance, but combining the the load and the time dimensions into a single dimension permits us to partition the fact table on a single dimension key value and, more importantly, it permits us to reap the benefits associated with partitioning by a time-correlated dimension and partitioning by the unit of load. This is particularly useful where a database platform does not permit partitioning by multiple dimension keys.
Some database platforms enable different granularities of partitioning in the same table: They permit both the splitting and merging of ranges of the partition key column/s. An example would be where a system initially creates a partition for each date of data loaded, but after a month-end former months of immutable data have their partitions merged into a partition-per-month partitioning scheme.
This strategy is useful in particular scenarios:
However, there is a big downside to variable partition granularity: If data is loaded into partitions at a lower level of granularity than they are subsequently merged into, then should data in subsequently merged partitions require updating, it may be necessary either to split the previously merged partitions in order to update the data, or to develop an entirely different load cycle to deal with this scenario (never a great idea from a code-maintenance perspective).
Therefore partitioning with variable partition granularity is normally only a good idea where the data in the partitions being merged is forever after immutable.
There are certainly more partitioning strategies than would initially meet the untrained eye. A successful partitioning implementation depends upon an accurate analysis of the goals to be fulfilled by partitioning and selection of the most suitable strategy to achieve those goals.