Oracle - Partitions (Table and Indexes)
Table of Contents
1 - About
partition in Oracle.
Partitioning can provide great performance improvements because of partition elimination (pruning) capabilities, but also because parallel execution plans can take advantage of partitioning.
The data placement is controlled with additional information about the object, such as ranges of order data or hash buckets of customer id information.
A partition cuts out the storage of a table in several part. (ie segment)
Usually, the tables are:
- partitioned by time
- and sub-partitioned by the respective functional area.
The recommended partitioning range for most implementations is a month, however, you may want to consider implementing quarterly or yearly partitioning ranges.
2 - Articles Related
3 - Benefits
- Rolling off older periods with a partition by time. The older period's partition can be easily “DROPPED” and need not be deleted. Normally deletion will take longer time than a DDL
4 - The Partition Key Decision
The real challenge for database designers is not creating the partitioned objects; rather, it's the decision that precedes it—what type of partitioning to use and what column to use as a partition key.
The most important decisions in partition design are choosing the partitioning scheme and the column(s) to partition on, and these decisions depend heavily on the priorities of potentially conflicting objectives. Once you are familiar with the advantages of each type of partitioning scheme, you will be able to choose the scheme and column(s) appropriate for your partitioning needs
How do you decide on the type and columns used for partitioning? Your choice of partitioning has to address key partitioning objectives for manageability and performance. How you partition will be based on the priorities you assign to your objectives into the following categories:
- Ease of administration
- Data purge
- Data archiving
- Data movement
- Data lifecycle management
- Efficiency of backup
4.1 - Performance
This, of course, is the primary objective of many partitioning schemes. Performance advantages come from partition pruning or partitionwise joins, so if your queries do a lot of full-table scans, partitioning will help immensely, because partitions will limit the scope of the search.
4.1.1 - partition pruning
4.1.2 - partition-wise joins
4.1.3 - hash partition
hash partition on id between a fact and a dim table
4.2 - Ease of administration on large table (index rebuilds)
See Oracle Partition - Index (Local, Global), local index can be rebuild only for one partition:
ALTER INDEX in_trans_01 rebuild partition y05q1;
In addition to doing index rebuilds, you can also move tables across tablespaces, export tables, delete data, and so on—one partition at a time.
4.3 - Data purge / How to drop a partition?
Dropping a partition is a metadata change, It's very fast.
4.4 - Data archiving - Partition Exchange
4.5 - Data movement / Data Load
4.6 - Data lifecycle management
Most business data follows a predictable lifecycle: In the beginning, it is accessed moderately; then it is updated heavily; and finally access slows down to almost nothing. Regulatory requirements may mandate retention in the database of even this least accessed data. However, because the older data is accessed with decreasing frequency, you can use partitioning to develop a disk access strategy that lowers the total cost of ownership without sacrificing performance. Put partitions with the most accessed and updated data on the fastest disk storage, and put older, less accessed data on slower—and cheaper—disk storage.
For instance, in the TRANS table, the records of more-recent transactions, such as in the current partition, Y05Q4, are accessed heavily, followed by those of the partition immediately preceding it—Y05Q3—and so on. Because the older partitions are not accessed that frequently, it might be worthwhile to put them on a storage tier that is slower than the one on which the current (and most-accessed) partitions reside. One way to save money by using inexpensive storage is to create a new tablespace on the cheaper storage and move the partition to this new tablespace:
alter table trans move partition y05q1 tablespace y05q1_inexpensive;
During the partition move, the partition will be available for using SELECTs but not for UPDATEs.
Efficiency of backup. Consider the example at the beginning of this article—the TRANS table. In that example, each partition is located in a different tablespace named for the partition. If you can say with certainty that the records with a TRANS_DT value earlier than today (or earlier than some specific date) will not change, then you can also consider the corresponding partition to be read-only. In that case, you can convert the tablespace the partition resides in to read-only, as follows:
alter tablespace y05q1 read only;
When you make the tablespace read-only, the Oracle Recovery Manager (RMAN) backup can exclude it while making backups, because the tablespace will not change over time; one backup is enough. The more tablespaces you can make read-only, the shorter the duration of the RMAN job, which also reduces the load the RMAN job puts on the database. This benefit is most visible in databases containing historical data, especially data warehouses, in which the total amount of data to be backed up is usually quite high and you can make numerous tablespaces read-only.
5 - Decisions, Decisions
To make an informed partitioning decision, first prioritize the objectives for your own partitioning setup. For example, using some of this article's alternatives, suppose you decide on the following order of priority:
- 1. Data archiving
- 2. Data purge
- 3. Efficiency of backup
- 4. Ease of administration
- 5. Performance
You have deliberately chosen performance after the other objectives; in a different partitioning situation, however, performance may be the top priority.
5.1 - Data archiving and data purge
For the data archiving and data purge priorities, you have to choose a differentiating column. Is it time-based, as in the case of TRANS_DT in the TRANS example table? If so, which column differentiates the records to be archived and purged? The TRANS table example uses the TRANS_DT column as the partition key in a range-partitioned table, but suppose you have a similar table with a column named EXPIRY_DT, which indicates the date after which the record will definitely not be updated. In that case, to better meet the data archiving and data purge priorities, make EXPIRY_DT the partitioning key in the range-partitioning scheme.
Suppose the differentiating column contains discrete values, such as CAMPAIGN_CODE (in the ADVERTISING table), denoting advertising campaigns. After each campaign is completed, its records are archived and purged. To meet the data archiving and data purge priorities in this situation, use list partitioning with the CAMPAIGN_CODE column as the partitioning key. A range scheme would be useless here, because you would need to drop partitions of a specific CAMPAIGN_CODE, not the range that contains it.
Now suppose that most queries against the TRANS table do not choose TRANS_DT in the WHERE clause. Choosing this column as a partitioning column does not help performance. Is it a good decision to choose a range-partition scheme, with TRANS_DT as the partition key? The answer lies in your prioritizing. In this example, you list performance after data archiving and data purge, so this scheme offers the best solution. If you had prioritized performance over data archiving and data purge, you would have chosen a different scheme.
5.2 - Efficiency of backup
If efficient backup is your primary objective, ask yourself this question: Which column makes a record read-only? Suppose the answer is the TRANS_DT column. A record with a TRANS_DT value that is more than a month old will not be updated and should be considered read-only. In such a case, choose range partitioning on TRANS_DT, with each partition residing on a different tablespace, as shown in Listing 1. When the records on a partition will never be updated again, make the corresponding tablespace read-only. The RMAN backups will skip this tablespace during the backup, making the backup process faster.
5.3 - Ease of administration
If ease of administration is the primary objective, your partition decision must be based on how the data is modified. Suppose you do a lot of index rebuilds on tables, due to data loads. You should note the column that determines which data is to be loaded. Is it a date—as in the TRANS_DT column in the TRANS table? If so, a range-partitioned table with TRANS_DT as the partitioning key is most appropriate. Is this column a discrete value, as in PRODUCT_CODE? If so, list partitioning will be useful. If no such pattern exists, you can use hash partitioning on the table and rebuild the indexes of the generated partitions.
5.4 - Performance
If performance is the highest priority, choose the column that is used mostly in WHERE conditions to filter rows and in joins with other tables. If it's a time-based column, then range partitioning is very useful, as in the case of TRANS_DT in the TRANS table. Even if it's not time-based, the presence of some logical range will help. For instance, suppose the PRODUCT_CODE column (a number) follows a pattern such as 1000 to 1999 for consumer products, 2000 to 2999 for industrial products, and so on. When users query the table, are they focused on only a specific type of product, such as the industrial category? If that is the case, range partitioning can be used on the PRODUCT_CODE column, with partitions in the ranges 1000-1999, 2000-2999, and so on.
Combining partitioning strategies. You can also combine several of your most important goals with Oracle Database's composite partitioning strategies. For example, you can create a range/list-partitioned ADVERTISING table, using range partitioning on the AD_DT column to satisfy the performance requirement and list partitioning on the PRODUCT_CODE column to satisfy the data purge requirement.