16.10 - Multilevel Partitioning - Teradata Database

Teradata Database Design

Product
Teradata Database
Release Number
16.10
Release Date
June 2017
Content Type
User Guide
Publication ID
B035-1094-161K
Language
English (United States)

Row Partitioning Across AMPs for 3 Partitioning Levels

The following graphic, which is best viewed on a color monitor and best printed on a color printer, shows the partitioning of rows based on a table that has 3 partitioning levels based on the table defined in Detailed Multilevel Partitioning Example.

The only difference between this graphic and the data in the table for the multilevel row partitioning example is that there is only one row per combined partition number, while the graphic has anywhere between 1 and 8 rows per combined partition number.

This graphic appears again later in this chapter in support of a more detailed explanation of multilevel partitioning.



Multilevel partitioning allows each partition at a given level to be further partitioned into subpartitions. Each partition for a level is subpartitioned the same per a partitioning expression or by column partitioning defined for the next lower level. Rows are grouped by the combined partition number, and within a group are ordered first by hash value and then, if for a partitioned primary index, by uniqueness value. The combined partition numbers are mapped 1-to-1 with internal partition numbers on which the rows on an AMP are ordered. The file system orders rows by the internal partition number, rowhash value, and then uniqueness value.

For multilevel partitioning, Teradata Database hash orders the rows within the lowest partition levels. A multilevel partitioning undertakes efficient searches by using row partition elimination at the various levels or combinations of levels. See SQL Request and Transaction Processing for a description of row partition elimination and its various forms.

The following list describes the various access methods that are available when multilevel partitioning is defined for a table.

  • If there is an equality constraint on the primary index and there are constraints on the partitioning columns such that access is limited to a single partition at each level, access is as efficient as with a nonpartitioned table.

    This is a single-AMP, single-hash access in a single subpartition at the lowest level of the partition hierarchy.

  • With constraints defined on the partitioning columns, performance of a primary index access can approach the performance of a nonpartitioned primary index depending on the extent of row partition elimination that can be achieved.

    This is a single-AMP, single-hash access in multiple (but not all) subpartitions at the lowest level of the partition hierarchy.

  • Access by means of equality constraints on the primary index columns that does not also include all the partitioning columns, and without constraints defined on the partitioning columns, might not be as efficient as access with a nonpartitioned primary index. The efficiency of the access depends on the number of populated subpartitions at the lowest level of the row partition hierarchy.

    This is a single-AMP, single-hash access in all subpartitions at the lowest level of the partition hierarchy.

  • With constraints on the partitioning columns of a partitioning expression such that access is limited to a subset of, say n percent, of the partitions for that level, the scan of the data is reduced to about n percent of the time required by a full-table scan.

    This is an all-AMP scan of only the non-eliminated row partitions for that level. This allows multiple access paths to a subset of the data: one for each partitioning expression.

    If constraints are defined on partitioning columns for more than one of the partitioning expressions in a multilevel partitioning definition, row partition elimination can lead to even less of the data needing to be scanned.

Partitioning CHECK Constraint for Multilevel Partitioning

A multilevel partitioned table has the following partitioning CHECK constraint, which the system stores in DBC.TableConstraints. Call this partitioning constraint form 3.



where:

Syntax element … Specifies the …
nn number of levels, or number of partitioning expressions, in the multilevel partitioning.

nn can range between 02 and 62, inclusive.

partitioning_expression_1 the first multilevel partitioning level.
partitioning_expression_2 the second multilevel partitioning level.
partitioning_expression_n the n th multilevel partitioning level.

Note that if the multilevel partitioning has 3 levels, there are 3 NOT NULL partitioning expressions in the implied constraint, if the multilevel partitioning has 10 levels, there are 10 NOT NULL partitioning expressions in the implied constraint, and so on.

Single-level partitioned tables have a different implied table-level partitioning CHECK constraint (see Single-Level Partitioning).

You can use the following query to retrieve a list of tables and join indexes that have PPIs and their partitioning constraint text.

     SELECT DatabaseName, TableName (TITLE 'Table/Join Index Name'),             ConstraintText
     FROM DBC.IndexConstraintsV
     WHERE ConstraintType = 'Q'
     ORDER BY DatabaseName, TableName;

You can use a query like the following to retrieve partitioning constraint information for each of the multilevel partitioned objects.

     SELECT *
     FROM DBC.TableConstraints
     WHERE ConstraintType = 'Q'
     AND   SUBSTRING(TableCheck FROM 1 FOR 13) >= 'CHECK (/*02*/'
     AND   SUBSTRING(TableCheck FROM 1 FOR 13) <= 'CHECK (/*15*/';

The TableCheck column of DBC.TableConstraints contains the unresolved condition text for a table-level constraint check or implicit table-level constraint such as a partitioning constraint. The ConstraintType code for such an implicit table-level constraint is Q for an object with a partitioned primary index, where in the case of a multilevel partitioning, each of the partitioning levels for the index appears once in the order defined for the table in the text contained in TableCheck. See Data Dictionary for details.

Rows that violate this implied index CHECK constraint, including those whose partitioning expression evaluates to null, are not allowed in the table.

Assume you create the following table:

     CREATE TABLE orders (
       o_orderkey      INTEGER NOT NULL,
       o_custkey       INTEGER,
       o_orderstatus   CHARACTER(1) CASESPECIFIC,
       o_totalprice    DECIMAL(13,2) NOT NULL,
       o_orderdate     DATE FORMAT 'yyyy-mm-dd' NOT NULL,
       o_orderpriority CHARACTER(21),
       o_clerk         CHARACTER(16),
       o_shippriority  INTEGER,
       o_comment       VARCHAR(79))
     PRIMARY INDEX (o_orderkey)
     PARTITION BY (RANGE_N(o_custkey   BETWEEN 0
                                       AND 49999
                                       EACH  100),
                   RANGE_N(o_orderdate BETWEEN DATE '2000-01-01'
                                       AND     DATE '2006-12-31'
                                       EACH INTERVAL '1' MONTH))
     UNIQUE INDEX (o_orderkey);

The partitioning CHECK constraint SQL text that would be stored in DBC.TableConstraints for this multilevel partitioned primary index is as follows.

     CHECK (/*02*/ RANGE_N(o_custkey   BETWEEN 0
                                       AND 49999
                                       EACH  100)
                   IS NOT NULL
               AND RANGE_N(o_orderdate BETWEEN DATE '2000-01-01'
                                       AND     DATE '2006-12-31'
                                       EACH INTERVAL '1' MONTH)
                   IS NOT NULL )

The maximum size of this partitioning CHECK constraint is 16,000 characters.

Partitioning CHECK Constraints

A partitioned table or join index has the following partitioning CHECK constraint.

     CHECK (/*nn bb cc*/ partitioning_constraint_1
            [AND partitioning_constraint_2] … )

where:

Syntax element … Specifies …
nn the number of partitioning levels.
  • For 2-byte partitioning, nn ranges between 01 and 15, inclusive.
  • For 8-byte partitioning, nn ranges between 01 and 62, inclusive.
bb the type of partitioning.
  • For 2-byte partitioning, bb is 02.
  • For 8-byte partitioning, bb is 08.
cc the column partitioning level.
  • For no column partitioning, cc is 00.
  • Otherwise, cc ranges between 01 and nn, inclusive.
Each of the partitioning constraints corresponds to a level of partitioning in the order defined for the table or join index.

The constraint specified by partitioning_constraint_i can be the partitioning constraint for any level and can represent either a row partitioning expression or a column partitioning expression.

partitioning_constraint_i When partitioning_expression_i is the row partitioning expression at level i, and partitioning constraint is the following.
   partitioning_expression_i /*i d+a*/ IS NOT NULL
When partitioning_expression_i is the column partitioning expression at level i is the following.
   PARTITION#Li /*i d+a*/ =1
i an integer ranging between 1 and nn, inclusive.

Leading zeros are not specified.

d the number of currently defined partitions for the level.

Leading zeros are not specified.

For a column-partitioned level, this includes the two internal column partitions.

a the number of additional partitions that could be added (this can be 0) or x.

Leading zeros are not specified.

x occurs for level 2 and higher if the partitioning constraint form is only being used because the cost profile constant PartitioningConstraintForm is set to 1 to force use of the new constraint form in all cases.

If the constraint form would be used regardless of the setting of PartitioningConstraintForm or this is for level 1, a is an integer number.

Assume you create the following table:

     CREATE TABLE orders (
       o_orderkey      INTEGER NOT NULL,
       o_custkey       INTEGER,
       o_orderstatus   CHARACTER(1) CASESPECIFIC,
       o_totalprice    DECIMAL(13,2) NOT NULL,
       o_orderdate     DATE FORMAT 'yyyy-mm-dd' NOT NULL,
       o_comment       VARCHAR(79))
     NO PRIMARY INDEX
     PARTITION BY (RANGE_N(o_custkey BETWEEN  0
                                     AND 100000
                                     EACH     1),
                   COLUMN)
     UNIQUE INDEX (o_orderkey);

The table-level partitioning CHECK constraint SQL text for this 8-byte partitioning is as follows:

     CHECK (/*02 08 02*/ RANGE_N(o_custkey BETWEEN  0
                                           AND 100000
                                           EACH     1
            /*1 100001+485440633518572409*/
            IS NOT NULL AND PARTITION#L2 /*2 9+10*/ =1)

The maximum size of this table-level CHECK constraint is 16,000 characters.

You can use the following request to retrieve the level for the column partitioning for each of the objects that have column partitioning in the system.

     SELECT DBaseId, TVMId, ColumnPartitioningLevel
            (TITLE ‘Column Partitioning Level’)
     FROM DBC.TableConstraints
     WHERE ConstraintType = 'Q'
     AND   ColumnPartitioningLevel >= 1
     ORDER BY DBaseId, TVMId;

See Partitioning CHECK Constraints for Single-Level Partitioning and Partitioning CHECK Constraint for Multilevel Partitioning for information about the table-level CHECK constraints that Teradata Database creates for single-level and multilevel partitioned primary index tables and join indexes.

Row Partition Elimination With Multilevel Partitioning

The following examples taken from various industries present examples of how multilevel partitioned primary indexes and row partition elimination can greatly enhance the performance of a query workload.

Example From the Insurance Industry

There are many cases for which row partition elimination using multiple expressions for WHERE clause predicate filtering can enhance query performance (see SQL Request and Transaction Processing for details about the various forms of row partition elimination). For example, consider an insurance company that frequently performs an analysis for a specific state and within a date range that constitutes a relatively small percentage of the many years of claims history in its data warehouse.

If an analysis is being performed only for claims filed in Connecticut, only for claims filed in all states in June 2005, or only for Connecticut claims filed during the month of June 2005, a partitioning of the data that allows elimination of all but the desired claims should deliver a dramatic performance advantage.

The following example shows how a claims table could be partitioned by a range of claim dates and subpartitioned by a range of state identifiers using multilevel partitioning.

Consider the following multilevel PPI table definition:

     CREATE TABLE claims (
       claim_id    INTEGER NOT NULL,
       claim_date  DATE NOT NULL,
       state_id    BYTEINT NOT NULL,
       claim_info  VARCHAR(20000) NOT NULL)
     PRIMARY INDEX (claim_id)
     PARTITION BY (RANGE_N(claim_date BETWEEN DATE '1999-01-01'
                                      AND     DATE '2005-12-31'
                                      EACH INTERVAL '1' MONTH),
                   RANGE_N(state_id   BETWEEN 1
                                      AND    75
                                      EACH 1))
     UNIQUE INDEX (claim_id);

Eliminating all but one month out of their many years of claims history would facilitate scanning of less than 5% of the claims history (because of business growth, there are many more recent than past claims) for satisfying the following query:

     SELECT *
     FROM claims
     WHERE claim_date BETWEEN DATE '2005-06-01' AND DATE '2005-06-30';

Similarly, eliminating all but the Connecticut claims from the many states in which this insurance company does business would make it possible to scan less than 5% of the claims history to satisfy the following query:

     SELECT *
     FROM claims, states
     WHERE claims.state_id = states.state_id
     AND   states.state = 'Connecticut';

State selectivity varies by the density of their business book. The company has more business in Connecticut than, for example Oregon and, therefore, a correspondingly larger number of claims for Connecticut. Note that the partitioning in the example specifies up to 75 state_id values to allow for any of the 50 US states plus US territories and protectorates such as Puerto Rico, Guam, American Samoa, the Marshall Islands, Federated States of Micronesia, Northern Mariana Islands, Palau, and the American Virgin Islands in the future.

Combining both of these predicates for row partition elimination makes it possible to scan less than 0.5% of the claims history to satisfy the following query.

     SELECT *
     FROM claims, states
     WHERE claims.state_id = states.state_id
     AND   states.state = 'Connecticut'
     AND   claim_date BETWEEN DATE '2005-06-01' AND DATE '2005-06-30';

For evenly distributed data, you would expect scanning of less than 0.25% of the data; however, the data for this company is not evenly distributed among states and dates, leading to a higher percentage of data to scan for the query.

Clearly, combining both predicates for row partition elimination has a significant performance advantage. Row partition elimination by both of these columns, as described, provides higher performance, more space efficiency, and more maintenance efficiency than a composite NUSI or join index for most of the queries run at this insurance company.

In fact, the performance advantage described previously could theoretically be achieved with the current single-level partitioning by using a single partitioning expression that combines both state and monthly date ranges. There are issues with using such a complex, single partitioning expression in this scenario but it is possible to do so.

The bigger problem is that a significant portion of the workload at this insurance company does not specify equality conditions on both partitioning columns. Though theoretically possible, Teradata Database is currently unable to evaluate more complex conditions with such a partitioning expression. With single-level partitioning, you would have to choose between partitioning by state or partitioning by date ranges.

If you chose partitioning by state, and a user submits a query that specifies a narrow date range without a state filter in the WHERE clause, the system does not perform row partition elimination. At this insurance company, both state-level (and, in some cases, also for a specific date range) analysis and enterprise-level (all states, but for a specific date range) analyses are common.

The users performing state-level analysis do not want to be penalized by having their data combined with all other states, but the users performing enterprise-level analyses also want their queries to be reasonably high-performing, at least by getting date range elimination, and be easy to construct.

Example from the Retail Industry

An example for a large retailer is similar to the insurance example, substituting division for state. In this case, the differences in size of partitions are even more exaggerated. For this company, division 1 is all of the United States, and represents greater than 85% of the entire business. But there are a number of divisions in other countries. The retailer currently replicates the data model for each country in order to provide reasonably efficient access to the data for an individual country. However, this is difficult to maintain, and limits the ability to do analyses across the entire company.

With multilevel partitioning, all the data can be stored in the same table, and analyses can be performed efficiently across either all of the data, or with subsets of the data. This example is also applicable to manufacturers with multiple divisions and date associated data.

Example from the Telecommunications Industry

Now consider the telecommunications industry. An example might be providing access to corporate billing information while also using the data for wider analysis. The difference is that there are many more customers than there are states or divisions. For example, suppose the company wants to start with 3,000 business customers, but they expect that number to grow considerably. This in turn results in more finely grained partitions. It also puts a great deal more pressure on the number of partitions and, therefore, limits choices in the granularity of date ranges or the next level partitioning.

You can use multilevel partitioning to improve query performance via row partition elimination, either at each of the partition levels or by combining all of them. Multilevel partitioning provides multiple access paths to the rows in the base table. As with other indexes, the Optimizer determines if the index is usable for a query and, if usable, whether its use provides the estimated least costly plan for executing the query.

You create multilevel partitioning by specifying two or more partitioning expressions, where each expression must be defined using either a RANGE_N function or a CASE_N function exclusively. The system combines the individual partitioning expressions internally into a single partitioning expression that defines how the data is partitioned on an AMP.

For example, assume there are three partitioning expressions, p 1, p 2, and p 3, defined as follows:

PARTITION BY(p 1,p 2,p 3 )

The first partitioning expression is the highest level partitioning. Within each of those partitions, the second partitioning expression defines how each of the highest-level partitions is subpartitioned. Within each of those second-level partitions, the third-level partitioning expression defines how each of the second level partitions is subpartitioned.

Within each of these lowest level partitions, rows are ordered by the row hash value of their primary index and their assigned uniqueness value.

Advantages of multilevel partitioning include the following:

  • Simplicity of the partitioning specification.
  • Partitioning expressions can be validated when they are created.
  • Multilevel partitioning allows for efficient altering of partitioning expressions in some common cases (see “ALTER TABLE” in SQL Data Definition Language).

Note that the number of levels of partitioning cannot exceed 15 for 2-byte partitioning or 62 for 8-byte partitioning because each level must define at least two partitions. The number of levels of partitioning might be further restricted by other limits such as the maximum size of the table header, data dictionary entry sizes, and so on.

Importance of Partition Order for Specifying Partitioning Expressions

The specification order of partitioning expressions can be important for multilevel partitioning. The system maps multilevel partitioning expressions into a single-level combined partitioning expression. It then maps the resulting combined partition number 1-to-1 to an internal partition number. Rows are in logical RowID order, where a RowID consists of an internal partition number, a row hash value, and a row uniqueness value (RowIDs are identical for single-level and multilevel partitioning).

This implies a partial physical ordering based on how the file system manages the data blocks and cylinders, though this is not a strict relationship. This physical ordering maintains the ordering of partitions (except for a possible wraparound of internal numbers at one point in the internal number sequence for each level). Multilevel partitioning expressions are analogous with a single-level partitioning expression that is identical to the combined partitioning expression for multilevel partitioning, at least in terms of expressing how rows are logically ordered by the file system.

There are several implications of this ordering.

  • Usage

    You can alter a partition level to change the number of partitions at that level to within 1 and the maximum number of partitions defined for that level when the table is populated with rows.

  • Query performance

    Row partition elimination at the lowest levels can increase overhead because of the frequent need to skip to the next internal partition to be read. This is because a partition at a lower level is split among the partitions at higher levels in the partition hierarchy.

    At higher levels in the partition hierarchy, there are more contiguous internal partitions to scan and skip.

  • Bulk data load performance

    If a load is order-based on one of the partitioning columns, having that partitioning column at the highest level in the partition hierarchy can improve load performance because those partitions are contiguous. For example, a date-based partition with daily data loads might benefit from having the date-based partitioning at the first level.

You define the ordering of the partitioning expressions in your CREATE TABLE SQL text, and that ordering implies the logically ordering by RowID. Because the partitions at each level are distributed among the partitions of the next higher level in the hierarchy, scanning a partition at a certain level requires skipping some internal partitions.

If the number of rows in each internal partition is large, then skipping to the next internal partition to be read incurs relatively little overhead.

If the system reads or skips only a few rows for each internal partition, performance might be somewhat worse than a full-table scan (or even significantly worse if a full-table scan were able to get more benefit out of block read-ahead or cylinder reads). You can exploit this ordering for a locality of reference performance benefit (meaning that by increasing the probability that certain partitions fall in the same cylinder by having their partitioning expression at a higher level, you can greatly enhance the performance of the scan).

Partition expression order does not affect the ability to eliminate partitions, but does affect the efficiency of a partition scan. As a general rule, this should not be a concern if there are many rows, which implies multiple data blocks, in each of the partitions.

Consider the following 2 cases.

  • For a table with 65,535 combined partitions, the maximum number of combined partitions per partitioned primary index is 6.5535 x 109 (billion) rows per AMP, 100 byte rows, and 50 KB data blocks, and assuming an equal distribution of rows among the partitions, each combined partition spans 200 to 201 data blocks.

    In this case, skipping over internal partitions should not incur significant overhead. Either all, or nearly all, of the rows in the data blocks read qualify, and the system skips at least 199 data blocks between each set of data blocks read.

  • If the table has only 6.5535 x 106 (million) rows per AMP, each combined partition has about 0.2 data blocks. In the worst case, where the system eliminates only every other partition at the lowest level, every data block is read, and a full-table scan would be more efficient.

    If there are 5 partitions at the lowest level, and the system can eliminate 4 out of 5 partitions, it still must read every data block.

    If there are 6 partitions at the lowest level, and the system can eliminate 5 out of the 6, it can skip some data blocks, but possibly not enough to overcome the overhead burden, so this might not be more efficient than a full-table scan.

    With a large number of partitions at the lowest level, and a large number of eliminated partitions at the lowest level, the system can probably skip enough more data blocks that the overhead burden can be overcome, and the operation might be more efficient than a full-table scan.

    This second case is somewhat artificial, and probably not a good use of multilevel partitioning. Instead of the demonstrated case, you should consider an alternative partitioning that results in multiple data blocks for each internal partition.

    A partitioning scheme that defines fewer levels of partitioning and fewer partitions per level, where the lowest level has the greatest number of partitions, ensures that there are more rows per combined partition, and would be far more useful. For example, if one level was initially partitioned in intervals of one day, changing the interval to one week or one month might be better.

Row partition elimination at the lower partition levels of a row partition hierarchy requires more skipping, which can both cause more I/O operations and increase the CPU path length.

To achieve optimal performance, you should specify a partitioning expression that is more likely to evoke row partition elimination for queries at a higher level and specify those expressions that are not as likely to evoke row partition elimination either at a lower level, or not at all. You should also consider specifying the row partitioning expression with the greatest number of row partitions at the lowest level.

As previously noted, the order of the row partitioning expressions might not be a significant concern if there are many data blocks per combined partition.

You can use the following query to find the average number of rows per populated combined row partition.

     SELECT AVG(pc)
     FROM (SELECT COUNT(*) AS pc
           FROM t
           GROUP BY PARTITION) AS pt;

Assuming the average block size is b and the row size is r, you can use the following query to find the average number of data blocks per populated combined row partition. The ideal number of data blocks per populated combined row partition is 10 or more.

     USING (b FLOAT, r FLOAT)
     SELECT (:r / :b) * AVG(pc) 
     FROM (SELECT COUNT(*) AS pc
           FROM t
           GROUP BY PARTITION) AS pt;

Different Multilevel Row Partitioning of the Same Table

The purpose of the following multipart example is to demonstrate the various properties of different multilevel row partitionings of the same data.

Stage 1: First multilevel row partitioning of the orders table.

     CREATE TABLE orders (
       o_orderkey INTEGER NOT NULL,
       o_custkey1 INTEGER,
       o_custkey2 INTEGER)
     PRIMARY INDEX (o_orderkey)
     PARTITION BY (RANGE_N(o_custkey1 BETWEEN 0
                                      AND    50
                                      EACH   10),  /* p1 */
                   RANGE_N(o_custkey2 BETWEEN 0
                                      AND   100
                                      EACH   10)) /* p2 */;

This definition implies the following information about the row partitioning of orders.

  • Number of row partitions in the first level, or highest, level = 6
  • Number of row partitions in second, or lowest, level = 11
  • Total number of combined partitions = 66
  • Combined partitioning expression = (p1 - 1) * d2 + p2 = (p1 - 1) * 11 + p2

Now if o_custkey1 is 15 and o_custkey2 is 55, the following additional information is implied:

  • Row partition number for level 1 = 2.
  • Row partition number for level 2 = 6.
  • PARTITION#L3 through PARTITION#L62 are all 0.
  • Combined partition number = PARTITION = (2-1)*11 + 6 = 17.

The following table indicates the row partition numbers for the various defined ranges for o_custkey1 and o_custkey2.

Value of o_custkey1 Value of o_custkey2 Value of (p1-1)*d2+p2 PARTITION Value of p1 PARTITION#L1 Value of p2 PARTITION#L2
0 - 9 0 - 9                  1 1                   1
10 - 19                  2                   2
20 - 29                  3                   3
30 - 39                  4                   4
40 - 49                  5                   5
50 - 59                  6                   6
60 - 69                  7                   7
70 - 79                  8                   8
80 - 89                  9                   9
90 - 99                10                 10
100                11                 11
10 - 19 0 - 9                 12 2                   1
10 - 19                 13                   2
20 - 29                 14                   3
30 - 39                 15                   4
40 - 49                 16                   5
50 - 59                 17                   6
60 - 69                 18                   7
70 - 79                 19                   8
80 - 89                 20                   9
90 - 99                 21                 10
100                 22                 11
20 - 29 0 - 9                 23 3                   1
10 - 19                 24                   2
20 - 29                 25                   3
30 - 39                 26                   4
40 - 49                 27                   5
50 - 59                 28                   6
60 - 69                 29                   7
70 - 79                 30                   8
80 - 89                 31                   9
90 - 99                 32                 10
100                 33                 11
30 - 39 0 - 9                34 4                   1
10 - 19                35                   2
20 - 29                36                   3
30 - 39                37                   4
40 - 49                38                   5
50 - 59                39                   6
60 - 69                40                   7
70 - 79                41                   8
80 - 89                42                   9
90 - 99                43                 10
100                44                 11
40 - 49 0 - 9                45 5                   1
10 - 19                46                   2
20 - 29                47                   3
30 - 39                48                   4
40 - 49                49                   5
50 - 59                50                   6
60 - 69                51                   7
70 - 79                52                   8
80 - 89                53                   9
90 - 99                54                 10
100                55                 11
50 0 - 9                56 6                   1
10 - 19                57                   2
20 - 29                58                   3
30 - 39                59                   4
40 - 49                60                   5
50 - 59                61                   6
60 - 69                62                   7
70 - 79                63                   8
80 - 89                64                   9
90 - 99                65                 10
100                66                 11

The following case examples demonstrate some of the properties of this multilevel partitioning scheme.

   SELECT * FROM orders WHERE o_custkey1 = 15;

In this case, there might be qualifying rows in the row partitions where the combined row partition numbers = (2 -1)*11 + (1 to 11) = 11 + (1 to 11) = 12 to 22.

   SELECT * FROM orders WHERE (o_custkey1 = 15) OR o_custkey1 = 25 AND 
o_custkey 2 BETWEEN 20 AND 50; 

In this case, there might be qualifying rows in the row partitions where the combined partition numbers are in (14 to 17, 25 to 28).

   SELECT * FROM ORDERS WHERE o_custkey2 BETWEEN 42 AND 47;

In this case, there might be qualifying rows in the row partitions where the combined partition numbers are in (5, 16, 27, 38, 49, 60).

Stage 2:

Suppose you submit the following ALTER TABLE request on orders.

     ALTER TABLE orders
     MODIFY PRIMARY INDEX
       DROP RANGE BETWEEN   0
                  AND       9
                  EACH     10
       ADD RANGE  BETWEEN  51
                  AND      70
                  EACH     10,
       DROP RANGE BETWEEN 100
                  AND     100
       ADD RANGE -100 TO -2;

This alters the row partitioning expressions to be the following expressions.

     RANGE_N(o_custkey1 BETWEEN   10
                        AND       50
                        EACH      10, 51 AND 70
                        EACH      10),
     RANGE_N(o_custkey2 BETWEEN -100
                        AND       -2, 0
                        AND       99
                        EACH      10)

In other words, the table definition after you have performed the ALTER TABLE request is as follows.

     CREATE TABLE orders (
       o_orderkey INTEGER NOT NULL,
       o_custkey1 INTEGER,
       o_custkey2 INTEGER)
     PRIMARY INDEX (o_orderkey)
     PARTITION BY (RANGE_N(o_custkey1 BETWEEN   10
                                      AND       50
                                      EACH      10, /* p1 */
                                                51
                                      AND       70
                                      EACH      10),
                   RANGE_N(o_custkey2 BETWEEN -100
                                      AND       -2, 0
                                      AND       99
                                      EACH      10)) /* p2 */;

This changes the information implied by the initial table definition about the row partitioning of orders as follows.

  • Number of partitions in the first, and highest, level = d1 = 7
  • Number of partitions in the second, and lowest, level = d2 = 11
  • Total number of combined partitions = d1 * d2 = 7 * 11 = 77
  • Combined partitioning expression = (p1 - 1) * d2 + p2 = (p1 - 1) * 11 + p2

Now if o_custkey1 is 15 and o_custkey2 is 55, the following additional information is implied.

  • Partition number for level 1 = PARTITION#L1 = p1(15) = 1.
  • Partition number for level 2 = PARTITION#L2 = p2(55) = 7.
  • PARTITION#L3 through PARTITION#L15 are all 0.
  • Combined partition number = PARTITION = (1-1)*11 + 7 = 7.

The following table indicates the row partition numbers for the various defined ranges for o_custkey1 and o_custkey2.

Value of o_custkey1 Value of o_custkey2 Value of (p1-1)*d2+p2  PARTITION Value of p1  PARTITION#L1 Value of p2  PARTITION#L2
10 - 19         -100  -  -2                     1 1                     1
0  -  9                     2                     2
10  -  19                     3                     3
20  -  29                     4                     4
30  -  39                     5                     5
40  -  49                     6                     6
50  -  59                     7                     7
60  -  69                     8                     8
70  - 7 9                     9                     9
80  -  89                   10                   10
90  -  99                   11                   11
20 - 29         -100  -  -2                   12 2                     1
0 -  9                   13                     2
10 - 19                   14                     3
20 - 29                   15                     4
30 - 39                   16                     5
40 - 49                   17                     6
50 - 59                   18                     7
60 - 69                   19                     8
70 - 79                   20                     9
80 - 89                   21                   10
90 - 99                   22                   11
30 - 39         -100  -  -2                   23 3                     1
0 - 9                   24                     2
10 - 19                   25                     3
20 - 29                   26                     4
30 - 39                   27                     5
40 - 49                   28                     6
50 - 59                   29                     7
60 - 69                   30                     8
70 - 79                   31                     9
80 - 89                   32                   10
90 - 99                   33                   11
40 - 49         -100  -  -2                   34 4                     1
0 - 9                   35                     2
10 - 19                   36                     3
20 - 29                   37                     4
30 - 39                   38                     5
40 - 49                   39                     6
50 - 59                   40                     7
60 - 69                   41                     8
70 - 79                   42                     9
80 - 89                   43                   10
90 - 99                   44                   11
50         -100  -  -2                   45 5                     1
0 - 9                   46                     2
10 - 19                   47                     3
20 - 29                   48                     4
30 - 39                   49                     5
40 - 49                   50                     6
50 - 59                   51                     7
60 - 69                   52                     8
70 - 79                   53                     9
80 - 89                   54                   10
90 - 99                   55                   11
51 - 60         -100  -  -2                   56 6                     1
0 - 9                   57                     2
10 - 19                   58                     3
20 - 29                   59                     4
30 - 39                   60                     5
40 - 49                   61                     6
50 - 59                   62                     7
60 - 69                   63                     8
70 - 79                   64                     9
80 - 89                   65                   10
90 - 99                   66                   11
61 - 70         -100  -  -2                   67 7                     1
0 - 9                   68                     2
10 - 19                   69                     3
20 - 29                   70                     4
30 - 39                   71                     5
40 - 49                   72                     6
50 - 59                   73                     7
60 - 69                   74                     8
70 - 79                   75                     9
80 - 89                   76                   10
90 - 99                   77                   11

The following cases provide examples of how multilevel row partitioning might be useful.

  SELECT * FROM orders WHERE o_custkey1 = 15;

In this case, there might be qualifying rows in the row partitions where the combined partition numbers are in the range (1-1)*11 + (1 TO 11) = 1 TO 11.

  SELECT * FROM orders WHERE (o_custkey1 = 15 OR o_custkey1 = 25) 
AND o_custkey2 BETWEEN 20 AND 50;

In this case, there might be qualifying rows in the row partitions where the combined partition numbers are in the ranges 4 TO 7 and 15 TO 18.

  SELECT * FROM orders WHERE o_custkey2 BETWEEN 42 AND 47;

In this case, there might be qualifying rows in the row partitions where the combined partition number is any of the following.

  •   6
  • 17
  • 28
  • 39
  • 50
  • 61
  • 72