Three-Level Row Partitioning Example | Teradata Vantage - Three-Level Row Partitioning Example - Advanced SQL Engine - Teradata Database

Database Design

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Published
July 2021
Language
English (United States)
Last Update
2021-07-27
dita:mapPath
kko1591750222108.ditamap
dita:ditavalPath
kko1591750222108.ditaval
dita:id
B035-1094
lifecycle
previous
Product Category
Teradata Vantageā„¢

The following CREATE TABLE request defines a table with three levels of row partitioning.

CREATE TABLE sales (
  storeid      INTEGER NOT NULL,
  productid    INTEGER NOT NULL,
  salesdate    DATE FORMAT 'yyyy-mm-dd' NOT NULL,
  totalrevenue DECIMAL(13,2),
  totalsold    INTEGER,
  note         VARCHAR(256))
UNIQUE PRIMARY INDEX (storeid, productid, salesdate)
PARTITION BY (RANGE_N(salesdate BETWEEN DATE '2003-01-01'
                                AND     DATE '2005-12-31'
                                EACH INTERVAL '1' YEAR),
              RANGE_N(storeid   BETWEEN 1
                                AND   300
                                EACH  100),
              RANGE_N(productid BETWEEN 1
                                AND   400
                                EACH  100));

Combined Partitioning Expression for the Sales Table

The corresponding combined partitioning expression is the following.

(RANGE_N(salesdate  BETWEEN DATE '2003-01-01'
                    AND     DATE '2005-12-31'
                    EACH INTERVAL '1' YEAR)-1)*12+
(RANGE_N(storeid    BETWEEN 1
                    AND   300
                    EACH  100)-1)*4+
(RANGE_N(productid  BETWEEN 1
                    AND   400
                    EACH  100)

There are 3 row partitions for both the first level and second levels, and 4 row partitions for the third level. Therefore, the total number of combined partitions is the product of 3*3*4, or 36.

The following diagram indicates the logical hierarchy of the three row partitioning levels for the sales table (65,535 assumes 2-byte partitioning):


Logical hierarchy of 3-row partitioning

How Rows for the Sales Table Are Grouped on an AMP

The following table shows how rows would be grouped on an AMP and, for each row, the partition number for each level (the result of the row partitioning expression for that level) and combined partition number (the result of the combined partitioning expression).

Rows are grouped by the combined partition number, and within a group are ordered by hash value first, then by uniqueness value. The Note column is truncated in all of these examples.

Note that for this example, only one sample row is shown for each combined partition number.

You can see that all the rows for a particular year are grouped together in this table; therefore, accessing those rows can be accomplished by reading only a subset of the data on each of the AMPs.

Combined Partition Number Partition Number Sales
L1 L2 L3 Storeid Productid Salesdate Total Revenue Total Sold Note
1 1 1 1 96 10 2003-04-15 4158 42 Good day
2 1 1 2 71 184 2003-07-06 1972 68 Marginal Sa
3 1 1 3 80 241 2003-11-09 3055 47 Slow day
4 1 1 4 82 363 2003-12-24 1261 13 Promotion
5 1 2 1 186 1 2003-01-11 255 17 Shelf Life
6 1 2 2 122 163 2003-06-13 2405 65 Multiple
7 1 2 3 110 234 2003-05-01 2618 77 2 for 1
8 1 2 4 187 384 2003-08-03 684 9 Buy 3, 1 Fr
9 1 3 1 280 31 2003-03-10 493 29 Rain
10 1 3 2 202 116 2003-02-17 6732 68 Cash Sale
11 1 3 3 292 272 2003-01-30 439 11 Quarterly
12 1 3 4 213 385 2003-04-29 4233 83 Promotion
13 2 1 1 76 51 2004-10-05 442 34 2 for 1
14 2 1 2 26 149 2004-01-09 365 5 Good day
15 2 1 3 57 295 2004-11-04 2684 61 Marginal Sa
16 2 1 4 87 338 2004-08-02 696 58 Slow day
17 2 2 1 108 34 2004-04-27 4218 74 2 for 1
18 2 2 2 171 143 2004-10-30 5925 79 Shelf Life
19 2 2 3 114 228 2004-05-24 2064 48 Rain
20 2 2 4 135 347 2004-08-03 110 55 Promotion
21 2 3 1 257 75 2004-09-18 3417 51 Promotion
22 2 3 2 295 191 2004-11-11 376 8 Quarterly
23 2 3 3 208 204 2004-03-17 864 36 Multiple
24 2 3 4 221 330 2004-12-15 1456 52 Buy 3, 1 Fr
25 3 1 1 39 85 2005-09-15 192 48 Cash Sale
26 3 1 2 55 112 2005-07-12 232 29 Shelf Life
27 3 1 3 76 243 2005-03-13 6696 93 2 for 1
28 3 1 4 7 309 2005-08-20 116 58 Discounted
29 3 2 1 186 44 2005-05-30 4275 75 Credit
30 3 2 2 183 167 2005-06-14 2982 42 Promotion
31 3 2 3 171 218 2005-05-22 80 8 Rain
32 3 2 4 180 389 2005-06-09 4128 96 Good day
33 3 3 1 278 61 2005-09-22 1581 51 Buy 4, 1 Fr
34 3 3 2 256 110 2005-04-22 3280 80 Sale
35 3 3 3 246 233 2005-11-29 5133 59 Discounted
36 3 3 4 251 342 2005-02-11 968 44 50% off
For example:
  • All the rows with a storeid in the range of 101 through 200, inclusively, can be found in the row partitions for combined partition numbers 5 to 8, 17 to 20, and 29 to 32.

    As a result, if you were to specify a predicate of WHERE storeid BETWEEN 101 AND 200, Vantage can use row partition elimination to scan only combined partitions 5-8, 17-20, and 29-32 because only they contain rows that evaluate to TRUE for the condition.

  • All the rows with a productid in the range of 201 through 300, inclusively, can be found in the combined partition numbers 3, 7, 11, 15, 19, 23, 27, 31, and 35.

    As a result, if you were to specify a predicate of WHERE productid BETWEEN 201 AND 300, Vantage can use row partition elimination to scan only partitions 3, 7, 11, 15, 19, 23, 27, 31, and 35 because only they contain rows that evaluate to TRUE for the condition.

  • All the rows with a storeid in the range of 1 and 100, inclusively, and a productid in the range 301 through 400, inclusively, can be found in the combined partitions for combined partition numbers 4, 16, and 28.

    As a result, if you were to specify a predicate of WHERE storeid BETWEEN 1 AND 100 AND productid BETWEEN 301 AND 400, Vantage can use row partition elimination to scan only combined partitions 4, 16, and 28 because only they contain rows that evaluate to TRUE for the condition.

Similarly, Vantage can find rows by reading only a subset of the data for other combinations of two or more conditions on the partitioning columns. If you are looking for rows with a specific value of each of the partitioning columns, then only one combined partition for the specific combined row partition number needs to be read.

Performance Implications of Multilevel Row Partitioning

  • If a SELECT request specifies values for all the primary index columns, Vantage can determine the AMP on which the rows reside, and only a single AMP needs to be accessed.

    If conditions are not specified on the partitioning columns, then Vantage can probe each combined partition to find the rows based on their hash value.

    If conditions are also specified on the partitioning columns, row partition elimination might reduce the number of row partitions to be probed on that AMP.

  • If a SELECT request does not specify the values for all the primary index columns or there is no primary index, then Vantage must do an all-AMP full-table scan for a nonpartitioned table.

    However, if row partitioning is defined on the table, and if you specify conditions on the partitioning columns, row partition elimination can reduce an all-AMP full file scan to an all-AMP scan of only the partitions of the combined partitioning expression that are not eliminated.

    The degree of row partition elimination that can be achieved depends on the partitioning expressions, the conditions in the query, and the ability of the Optimizer to recognize such opportunities.

    You need not specify values for all the partitioning columns in a query for row partition elimination to occur. Row partition elimination occurs at each level independently; the combination of the row partition elimination, if any, for each level determines which combined partitions need to be processed.