15.00 - Considerations for Basing the Partitioning Expression on a CASE_N or RANGE_N Character Column - Teradata Database

Teradata Database SQL Data Definition Language Detailed Topics

prodname
Teradata Database
vrm_release
15.00
category
Programming Reference
featnum
B035-1184-015K

Considerations for Basing the Partitioning Expression on a CASE_N or RANGE_N Character Column

You can use CASE_N functions to define a character partitioning expression that groups the rows of a table into partitions that are based on a particular condition. You can then evaluate the conditions to return the number of the first condition that evaluates to TRUE without encountering a condition that evaluates to UNKNOWN. You can also use various options to handle cases where a condition evaluates to UNKNOWN or when all conditions evaluate to FALSE.

You can use the following comparison operators and string functions within a CASE_N function to evaluate partitioning expression character strings using the current session collation if the expression is not part of a partitioning expression.

  • =
  • <
  • >
  • >=
  • <=
  • <>
  • BETWEEN
  • LIKE
  • If one of the partitioning expressions in a column partitioning uses a CASE_N function, then all of the comparisons use the session collation that was in effect when the table or join index was created or had its partitioning altered.

    The rules for CASE_N expressions in a column partitioning created with a CREATE TABLE request are the same as those for a non‑character partitioning expression with the following exceptions.

  • For single‑level partitioning, if a partitioning expression consists only of a RANGE_N function with INTEGER data type, the total number of partitions defined (including all the range partitions, and the NO RANGE, UNKNOWN, and NO RANGE OR UNKNOWN partitions if specified) must be less than or equal to 2,147,483,647.
  • If you copy the primary index definition from a table to create another table using a CREATE TABLE … AS … WITH [NO] DATA request, and the primary index of the source table is partitioned, the copied table definition is partitioned in the same way.
  • Teradata Database uses the collation used for the character partitioning of the source table for the definition of the target table.

    The session collation does not need to be the same as the collation of the character partitioning for the source table.

    If the collation for the source table is MULTINATIONAL or CHARSET_COLL, and the definition of the collation or collation character set has changed since the table was created, then an attempt to copy the table using the CREATE TABLE … AS syntax aborts and returns an error to the requestor.

    You must revalidate the table before it can be copied using the CREATE TABLE … AS syntax.

    The rules for CREATE TABLE … AS requests are otherwise the same as those for tables that do not have character partitioning.

  • You can define a partitioning expression using a character or graphic comparison, but the partitioning expression cannot be defined on Kanji1 or KanjiSJIS columns or constant expressions.
  • A partitioning expression can specify the UPPERCASE qualifier and any of the following character functions.

  • CHAR2HEXINT
  • INDEX
  • LOWER
  • MINDEX
  • POSITION
  • TRANSLATE
  • TRANSLATE_CHK
  • TRIM
  • UPPER
  • VARGRAPHIC
  • Expressions and referenced columns in the partitioning expression must not have BLOB, CLOB, or Geospatial data types, nor can they be defined using any of the following functions and function types.

  • External UDFs
  • SQL UDFs
  • Built‑in functions, including the following.
  • Aggregate functions
  • Grouped‑row OLAP functions
  • HASHAMP function
  • HASHBAKAMP function
  • RANDOM function
  • ROWID keyword
  • PARTITION or PARTITION[#Ln] columns
  • Set operators
  • Subqueries
  • A partitioning expression must be based on 1 or more columns in the table. Specifically, a partitioning expression must be a deterministic expression that is based on the columns within a row.

  • The evaluation of the partitioning expression at any character partitioning level uses the session collation in effect when the table was created to determine the ordering of character data for comparison operators.
  • The evaluation of a partitioning expression that is not based on a RANGE_N function at any character partitioning level uses the same case sensitivity rules that were in effect when the table was created to determine ordering of character data for comparison operators and string functions.
  •  

    IF a table is created in this session mode …

    THEN all comparisons with constant literals treat the literal as …

    ANSI

    CASESPECIFIC unless the constant literals are explicitly cast to be NOT CASESPECIFIC.

    When an expression in a comparison is CASESPECIFIC, then the comparison is case sensitive; otherwise, it is case blind.

    Teradata

    NOT CASESPECIFIC by default.

    A comparison between a constant and a NOT CASESPECIFIC column or expression is treated as case blind unless the constant literals are explicitly cast to be CASESPECIFIC.

    See SQL Functions, Operators, Expressions, and Predicates for more information on the rules for case sensitivity in character data comparisons.

  • A partitioning expression at a given partitioning level should have the same case sensitivity in comparisons and string functions as what is commonly used in WHERE or ON clause conditions of DML requests involving the partitioning column. Otherwise, the character partitioning might not be eligible for optimizations that eliminate partitions, either statically or dynamically. For information about row partition elimination, see Database Design and SQL Request and Transaction Processing. For a row partition to be statically eliminated, the conditional expression associated with that partition must be a combination of predicates that all have the same case sensitivity.
  • The evaluation of a RANGE_N partitioning expression at any character partitioning level is independent of the default case sensitivity rules in effect for the session when the table was created, but only when the test value is an expression that does not contain any string literals without an explicit CAST.
  • For a RANGE_N partitioning expression,
  •  

    IF a test value is …

    THEN its range boundaries are …

    NOT CASESPECIFIC

    NOT CASESPECIFIC

    CASESPECIFIC

    CASESPECIFIC

  • If the test value is an expression that contains NOT CASESPECIFIC columns or expressions and string literals with no explicit cast, then the test value behaves according to the following table.
  •  

    IF the partitioning is created in this session mode …

    THEN the test value …

    ANSI

    is CASESPECIFIC

    Teradata

    for a RANGE_N expression is NOT CASESPECIFIC

  • A test value for a RANGE_N function must result in a BYTEINT, BIGINT, INTEGER, SMALLINT, DATE, TIMESTAMP(n), TIMESTAMP(n) WITH TIME ZONE, CHARACTER, or VARCHAR data type.
  • The sum of the sizes of all constant literals referenced in all partitioning expressions in a partitioned primary index definition must be less than 64 KB; otherwise, the CREATE TABLE request aborts and returns an error to the requestor.
  • If you use multiple character sets on your system, you should always create character partitioning columns using the Unicode server character set and store the partitioning column using the same server character set. Otherwise, a DML request against this table that can use a rowkey-based merge join might be restricted to direct rowkey‑based merge joins between 2 partitioned tables that have identical partitioning.
  • Another reason to create your character partitioning columns using the Unicode server character set is to fully optimize dynamic row partition elimination (see SQL Request and Transaction Processing for details).

  • To maximize the cases that can use character partitioning for optimization, you should create your partitioned tables in a Teradata mode session if the character partitioning expression set involves columns that are NOT CASESPECIFIC. All constant and non‑constant character expressions in a comparison should be defined as NOT CASESPECIFIC.
  • If the partitioning columns are CASESPECIFIC, then the partitioning should either be created in an ANSI mode session, or you should define any non‑constant character expressions that are specified in a comparison to be CASESPECIFIC.

    The easiest way to guarantee that all comparisons are CASESPECIFIC is to make those comparisons between the CASESPECIFIC partitioning column (instead of an expression on the partitioning column) and a constant expression.

    While all comparisons should be made using the same case sensitivity to enable static row partition elimination, the recommend best practice is to specify all partitioning to be case blind, which enables row partition elimination to occur when you specify either case blind or case sensitive WHERE clause conditions are present.

  • Teradata Database might convert any string literals in partitioning expression comparison operations from the session client character set or explicitly specified character set, if a qualifier is present, of the CREATE TABLE request, to the character set of the non‑constant expression in the comparison. Teradata Database always performs this conversion for partitioning expressions based on the RANGE_N function.
  • For example, consider CASE_N (t1.a < 'cat'), where t1.a is the non‑constant expression and 'cat' is the string literal. The characters specified in the string literal should exist in the client character set to ensure the proper translation of the non‑constant expression to the server character set. You should submit a SHOW TABLE request and examine the SQL create text it returns to ensure that the system did not make any undesired translations.

    This example shows the use of native Japanese ideographs in a character partitioning expression.

         CREATE SET TABLE df2.t1c, FALLBACK, NO BEFORE JOURNAL,
                                   NO AFTER JOURNAL, CHECKSUM = DEFAULT (
           c1 INTEGER,
           c2 CHARACTER(30) CHARACTER SET UNICODE NOT CASESPECIFIC,
           c3 CHARACTER(4)  CHARACTER SET GRAPHIC CASESPECIFIC)
         PRIMARY INDEX (c1)
         PARTITION BY CASE_N(c2 BETWEEN ''
                                AND     '',
                             c2 = '',
                             c2 BETWEEN ''
                                AND     '',
                             NO CASE, UNKNOWN); 

    To create the same table using BTEQ or another client API that does not support client character sets such as KanjiEUC_0U natively, you must create the table using hexadecimal Unicode literals in place of the Japanese ideographs, like this.

         CREATE SET TABLE df2.t1c, FALLBACK, NO BEFORE JOURNAL, 
                                   NO AFTER JOURNAL, CHECKSUM = DEFAULT (
           c1 INTEGER,
           c2 CHARACTER(30) CHARACTER SET UNICODE NOT CASESPECIFIC,
           c3 CHARACTER(4)  CHARACTER SET GRAPHIC CASESPECIFIC)
         PRIMARY INDEX (c1)
         PARTITION BY CASE_N(c2 BETWEEN _UNICODE '9758'XC
                                AND     _UNICODE '9759'XC,
                             c2 =       _UNICODE '9797'XC,
                             c2 BETWEEN _UNICODE '97D3'XC 
                                AND     _UNICODE '9AFB'XC,
                             NO CASE, UNKNOWN);

    It is your responsibility to ensure that the client character set is correct for the string literals you specify, and that the client character set is compatible with the server character set of the test value for RANGE_N expressions or non‑constant expressions that are compared in CASE_N partitioning. Teradata Database does not do this for you.