Rules and Restrictions for Column-Partitioned Join Indexes - Advanced SQL Engine

Rules and Restrictions for Column-Partitioned Join Indexes - Advanced SQL Engine - Teradata Database

SQL Data Definition Language Detailed Topics

Product

Advanced SQL Engine

Teradata Database

Release Number

17.05

17.00

Published

June 2020

Language

English (United States)

Last Update

2021-01-24

dita:mapPath

jpx1556733107962.ditamap

dita:ditavalPath

lze1555437562152.ditaval

dita:id

B035-1184

lifecycle

Product Category

Teradata Vantage™

Column-partitioned join indexes have the following rules and restrictions.

You can only specify column-partitioning, and optionally, column-partitioning mixed with row-partitioning, for a join index if the following assertions are all true.
- The join index is defined on a single table.
  It cannot be a multitable index.
- The join index is a non-aggregate index.
- The join index is not row-compressed.
If you specify the COLUMN option in a PARTITION BY clause, the join index is by definition a column-partitioned join index.
If specify NO PRIMARY INDEX or PRIMARY AMP INDEX, you must also specify a PARTITION BY clause and the partitioning must specify a COLUMN partitioning level.
When you do not specify a column grouping for a COLUMN clause, Teradata Database defines a separate column partition for each column and column group specified for a CREATE JOIN INDEX request.
You can only specify column or constraint grouping that is done in the select list if you also specify COLUMN partitioning in the PARTITION BY clause.
You cannot group columns or constraints in the select list of join indexes that are not column-partitioned.
You cannot specify column or constraint grouping in both the select list and in the COLUMN option of the PARTITION BY clause of the same CREATE JOIN INDEX request.
You cannot specify a column more than once within a column partition.
Otherwise, Teradata Database returns an error to the requestor.
You cannot specify a column to be in more than one column partition.
Otherwise, Teradata Database returns an error to the requestor.
The number of defined column partitions for a column partitioning level is the number of user-specified column partitions plus two column partitions reserved for internal use.
Teradata Database uses one of the reserved partitions as a sparse bit map to indicate deleted rows and the other is reserved for future use. There is always at least one column partition number that is not assigned to a column partition.
The number of defined partitions for a row partitioning level is the number of user-defined row partitions specified by the PARTITION BY clause.
If you do not specify a RANGE_N or CASE_N function in the PARTITION BY clause, Teradata Database uses 65,535 row partitions for the join index.
For a row partitioning level, the maximum partition number is the same as the maximum number of partitions for that level.
For a column partitioning level, the maximum partition number is one more than the maximum number of column partitions for that level.
This ensures that there is at least one unused column partition number that is always available for altering a column partition.
For each row partitioning level that does not specify the ADD option in level order, the maximum number of partitions for the row partitioning level is, if using the number of row partitions defined as the maximum for this and any lower row partitioning level without an ADD clause, the partitioning would be 2-byte partitioning, and is the largest value that does not cause the partitioning to be 8-byte partitioning.
Otherwise, the maximum number of partitions for this level is the largest value that does not exceed 9,223,372,036,854,775,807. If there is no such largest value, Teradata Database returns an error to the requestor.
You can specify ADD 0 for a partitioning level to specify explicitly that the maximum number of partitions for this level is the same as the number of defined partitions.
This can be useful for a column partitioning level if you want to override the default of ADD 10 so that other levels can have more partitions.

This can also be useful for a row partitioning level if you want a lower level that does not specify the ADD clause to have any excess partitions.
The maximum number of partitions for a row partitioning level must be at least two.
Otherwise, Teradata Database returns an error to the requestor.

This error occurs when only one partition is defined for a row partitioning level with an ADD 0 or with no ADD option and the maximum is not increased to at least two.
Specifying a maximum number of partitions for a partitioning level that is larger than the number of defined partitions for that level can enable the number of defined partitions for that level to be increased using an ALTER TABLE request.
This does not mean that you can increase the maximum number of partitions for a level using an ALTER TABLE request.
If you define a partitioning expression using only a single CASE_N function, the number of partitions defined must be less than or equal to 2,147,483,647 because the CASE_N function has INTEGER data type. (otherwise, existing error 5715 occurs).
Otherwise, Teradata Database returns an error to the requestor.
If you define the partitioning expression for single-level partitioning using only a single RANGE_N function with INTEGER data type, the total number of partitions defined must be less than or equal to 2,147,483,647.
Otherwise, Teradata Database returns an error to the requestor.

For 2-byte single-level partitioning, the RANGE_N function cannot define more than 65,533 range partitions because by definition, more range partitions would require 8-byte partitioning.

The total number of partitions you can define can be as many as 65,535 if you define both a NO RANGE and an UNKNOWN partition.
If you define a partitioning expression using only a single RANGE_N function with BIGINT data type, the number of ranges defined must be less than or equal to 9,223,372,036,854,775,805.
Otherwise, Teradata Database returns an error to the requestor.

For single-level partitioning, the total number of partitions you can define can be as many as 9,223,372,036,854,775,807 if you define both a NO RANGE and an UNKNOWN partition.
The number of partitions defined for a row partitioning level is the number of row partitions specified by the RANGE_N or CASE_N function, or 65,535 if you specify neither a RANGE_N or a CASE_N function.
You can specify an ADD option for a partitioning level. The BIGINT number following the ADD keyword plus the number of partitions defined for the level is the maximum number of defined partitions allowed for that level.
If this maximum exceeds 9,223,372,036,854,775,807, Teradata Database returns an error to the requestor.

For single-level partitioning, the maximum combined partition number is the same as the maximum partition number for the partitioning level.

The ADD option allows the number of partitions to be increased for that level up to this maximum using ALTER TABLE requests.
If you do not specify an ADD option for a column partitioning level, and the level is the only partitioning level, the maximum number of partitions, including the two for internal use, is 65,534.
If you do not specify an ADD option for a column partitioning level and the following list of items is also true, the partitioning is stored using 2 bytes in the row header and the maximum number of partitions for the column partitioning level is the number of column partitions defined plus 10. In other words, the default for this case is ADD 10.
- There is row partitioning
- At least one of the row partitioning levels does not specify the ADD option
The maximum number of partitions for the column partitioning level is the largest value that does not cause the partitioning to be 8-byte partitioning; otherwise, the maximum number of partitions for the column partitioning level is the largest value that does not exceed 9,223,372,036, 854,775,807.

For single-level partitioning, the maximum combined partition number is the same as the maximum partition number for this partitioning level.
The number of combined partitions for a join index is the product of the number of partitions defined by each partitioning level.
The maximum number of combined partitions for a join index is the product of the maximum number of partitions defined for each partitioning level.
For single-level partitioning, the maximum number of combined partitions is the same as the maximum number of partitions for the partitioning level.

If there is column partitioning, the maximum number of combined partitions is smaller than the maximum combined partition number. Otherwise, it is the same.
If the maximum combined partition number is greater than 65,535, Teradata Database stores the partition number in an 8-byte field in the row header. This is referred to as 8-byte partitioning.
If the maximum combined partition number is less than or equal to 65,535, Teradata Database stores the partitioning in a 2-byte field in the row header. This is referred to as 2-byte partitioning.

Single-level column partitioning with no ADD option is stored in a 2-byte field.
The maximum combined partition number for a join index is the product of the maximum partition numbers for each partitioning level.
The maximum cannot exceed 9,223,372,036, 854,775,807.

For single-level partitioning, the maximum number of combined partitions is the maximum number of partitions for this partitioning level.
For 2-byte partitioning, the maximum number of partitions for the first level is the largest value that does not cause the maximum combined partition number to exceed 65,535. This is repeated for each of the other levels, if any, from the second level to the last level.
For 2-byte partitioning, the maximum number of partitioning levels is 15.
For 8-byte partitioning, the maximum number of partitions for the first level is the largest value that does not cause the maximum combined partition number to exceed 9,223,372,036,854,775,807. This is repeated for each of the other levels from the second level to the last level.
For 8-byte partitioning, the maximum number of partitioning levels is 62.
If the maximum column partition number is more than two at one or more levels, the number of partitioning levels may be further limited because of the limit placed by the rule that the product of the maximum combined partition numbers at each level cannot exceed 9,223,372,036,854,775,807.

Teradata Database derives a partitioning constraint from the partitioning level definitions for a column-partitioned join index.

The text for this partitioning constraint cannot exceed 16,000 characters; otherwise, Teradata Database returns an error to the requestor.

The following definitions apply to this rule.

Variable	Definition
cs	the number of user-specified column partitions.
cm	the maximum partition number for the column partitioning level.

Teradata Database initially assigns the user-specified column partitions numbers starting at 1 in increments of 1 up to cs.

Teradata Database assigns an internal partition a partition number of cm-1 and the delete column internal partition is assigned a partition number of cm.

At first, no column partitions are assigned to column partition numbers cs+1 to cm-2. Initially, there is at least one unused column partition number because cm-cs-2 is greater than 0.

As you drop or alter partitions, there can be gaps in the numbering. As you add or alter column partitions, unused column partition numbers between 1 and cm-2 can be assigned to the newly added or altered column partitions as long as one column partition number remains unused. This is necessary because at least one column partition number must be available for use by ALTER TABLE to alter a column partition.

Teradata Database uses the column partition number to compute the combined partition number for a column partition value of a join index row. Apart from that, there is no significance to the column partition number assigned to a column partition.

If a join index is partitioned, its fallback rows are partitioned identically to its primary data rows.

If the primary data rows have a primary index, the fallback data rows have the same primary index.

If the primary data rows do not have a primary index, the fallback data rows also do not have a primary index.

The following table summarizes to which partitioning level any excess partitions are added.

IF partitioning is …	AND …	THEN Teradata Database adds as many left over combined partitions as possible to …
single-level		the only row or column partitioning level. If you specify an ADD clause, Teradata Database overrides it.
multilevel	all the row partitioning levels have an ADD clause, but there is a column partitioning level without an ADD clause,	the column partitioning level. This does not need to be added at the first partitioning level.
	a column partitioning level and at least one of the row partitioning levels does not have an ADD clause, including the case where none of the row partitioning levels has an ADD clause specified,	the first row partitioning level without an ADD clause after using a default of ADD 10 for the column partitioning level. This is repeated for each of the other row partitioning levels without an ADD clause, if any, in level order.
	a column partitioning level has an ADD clause and at least one of the row partitioning levels does not have an ADD clause	the first row partitioning level without an ADD clause. This is repeated for each of the other row partitioning levels without an ADD clause, if any, in level order.
	there is no column partitioning level and at least one of the row partitioning levels does not have an ADD clause. This includes the case where none of the row partitioning levels has an ADD clause specified.	the first row partitioning level without an ADD clause.
	all of the partitioning levels have an ADD clause or after applying left over combined partitions.	the first row or column partitioning level and overrides the ADD clause for the first partitioning level if one is specified. Teradata Database repeats this for each of the other levels, if any, from the second level to the last level.

You can define a large variety of partitioning expressions and column groupings for column partitioning with a large range in the number of combined partitions. You must consider the usefulness of defining a particular partitioning and its impact on performance and storage.

See Performance Issues for Column-Partitioned Tables for performance considerations.

Column grouping in the COLUMN option of a PARTITION BY clause enables more flexibility in specifying which columns belong to which partitions while still being able to specify the display order in the column list.

Column grouping in the select list enables a simpler, but less flexible, specification of column groupings.

A column partition value consists of the values of the columns in the column partition for a specific join index physical row.
If you specify a column grouping clause for a COLUMN specification of a PARTITION BY clause, the columns specified for the grouping columns must have been specified in the select list for the join index.
If you specify a column grouping clause for a COLUMN specification of a PARTITION BY clause, Teradata Database defines column partitions as specified by that column grouping clause.
If you specify a column or constraint grouping or both with COLUMN format in the column list, the grouping defines a column partition and Teradata Database stores one or more column partition values in a physical container using COLUMN format.
If you specify a column or constraint grouping or both with ROW format in the column list, the grouping defines a column partition that stores only one column partition value in a physical subrow using ROW format.

If you specify neither a COLUMN format nor a ROW format for a column or constraint grouping, the grouping defines a column partition and Teradata Database determines whether to use a COLUMN or ROW format for the column partition.
You can specify either AUTO COMPRESS or NO AUTO COMPRESS as the default for column partitions. If you do not explicitly specify either, AUTO COMPRESS is the default..
If you specify a column or constraint grouping (or both) with AUTO COMPRESS or NO AUTO COMPRESS in the column list, Teradata Database applies the appropriate autocompression for the specified physical rows and applies any user-specified compression to the column partition.
Teradata Database also applies row header compression for column partitions with COLUMN format.

If you specify neither AUTO COMPRESS nor NO AUTO COMPRESS, Teradata Database uses the default that you specify explicitly for COLUMN in the PARTITION BY clause.

If you do not define a default for COLUMN, Teradata Database uses AUTO COMPRESS.
If you specify neither PRIMARY INDEX (column_list) nor NO PRIMARY INDEX explicitly in your CREATE JOIN INDEX request, but do specify a PARTITION BY clause, Teradata Database creates the join index without a primary index regardless of the setting of the DBS Control flag PrimaryIndexDefault. For details and exceptions, see Teradata Vantage™ - Database Design, B035-1094 and Teradata Vantage™ - Database Utilities , B035-1102 .
If a COLUMN option that is a partitioning level of a PARTITION BY clause neither specifies column grouping nor specifies a column or constraint definition delimited by parentheses (which defines a group), Teradata Database treats the specified individual column as a single-element group, defining a single-column partition with AUTO COMPRESS and with a system-determined COLUMN or ROW format.
To specify either a COLUMN or ROW column partition format or to specify NO AUTO COMPRESS for a single-column column partition, you must delimit the column or constraint definition within parentheses to denote it as a single-column group.
You can group one or more column or constraint definitions into a column partition by delimiting the definitions with parentheses.

You can achieve the same result by specifying the same options and grouping in a COLUMN option in the PARTITION BY clause.
Column partitions have either a COLUMN format or a ROW format. Formats cannot mixed within the same column partition.
However, different column partitions of a column-partitioned join index can have different formats. For example, the column partitions of a column-partitioned join index can have all COLUMN format (and be stored in containers), all ROW format (and be stored in subrows), or some partitions in COLUMN format and others in ROW format.
You can specify a different column grouping for a target table than the source table has in the column list of a CREATE TABLE … AS request. The rules listed here also apply.

COLUMN Partitioning in the PARTITION BY Clause of a Join Index

You can create a single-table, nonaggregate, noncompressed join index with column partitioning and, optionally, row partitioning. If you specify COLUMN partitioning in the PARTITION BY clause of a join index, the following rules and restrictions apply.

You can specify a PRIMARY INDEX, PRIMARY AMP INDEX, or NO PRIMARY INDEX clause.
You cannot specify a UNIQUE PRIMARY INDEX clause.
If you do not specify a PRIMARY INDEX, PRIMARY AMP INDEX, or a NO PRIMARY INDEX clause, the default is NO PRIMARY INDEX regardless of the setting of the DBS Control field PrimaryIndexDefault.
You cannot specify an ORDER BY clause.
You can define column-partitioning for a sparse join index.
You must specify an alias for the system-derived column ROWID in the select list of the CREATE JOIN INDEX request used to create a column-partitioned join index.
You can then use the alias you specify for ROWID in a column grouping clause specified in a COLUMN clause in the PARTITION BY clause.
Any expression in the select list of the join index definition that is not a column name must be specified with an alias.
None of the columns specified in the select list for the join index definition can have a server character set of CHARACTER SET KANJI1.
Instead, you should change the server character set to CHARACTER SET UNICODE.

Specifying a Column Grouping for the COLUMN Option

Grouping columns in a COLUMN specification of the PARTITION BY clause enables flexibility in specifying the partitions to which columns belong.

The following rules apply when you specify a column grouping for the COLUMN option.

When you specify a column grouping for a COLUMN clause, you can only specify the name of a column that is defined in the select list for the join index.
If you attempt to specify a column by something other than its name, Teradata Database returns an error to the requestor.
Teradata Database defines a column partition for each non-group and group column partition specified in the select list of the join index definition.
If you specify COLUMN for a column partition, Teradata Database stores one or more column-partition values in a physical container.
If you specify ROW for a column partition, Teradata Database stores only one column partition value in a physical subrow.
A subrow is just a standard Teradata row format, and the term subrow is used to emphasize that it is part of a column partitioning.
If you specify neither COLUMN nor ROW for a column partition, Teradata Database determines whether to use COLUMN or ROW format for the column partition.
A column partition value consists of the values of the columns in the column partition for a specific join index row.

The following table provides the rules for using the ALL BUT option in the PARTITION BY clause of a join index definition.

IF …	THEN Teradata Database …
you specify ALL BUT	Defines a single-column partition with autocompression. Defines a system-determined COLUMN or ROW format for any column that is not specified in select list.
do not specify ALL BUT	groups any columns that you do not specify in the select list into one column partition with autocompression and a system-determined COLUMN or ROW format.

The following table provides the rules for using the NO AUTO COMPRESS option in the PARTITION BY clause of a join index definition.

IF you …	THEN Teradata Database …
specify NO AUTO COMPRESS for a column partition	does not apply autocompression for physical rows.
do not specify NO AUTO COMPRESS for a column partition	applies autocompression for physical rows.

A column partition either has COLUMN format or ROW format. It cannot have a mix of both formats.
Different column partitions of a column-partitioned join index can have different formats. In other words, the column partitions of a column-partitioned join index can have all COLUMN format, all ROW format, or some COLUMN format and others ROW format.
If you do not specify an explicit COLUMN or ROW format for a partition, Teradata Database makes the determination for you.
When the COLUMN or ROW format is system-determined, Teradata Database bases its choice of format on the size of a column partition value for the column partition and other factors such as whether a column-partition value for the column partition has fixed or variable length and whether the column partition is a single-column or multicolumn partition.

In general, Teradata Database assigns a COLUMN format to a narrow column partition and assigns a ROW format to a wide column partition.

You can submit a HELP COLUMN request or select the ColumnPartitionFormat column from the DBC.ColumnsV(X) view to determine which format Teradata Database elected to use for a column partition.

Note that you can explicitly specify the format if you want to use a specific format for a column partition.

Archive/Recovery Utility with Column-Partitioned Tables

The following rules apply to using the Archive/Recovery utility with column-partitioned tables.

You can archive, restore, and copy a column-partitioned join index with column partitioning or row partitioning or both.
You can also archive, restore, and copy databases or users that contain column-partitioned join indexes.
You cannot archive, restore, or copy selected partitions of a column-partitioned join index.
You should not archive, restore, or copy selected partitions of a column-partitioned join index.
This is currently valid for 2-byte PPI join indexes, but it is a deprecated feature.

You cannot archive, restore, or copy selected partitions of an 8-byte column-partitioned join index.

Options for Column-Partitioned Join Indexes

You can define the following options for column-partitioned join indexes.

Column partitioning
Fallback
If a table is column-partitioned, its fallback rows are also column-partitioned using the same partitioning as its primary data rows.
Secondary indexes
BLOB, CLOB, ARRAY, VARRAY, and Geospatial columns.
The validity of UDTs as index elements does not include a NUSI defined on a UDT that contains one or more LOB elements.

There is a limit of approximately 64K rows per row hash value for LOB columns. Because there is normally only one row hash value per AMP for column-partitioned join indexes, there is also a limit of approximately 64K rows per AMP for column-partitioned join indexes that contain columns typed as BLOBs or CLOBs.