index (CREATE JOIN INDEX) - Teradata VantageCloud Lake

Lake - Working with SQL

Deployment
VantageCloud
Edition
Lake
Product
Teradata VantageCloud Lake
Release Number
Published
February 2025
ft:locale
en-US
ft:lastEdition
2025-11-21
dita:mapPath
jbe1714339405530.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
jbe1714339405530

You can use the indexes clause to define a primary index, primary AMP index, partitioning, and a set of secondary indexes for the join index.

You can use this clause with multitable and single-table join indexes.

PRIMARY INDEX

A primary index defined on the join index. The primary index for an aggregate join index must be from the columns specified in the GROUP BY clause of the join index definition. You cannot specify aggregated columns as part of the primary index. If the primary index is not compressed, you can define it as a single-level or multilevel row partitioned primary index by specifying one or more partitioning expressions. You cannot partition the primary index of a join index if the index is defined with compression.

The following rules apply to specifying a row partitioned primary index for a join index.
  • The partitioning columns for an aggregate join index must be drawn from the columns specified in the GROUP BY clause of the join index definition.
  • You cannot specify an aggregated column as a partitioning column for the PPI.
  • The partitioning level in a join index acts as a constraint on its underlying base tables. If an insert, delete, or update operation on a base table causes a partition violation in the join index by making one or more of the partitioning expressions evaluate to null, the request returns an error message, and neither the base table nor the join index is updated.
  • If a join index definition includes an outer join, deleting base table rows may cause inserts into that join index for the unmatched rows.
  • For your partitioned join indexes, you must define partitioning levels that do not prevent rows from being inserted into, updated, or deleted from the base tables when required.
  • If you do not define an explicit NUPI, the first column defined for the join index is assigned to be the NUPI by default. You can define an uncompressed join index with a partitioned primary index. However, the primary index cannot be partitioned if the join index is compressed.
  • Each NUSI counts toward the maximum number of 32 secondary indexes that you can define on a join index. Each multicolumn NUSI defined with an ORDER BY clause counts as two consecutive indexes against the limit of 32 per join index.
You cannot define the primary index for a join index on a column with any of the following data types:
  • XML
  • Geospatial
  • Derived Period
  • Period
UNIQUE
You can define a unique primary index for an uncompressed single-table join index, but the primary index for any other join index must be nonunique even when it is defined on a unique column.
index_name
The optional name of the primary being defined. For information about naming database objects, see Object Names.
(primary_index_column)
Names of the primary index columns. For a composite primary index, primary_index_column indicates a comma-separated list of all the index columns in parenthesis. You cannot specify the begin or end columns of a derived period column in a primary index.

You cannot alter a table to have a row-level security constraint column as a component of its primary index.
You cannot define a primary index on a column defined with the JSON data type.

PRIMARY AMP

Add a primary AMP index.

Rows are hash-distributed to AMPs for a column-partitioned table or join index. Column partition values are ordered on each AMP by an internal partition number and a row hash for a column-partitioned table or join index.

If a PRIMARY AMP clause is specified, you must specify a PARTITION BY clause that includes a column-partitioning level, either in the PRIMARY AMP clause or by itself in the index list.

INDEX
Optionally, the INDEX keyword can be specified with AMP for readability.
index_name
Name of the primary AMP index. For information about naming database objects, see Object Names.
(index_column_name)
The primary AMP index columns. For a composite primary index, index_column_name indicates a comma-separated list of all the index columns in parenthesis. You cannot specify the begin or end columns of a derived period column in a primary index.

You cannot alter a table to have a row-level security constraint column as a component of its primary index.
You cannot define a primary index on a column defined with the JSON data type.

Example: Join Index with a Primary AMP Index and Column Partitioning

This example assumes the following table has been created:

CREATE TABLE t1 (a INT, b INT, c INT, d INT) PRIMARY INDEX (a);

The following CREATE JOIN INDEX statements are equivalent:

CREATE JOIN INDEX jt1_1 AS SELECT ROWID AS rw, a, b FROM t1 WHERE a<10
  PRIMARY AMP (a) PARTITION BY COLUMN (rw NO AUTO COMPRESS, ROW(a,b));
CREATE JOIN INDEX jt1_1 AS SELECT ROWID AS rw, a, b FROM t1 WHERE a<10
  PRIMARY AMP INDEX (a) PARTITION BY COLUMN (rw NO AUTO COMPRESS, ROW(a,b));

Example: Join Index with a Primary Index and Partitioning by Row and Column

This example assumes the following table has been created:

CREATE TABLE t1 (a INT, b INT, c INT, d INT) PRIMARY INDEX (a);

This example shows a CREATE JOIN INDEX statement with a primary index and partitioned by row and column:

CREATE JOIN INDEX jt1_4 AS SELECT ROWID rw, a, b, d
  FROM t1
  PRIMARY INDEX (a) PARTITION BY (RANGE_N(b BETWEEN 1 AND 1000 EACH 1), COLUMN);

NO PRIMARY

The join index is defined with no primary index or primary AMP index.

You cannot specify NO PRIMARY if you specify UNIQUE PRIMARY INDEX or PRIMARY INDEX.

The NO PRIMARY specification is optional for column-partitioned join indexes. The default is no primary index.

If the preceding item in the index list is a partitioning clause that is not part of an index clause, you must specify a COMMA character preceding NO PRIMARY. Otherwise, the comma is optional.

You cannot specify a column name list following a NO PRIMARY specification.

If the preceding item in the index list is a partitioning clause that is not part of an index clause, you must specify a COMMA character preceding NO PRIMARY INDEX. Otherwise, the comma is optional.

See Primary Index Defaults for the complete set of rules that control primary index defaults.

PARTITION BY

The join index is partitioned by one or more partitioning levels. You can define either a single-level or multilevel partitioning for a join index, but only if the index is not row-compressed.

You cannot specify a PARTITION BY clause in the same join index definition as an ORDER BY CLAUSE.

You cannot collect statistics on the system-derived PARTITION#L n columns for a join index.

partitioning expression
Defines partitions for the level. You can also specify the ADD keyword for a partitioning level. For more information about partitioning tables and join indexes, see When Join Indexes Are Useful . If the preceding item in the index list is a partitioning clause that is not part of an index clause, you must specify a COMMA character following the PARTITION BY clause. Otherwise, the comma is optional. You can define multilevel partitioning for an uncompressed join index, with as many as 62 partitioning levels. Each level must be defined by a RANGE_N or CASE_N function or by the COLUMN keyword. You cannot specify character partitioning expressions for columns or constants that use the Kanji1 or KanjiSJIS server character sets. The result of a partitioning expression that is not a RANGE_N or CASE_N function is cast to an INTEGER, if it is not already an INTEGER. The value must be from 1 through 65,535. A RANGE_N or CASE_N function is only allowed for single-level partitioning. The maximum number of partitioning levels that you can specify for a join index with 2-byte partitioning is 15. The maximum number of partitioning levels that you can specify for a join index with 8-byte partitioning is 62. A partitioning expression for a join index must be a deterministic expression and cannot specify external or SQL UDFs or columns having any of the following data types:
  • ARRAY
  • VARRAY
  • BLOB
  • CLOB
  • Geospatial
While the partitioning expression for a join index cannot be defined on a Geospatial or ARRAY/VARRAY column, the partitioning expression can be defined on a Period column. This is particularly useful for defining updatable current dates and updatable current timestamps in a PPI.
In deciding whether to use a CASE_N or RANGE_N function in your partitioning expression, see the following:
  • Use a CASE_N function to define a mapping between conditions to INTEGER numbers.

    If a partitioning expression is based only on a CASE_N function, the maximum number of partitions you can define is a maximum of approximately 4,000 conditions.

    This maximum can be further limited by other limits such as the size of the request text.

  • Use a RANGE_N function to define a mapping of ranges of INTEGER, CHARACTER, or DATE values to INTEGER numbers. Use a RANGE_N function to map BIGINT and TIMESTAMP values to BIGINT numbers.
If a single-level partitioning expression is based only on a RANGE_N function with INTEGER data type, the maximum number of partitions you can define is 65,535 for 2-byte partitioning and 2,147,483,647 for 8-byte partitioning.
If a partitioning expression is based only on a RANGE_N function with BIGINT data type, the maximum number of partitions you can define is 9,223,372,036,854,775,805.
You can define up to 9,223,372,036,854,775,807 ranges if you specify both the NO RANGE and UNKNOWN partitions.
CASE_N and RANGE_N are useful for use as partitioning levels.
You can define a wide variety of partitions with a large range in the number of combined partitions. However, you must consider the usefulness of defining a partitioning and its impact, both positive and negative, on performance and storage.
You must specify only a single COLUMN partitioning level for a column-partitioned join index.
You can group multiple columns together into a single partition by specifying a list of columns to be grouped together.
If you specify column grouping in a partitioning level of a join index definition, you cannot also specify column grouping in its select list.
If you specify neither COLUMN nor ROW for a column partition, the system determines whether COLUMN or ROW format is used for the column partition (for information about COLUMN and ROW formats, see Storage and Other Overhead Considerations for Partitioning .
COLUMN
Keyword to indicate partitioning level.
AUTO COMPRESS
NO AUTO COMPRESS
To enable or disable autocompression:
  • If you specify AUTO COMPRESS for a column partitioning level in a PARTITION BY clause, the system applies autocompression for a column partition unless the you explicitly specify NO AUTO COMPRESS. This is the default.
  • If you specify NO AUTO COMPRESS for a column partitioning level in a PARTITION BY clause, the system does not apply autocompression for a column partition unless you explicitly specify AUTO COMPRESS.
The system applies row header autocompression for column partitions with COLUMN format.
ALL BUT (column_name_list)
ALL BUT (column_group_list)
A multicolumn partition with autocompression and system-determined COLUMN or ROW format that includes all of the columns in the join index definition that are not specified in column_name_list or column_group_list.
You can only specify this option for column-partitioned join index.
COLUMN
Specifies that the join index has a column-partitioned level.
If you precede a column grouping with the COLUMN keyword, the system stores the grouped columns using COLUMN format.
ROW
Specifies that a column partition has ROW format. A ROW format means that only one column-partition value is stored in a physical row as a subrow.
If you precede a column grouping with the ROW keyword, the system stores the grouped columns using ROW format.
ADD constant
The maximum number of partitions for a partitioning level is the number of partitions it defines plus the value of constant.
The value of constant must be an unsigned BIGINT constant and cannot exceed 9,223,372,036,854,775,807.
You can only specify this option for column-partitioned join indexes.

INDEX

A set of nonunique secondary indexes defined on the join index.

index_name
The optional name of the NUSI being defined. For information about naming database objects, see Object Names.
ALL
The defined NUSI is to maintain Row ID pointers for each logical row of the join index, instead of only the compressed physical rows.
ALL also ignores the NOT CASESPECIFIC attribute of data types so a NUSI can include case-specific values.
ALL enables a NUSI to cover a join index, which enhances performance by eliminating the need to access the join index when all values needed by a query are in the secondary index. However, ALL may also require the use of additional index storage space.
Use this keyword only when a secondary index is being defined on top of a join index.
You cannot specify ALL with a PRIMARY index.
You cannot specify multiple indexes that differ only by the presence or absence of the ALL option.
index_column_name
A column set whose values are to be an index on this join index. If you specify more than one column, the new index is based on the combined values of each column.
A maximum of 64 columns can be defined for one index.
Multiple indexes can be defined on the same columns if each index differs in its ordering option, for example, VALUES or HASH.
If two specified columns have the same name, both names must be aliased with unique column name aliases. See Using a Column Alias.
Columns in a secondary index cannot have any of the following data types: .
  • Period
  • Geospatial
  • BLOB
  • CLOB
  • ARRAY/VARRAY
In the index_column_name list, you cannot specify the system-derived columns PARTITION or PARTITION#Ln, where n is an integer from 1 through 62. However, you can specify a user-defined column named PARTITION or PARTITION#L n.
ORDER BY
Row ordering on each AMP by a single NUSI column. The ordering of the NUSI column values can either value-ordered or hash-ordered. Each multicolumn NUSI defined with an ORDER BY clause counts as two consecutive indexes against the limit of 32 per join index. You cannot specify an ORDER BY clause in the same join index definition as a partitioned primary index.
Following are the rules for specifying an ORDER BY clause:
  • If you specify ORDER BY VALUES, column_name must be numeric with four bytes or less.
  • If you specify ORDER BY without specifying the HASH or VALUES keywords, VALUES is assumed by default.
  • The ORDER BY column_name must be one of the columns specified in the select list.
  • You cannot specify the system-derived PARTITION column as the ORDER BY column.
You can specify a user-defined column named partition.
VALUES
Value-ordering for the ORDER BY column. This is the default specification.
Select VALUES to optimize queries that return a contiguous range of values, especially for a covered index or a nested join.
HASH
Hash-ordering for the ORDER BY column.
Select HASH to limit hash-ordering to one column, rather than all columns of the primary index, which is the default.
Hash-ordering a multicolumn NUSI on one of its columns allows the index to participate in a nested join where join conditions involve only that ordering column.
order_column_name
A column in the select list that specifies the sort order to be used for NUSI ordering.
Columns in a join index cannot have any of the following data types: Geospatial, BLOB, CLOB, or XML.
You cannot specify the system-derived columns PARTITION or PARTITION#Ln , where n is an integer ranging from 1 through 62, inclusive, as part of the column_name_2 list.
However, you can specify a user-defined column named PARTITION or PARTITION#L n.
Supported data types for a value-ordered, four-bytes-or-less column_name_2 are:
  • BYTEINT
  • DATE
  • DECIMAL
  • INTEGER
  • SMALLINT