Row compression automatically divides rows into repeating and nonrepeating portions. The system appends multiple sets of nonrepeating column values to a single set of repeating column values. This allows the system to store the repeating value set only once, while any nonrepeating column values are stored as logical extensions of the base repeating set. The nonrepeating column set has a pointer to each of the repeating column sets so it can reconstruct the rows when required.
Row compression significantly reduces index storage space for index columns that have a large number of repeating values.
For hash indexes, the system compresses rows by default.
For join indexes you can:
- Specify the columns to be compressed using the (column_1_name) syntax elements in the SELECT clause of a CREATE JOIN INDEX statement.
- Specify the columns to remain uncompressed using the (column_2_name) syntax elements in the SELECT clause for the same CREATE JOIN INDEX statement.
Example 1: Join Index Row Compression
- Two columns with repeating values:
- Column A indicates whether the customer owns a truck or car
- Column B indicates whether the vehicle is domestic or foreign made
- Three columns of unique, or mostly unique, information:
- Column C is the customer name
- Column D is the customer address
- Column E is the customer phone number
The uncompressed join index row data is as follows:
(car,foreign,joe,addr1,ph1) (car,foreign,jane,addr2,ph2) (car,foreign,max,addr3,ph3) (car,domestic,bill,addr6,ph6) (car,domestic,linda,addr7,ph7) (truck,domestic,brad,addr4,ph4) (truck,domestic,sam,addr5,ph5) (truck,foreign,dave,addr8,ph8) (truck,foreign,rick,addr9,ph9)
You can specify row compression in a CREATE JOIN INDEX statement, for example:
CREATE JOIN INDEX ji_name AS SELECT (col_a,col_b),(col c,col_d,col_e) FROM table_name . . . ;
where the first set of parentheses defines the columns (col_a and col_b) for which duplicate row values are compressed, and the second set of parentheses defines the remaining columns (col c, col_d, and col_e), for which row values are not compressed.
As a result, the system stores each possible combination of the repeating values for col_a and col_b once, grouped with the related row values for the uncompressed columns, for example:
((car,foreign),(joe,addr1,ph1),(jane,addr2,ph2),(max,addr3,ph3)) ((car,domestic),(bill,addr6,ph6),(linda,addr7,ph7)) ((truck,domestic),(brad,addr4,ph4),(sam,addr5,ph5)) ((truck,foreign),(dave,addr8,ph8),(rick,addr9,ph9))
Benefits
Row compression significantly reduces index storage space for index columns that have a large number of repeating values.
Usage Notes
Row compression is valuable where column values repeat from row to row. Columns in a join index that have a high percentage of distinct (non-repeating) values are not a good candidates for row compression. Row compression cannot be used with data using row-level security constraints. A join index with row compression cannot be column-partitioned.
In addition to any row compression you define for a join index, the index also inherits MVC compression defined for its base tables, with some exceptions.
Related Topics
For information on... | See... |
---|---|
strategies and restrictions when specifying row compression in a join index, including how join indexes inherit MVC from base tables | “CREATE JOIN INDEX” in Teradata Vantage™ - SQL Data Definition Language Detailed Topics , B035-1184 |
syntax and examples for using the CREATE JOIN INDEX statement | “CREATE JOIN INDEX” in Teradata Vantage™ - SQL Data Definition Language Syntax and Examples, B035-1144 |
join index row compression as it relates to database design | Join Index Storage |