17.05 - Unpack Syntax Elements - Teradata Database

Teradata Vantage™ - Advanced SQL Engine Analytic Functions

prodname
Advanced SQL Engine
Teradata Database
vrm_release
17.00
17.05
created_date
June 2020
category
Programming Reference
featnum
B035-1206-170K
TargetColumn
Specify the name of the input column that contains the packed data.
OutputColumns
Specify the names to give to the output columns, in the order in which the corresponding virtual columns appear in target_column. The names must be valid object names, as defined in Teradata Vantage™ - SQL Fundamentals, B035-1141.
If you specify fewer output column names than there are virtual input columns, the function ignores the extra virtual input columns. That is, if the packed data contains x+y virtual columns and the OutputColumns syntax element specifies x output column names, the function assigns the names to the first x virtual columns and ignores the remaining y virtual columns.
OutputDataTypes
Specify the datatypes of the unpacked output columns. Supported data types are VARCHAR, INTEGER, DOUBLE PRECISION, TIME, DATE, and TIMESTAMP.
If OutputDataTypes specifies only one value and OutputColumns specifies multiple columns, the specified value applies to every output_column.
If OutputDataTypes specifies multiple values, it must specify a value for each output_column. The nth datatype corresponds to the nth output_column.
The function can output only 16 VARCHAR columns.
Delimiter
[Optional] Specify the delimiter—a single Unicode character in Normalization Form C (NFC)—that separates the virtual columns in the packed data. The delimiter is case-sensitive.
Do not specify both this syntax element and the ColumnLength syntax element. If the virtual columns are separated by a delimiter, specify the delimiter with this syntax element; otherwise, specify the ColumnLength syntax element.
Default: ',' (comma)
ColumnLength
[Optional] Specify the lengths of the virtual columns; therefore, to use this syntax element, you must know the length of each virtual column.
If ColumnLength specifies only one value and OutputColumns specifies multiple columns, the specified value applies to every output_column.
If ColumnLength specifies multiple values, it must specify a value for each output_column. The nth datatype corresponds to the nth output_column. However, the last output_column can be an asterisk (*), which represents a single virtual column that contains the remaining data. For example, if the first three virtual columns have the lengths 2, 1, and 3, and all remaining data belongs to the fourth virtual column, you can specify ColumnLength ('2', '1', '3', *).
Do not specify both this syntax element and the Delimiter syntax element.
Regex
[Optional] Specify a regular expression that describes a row of packed data, enabling the function to find the data values.
A row of packed data contains a data value for each virtual column, but the row might also contain other information (such as the virtual column name). In the regular_expression, each data value is enclosed in parentheses.
For example, suppose that the packed data has two virtual columns, age and sex, and that one row of packed data is age:34,sex:male. The regular_expression that describes the row is '.*:(.*)'. The '.*:' matches the virtual column names, age and sex, and the '(.*)' matches the values, 34 and male.
To represent multiple data groups in regular_expression, use multiple pairs of parentheses. Without parentheses, the last data group in regular_expression represents the data value (other data groups are assumed to be virtual column names or unwanted data). If a different data group represents the data value, specify its group number with the RegexSet syntax element.
Default: '(.*)', which matches the whole string (between delimiters, if any). When applied to the preceding sample row, the default regular_expression causes the function to return 'age:34' and 'sex:male' as data values.
RegexSet
[Optional] Specify the ordinal number of the data group in regular_expression that represents the data value in a virtual column.
Default behavior: The last data group in regular_expression represents the data value. For example, suppose that regular_expression is '([a-zA-Z]*):(.*)'. If group_number is '1', '([a-zA-Z]*)' represents the data value. If group_number is '2', '(.*)' represents the data value.
Maximum: 30
IgnoreInvalid
[Optional] Specify whether the function ignores rows that contain invalid data.
IgnoreInvalid may not behave as you expect if an item in a virtual column has trailing special characters. See Unpack Example: IgnoreInvalid ('true') with Trailing Special Characters.
Default: 'false' (The function fails if it encounters a row with invalid data.)