Unpack_MLE Syntax Elements - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.10
1.1
Published
October 2019
Language
English (United States)
Last Update
2019-12-31
dita:mapPath
ima1540829771750.ditamap
dita:ditavalPath
jsj1481748799576.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantage™
TargetColumn
Specify the name of the input column that contains the packed data.
OutputColumns
Specify the names to give to the output columns, one for each virtual column in the packed data, in the order in which the virtual columns appear in target_column.
OutputDataTypes
Specify the datatypes of the unpacked output columns. Each datatype must be an MLE data type; for information about these, see Teradata Vantage™ User Guide, B700-4002.

If OutputDataTypes specifies only one value and OutputColumns specifies multiple columns, the specified value applies to every output_column.

If OutputDataTypes specifies multiple values, it must specify a value for each output_column. The nth datatype corresponds to the nth output_column.

Delimiter
[Optional] Specify the delimiter (a string) that separates the virtual columns in the packed data. If delimiter contains a character that is a symbol in a regular expression—such as an asterisk (*) or pipe character (|)—precede it with two escape characters. For example, if the delimiter is the pipe character, specify '\\|'.
Do not specify both this syntax element and the ColumnLength syntax element. If the virtual columns are separated by a delimiter, specify the delimiter with this syntax element; otherwise, specify the ColumnLength syntax element.
Default: ',' (comma)
ColumnLength
[Optional] Specify the lengths of the virtual columns; therefore, to use this syntax element, you must know the length of each virtual column.

If ColumnLength specifies only one value and OutputColumns specifies multiple columns, the specified value applies to every output_column.

If ColumnLength specifies multiple values, it must specify a value for each output_column. The nth datatype corresponds to the nth output_column. However, the last output_column can be an asterisk (*), which represents a single virtual column that contains the remaining data. For example, if the first three virtual columns have the lengths 2, 1, and 3, and all remaining data belongs to the fourth virtual column, you can specify ColumnLength (2, 1, 3, *).

Do not specify both this syntax element and the Delimiter syntax element.
Regex
[Optional] Specify a regular expression that describes a row of packed data, enabling the function to find the data values.
A row of packed data contains a data value for each virtual column, but the row might also contain other information (such as the virtual column name). In the regular_expression, each data value is enclosed in parentheses.
For example, suppose that the packed data has two virtual columns, age and sex, and that one row of packed data is age:34,sex:male. The regular_expression that describes the row is '.*:(.*)'. The '.*:' matches the virtual column names, age and sex, and the '(.*)' matches the values, 34 and male.
To represent multiple data groups in regular_expression, use multiple pairs of parentheses. Without parentheses, the last data group in regular_expression represents the data value (other data groups are assumed to be virtual column names or unwanted data). If a different data group represents the data value, specify its group number with the RegexSet syntax element.
Default: '(.*)', which matches the whole string (between delimiters, if any). When applied to the preceding sample row, the default regular_expression causes the function to return 'age:34' and 'sex:male' as data values.
RegexSet
[Optional] Specify the ordinal number of the data group in regular_expression that represents the data value in a virtual column.
Default behavior: The last data group in regular_expression represents the data value. For example, suppose that regular_expression is '([a-zA-Z]*):(.*)'. If group_number is '1', '([a-zA-Z]*)' represents the data value. If group_number is '2', '(.*)' represents the data value.
IgnoreInvalid
[Optional] Specify whether the function ignores rows that contain invalid data.
Default: 'false' (The function fails if it encounters a row with invalid data.)