VectorDistance Syntax Elements - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢
TargetIDColumns
Specify the names of the columns that comprise the target vector identifier. You must partition the target input table by these columns and specify them with this syntax element.
RefIDColumns
[Optional] Specify the names of the columns that comprise the reference vector identifier.
Default: TargetIDColumns value
TargetAttributeNameColumn
[Optional] Specify the name of the column that contains the target vector feature name (for example, the axis of a 3-D vector).
The function drops any entry that has a NULL value in a feature_column.
TargetAttributeValueColumn
[Optional] Specify the name of the column that contains the value for the target vector feature. If you omit this syntax element, each feature (that is, each row) has the target value 1.
The function drops any entry that has a NULL value in a value_column.
RefAttributeNameColumn
[Optional] Specify the name of the column that contains the reference vector feature name.
The function drops any entry that has a NULL value in a feature_column.
Default: target_feature_column (TargetAttributeNameColumn value)
RefAttributeValueColumn
[Optional] Specify the name of the column that contains the value for the reference vector feature.
The function drops any entry that has a NULL value in a value_column.
Default: TargetAttributeValueColumn syntax element value.
TargetColumns
[Optional] Specify names of the columns that contain target vector feature values (for example, the names of the three axes of a 3-D vector).
TargetColumns and RefColumns must specify the same number of columns.
For dense-format input, TargetColumns and RefColumns must specify the same columns; otherwise results are invalid.
RefColumns
[Optional] Specify names of the columns that contain reference vector feature values (for example, the names of the three axes of a 3-D vector).
RefTableSize
[Optional] Specify the size of the ReferenceTable. Specify 'LARGE' only if the ReferenceTable does not fit in memory, because 'SMALL' allows faster processing.
Default: 'SMALL'
DistanceMeasure
[Optional] Specify the distance measures that the function uses.
Option Description
'cosine' Cosine distance between vectors p and q:

Formula Machine Learning Engine function VectorDistance uses to compute cosine distance between two vectors p and q.

'euclidean' Euclidean distance between vectors p and q:

Formula Machine Learning Engine function VectorDistance uses to compute Euclidean distance between two vectors p and q.

'manhattan' Manhattan distance between vectors p and q:

Formula Machine Learning Engine function VectorDistance uses to compute Manhattan distance between two vectors p and q.

'binary' Binary distance between two vectors is 1 if vectors are identical and 0 otherwise.
Default: 'cosine'
IgnoreMismatch
[Optional for sparse input, ignored otherwise.] Specify whether to drop mismatched dimensions. If DistanceMeasure is 'cosine', this syntax element is 'false'. If you specify 'true', two vectors with no common features become two empty vectors when only their common features are considered, and the function cannot measure the distance between them.
Default: 'true'
ReplaceInvalid
[Optional] Specify the value to return when the function encounters an infinite value or empty vectors. For custom, you can supply any DOUBLE PRECISION value.
Default: 'PositiveInfinity'
TopK
[Optional] Specify, for each target vector and for each measure, the maximum number of closest reference vectors to include in the output table. k is a positive INTEGER value.
Default behavior: Function includes all reference vectors in the output table.
MaxDistance
[Optional] Specify the maximum distance between a pair of target and reference vectors. If the distance exceeds the threshold, the pair does not appear in the output table.
If the DistanceMeasure syntax element specifies multiple measures, then the MaxDistance syntax element must specify a threshold for each measure. The ith threshold corresponds to the ith measure. Each threshold can be any DOUBLE PRECISION value.
Default behavior: The function returns all results.
OutputFormat
[Optional] Specify the format of the output table. For large data sets, Teradata recommends input in dense format, for which computing distances is faster.
Default: 'SPARSE'
InputTablesSame
[Optional without TopK, disallowed otherwise.] When TargetTable and ReferenceTable are the same, specify 'true' to increase speed of computing distances.
If you specify InputTablesSame, you must include the ORDER BY clause in the ReferenceTable specification.