VectorDistance Example 1: Default Thresholds - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantage™

SQL Call

SELECT * FROM VectorDistance (
  ON target_mobile_data AS target PARTITION BY UserID
  ON ref_mobile_data AS ref DIMENSION
  USING
  TargetIDColumns ('UserID')
  TargetFeatureColumn ('Feature')
  TargetValueColumn ('value1')
  DistanceMeasure ('Cosine', 'Euclidean', 'Manhattan')
) AS dt ORDER BY Target_UserID;

Output

target_userid ref_userid type distance
1 5 cosine 0.454865178527558
1 5 euclidean 1.12465019517762
1 5 manhattan 1.72996669672284
2 5 cosine 0.0260892301077248
2 5 euclidean 0.524309064791334
2 5 manhattan 0.729999989271164
3 5 cosine 0.0241505454220814
3 5 euclidean 0.452658810804166
3 5 manhattan 0.669999986886978
4 5 cosine 0.438222433743287
4 5 euclidean 1.04709120838197
4 5 manhattan 1.41999999247491

The following table (which is not output by the VectorDistance function) shows the distances of the target vectors from the reference vector (UserID 5) and their similarity ranks. The shorter the distance, the higher the similarity rank. Similarity rank is independent of measure—if relative distances are shorter in one measure, they are shorter in all measures. UserID 3 is most similar to UserID 5.

Target Distances from Reference and Similarity Ranks
target_userid Cosine Distance Euclidean Distance Manhattan Distance Similarity Rank
1 0.454865179 1.124650195 1.7299667 4
2 0.02608923 0.524309065 0.72999999 2
3 0.024150545 0.452658811 0.66999999 1
4 0.438222434 1.047091208 1.41999999 3