Input
SQL Call
SELECT * FROM VectorDistance ( ON target_mobile_data as TargetTable PARTITION BY UserID ON ref_mobile_data as ReferenceTable DIMENSION USING TargetIdColumns ('UserID') TargetAttributeNameColumn ('Feature') TargetAttributeValueColumn ('value1') DistanceMeasure ('Cosine','Euclidean','Manhattan') ) AS dt ORDER BY 1;
Output
target_userid ref_userid type distance ------------- ---------- --------- -------------------- 1 5 euclidean 1.1246501958870991 1 5 manhattan 1.7299667 1 5 cosine 0.45486517827424 2 5 manhattan 0.73 2 5 euclidean 0.5243090691567331 2 5 cosine 0.02608922985452755 3 5 cosine 0.024150544155866593 3 5 manhattan 0.67 3 5 euclidean 0.4526588119102511 4 5 manhattan 1.42 4 5 euclidean 1.047091209016674 4 5 cosine 0.43822243299800046
The following table (which is not output by the VectorDistance function) shows the distances of the target vectors from the reference vector (UserID 5) and their similarity ranks. The shorter the distance, the higher the similarity rank. Similarity rank is independent of measure—if relative distances are shorter in one measure, they are shorter in all measures. UserID 3 is most similar to UserID 5.
target_userid | Cosine Distance | Euclidean Distance | Manhattan Distance | Similarity Rank |
---|---|---|---|---|
1 | 0.454865179 | 1.124650195 | 1.7299667 | 4 |
2 | 0.02608923 | 0.524309065 | 0.72999999 | 2 |
3 | 0.024150545 | 0.452658811 | 0.66999999 | 1 |
4 | 0.438222434 | 1.047091208 | 1.41999999 | 3 |
Download a zip file of all examples and a SQL script file that creates their input tables.