TD_VectorDistance Function | VectorDistance | Teradata Vantage - TD_VectorDistance - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-02-17
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

The TD_VectorDistance function accepts a table of target vectors and a table of reference vectors and returns a table that contains the distance between target-reference pairs. The function computes the distance between the target pair and the reference pair from the same table if you provide only one table as the input. You must have the same column order in the TargetFeatureColumns argument and the RefFeatureColumns argument. The function ignores the feature values during distance computation if the value is either NULL, NAN, or INF.

Important: The function returns N2 output if you use the TopK value as -1 because the function includes all reference vectors in the output table.
The algorithm used in this function is of the order of N2 (where N is the number of rows). The query runs significantly longer as the number of rows increases in either the target table or the reference table. Because the reference table is a DIMENSION input, it is copied to the spool for each AMP before running the query. The user spool limits the size and scalability of the input.

Vector Distance is a measure of the similarity or dissimilarity between two vectors in multidimensional space. It is a fundamental concept in machine learning and data analysis, as it is used to determine the distance between data points, cluster centers, or features in a dataset. The distance between vectors is usually calculated using a distance metric, such as Euclidean distance, Manhattan distance, or cosine similarity.

Overall, the choice of distance metric depends on the nature of the data and the problem at hand.

Vector distance, also known as distance metric or similarity measure, is a mathematical calculation used in machine learning to determine the similarity or dissimilarity between two data points represented as vectors in a multi-dimensional space.

There are different types of distance metrics commonly used in machine learning, including Euclidean distance, Manhattan distance and cosine similarity. For example, in terms of k-means clustering, vector distance is used to measure a distance between a data point and the centroid of a cluster.

Euclidean distance is the most commonly used distance metric and calculates the straight-line distance between two data points in the multi-dimensional space.