Using the K-Means Algorithm | Teradata Vantage - Example: How to Use the k-means Algorithm - Analytics Database

Database Analytic Functions

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-04-06
dita:mapPath
gjn1627595495337.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
jmh1512506877710
Product Category
Teradata Vantageā„¢

In the following example, you have a set of unlabeled or unclustered points.

TD_KMEANSPREDICT random points

The k-means algorithm creates clusters. The points are shown as squares and triangles with the cluster centers shown as crosses:

Then, new unlabeled data points are added to the set of points.The following image shows the new data points in circle:

To predict the label of the two new points, the k-means algorithm calculates the distances of each point from each cluster center or centroid. The k-means algorithm assigns the new point to the cluster whose centroid is closest to the new point.

In the previous image, the unknown point on the left is closer to the square cluster and the other is closer to the triangle cluster. The k-means algorithm assigns the new points to their closest cluster respectively for the calculation. The following image shows these assignments by transforming the points with their relevant figure: