TD_KMeansPredict Examples | KMeansPredict | Teradata Vantage - Examples: How to Use TD_KMeansPredict - Analytics Database

Database Analytic Functions

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-04-06
dita:mapPath
gjn1627595495337.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
jmh1512506877710
Product Category
Teradata Vantageā„¢

The TD_KMeansPredict function uses the cluster centroids in the TD_KMeans function output to assign the input data points to the cluster centroids.

Input Table

This example uses the following input table:

 id C1 C2 
 -- -- -- 
  1  1  1
  2  2  2
  3  8  8
  4  9  9

KMeans_Model (generated using TD_KMeans)

You can view the TD_KMeans call provisioned with initial centroids table.

SELECT * FROM TD_KMeans (
  ON kmeans_input_table AS InputTable
  ON kmeans_initial_centroids_table AS InitialCentroidsTable DIMENSION
  USING
  IdColumn('id')
  TargetColumns('c1','c2')
  StopThreshold(0.0395)
  MaxIterNum(3)
) AS dt;

Result:

td_clusterid_kmeans   C1   C2   td_size_kmeans   td_withinss_kmeans   id   td_modelinfo_kmeans
-------------------   --   --   --------------   ------------------   --   -------------------
       0              1.5  1.5   2                1                   NULL  NULL 
       1              8.5  8.5   2                1                   NULL  NULL
       NULL           NULL NULL  NULL             NULL                NULL  Converged : True
       NULL           NULL NULL  NULL             NULL                NULL  Number of Iterations : 2
       NULL           NULL NULL  NULL             NULL                NULL  Number of Clusters : 2
       NULL           NULL NULL  NULL             NULL                NULL  Total_WithinSS 2.00000000000000E+00
       NULL           NULL NULL  NULL             NULL                NULL  Between_SS : 9.80000000000000E+01
       NULL           NULL NULL  NULL             NULL                NULL  Method for InitialCentroids : Externally supplied InitialCentroidsTable

TD_KMeansPredict Call

SELECT * FROM TD_KMeansPredict (
ON kmeans_input_table AS InputTable
ON kmeans_model AS ModelTable DIMENSION
USING
OutputDistance('true')
Accumulate('c1','c2')
)AS dt order by 1,2,3;

TD_KMeansPredict Output

         id   td_clusterid_kmeans      td_distance_kmeans   C1       C2 
         --   -------------------      ------------------   --       -- 
         1     0                        0.707               1        1
         2     0                        0.707               2        2
         3     1                        0.707               8        8
         4     1                        0.707               9        9

If you set the value of OutputDistance to 'false' and rerun the query, the output shows these columns:

         id   td_clusterid_kmeans      C1       C2
         --  ---------------------     --       --
         1     0                       1        1
         2     0                       2        2
         3     1                       8        8
         4     1                       9        9