The TD_KMeansPredict function uses the cluster centroids in the TD_KMeans function output to assign the input data points to the cluster centroids.
Input Table
This example uses the following input table:
id C1 C2 -- -- -- 1 1 1 2 2 2 3 8 8 4 9 9
KMeans_Model (generated using TD_KMeans)
You can view the TD_KMeans call provisioned with initial centroids table.
SELECT * FROM TD_KMeans (
ON kmeans_input_table AS InputTable
ON kmeans_initial_centroids_table AS InitialCentroidsTable DIMENSION
USING
IdColumn('id')
TargetColumns('c1','c2')
StopThreshold(0.0395)
MaxIterNum(3)
) AS dt;
Result:
td_clusterid_kmeans C1 C2 td_size_kmeans td_withinss_kmeans id td_modelinfo_kmeans
------------------- -- -- -------------- ------------------ -- -------------------
0 1.5 1.5 2 1 NULL NULL
1 8.5 8.5 2 1 NULL NULL
NULL NULL NULL NULL NULL NULL Converged : True
NULL NULL NULL NULL NULL NULL Number of Iterations : 2
NULL NULL NULL NULL NULL NULL Number of Clusters : 2
NULL NULL NULL NULL NULL NULL Total_WithinSS 2.00000000000000E+00
NULL NULL NULL NULL NULL NULL Between_SS : 9.80000000000000E+01
NULL NULL NULL NULL NULL NULL Method for InitialCentroids : Externally supplied InitialCentroidsTable
TD_KMeansPredict Call
SELECT * FROM TD_KMeansPredict (
ON kmeans_input_table AS InputTable
ON kmeans_model AS ModelTable DIMENSION
USING
OutputDistance('true')
Accumulate('c1','c2')
)AS dt order by 1,2,3;
TD_KMeansPredict Output
id td_clusterid_kmeans td_distance_kmeans C1 C2
-- ------------------- ------------------ -- --
1 0 0.707 1 1
2 0 0.707 2 2
3 1 0.707 8 8
4 1 0.707 9 9
If you set the value of OutputDistance to 'false' and rerun the query, the output shows these columns:
id td_clusterid_kmeans C1 C2
-- --------------------- -- --
1 0 1 1
2 0 2 2
3 1 8 8
4 1 9 9