The output displayed on the screen is a summary table containing statistics about the KModes run. The table has four columns if a single model is trained, five if multiple models are trained simultaneously. The three right-most columns are separated from the summary so that you can sort by them and quickly find the best model.
Column Name | Data Type | Description |
---|---|---|
model_id | VARCHAR | Only appears if multiple models are trained. Integer, starting with 0, identifying the model. |
summary | VARCHAR | Presents the following data about the model:
|
between_cluster_error | DOUBLE PRECISION | Sum of squared distances of centroids to the global mean, where the squared distance of each mean to the global mean is multiplied by the number of data points in the cluster. |
total_within_cluster_error | DOUBLE PRECISION | The sum over all clusters of the within cluster error (within_cluster_ss). |
pseudo_f | DOUBLE PRECISION | The value given by this formula: (between_cluster_errror / (K - 1)) / (total_within_cluster_error / (N - K)) where N is the total number of data points, or the total weight if the points are weighted, and K is the number of clusters. |
Column Name | Data Type | Description |
---|---|---|
model_id | INTEGER | ID of the model. |
cluster_id | INTEGER | ID assigned to the cluster. |
numerical attributes | DOUBLE PRECISION | One column for each numerical dimension from the input table. |
categorical attributes | VARCHAR | One column for each categorical dimension from the input table. |
within_cluster_ss | DOUBLE PRECISION | Total distance summed over all points in the cluster, between the point and the cluster center, as calculated by the Distance metric. |
cluster_weight | DOUBLE PRECISION | Total weight of the data points assigned to the cluster. |
distance_metric | VARCHAR | Value of the Distance argument in the function call ( copied to the output table so that you need not specify it again when calling KModesPredict). |
category_weights | VARCHAR | Value of the CategoryWeights argument in the function call (copied to the output table so that you need not specify it again when calling KModesPredict). |