Output - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product
Aster Analytics
Release Number
6.21
Published
November 2016
Language
English (United States)
Last Update
2018-04-14
dita:mapPath
kiu1466024880662.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1021
lifecycle
previous
Product Category
Software

The output displayed on the screen is a summary table containing statistics about the KModes run. The table has four columns if a single model is trained, five if multiple models are trained simultaneously. The three right-most columns are separated from the summary so that you can sort by them and quickly find the best model.

KMode Output Summary Table
Column Name Data Type Description
model_id VARCHAR Only appears if multiple models are trained. Integer, starting with 0, identifying the model.
summary VARCHAR Presents the following data about the model:
  • Number of Clusters (number of clusters found in the model)
  • Number of Iterations (number of iterations required)
  • Model Converged (whether or not the algorithm converged)
  • Number of Data Points (number of input rows used to build the model)
between_cluster_error DOUBLE PRECISION Sum of squared distances of centroids to the global mean, where the squared distance of each mean to the global mean is multiplied by the number of data points in the cluster.
total_within_cluster_error DOUBLE PRECISION The sum over all clusters of the within cluster error (within_cluster_ss).
pseudo_f DOUBLE PRECISION The value given by this formula:

(between_cluster_errror / (K - 1)) /

(total_within_cluster_error / (N - K))

where N is the total number of data points, or the total weight if the points are weighted, and K is the number of clusters.

KMode Output Model Table
Column Name Data Type Description
model_id INTEGER ID of the model.
cluster_id INTEGER ID assigned to the cluster.
numerical attributes DOUBLE PRECISION One column for each numerical dimension from the input table.
categorical attributes VARCHAR One column for each categorical dimension from the input table.
within_cluster_ss DOUBLE PRECISION Total distance summed over all points in the cluster, between the point and the cluster center, as calculated by the Distance metric.
cluster_weight DOUBLE PRECISION Total weight of the data points assigned to the cluster.
distance_metric VARCHAR Value of the Distance argument in the function call ( copied to the output table so that you need not specify it again when calling KModesPredict).
category_weights VARCHAR Value of the CategoryWeights argument in the function call (copied to the output table so that you need not specify it again when calling KModesPredict).