Syntax | Fast K-Means Clustering | Vantage Analytics Library - Syntax - Vantage Analytics Library

Vantage Analytics Library User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
Lake
VMware
Product
Vantage Analytics Library
Release Number
2.2.0
Published
March 2023
Language
English (United States)
Last Update
2024-01-02
dita:mapPath
ibw1595473364329.ditamap
dita:ditavalPath
iup1603985291876.ditaval
dita:id
zyl1473786378775
Product Category
Teradata Vantage
CALL td_analyze (
  'kmeans',
  'required_parameter_list [ optional_parameter; [...] ]'
);
required_parameter_list
database = input_database_name;
tablename = input_table_name;
columns = { column_name [,...] | keyword };
outputdatabase = output_database_name;
outputtablename = output_table_name;
kvalue = k_value;
optional_parameter
{ columnstoexclude = column_name [,...] |
  continuation = { true | false } |
  iterations = iterations |
  operatordatabase = operator_database_name |
  overwrite = { true | false } |
  threshold = threshold
}

Syntax Elements

database
The database containing the input table.
tablename
The name of the table containing the data to cluster.
columns
The columns to analyze.
keyword Description
all All columns.
allnumeric All numeric columns.
outputdatabase
The database to contain the resulting output table that represents a cluster model.
outputtablename
The name of the output table representing the cluster model.
kvalue
The number of clusters to be contained in the cluster model.
columnstoexclude
[Optional] The columns to exclude when columns specifies a keyword.
continuation
[Optional] Whether clustering begins with values determined by pre-existing result tables rather than random values.
Default: false
iterations
[Optional] The maximum number of iterations to perform during modeling.
Default: 50
operatordatabase
[Optional] The database where the table operators that td_analyze calls reside.
Default behavior: The function searches the standard search path for table operators.
overwrite
[Optional] Whether to drop the output tables before creating new ones.
Default: true
threshold
[Optional] The decimal value that determines if the algorithm has converged, based on how much the cluster centroids change from one iteration to the next.
Default: .001