KMeansTrain - Aster Analytics

Teradata Aster® Spark Connector User Guide

Product

Aster Analytics

Release Number

7.00.00.01

Published

May 2017

Language

English (United States)

Last Update

2018-04-13

dita:mapPath

dbt1482959363906.ditamap

dita:ditavalPath

Generic_no_ie_no_tempfilter.ditaval

dita:id

dbt1482959363906

lifecycle

Product Category

Software

The KmeansTrain class defines a wrapper function that uses the Aster Spark API and implements the training phase of the Spark MLlib K-means clustering algorithm. The function generates a model that is typically used by the KMeansRun function.

Run Method Signature

run(input: RDD[DataRow], sparkFunctParams: String): RDD[DataRow]

Parameters

String representing the parameters specific to the function you are implementing. The string has this syntax:

'--option_value_pair [,...]'

option_value_pair is one of the following:

initializationMode { "random" | "k-means||" }
Default: "k-means||"
k clusters
Number of clusters.
maxIterations max_iterations
Maximum number of iterations.
modelLocation model_location
Required. Specifies the HDFS path to the location where the function is to save the model.
runs runs
Number of parallel runs. Default: 1.
seed seed
Random seed value for cluster initialization.

Returns

The input data and the predicted value (that is, the cluster number).

Side Effects

Function saves model in model_location.

Version

Spark 1.4 and later.