KMeansTrain - Aster Analytics

Teradata AsterĀ® Spark Connector User Guide

Product
Aster Analytics
Release Number
7.00.00.01
Published
May 2017
Language
English (United States)
Last Update
2018-04-13
dita:mapPath
dbt1482959363906.ditamap
dita:ditavalPath
Generic_no_ie_no_tempfilter.ditaval
dita:id
dbt1482959363906
lifecycle
previous
Product Category
Software

The KmeansTrain class defines a wrapper function that uses the Aster Spark API and implements the training phase of the Spark MLlib K-means clustering algorithm. The function generates a model that is typically used by the KMeansRun function.

Run Method Signature

run(input: RDD[DataRow], sparkFunctParams: String): RDD[DataRow]

Parameters

String representing the parameters specific to the function you are implementing. The string has this syntax:
'--option_value_pair [,...]'
option_value_pair is one of the following:
  • initializationMode { "random" | "k-means||" }

    Default: "k-means||"

  • k clusters

    Number of clusters.

  • maxIterations max_iterations

    Maximum number of iterations.

  • modelLocation model_location

    Required. Specifies the HDFS path to the location where the function is to save the model.

  • runs runs

    Number of parallel runs. Default: 1.

  • seed seed

    Random seed value for cluster initialization.

Returns

The input data and the predicted value (that is, the cluster number).

Side Effects

Function saves model in model_location.

Version

Spark 1.4 and later.