Arguments - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product
Aster Analytics
Release Number
6.21
Published
November 2016
Language
English (United States)
Last Update
2018-04-14
dita:mapPath
kiu1466024880662.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1021
lifecycle
previous
Product Category
Software
Argument Category Description
SampleIdColumn Required Name of the column in the InputTable that contains the identifier of the training samples.
AttributeColumns Required Specifies all the attribute columns. Attribute columns must have a numeric value.
KernelFunction Optional Specifies the kernel function that the DenseSVMTrainer function uses to compute the hash function:
  • 'linear' (default)

    DenseSVMTrainer uses a Pegasos algorithm to solve the linear SVM.

  • 'polynomial'

    DenseSVMTrainer uses a Hash-SVM algorithm.

    The formula for a polynomial is: γ(u T v + c) d

  • 'rfb'

    DenseSVMTrainer uses a Hash-SVM algorithm.

    The formula for RBF is: exp (-γ * | | x - x' | |2)

  • 'sigmoid'

    DenseSVMTrainer uses a Hash-SVM algorithm.

    The formula for sigmoid is: tanh (γ * u T v + c)

When DenseSVMTrainer uses a Hash-SVM algorithm, each sample is represented by compact hash bits, over which an inner product is defined to serve as the surrogate of the original nonlinear kernels.

Gamma Optional Use only when KernelFunction is polynomial, RBF, or sigmoid. A positive double that specifies γ. The minimum value is 0.0. The default value is 1.0.
Constant Optional Use only when KernelFunction is polynomial or sigmoid. A double value that specifies c. If KernelFunction is polynomial, the minimum value is 0.0. The default value is 1.0.
Degree Optional Use only when KernelFunction is polynomial. A positive integer that specifies the degree (d) of the polynomial kernel. The input value must be greater than 0. The default value is 2.
SubspaceDimension Optional Valid only if kernel is polynomial, RBF, or sigmoid. A positive integer that specifies the random subspace dimension of the basis matrix V obtained by the Gram-Schmidt process. Because the Gram-Schmidt process cannot be parallelized, this dimension cannot be too large. Accuracy increases with higher values of this number, but computation costs also increase. The input value must be in the range [1, 2048]. The default value is 256.
HashBits Optional Valid only if kernel is polynomial, RBF, or sigmoid. A positive integer specifying the number of compact hash bits used to represent a data point. Accuracy increases with higher values of this number, but computation costs also increase. The input value must be in the range [8, 8192]. The default value is 256.
InputTable Required Name of the table containing the training samples. Each row consists of a sample_id, a set of attribute values, and a corresponding label.
ModelTable Required Name for the model table that the function creates.
LabelColumn Required Column that identifies the class of the corresponding sample. Must be an integer or a string.
Cost Optional The regularization parameter λ in the SVM soft-margin loss function:


Must be greater than 0.0. The default value is 1.0.

Bias Optional A nonnegative value. If the value is greater than zero, each sample in the training set is converted to (, b); that is, it adds another dimension containing the bias value b. This argument addresses situations where not all samples center at 0. The default value is 0.0.
ClassWeights Optional Specifies the weights for different classes. The format is: classlabel m:weight m, classlabel n:weight n . If weight for a class is given, the cost parameter for this class is weight * cost. A weight larger than 1 often increases the accuracy of the corresponding class; however, it may decrease global accuracy. Classes not assigned a weight have default weight 1.0.
MaxStep Optional A positive integer value that specifies the maximum number of iterations of the training process. One step means that each sample is seen once by the trainer. The input value must be in the range (0, 10000]. The default value is 100.
Epsilon Optional Termination criterion. When the difference between the values of the loss function in two sequential iterations is less than this number, the function stops. Must be greater than 0.0. The default value is 0.01.
Seed Optional A long integer value used to order the training set randomly and consistently. This value can be used to ensure that the same model is generated if the function is run multiple times in a given database with the same arguments. The input value must be in the range [0, 9223372036854775807]. The default value is 0.
OverwriteOutput Optional If true, the function overwrites the output table specified in the ModelTable argument. The default value is 'false'.