LDATrainer Arguments - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product

Aster Analytics

Release Number

7.00.02

Published

September 2017

Language

English (United States)

Last Update

2018-04-17

dita:mapPath

uce1497542673292.ditamap

dita:ditavalPath

AA-notempfilter_pdf_output.ditaval

dita:id

B700-1022

lifecycle

Product Category

Software

InputTable

Specifies the name of the table or view that contains the training documents.

ModelTable

Specifies the name for the model table that the function creates in the database. This table must not already exist.

OutputTable

[Optional] Specifies the name of the output table that contains the topic distribution of each document in the input table, which the function creates in the database. This table must not already exist. If you omit this argument, the function does not generate this table.

TopicNum

Specifies the number of topics for all the documents in the input table, an INTEGER value in the range [2, 1000].

Alpha

[Optional] Specifies a hyperparameter of the model, the prior smooth parameter for the topic distribution over documents. As alpha decreases, fewer topics are associated with each document. Default: 0.1.

Eta

[Optional] Specifies a hyperparameter of the model, the prior smooth parameter for the word distribution over topics. As eta decreases, fewer words are associated with each topic. Default: 0.1.

DocIDColumn

Specifies the name of the input column that contains the document identifiers.

WordColumn

Specifies the name of the input column that contains the words (one word in each row).

CountColumn

[Optional] Specifies the name of the input column that contains the count of the corresponding word in the row, a positive value. Default behavior: The count of each word is 1.

MaxIterate

[Optional] Specifies the maximum number of iterations to perform if the model does not converge, a positive INTEGER value. Default: 50.

ConvergenceDelta

[Optional] Specifies the convergence delta of log perplexity, a NUMERIC value in the range [0.0, 1.0]. Default: 1e-4.

Seed

[Optional] Specifies the seed with which to initialize the model, a LONG value. Given the same seed, cluster configuration, and input table, the function generates the same model. Default behavior: The function initializes the model randomly.

OutputTopicNum

[Optional] Ignored unless OutputTable is specified. Specifies the number of top-weighted topics and their weights to include in the output table for each training document.

'all' (Default): All topics and their weights.
num_topics: Positive integer.

OutputTopicWordNum

[Optional] Ignored unless OutputTable is specified. Specifies the number of top topic words and their topic identifiers to include in the output table for each training document.

'none' (Default): No topic words or identifiers.
'all': All topic words and their identifiers.
num_topic_words: Positive integer.

Seed settings and cluster configurations affect function results.