- InputTable
- Specifies the name of the table or view that contains the training documents.
- ModelTable
- Specifies the name for the model table that the function creates in the database. This table must not already exist.
- OutputTable
- [Optional] Specifies the name of the output table that contains the topic distribution of each document in the input table, which the function creates in the database. This table must not already exist. If you omit this argument, the function does not generate this table.
- TopicNum
- Specifies the number of topics for all the documents in the input table, an INTEGER value in the range [2, 1000].
- Alpha
- [Optional] Specifies a hyperparameter of the model, the prior smooth parameter for the topic distribution over documents. As alpha decreases, fewer topics are associated with each document. Default: 0.1.
- Eta
- [Optional] Specifies a hyperparameter of the model, the prior smooth parameter for the word distribution over topics. As eta decreases, fewer words are associated with each topic. Default: 0.1.
- DocIDColumn
- Specifies the name of the input column that contains the document identifiers.
- WordColumn
- Specifies the name of the input column that contains the words (one word in each row).
- CountColumn
- [Optional] Specifies the name of the input column that contains the count of the corresponding word in the row, a positive value. Default behavior: The count of each word is 1.
- MaxIterate
- [Optional] Specifies the maximum number of iterations to perform if the model does not converge, a positive INTEGER value. Default: 50.
- ConvergenceDelta
- [Optional] Specifies the convergence delta of log perplexity, a NUMERIC value in the range [0.0, 1.0]. Default: 1e-4.
- Seed
- [Optional] Specifies the seed with which to initialize the model, a LONG value. Given the same seed, cluster configuration, and input table, the function generates the same model. Default behavior: The function initializes the model randomly.
- OutputTopicNum
- [Optional] Ignored unless OutputTable is specified. Specifies the number of top-weighted topics and their weights to include in the output table for each training document.
- 'all' (Default): All topics and their weights.
- num_topics: Positive integer.
- OutputTopicWordNum
- [Optional] Ignored unless OutputTable is specified. Specifies the number of top topic words and their topic identifiers to include in the output table for each training document.
- 'none' (Default): No topic words or identifiers.
- 'all': All topic words and their identifiers.
- num_topic_words: Positive integer.
Seed settings and cluster configurations affect function results.