LDA Output - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantage™

Output Message Schema

Column Data Type Description
message VARCHAR Reports iteration steps and perplexity of model.

Perplexity formula:

perplexity = 2H (p) = 2x p (x) log2 p (x)

where H (p) is the entropy of the distribution.

Although perplexity varies with training documents, you can use perplexity to find the best model for a specified set of training documents: Create models for several subsets of the training documents and then choose the model with the lowest perplexity.

ModelTable Schema

Column Data Type Description
topicid INTEGER Internally created topic identifier.
value_col BLOB Model in binary format, which is not readable. To see binary contents, use LDATopicSummary (ML Engine) function.

OutputTable Schema

This table appears only with the OutputTable syntax element.

Column Data Type Description
docid Same as doc_id_column in input table Document identifier from input table.
topicid INTEGER Topic identifier from ModelTable.
topicweight DOUBLE PRECISION [Column appears number of times specified by OutputTopicNum syntax element.] Topic weight.
topicwords VARCHAR [Column appears number of times specified by OutputTopicWordNum syntax element.] Topic words in document, separated by commas.