LDA Input - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

InputTable Schema

Column Data Type Description
doc_id_column INTEGER, SMALLINT, BIGINT, NUMERIC, VARCHAR, VARBYTE(n), or BLOB Document identifier.
word_column INTEGER, SMALLINT, BIGINT, or VARCHAR Word.
count_column INTEGER, SMALLINT, BIGINT, NUMERIC, or DOUBLE PRECISION [Column appears only with CountColumn syntax element.] Number of times word appears in document.

You can use TextParser Output as input to the LDA function. Teradata recommends filtering out words with low and high frequency, which impact topics that consist of common words that are not meaningful in the topic model.