Input - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product
Aster Analytics
Release Number
6.21
Published
November 2016
Language
English (United States)
Last Update
2018-04-14
dita:mapPath
kiu1466024880662.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1021
lifecycle
previous
Product Category
Software
LDATrainer Training Table Schema
Column Name Data Type Description
doc_column INTEGER, SMALLINT, BIGINT, NUMERIC, NUMERIC(p), NUMERIC(p,a), TEXT, VARCHAR, VARCHAR(n), UUID, or BYTEA. Contains the document identifiers.
word_column INTEGER, SMALLINT, BIGINT, 
TEXT, VARCHAR, or VARCHAR(n) Contains the words (one word in each row).
count_column INTEGER, SMALLINT, BIGINT, NUMERIC, NUMERIC(p), NUMERIC(p,a),DOUBLE PRECISION Optional. Contains the counts of the words. The default value is 1.
You can use the output of the TextTokenizer function with the argument OutputByWord('true') as input to the LDATrainer function. Teradata recommends that you filter out the words with low frequency and high frequency, as they may impact the topics that consist of common words that are not meaningful in topic model.