Input - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product

Aster Analytics

Release Number

7.00.02

Published

September 2017

Language

English (United States)

Last Update

2018-04-17

dita:mapPath

uce1497542673292.ditamap

dita:ditavalPath

AA-notempfilter_pdf_output.ditaval

dita:id

B700-1022

lifecycle

Product Category

Software

LDATrainer Training Table Schema
Column Name	Data Type	Description
doc_column	INTEGER, SMALLINT, BIGINT, NUMERIC, NUMERIC(p), NUMERIC(p,a), TEXT, VARCHAR, VARCHAR(n), UUID, or BYTEA.	Contains the document identifiers.
word_column	INTEGER, SMALLINT, BIGINT,  TEXT, VARCHAR, or VARCHAR(n)	Contains the words (one word in each row).
count_column	INTEGER, SMALLINT, BIGINT, NUMERIC, NUMERIC(p), NUMERIC(p,a),DOUBLE PRECISION	Optional. Contains the counts of the words. The default value is 1.

You can use the output of the TextTokenizer function with the argument OutputByWord('true') as input to the LDATrainer function. Teradata recommends filtering out words with low and high frequency, which impact topics that consist of common words that are not meaningful in topic model.