NaiveBayesTextClassifierTrainer Input - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.10
1.1
Published
October 2019
Language
English (United States)
Last Update
2019-12-31
dita:mapPath
ima1540829771750.ditamap
dita:ditavalPath
jsj1481748799576.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢
Table Description
InputTable Contains classified training tokens. Usually output by a tokenizing function, such as TextTokenizer (ML Engine) or TextParser (ML Engine).
CategoriesTable [Optional] Contains prediction categories to use in model, which you can also specify with Categories syntax element. If you omit both this table and Categories syntax element, function uses all categories specified by DocCategoryColumn syntax element.
StopWords [Optional] Contains stop words (a, an, the, and so on). You can specify stop words with either this table or StopWordsList syntax element.

InputTable Schema

Column Data Type Description
doc_id_column CHARACTER, VARCHAR, INTEGER, or SMALLINT [Column appears once for each specified doc_id_column.] Identifier of document that contains classified training tokens.
token_column CHARACTER or VARCHAR Classified training token.
doc_category_column CHARACTER or VARCHAR Category of document.

Partition table by this column.

CategoriesTable Schema

Column Data Type Description
category_column CHARACTER or VARCHAR Prediction category.

StopWords Schema

Column Data Type Description
stop_words_column CHARACTER or VARCHAR Stop word.