1.1 - 8.10 - NaiveBayesTextClassifierTrainer Input - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Teradata Vantage
Release Number
October 2019
Content Type
Programming Reference
Publication ID
English (United States)
Table Description
InputTable Contains classified training tokens. Usually output by a tokenizing function, such as TextTokenizer (ML Engine) or TextParser (ML Engine).
CategoriesTable [Optional] Contains prediction categories to use in model, which you can also specify with Categories syntax element. If you omit both this table and Categories syntax element, function uses all categories specified by DocCategoryColumn syntax element.
StopWords [Optional] Contains stop words (a, an, the, and so on). You can specify stop words with either this table or StopWordsList syntax element.

InputTable Schema

Column Data Type Description
doc_id_column CHARACTER, VARCHAR, INTEGER, or SMALLINT [Column appears once for each specified doc_id_column.] Identifier of document that contains classified training tokens.
token_column CHARACTER or VARCHAR Classified training token.
doc_category_column CHARACTER or VARCHAR Category of document.

Partition table by this column.

CategoriesTable Schema

Column Data Type Description
category_column CHARACTER or VARCHAR Prediction category.

StopWords Schema

Column Data Type Description
stop_words_column CHARACTER or VARCHAR Stop word.