1.0 - 8.00 - NaiveBayesTextClassifierTrainer Input - Teradata Vantage

Teradata® Vantage Machine Learning Engine Analytic Function Reference

Teradata Vantage
Release Number
Release Date
May 2019
Content Type
Programming Reference
Publication ID
English (United States)
Table Description
tokens Contains classified training tokens. Usually output by a tokenizing function, such as TextTokenizer or TextParser.
[Optional] categories Contains prediction categories to use in model, which you can also specify with Categories argument. If you omit both this table and Categories argument, function uses all categories specified by DocCategoryColumn argument.
[Optional] stop_words Contains stop words (a, an, the, and so on). If you omit this table, you must specify stop words with StopWords argument.

tokens Schema

Column Data Type Description
doc_id_column CHARACTER, VARCHAR, INTEGER, or SMALLINT [Column appears once for each specified doc_id_column.] Identifier of document that contains classified training tokens.
token_column CHARACTER or VARCHAR Classified training token.
doc_category_column CHARACTER or VARCHAR Category of document.
Partition table by this column.

categories Schema

Column Data Type Description
category_column CHARACTER or VARCHAR Prediction category.

stop_words Schema

Column Data Type Description
stop_words_column CHARACTER or VARCHAR Stop word.