7.00.02 - Input - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Release Date
September 2017
Content Type
Programming Reference
User Guide
Publication ID
B700-1022-700K
Language
English (United States)
The NaiveBayesTextClassifierTrainer function has these input tables:
  • Input table, token
  • [Optional] categories
  • [Optional] stop_word

The token table, which contains the classified training tokens, is usually generated by a tokenizing function, such as TextTokenizer or Text_Parser. The following table describes its schema.

NaiveBayesTextClassifierTrainer Token Table Schema
Column Name Data Type Description
doc_id_column CHARACTER, VARCHAR, text, INTEGER, or SMALLINT Contains the identifiers of the documents that contain the classified training tokens. The table can have more than one such column.
token_column CHARACTER, VARCHAR, or text Contains the classified training tokens.
doc_category_column CHARACTER, VARCHAR, or text Contains the categories of the documents that contain the classified training tokens.
Partition the table by this column.

The categories table contains all possible prediction categories. If you omit this table, then you must specify all possible prediction categories with the Categories argument.

NaiveBayesTextClassifierTrainer Categories Table Schema
Column Name Data Type Description
category_column CHARACTER, VARCHAR, or text Contains all possible prediction categories.

The stop_words table contains all possible stop words (a, an, the, and so on). If you omit this table, then you must specify all possible stop words with the Stop_Words argument.

NaiveBayesTextClassifierTrainer Stop_Words Table Schema
Column Name Data Type Description
stop_words_column CHARACTER, VARCHAR, or text Contains all possible stop words.