Input - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product

Aster Analytics

Release Number

6.21

Published

November 2016

Language

English (United States)

Last Update

2018-04-14

dita:mapPath

kiu1466024880662.ditamap

dita:ditavalPath

AA-notempfilter_pdf_output.ditaval

dita:id

B700-1021

lifecycle

Product Category

Software

The NaiveBayesTextClassifierTrainer function has these input tables:

Input table, token
categories [Optional]
stop_words [Optional]

The token table, which contains the classified training tokens, is usually generated by a tokenizing function, such as TextTokenizer or Text_Parser. The following table describes its schema.

NaiveBayesTextClassifierTrainer Token Table Schema
Column Name	Data Type	Description
doc_id_column	CHARACTER, VARCHAR, text, INTEGER, or SMALLINT	Contains the identifiers of the documents that contain the classified training tokens. The table can have more than one such column.
token_column	CHARACTER, VARCHAR, or text	Contains the classified training tokens.
doc_category_column	CHARACTER, VARCHAR, or text	Contains the categories of the documents that contain the classified training tokens. Partition the table by this column.

The categories table contains all possible prediction categories. If you omit this table, then you must specify all possible prediction categories with the Categories argument.

NaiveBayesTextClassifierTrainer Categories Table Schema
Column Name	Data Type	Description
category_column	CHARACTER, VARCHAR, or text	Contains all possible prediction categories.

The stop_words table contains all possible stop words (a, an, the, and so on). If you omit this table, then you must specify all possible stop words with the Stop_Words argument.

NaiveBayesTextClassifierTrainer Stop_Words Table Schema
Column Name	Data Type	Description
stop_words_column	CHARACTER, VARCHAR, or text	Contains all possible stop words.