Table | Description |
---|---|
InputTable | Contains classified training tokens. Usually output by a tokenizing function, such as TextTokenizer (ML Engine) or TextParser (ML Engine). |
CategoriesTable | [Optional] Contains prediction categories to use in model, which you can also specify with Categories syntax element. If you omit both this table and Categories syntax element, function uses all categories specified by DocCategoryColumn syntax element. |
StopWords | [Optional] Contains stop words (a, an, the, and so on). You can specify stop words with either this table or StopWordsList syntax element. |
InputTable Schema
Column | Data Type | Description |
---|---|---|
doc_id_column | CHARACTER, VARCHAR, INTEGER, or SMALLINT | [Column appears once for each specified doc_id_column.] Identifier of document that contains classified training tokens. |
token_column | CHARACTER or VARCHAR | Classified training token. |
doc_category_column | CHARACTER or VARCHAR | Category of document. Partition table by this column. |
CategoriesTable Schema
Column | Data Type | Description |
---|---|---|
category_column | CHARACTER or VARCHAR | Prediction category. |
StopWords Schema
Column | Data Type | Description |
---|---|---|
stop_words_column | CHARACTER or VARCHAR | Stop word. |