1.1 - 8.10 - TextTagger Syntax Elements - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)
InputLanguage
[Optional] Specify the language of the input text:
Option Description
'en' (Default) English
'zh_CN' Simplified Chinese
'zh_TW' Traditional Chinese
TaggingRules
[Required if you do not specify a Rules table, disallowed otherwise.] Specify the tag names and tagging rules. For information about defining tagging rules, see Defining Tagging Rules.
Tokenize
[Optional] Specify whether the function tokenizes the input text before evaluating the rules and tokenizes the text string parameter in the rule definition when parsing a rule.

If you specify 'true', then you must also specify the InputLanguage syntax element. The function uses the value of InputLanguage to create the word tokenizer.

Default: 'false'
OutputByTag
[Optional] Specify whether the function outputs a tuple when a text document matches multiple tags.
Default: 'false' (One tuple in the output stands for one document and the matched tags are listed in the output column tag.)
TagDelimiter
[Optional]
Specify the delimiter, a string, that separates multiple tags in the output column tag if OutputByTag has the value 'false'. If OutputByTag has the value 'true', specifying this syntax element causes an error.
Default: ',' (comma)
Accumulate
[Optional] Specify the names of text table columns to copy to the output table.
Do not use the name 'tag' for an accumulate_column, because the function uses that name for the output table column that contains the tags.