TextTagger Arguments - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢
InputLanguage
[Optional] Specify the language of the input text:
Option Description
'en' (Default) English
'zh_CN' Simplified Chinese
'zh_TW' Traditional Chinese
TaggingRules
[Required if you do not specify a rules table, disallowed otherwise.] Specify the tag names and tagging rules. For information about defining tagging rules, see Defining Tagging Rules.
Tokenize
[Optional] Specify whether the function tokenizes the input text before evaluating the rules and tokenizes the text string parameter in the rule definition when parsing a rule.

If you specify 'true', then you must also specify the InputLanguage argument. The function uses the value of InputLanguage to create the word tokenizer.

Default: 'false'
OutputByTag
[Optional] Specify whether the function outputs a tuple when a text document matches multiple tags.
Default: 'false' (One tuple in the output stands for one document and the matched tags are listed in the output column tag.)
TagDelimiter
[Optional]
Specify the delimiter, a string, that separates multiple tags in the output column tag if OutputByTag has the value 'false'. If OutputByTag has the value 'true', specifying this argument causes an error.
Default: ',' (comma)
Accumulate
[Optional] Specify the names of text table columns to copy to the output table.
Do not use the name 'tag' for an accumulate_column, because the function uses that name for the output table column that contains the tags.