1.0 - 8.00 - NamedEntityFinder Arguments - Teradata Vantage

Teradata® Vantage Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.0
8.00
Release Date
May 2019
Content Type
Programming Reference
Publication ID
B700-4003-098K
Language
English (United States)
TextColumn
Specify the name of the input table column that contains the text to analyze.
Model
[Optional] Required if you do not specify configuration_table, in which case you cannot specify 'all'. Specify the model items to load.
If you specify both configuration_table and this argument, the function loads the specified model items from configuration_table.
The entity_type is the name of an entity type (for example, PERSON, LOCATION, or EMAIL), which appears in the output table.
model_type Description
'max entropy' Maximum entropy language model output by training.
'rule' Rule-based model, a plain text file with one regular expression on each line.
'dictionary' Dictionary-based model, a plain text file with one word on each line.
'reg exp' Regular expression that describes entity_type.
If model_type is 'reg exp', specify regular_expression (a regular expression that describes entity_type); otherwise, specify model_file (the name of the model file).

If you specify configuration_table, you can use entity_type as a shortcut. For example, if the configure_table has the row 'organization, max entropy, en-ner-organization.bin', you can specify Model ('organization') as a shortcut for Model ('organization:max entropy:en-ner-organization.bin').

For model_type 'max entropy', if you specify configuration_file and omit this argument, then the JVM of the worker node needs more than 2GB of memory.
Default: 'all' (If you specify configuration_table but omit this argument.)
ShowEntityContext
[Optional] Specify the number of context words to output. If context_words is n (which must be a positive integer), the function outputs the n words that precede the entity, the entity, and the n words that follow the entity.
Default: 0
EntityColumn
[Optional] Specify the name of the output table column that contains the entity names.
Default: 'entity'
Accumulate
[Optional] Specify the names of input columns to copy to the output table. No accumulate_column can be an entity_column.
Default: All input columns