NamedEntityFinder Syntax Elements

NamedEntityFinder Syntax Elements - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product

Teradata Vantage

Release Number

9.02

9.01

2.0

1.3

Published

February 2022

Language

English (United States)

Last Update

2022-02-10

dita:mapPath

rnn1580259159235.ditamap

dita:ditavalPath

ybt1582220416951.ditaval

dita:id

B700-4003

lifecycle

Product Category

Teradata Vantage™

TextColumn

Specify the name of the input table column that contains the text to analyze.

Models

[Optional] Required if you do not specify ConfigurationTable, in which case you cannot specify 'all'. Specify the model items to load.

If you specify both ConfigurationTable and this syntax element, the function loads the specified model items from ConfigurationTable.

The entity_type is the name of an entity type (for example, PERSON, LOCATION, or EMAIL), which appears in the output table.

model_type	Description
'max entropy'	Maximum entropy language model output by training.
'rule'	Rule-based model, a plain text file with one regular expression on each line.
'dictionary'	Dictionary-based model, a plain text file with one word on each line.
'reg exp'	Regular expression that describes entity_type.

If model_type is 'reg exp', specify regular_expression (a regular expression that describes entity_type); otherwise, specify model_file (the name of the model file).

If you specify ConfigurationTable, you can use entity_type as a shortcut. For example, if the ConfigurationTable has the row 'organization, max entropy, en-ner-organization.bin', you can specify Models ('organization') as a shortcut for Models ('organization:max entropy:en-ner-organization.bin').

For model_type 'max entropy', if you specify ConfigurationTable and omit this syntax element, then the JVM of the worker node needs more than 2GB of memory.

Default: 'all' (If you specify ConfigurationTable but omit this syntax element.)

ShowContext

[Optional] Specify the number of context words to output. If context_words is n (which must be a positive integer), the function outputs the n words that precede the entity, the entity, and the n words that follow the entity.

Default: 0

EntityColName

[Optional] Specify the name of the output table column that contains the entity names.

Default: 'entity'

Accumulate

[Optional] Specify the names of input columns to copy to the output table. No accumulate_column can be an entity_column.

Default: All input columns