FindNamedEntity Arguments - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product

Aster Analytics

Release Number

7.00.02

Published

September 2017

Language

English (United States)

Last Update

2018-04-17

dita:mapPath

uce1497542673292.ditamap

dita:ditavalPath

AA-notempfilter_pdf_output.ditaval

dita:id

B700-1022

lifecycle

Product Category

Software

TextColumn

Specifies the name of the input table column that contains the text to analyze.

Model

[Optional] Required if you do not specify configuration_table, in which case you cannot specify 'all'.

Specifies the model items to load.

If you specify both configuration_table and this argument, the function loads the specified model items from configuration_table.

Default: 'all' (If you specify configuration_table but omit this argument).

The entity_type is the name of an entity type (for example, PERSON, LOCATION, or EMAIL), which appears in the output table.

The model_type is one of these model types:

'max entropy'
Maximum entropy language model generated by training.
'rule'
Rule-based model, a plain text file with one regular expression on each line.
'dictionary'
Dictionary-based model, a plain text file with one word on each line.
'reg exp'
Regular expression that describes entity_type.

If model_type is 'reg exp', specify regular_expression (a regular expression that describes entity_type); otherwise, specify model_file (the name of the model file). Before calling the function, add the location of every specified model_file to the user/session default search path.

If you specify configuration_table, you can use entity_type as a shortcut. For example, if the configure_table has the row 'organization, max entropy, en-ner-organization.bin', you can specify Model('organization') as a shortcut for Model('organization:max entropy:en-ner-organization.bin').

For model_type 'max entropy', if you specify configuration_file and omit this argument, then the JVM of the worker node needs more than 2GB of memory.

ShowEntityContext

[Optional] Specifies the number of context words to output. If context_words is n (which must be a positive integer), the function outputs the n words that precede the entity, the entity, and the n words that follow the entity. Default: 0.

EntityColumn

[Optional] Specifies the name of the output table column that contains the entity names. Default: 'entity'.

Accumulate

[Optional] Specifies the names of input columns to copy to the output table. No accumulate_column can be an entity_column. Default: All input columns.