Named entity recognition (NER) is a process for finding specified entities in text. For example, a simple news named-entity recognizer for English might find the person "John J. Smith" and the location "Seattle" in the text string "John J. Smith lives in Seattle."
NER functions let you specify how to extract named entities when training the data models. The Aster Analytics Foundation provides two sets of NER functions.
The NER functions that use the Conditional Random Fields (CRF) model are:
- NERTrainer, which takes training data and outputs a CRF model (a binary file)
-
NER, which takes input documents and extracts specified entities, using one or more CRF models and, if appropriate, rules (regular expressions) or a dictionary
The function uses models to extract the names of persons, locations, and organizations; rules to extract entities that conform to rules (such as phone numbers, times, and dates); and a dictionary to extract known entities.
- NEREvaluator, which evaluates a CRF model
The CRF model implementation supports English, simplified Chinese, and traditional Chinese text. The maximum entropy model implementation supports only English text.
The NER functions that use the Max Entropy Model are documented in NER Functions (Max Entropy Model Implementation).