7.00.02 - Background - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Content Type
Programming Reference
User Guide
Publication ID
B700-1022-700K
Language
English (United States)
Last Update
2018-04-17

Lemmatization is a basic text analysis tool that determines the lemmas (standard forms) of words, so that all forms of a word can be grouped together, improving the accuracy of text analysis.

The TextMorph function implements a lemmatization algorithm based on the WordNet 3.0 dictionary, which is packaged with the function. If an input word is in the dictionary, the function outputs its morphs with their parts of speech; otherwise, the function outputs the input word itself and sets its part of speech to NULL.

When an input word has multiple morphs, the function outputs them in order of the precedence of their parts of speech: noun, verb, adj, and adv. That is, if an input word has a noun form, it is listed first. If the same word has a verb form, it is listed next, and so on.