Background - Aster Analytics

Teradata AsterĀ® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Language
English (United States)
Last Update
2018-04-17
dita:mapPath
uce1497542673292.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1022
lifecycle
previous
Product Category
Software

Lemmatization is a basic text analysis tool that determines the lemmas (standard forms) of words, so that all forms of a word can be grouped together, improving the accuracy of text analysis.

The TextMorph function implements a lemmatization algorithm based on the WordNet 3.0 dictionary, which is packaged with the function. If an input word is in the dictionary, the function outputs its morphs with their parts of speech; otherwise, the function outputs the input word itself and sets its part of speech to NULL.

When an input word has multiple morphs, the function outputs them in order of the precedence of their parts of speech: noun, verb, adj, and adv. That is, if an input word has a noun form, it is listed first. If the same word has a verb form, it is listed next, and so on.