Text Analysis - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product
Aster Analytics
Release Number
6.21
Published
November 2016
Language
English (United States)
Last Update
2018-04-14
dita:mapPath
kiu1466024880662.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1021
lifecycle
previous
Product Category
Software
Text Analysis Functions
Function Description
LDA Functions Build a topic model based on the supplied training data and parameters, estimate the topic distribution for each document based on the generated model, and display information from the model. The LDA functions are LDATrainer, LDAInference, and LDATopicPrinter.
Levenshtein Distance (LDist) Computes the Levenshtein distance between two text values, that is, the number of edits needed to transform one string into the other, where edits include insertions, deletions, or substitutions of individual characters.
Naive Bayes Text Classifier Uses the Naive Bayes algorithm to classify data objects. The Naive Bayes Text Classifier is composed of the functions NaiveBayesTextClassifierTrainer and NaiveBayesTextClassifierPredict.
NER Functions (CRF Model Implementation) Use the Conditional Random Fields (CRF) model to specify how to extract features (for example, person, location, and organization) when training data models. Trains, evaluates, and applies models. These NER functions are NERTrainer, NER, and NEREvaluator.
NER Functions (Max Entropy Model Implementation) Use the Max Entropy model to specify how to extract features (for example, person, location, and organization) when training data models. Trains, evaluates, and applies models. These NER functions are FindNamedEntity, TrainNamedEntityFinder, Evaluate Named Entity Finder.
nGram Tokenizes (splits) an input stream and emits n multi-grams based on specified delimiter and reset parameters. Useful for sentiment analysis, topic identification, and document classification.
POSTagger Tags the parts-of-speech of input text.
Sentenizer Extracts the sentences in the input paragraphs.
Sentiment Extraction Functions Deduce user opinion (positive, negative, or neutral) from text. The sentiment extraction functions are TrainSentimentExtractor, ExtractSentiment, and EvaluateSentimentExtractor.
Text Classifier Chooses the correct class label for given text. Text Classifier is composed of the functions TextClassifierTrainer, TextClassifier, and TextClassifierEvaluator.
Text_Parser Tokenizes a stream of words, optionally stems them, and outputs the individual words and their counts.
TextChunker Divides text into phrases and assigns each phrase a tag identifying its type.
TextMorph Provides lemmatization, a basic tool in text analysis. Outputs a standard form of the input words.
TextTagging Tags input tuples according to user-defined rules that use logical and text processing operators.
TextTokenizer Extracts tokens (for example, words, punctuation marks, and numbers) from text.
TF_IDF Evaluates the importance of a word within a specific document, weighted by the number of times the word appears in the entire document set.