Defining Text Parameters and Running Analysis - Teradata Analytic Apps

Defining Text Parameters and Running Analysis - Teradata Analytic Apps - Vantage Analyst

Vantage Analyst with Machine Learning Engine User Guide

Product

Teradata Analytic Apps

Vantage Analyst

Release Number

1.1

Published

December 2019

Language

English (United States)

Last Update

2020-08-06

dita:mapPath

ezh1551894635141.ditamap

dita:ditavalPath

wsp1565965728073.ditaval

dita:id

B035-3805

lifecycle

Product Category

Teradata Vantage™

PrerequisiteThe database user for Text must have CREATE TABLE permission.

On Select system page, select the System, and your authentication method:

Option	Description
Use current session	Use the current user's session.
Enter credentials	Enter the Username and Password of a user with access to the selected system.

Define the parameters for analysis.

Parameter	Description
Data source	Select the Database and Table for the analysis. You must also specify: Text column: Select a column that contains the text to analyze. Identifier column: Select a column as the identifier for each text element. Sample: Select the sample size.
Data filter	[Optional] Use these fields to include or exclude specific terms from the analysis: Filter column: Select the column from the Table, then enter a text to filter against. Words or phases to exclude: Enter text (stop words) to exclude.
Analysis type	Select the type of analysis to perform: Sentiment: Infer positive, negative, or neutral user sentiment. Key Terms: Extract the most common keywords.

For Sentiment analysis, use the following options to configure the analysis:

Parameter	Description
Level of Analysis	Analyze the document (default) or each sentence in the document.
High Priority	Specify which results receive the highest priority: None (default): Give all results the same priority. Negative Recall: Give highest priority to negative results, including those with lower-confidence sentiment classifications (maximizes the number of negative results returned). Negative Precision: Give highest priority to negative results with high-confidence sentiment classifications. Positive Recalll: Give highest priority to positive results, including those with lower-confidence sentiment classifications (maximizes the number of positive results returned). Positive Precision: Give highest priority to positive results with high-confidence sentiment classifications.
Filter	Return all (default) results, only positive results, or only negative results.

For Key Terms analysis, use the following options to configure the analysis:

Parameter	Description
Analysis Types	Select the specific analysis type: Cleanse and Gram nGram Text Parser
Options	Depending on the selected analysis type, additional options may be available: Stemming: Specify if similar words are grouped y by their stem. When enabled, words such as like, liked, likes will be analyzed together. nGram options: Specify the minimum and maximum size of the nGrams Overlap: Specifies if the analysis allows overlapping n-grams. When enabled (default), each word in each sentence starts an n-gram. Use TF-IDF: Use term frequency- inverse document frequency to evaluate the importance of specific terms. If enabled, you must select the specific TF-IDF formula to use.

Parameter

Description

Analysis Types

Select the specific analysis type:

Options

Depending on the selected analysis type, additional options may be available:

Stemming: Specify if similar words are grouped y by their stem. When enabled, words such as like, liked, likes will be analyzed together.
nGram options: Specify the minimum and maximum size of the nGrams
Overlap: Specifies if the analysis allows overlapping n-grams. When enabled (default), each word in each sentence starts an n-gram.
Use TF-IDF: Use term frequency- inverse document frequency to evaluate the importance of specific terms. If enabled, you must select the specific TF-IDF formula to use.

[Optional] Click SHOW SQL to preview the generated SQL for the analysis.
1. Select a Database, enter a unique Prefix to identify the output table created by the analysis, and click VIEW SQL.
2. Review the generated SQL and click COPY to copy to your PC.
  You can then paste the generated SQL into a SQL node in a workflow workflows. See Using Text Results in Workflows.
Click RUN.
If you make changes to the parameters, you must select RUN again to see the updated results.
Select a Database, enter a unique Prefix to identify the output table created by the analysis, and click RUN.
Standard character requirements for Advanced SQL Engine tables apply.
If you enter a prefix used for existing tables, those tables are overwritten.