SentimentExtractor
Description
The td_sentiment_extractor_sqle()
function uses a dictionary model
to extract the sentiment (positive, negative, or neutral)
of each input document or sentence.
The dictionary model consists of WordNet, a lexical database
of the English language, and these negation words (no, not,
neither, never, and similar negation words).
The function handles negated sentiments as follows:
-1 if the sentiment is negated (for example, "I am not happy")
-1 if the sentiment and a negation word are separated by one word (for example, "I am not very happy")
+1 if the sentiment and a negation word are separated by two or more words (for example, "I am not saying I am happy")
Notes:
This function requires the UTF8 client character set for UNICODE data.
This function does not support Pass Through Characters (PTCs).
For information about PTCs, see Teradata Vantage™ - Analytics Database International Character Set Support.
This function does not support KanjiSJIS or Graphic data types.
Only the English language is supported.
The max length supported for sentiment word in the dictionary data is 128 characters.
The Max length of the sentiment_words output column is 32000 characters. If the sentiment_words output column value exceeds this limit, then a triple dot(...) displays at the end of the string.
The Max length of the content output column is 32000 characters; that is, the supported maximum length of a sentence is 32000.
User can have up to 10 words in a sentiment phrase.
Usage
td_sentiment_extractor_sqle (
data = NULL,
cust.dict = NULL,
add.dict = NULL,
text.column = NULL,
accumulate = NULL,
analysis.type = "DOCUMENT",
priority = "NONE",
output.type = "ALL",
...
)
Arguments
data |
Required Argument. |
cust.dict |
Optional Argument. |
add.dict |
Optional Argument. |
text.column |
Required Argument. |
accumulate |
Optional Argument. |
analysis.type |
Optional Argument.
Default Value: "DOCUMENT" |
priority |
Optional Argument.
Default Value: "NONE" |
output.type |
Optional Argument.
Default Value: "ALL" |
... |
Specifies the generic keyword arguments SQLE functions accept. Below
are the generic keyword arguments: volatile: Function allows the user to partition, hash, order or local order the input data. These generic arguments are available for each argument that accepts tbl_teradata as input and can be accessed as:
Note: |
Value
Function returns an object of class "td_sentiment_extractor_sqle"
which is a named list containing object of class "tbl_teradata".
Named list member(s) can be referenced directly with the "$" operator
using the name(s):
result
output.dictionary.data
Examples
# Get the current context/connection.
con <- td_get_context()$connection
# Load the example data.
loadExampleData("sentimentextractor_example", "sentiment_extract_input",
"sentiment_word_input",
"additional_table")
# Create tbl_teradata object.
sentiment_extract_input <- tbl(con, "sentiment_extract_input")
sentiment_word_input <- tbl(con, "sentiment_word_input")
additional_table <- tbl(con, "additional_table")
# Check the list of available analytic functions.
display_analytic_functions()
# Example 1 : Extracting the sentiment (positive, negative, or neutral)
# of each input document or sentence.
sentimentextractor_out <- td_sentiment_extractor_sqle(text.column="review",
data=sentiment_extract_input)
# Print the result.
print(sentimentextractor_out$result)
print(sentimentextractor_out$output.dictionary.data)
# Example 2 : Extracting the sentiment (positive, negative, or neutral)
# of each input document by specifying custom dictionary data
# and adding additional entries to custom dictionary data.
sentimentextractor_out_1 <- td_sentiment_extractor_sqle(
text.column="review",
accumulate=c('id', 'product'),
analysis.type="DOCUMENT",
priority="NONE",
output.type="ALL",
data=sentiment_extract_input,
cust.dict=sentiment_word_input,
add.dict=additional_table)
# Print the result.
print(sentimentextractor_out_1$result)
print(sentimentextractor_out_1$output.dictionary.data)