Description
The POSTagger function creates part-of-speech (POS) tags for the words in the input text. POS tagging is the first step in the syntactic analysis of a language, and an important preprocessing step in many natural language processing applications.
Usage
td_pos_tagger_mle (
data = NULL,
text.column = NULL,
language = "en",
accumulate = NULL,
data.sequence.column = NULL,
data.order.column = NULL
)
Arguments
data |
Required Argument. |
data.order.column |
Optional Argument. |
text.column |
Required Argument. |
language |
Optional Argument. |
accumulate |
Optional Argument. |
data.sequence.column |
Optional Argument. |
Value
Function returns an object of class "td_pos_tagger_mle" which is a
named list containing object of class "tbl_teradata".
Named list member can be referenced directly with the "$" operator
using name: result.
Examples
# Get the current context/connection
con <- td_get_context()$connection
# Load example data.
loadExampleData("pos_tagger_example", "paragraphs_input")
# Create object(s) of class "tbl_teradata".
paragraphs_input <- tbl(con, "paragraphs_input")
# Example 1 - Applying POSTagger using default language 'en'.
pos_tagger_out <- td_pos_tagger_mle(data=paragraphs_input,
text.column='paratext',
language='en',
accumulate='paraid'
)
# Example 2 - This example uses output of td_sentence_extractor_mle() function as input.
td_sentence_extractor_out <- td_sentence_extractor_mle(data = paragraphs_input,
text.column = "paratext",
accumulate = c("paraid", "paratopic")
)
pos_tagger_out <- td_pos_tagger_mle(data=td_sentence_extractor_out$result,
text.column='sentence',
accumulate=c('sentence','sentence_sn')
)