Description
The NaiveBayesTextClassifierTrainer function takes training data as input and outputs a model tbl_teradata.
Usage
td_naivebayes_textclassifier_mle (
data = NULL,
data.partition.column = NULL,
token.column = NULL,
doc.id.columns = NULL,
doc.category.column = NULL,
model.type = "MULTINOMIAL",
categories.data = NULL,
category.column = "[0:0]",
prediction.categories = NULL,
stopwords.data = NULL,
stopwords.column = NULL,
stopwords.list = NULL,
data.sequence.column = NULL,
stopwords.data.sequence.column = NULL,
categories.data.sequence.column = NULL,
data.order.column = NULL,
stopwords.data.order.column = NULL,
categories.data.order.column = NULL
)
Arguments
data |
Required Argument. |
data.partition.column |
Required Argument. |
data.order.column |
Optional Argument. |
token.column |
Required Argument. |
doc.id.columns |
Optional Argument. Required when "model.type" is 'BERNOULLI'. |
doc.category.column |
Required Argument. |
model.type |
Optional Argument. |
categories.data |
Optional Argument. |
categories.data.order.column |
Optional Argument. |
category.column |
Optional Argument. |
prediction.categories |
Optional Argument. |
stopwords.data |
Optional Argument. |
stopwords.data.order.column |
Optional Argument. |
stopwords.column |
Optional Argument. |
stopwords.list |
Optional Argument. |
data.sequence.column |
Optional Argument. |
stopwords.data.sequence.column |
Optional Argument. |
categories.data.sequence.column |
Optional Argument. |
Value
Function returns an object of class "td_naivebayes_textclassifier_mle"
which is a named list containing object of class "tbl_teradata".
Named list member can be referenced directly with the "$" operator
using name: result.
Examples
# Get the current context/connection
con <- td_get_context()$connection
# Load example data.
loadExampleData("naivebayes_textclassifier_example", "token_table")
# Create object(s) of class "tbl_teradata".
token_table <- tbl(con, "token_table")
# Example 1 -
naivebayes_textclassifier_out <- td_naivebayes_textclassifier_mle(
data = token_table,
data.partition.column = c("category"),
token.column = "token",
doc.id.columns = c("doc_id"),
doc.category.column = "category",
model.type = "Bernoulli"
)