Description
The NamedEntityFinderTrainer (td_namedentity_finder_trainer_mle
) function takes
training data and outputs a Max Entropy data model. The function is based
on OpenNLP, and follows its annotation. For more information on OpenNLP,
see http://opennlp.apache.org/documentation/1.5.2-incubating/manual/opennlp.html.
Usage
td_namedentity_finder_trainer_mle ( data = NULL, text.column = NULL, entity.type = NULL, model.file = NULL, iter.num = 100, cutoff = 5, data.sequence.column = NULL )
Arguments
data |
Required Argument. |
text.column |
Required Argument. |
entity.type |
Required Argument. |
model.file |
Required Argument. |
iter.num |
Optional Argument. |
cutoff |
Optional Argument. |
data.sequence.column |
Optional Argument. |
Value
Function returns an object of class "td_namedentity_finder_trainer_mle"
which is a named list containing Teradata tbl object.
Named list member can be referenced directly with the "$" operator
using name: output.
Examples
# Get the current context/connection con <- td_get_context()$connection # Load example data. loadExampleData("namedentityfindertrainer_example", "nermem_sports_train") # Create remote tibble objects. nermem_sports_train <- tbl(con, "nermem_sports_train") # Example: Train a namedentity finder model on entity type: "LOCATION" # The trained model is stored in a binary file: "location.sports" td_neft_out <- td_namedentity_finder_trainer_mle(data = nermem_sports_train, text.column = "content", entity.type = "LOCATION", model.file = "location.sports" )