Description
TF (Term Frequency) is used in conjuction with function TF-IDF (Term Frequency - Inverse Document Frequency). TF-IDF is a technique for weighting words in a document. The resulting weights can be used together in a vector space model as input for various document clustering or classification algorithms. To compute TF-IDF values, the TF_IDF function relies on the TF function, which computes the TF value of the input.
Usage
td_tf_mle (
data = NULL,
formula = "normal",
data.sequence.column = NULL,
data.partition.column = NULL,
data.order.column = NULL
)
Arguments
data |
Required Argument. |
data.partition.column |
Required Argument. |
data.order.column |
Optional Argument. |
formula |
Optional Argument.
|
data.sequence.column |
Optional Argument. |
Value
Function returns an object of class "td_tf_mle" which is a named list
containing object of class "tbl_teradata".
Named list member can be referenced directly with the "$" operator
using the name: result.
Examples
# Get the current context/connection
con <- td_get_context()$connection
# Load example data.
loadExampleData("tf_example", "tfidf_input1")
# Create object(s) of class "tbl_teradata".
tfidf_input1 <- tbl(con, "tfidf_input1")
# Example 1 - Calculate TF values using input tbl_teradata containing tokens and their count in
# all documents.
tf_out <- td_tf_mle(tfidf_input1, data.partition.column="docid")