Teradata Package for R Function Reference | 17.00 - td_kmeans_predict_valib - Teradata Package for R - Look here for syntax, methods and examples for the functions included in the Teradata Package for R.

Teradata® Package for R Function Reference

Product
Teradata Package for R
Release Number
17.00
Published
July 2021
Language
English (United States)
Last Update
2023-08-08
dita:id
B700-4007
NMT
no
Product Category
Teradata Vantage
K-Means Clustering Predict.

Description

The function performs cluster prediction on test data based on the cluster model generated by VALIB td_kmeans_valib() function and generates a tbl_teradata object containing progress report with two columns, a timestamp, and a progress message.

Usage

td_kmeans_predict_valib(model, data, ...)

Arguments

model

Required Argument.
Specifies the input containing the kmeans model to use in scoring. This must be the "result" tbl_teradata generated by td_kmeans_valib() or a tbl_teradata created on a table generated by 'KMEANS' function from Vantage Analytic Library.
Types: tbl_teradata

data

Required Argument.
Specifies the input data for which cluster is to be predicted.
Types: tbl_teradata

...

Specifies other arguments supported by the function as described in the 'Other Arguments' section.

Value

Function returns an object of class "td_kmeans_predict_valib" which is a named list containing object of class "tbl_teradata".
Named list member can be referenced directly with the "$" operator using name: result.

Other Arguments

cluster.column

Optional Argument.
Specifies the name of the column representing cluster identifier.
Default Value: "clusterid"
Types: character

index.columns

Optional Argument.
Specifies the name(s) of the column(s) in the input tbl_teradata to use as the primary index of the scored output tbl_teradata.
Types: character OR vector of Strings (character)

accumulate

Optional Argument.
Specifies the name(s) of the column(s) from the input tbl_teradata that can be passed along to the output tbl_teradata.
Types: character OR vector of Strings (character)

fallback

Optional Argument.
Specifies an optional flag to indicate (TRUE), that the scored output tbl_teradata has the fallback attribute (that is, have a mirrored copy).
Default Value: FALSE
Types: logical

operator.database

Optional Argument.
Specifies the database where the table operators called by Vantage Analytic Library reside. If not specified, the library searches the standard search path for table operators, including the current database.
Types: character

Examples


# Notes:
#   1. To execute Vantage Analytic Library functions, set option
#      'val.install.location' to the database name where Vantage analytic
#      library functions are installed.
#   2. Datasets used in these examples can be loaded using Vantage Analytic
#      Library installer.

# Set the option 'val.install.location'.
options(val.install.location = "SYSLIB")

# Get remote data source connection.
con <- td_get_context()$connection

# Create an object of class "tbl_teradata".
cust <- tbl(con, "customer_analysis")
print(cust)

# Example 1: Shows how kmeans clustering is performed.
# First generate the model using td_kmeans_valib() function.
kmeans_obj <- td_kmeans_valib(data=cust,
                              columns=c("avg_cc_bal", "avg_ck_bal", "avg_sv_bal"),
                              centers=3)

# Use kmeans result tbl_teradata (from above step) to predict the clusters 
# for the data in tbl 'cust'.
obj <- td_kmeans_predict_valib(data=cust,
                               model=kmeans_obj$result,
                               cluster.column="clusterid",
                               index.columns="cust_id",
                               fallback=FALSE,
                               accumulate=c("cust_id", "age", "city_name", "state_code", "gender"))
# Print the results.
print(obj$result)

# Score using S3 predict function and the model generated above.
obj <- predict(object=kmeans_obj,
               data=cust)

# Print the results.
print(obj$result)