| |
- KMeansPredict(data, model, cluster_column='clusterid', index_columns=None, accumulate=None, fallback=False, operator_database=None)
- DESCRIPTION:
The function performs cluster prediction on test data based on the cluster model
generated by KMeans() VALIB function and generates a teradataml DataFrame containing
progress report with two columns, a timestamp, and a progress message.
PARAMETERS:
data:
Required Argument.
Specifies the input data for which cluster is to be predicted.
Types: teradataml DataFrame
model:
Required Argument.
Specifies the model generated by VALIB KMeans() function.
Types: teradataml DataFrame
cluster_column:
Optional Argument.
Specifies the name of the column representing cluster identifier.
Default Value: "clusterid"
Types: str
index_columns:
Optional Argument.
Specifies the names of one or more columns in the input teradataml DataFrame
to use as the primary index of the scored output DataFrame.
Types: str OR list of Strings (str)
accumulate:
Optional Argument.
Specifies the names of one or more columns from the input teradataml DataFrame
that can be passed along to the output teradataml DataFrame.
Types: str OR list of Strings (str)
fallback:
Optional Argument.
Specifies an optional flag to indicate ('True'), that the scored output
teradataml DataFrame has the fallback attribute (that is, have a mirrored copy).
Default Value: False
Types: bool
operator_database:
Optional Argument.
Specifies the database where the table operators called by Vantage Analytic
Library reside. If not specified, the library searches the standard search path
for table operators, including the current database.
Types: str
RETURNS:
An instance of KMeansPredict.
Output teradataml DataFrames can be accessed using attribute references, such as
KMeansPredObj.<attribute_name>.
Output teradataml DataFrame attribute name is: result.
RAISES:
TeradataMlException, TypeError, ValueError
EXAMPLES:
# Notes:
# 1. To execute Vantage Analytic Library functions,
# a. import "valib" object from teradataml.
# b. set 'configure.val_install_location' to the database name where Vantage
# analytic library functions are installed.
# 2. Datasets used in these examples can be loaded using Vantage Analytic Library
# installer.
# Import valib object from teradataml to execute this function.
from teradataml import valib
# Set the 'configure.val_install_location' variable.
from teradataml import configure
configure.val_install_location = "SYSLIB"
# Create the required teradataml DataFrame.
df = DataFrame("customer_analysis")
print(df)
# Run KMeans() first.
kmeans_out = valib.KMeans(data=df,
columns=["avg_cc_bal", "avg_ck_bal", "avg_sv_bal"],
centers=3)
# Use KMeans result teradataml DataFrame(from above step) to predict the clusters
# for the data in DataFrame 'df'.
obj = valib.KMeansPredict(data=df,
model=kmeans_out.result,
cluster_column="clusterid",
index_columns="cust_id",
fallback=False,
accumulate=["cust_id", "age", "city_name", "state_code",
"gender"])
# Print the results.
print(obj.result)
|