Teradata Package for Python Function Reference | 17.10 - SVMSparse - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.

Teradata® Package for Python Function Reference

Product

Teradata Package for Python

Release Number

17.10

Published

April 2022

Language

English (United States)

Last Update

2022-08-19

lifecycle

Product Category

Teradata Vantage

teradataml.analytics.mle.SVMSparse = class SVMSparse(builtins.object)

Methods defined here:

__init__(self, data=None, sample_id_column=None, attribute_column=None, value_column=None, label_column=None, cost=1.0, bias=0.0, hash=False, hash_buckets=None, class_weights=None, max_step=100, epsilon=0.01, seed=0, data_sequence_column=None, force_mapreduce=False): DESCRIPTION: The SVMSparse function takes training data (in sparse format) and outputs a predictive model in binary format, which is input to the functions SVMSparsePredict and SVMSparseSummary. PARAMETERS: data: Required Argument. Specifies the name of the teradataml DataFrame that contains the training samples. sample_id_column: Required Argument. Specifies the name of the column in data, teradataml DataFrame that contains the identifiers of the training samples. Types: str attribute_column: Required Argument. Specifies the name of the column in data, teradataml DataFrame that contains the attributes of the samples. Types: str value_column: Optional Argument. Required when teradataml is connected to Vantage 1.3 version. Specifies the name of the column in data, teradataml DataFrame that contains the attribute values. Types: str label_column: Required Argument. Specifies the name of the column in data, teradataml DataFrame that contains the classes of the samples. Types: str cost: Optional Argument. Specifies the regularization parameter in the SVM soft-margin loss function: Cost must be greater than 0.0. Default Value: 1.0 Types: float bias: Optional Argument. Specifies a non-negative value. If the value is greater than zero, each sample x in the training set will be converted to (x, b); that is, it will add another dimension containing the bias value b. This argument addresses situations where not all samples center at 0. Default Value: 0.0 Types: float hash: Optional Argument. Specifies whether to use hash projection on attributes. hash projection can accelerate processing speed but can slightly decrease accuracy. Note: You must use hash projection if the dataset has more features than fit into memory. Default Value: False Types: bool hash_buckets: Optional Argument. Valid only if hash is True. Specifies the number of buckets for hash projection. In most cases, the function can determine the appropriate number of buckets from the scale of the input data set. However, if the dataset has a very large number of features, you might have to specify buckets_number to accelerate the function. Types: int class_weights: Optional Argument. Specifies the weights for different classes. The format is: "classlabel m:weight m, classlabel n:weight n". If weight for a class is given, the cost parameter for this class is weight * cost. A weight larger than 1 often increases the accuracy of the corresponding class; however, it may decrease global accuracy. Classes not assigned a weight in this argument is assigned a weight of 1.0. Types: str OR list of Strings (str) max_step: Optional Argument. Specifies a positive integer value that specifies the maximum number of iterations of the training process. One step means that each sample is seen once by the trainer. The input value must be in the range (0, 10000]. Default Value: 100 Types: int epsilon: Optional Argument. Specifies the termination criterion. When the difference between the values of the loss function in two sequential iterations is less than this number, the function stops. epsilon must be greater than 0.0. Default Value: 0.01 Types: float seed: Optional Argument. A long integer value used to order the training set randomly and consistently. This value can be used to ensure that the same model will be generated if the function is run multiple times in a given database with the same arguments. The input value must be in the range [0, 9223372036854775807]. Default Value: 0 Types: int data_sequence_column: Optional Argument. Specifies the list of column(s) that uniquely identifies each row of the input argument "data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run. Types: str OR list of Strings (str) force_mapreduce: Optional Argument. Specifies whether the function is to use MapReduce. If set to 'False', a lighter version of the function runs for faster results. Note: 1. The model may be different with "force_mapreduce" set to 'True' and "force_mapreduce" set to 'False'. 2. "force_mapreduce" argument support is only available when teradataml is connected to Vantage 1.3 version. Default Value: False Types: bool RETURNS: Instance of SVMSparse. Output teradataml DataFrames can be accessed using attribute references, such as SVMSparseObj.<attribute_name>. Output teradataml DataFrame attribute names are: 1. model_table 2. output RAISES: TeradataMlException, TypeError, ValueError EXAMPLES: # Load the data to run the example. load_example_data("SVMSparse","svm_iris_input_train") # Create teradataml DataFrame svm_iris_input_train = DataFrame.from_table("svm_iris_input_train") # Example 1 svm_sparse_out = SVMSparse(data=svm_iris_input_train, sample_id_column='id', attribute_column='attribute', label_column='species', value_column='value1', max_step=150, seed=0, ) # Print the result DataFrame print(svm_sparse_out.model_table) print(svm_sparse_out.output)

__repr__(self): Returns the string representation for a SVMSparse class instance.

get_build_time(self): Function to return the build time of the algorithm in seconds. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

get_prediction_type(self): Function to return the Prediction type of the algorithm. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

get_target_column(self): Function to return the Target Column of the algorithm. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

show_query(self): Function to return the underlying SQL query. When model object is created using retrieve_model(), then None is returned.