Teradata Package for Python Function Reference - HMMSupervised - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.

Teradata® Package for Python Function Reference

Product

Teradata Package for Python

Release Number

17.00

Published

November 2021

Language

English (United States)

Last Update

2021-11-19

lifecycle

Product Category

Teradata Vantage

teradataml.analytics.mle.HMMSupervised = class HMMSupervised(builtins.object)

Methods defined here:

__init__(self, vertices=None, model_key=None, sequence_key=None, observed_key=None, state_key=None, skip_key=None, batch_size=None, vertices_sequence_column=None, vertices_partition_column=None, vertices_order_column=None): DESCRIPTION: The HMMSupervised function runs on the SQL-GR framework. The function can produce multiple HMM models simultaneously, where each model is learned from a set of sequences and where each sequence represents a vertex. PARAMETERS: vertices: Required Argument. Specifies the teradataml DataFrame containing the vertex data. vertices_partition_column: Required Argument. Specifies Partition By columns for vertices. Values to this argument can be provided as list, if multiple columns are used for partition. Note: 1. This argument must contain the name of the column specified in 'sequence_key' argument. 2. This argument should contain the name of the column specified in 'model_key', if 'model_key' argument is used, and it must be the first column followed by the name of the column specified in 'sequence_key'. Types: str OR list of Strings (str) vertices_order_column: Required Argument. Specifies Order By columns for vertices. Values to this argument can be provided as list, if multiple columns are used for ordering. Note: This argument must contain the name of the column, containing time ordered sequence, as one of its columns. Types: str OR list of Strings (str) model_key: Optional Argument. Specifies the name of the column that contains the model attribute. The values in the column can be integers or strings. Note: The 'vertices_partition_column' argument should contain the name of the column specified in this argument. Types: str sequence_key: Required Argument. Specifies the name of the column that contains the sequence attribute. The sequence_key must be a sequence attribute in the vertices_partition_column. A sequence (value in this column) must contain more than two observation symbols. Each sequence represent a vertex. Types: str observed_key: Required Argument. Specifies the name of the column that contains the observed symbols. The function scans the input teradataml DataFrame to find all possible observed symbols. Note: Observed symbols are case-sensitive. Types: str state_key: Required Argument. Specifies the column containing state attributes. You can specify multiple states. The states are case-sensitive. Types: str skip_key: Optional Argument. Specifies the name of the column whose values determine whether the function skips the row. The function skips the row if the value is "true", "yes", "y", or "1". The function does not skip the row if the value is "false", "f", "no", "n", "0", or NULL. Types: str batch_size: Optional Argument. Specifies the number of models to process. The size must be positive. If the batch size is not specified, the function avoids out-of-memory errors by determining the appropriate size. If the batch size is specified and there is insufficient free memory, the function reduces the batch size. The batch size is determined dynamically, based on the memory conditions. For example, at time T1, the specified batch size 1000 might be adjusted to 980, and at time T2, the batch size might be adjusted to 800. Types: int vertices_sequence_column: Optional Argument. Specifies the list of column(s) that uniquely identifies each row of the input argument "vertices". The argument is used to ensure deterministic results for functions which produce results that vary from run to run. Types: str OR list of Strings (str) RETURNS: Instance of HMMSupervised. Output teradataml DataFrames can be accessed using attribute references, such as HMMSupervisedObj.<attribute_name>. Output teradataml DataFrame attribute names are: 1. output_initialstate_table 2. output_statetransition_table 3. output_emission_table 4. output RAISES: TeradataMlException EXAMPLES: # Load example data. load_example_data("hmmsupervised", "customer_loyalty") # # "customer_loyalty" dataset contains events that are related to customer transaction. # Each event comprises of the time elapsed since the last transaction and # the amount spent compared amount spent in the last transaction # # Time elapsed since the last transaction: # small(S), medium(M) and large(L) # and the amount spent compared amount spent in the last transaction: # less(L), about same(S) and more(M). # # So there are 9 possible combinations, resulting in 9 events. # For example, the event SM implies a transaction, where the time elapsed # since the last transaction is small and the customer spent more than last time. # # Datset also contains 3 hidden states corresponding to 3 levels of loyalty: # low(L), normal(N), high(H). # Create teradataml DataFrame objects. customer_loyalty = DataFrame.from_table("customer_loyalty") # Example 1 - Train a HMM Supervised model on the customer loyalty dataset HMMSupervised_out = HMMSupervised(vertices = customer_loyalty, vertices_partition_column = ["user_id", "seq_id"], vertices_order_column = ["user_id", "seq_id", "purchase_date"], model_key = "user_id", sequence_key = "seq_id", observed_key = "observation", state_key = "loyalty_level" ) # Print the results. print(HMMSupervised_out.output_initialstate_table) print(HMMSupervised_out.output_statetransition_table) print(HMMSupervised_out.output_emission_table) print(HMMSupervised_out.output)

__repr__(self): Returns the string representation for a HMMSupervised class instance.

get_build_time(self): Function to return the build time of the algorithm in seconds. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

get_prediction_type(self): Function to return the Prediction type of the algorithm. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

get_target_column(self): Function to return the Target Column of the algorithm. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

show_query(self): Function to return the underlying SQL query. When model object is created using retrieve_model(), then None is returned.