Teradata R Package Function Reference - 16.20 - HMMSupervised - Teradata R Package

Teradata® R Package Function Reference

prodname
Teradata R Package
vrm_release
16.20
created_date
February 2020
category
Programming Reference
featnum
B700-4007-098K

Description

The HMMSupervisedLearner (td_hmm_supervised_mle) function is available on SQL-Graph platform. The function can produce multiple HMM models simultaneously, where each model is learned from a set of sequences and where each sequence represents a vertex.

Usage

  td_hmm_supervised_mle (
      vertices = NULL,
      model.key = NULL,
      sequence.key = NULL,
      observed.key = NULL,
      state.key = NULL,
      skip.key = NULL,
      batch.size = NULL,
      vertices.sequence.column = NULL,
      vertices.partition.column = NULL,
      vertices.order.column = NULL
  )

Arguments

vertices

Required Argument.
Specifies the input vertex table.

vertices.partition.column

Specifies the Partition By columns for "vertices".
Values to this argument can be provided as vector, if multiple columns are used for ordering.

vertices.order.column

Specifies the Order By columns for "vertices".
Values to this argument can be provided as vector, if multiple columns are used for ordering.

model.key

Required Argument.
Specifies the name of the column that contains the model attribute. If you specify this argument, then its value must match the column specified in the "vertices.partition.column" argument.

sequence.key

Required Argument.
Specifies the name of the column that contains the sequence attribute. The value for this argument must be a sequence attribute in the "vertices.partition.column" argument. A sequence must contain more than two observation symbols.

observed.key

Required Argument.
Specifies the name of the column that contains the observed symbols. The function scans the input tbl_teradata to find all possible observed symbols.
Note: Observed symbols are case-sensitive.

state.key

Required Argument.
Specifies the state attributes. You can specify multiple states. The states are case-sensitive.

skip.key

Optional Argument.
Specifies the name of the column whose values determine whether the function skips the row. The function skips the row if the value is "true", "yes", "y", or "1". The function does not skip the row if the value is "false", "f", "no", "n", "0", or NULL.

batch.size

Optional Argument.
Specifies the number of models to process. The size must be positive. If the batch size is not specified, the function avoids out-of-memory errors by determining the appropriate size. If the batch size is specified and there is insufficient free memory, the function reduces the batch size. The function determines the batch size dynamically based on the memory conditions. For example, the batch size is set to 1000, at time T1, it might be adjusted to 980, and at time T2, it might be adjusted to 800.

vertices.sequence.column

Optional Argument.
Specifies the vector of column(s) that uniquely identifies each row of the input argument "vertices". The argument is used to ensure deterministic results for functions which produce results that vary from run to run.

Value

Function returns an object of class "td_hmm_supervised_mle" which is a named list containing Teradata tbl objects. Named list members can be referenced directly with the "$" operator using following names:

  1. output.initialstate.table

  2. output.statetransition.table

  3. output.emission.table

  4. output

Examples

    # Get the current context/connection
    con <- td_get_context()$connection
    
    # Load example data.
    loadExampleData("hmmsupervised_example", "customer_loyalty")
    
    # Create remote tibble objects.
    customer_loyalty <- tbl(con, "customer_loyalty")
    
    # Example 1 - Train a HMM Supervised model on the customer loyalty dataset
    td_hmm_supervised_out <- td_hmm_supervised_mle(vertices = customer_loyalty,
                                               vertices.partition.column = c("user_id", "seq_id "),
                                               vertices.order.column = c("user_id", "seq_id", "purchase_date"),
                                               model.key = "user_id",
                                               sequence.key = "seq_id",
                                               observed.key = "observation",
                                               state.key = "loyalty_level"
                                               )