1.1 - 8.10 - HMMUnsupervised Syntax Elements - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)
InitStateTable
[Optional] Specify the name for the initial state probability table that the function outputs.
Default: Pi in the current schema
StateTransitionTable
[Optional] Specify the name for the state transition probability table that the function outputs.
Default: A in the current schema
EmissionTable
[Optional] Specify the name for the emission probability table that the function outputs.
Default: B in the current schema
ModelColumn
[Required if PARTITION BY clause specifies model_key, disallowed otherwise.] Specify the name of the Vertices column that contains the model attributes. The model_column must match a model_key in the PARTITION BY clause. The values in the columns can be either integers or strings.
SeqColumn
Specify the name of the Vertices column that contains the sequence attribute. The sequence_column must be a sequence attribute in the PARTITION BY clause and contain more than two observation symbols.
ObservationColumn
Specify the name of the Vertices column that contains the observed symbols. The function scans the input table to find all possible observed symbols.
Observed symbols are case-sensitive.
HiddenStateNum
Specify the number of hidden states.
The number of hidden states can influence model quality and performance, so choose the number appropriately.
MaxIterNum
[Optional] Specify the number of iterations that the training process runs before the function completes.
Default: 10
Epsilon

[Optional] Specify the threshold value that determines the convergence of HMM training. If the parameter value difference is less than the threshold, the training process converges.

Default behavior: Only MaxIterNum determines when the training process converges.

SkipColumn
[Optional] Specify the name of the Vertices column whose value determines whether the function skips the row. Each value in this column must be an INTEGER or VARCHAR that represents the value true or false (INTEGER 1 or 0 or VARCHAR 'true', 't', 'yes', 'y', '1', 'false', 'f', 'no', 'n', or '0'). If the value represents true, the function skips the row; otherwise it does not.
Default behavior: The function does not skip any rows.
InitMethods
[Optional] Specify the method for creating the initial parameters for the initial state probabilities, state transition probabilities, and observation emission probabilities:
Option Description
'random' (Default) Initial parameters are based on uniform distribution.

The seed is meaningful only with 'random'. Specify the random seed the algorithm uses for repeatable results (for more information, see Nondeterministic Results and UniqueID Syntax Element).

'flat' Probabilities are equal. Each cell in matrix or vector contains same probability.
'input' Function takes initial parameters from InitParams syntax element.
InitParams
[Required with InitMethods ('input')] When InitMethods specifies 'input', this syntax element specifies the initial parameters for the models:
InitParams Item Description
init_state_probability_vector Vector that contains initial state probabilities for model.
state_transition_probability_matrix Matrix that contains state transition probabilities for model.
observation_emission_probability_matrix Matrix that contains observation emission probabilities for model.
For example, if the NumberHiddenStates syntax element specifies three hidden states and two observed symbols ('yes' and 'no'), these are the InitParams values:
InitParams Item Value
init_state_probability_vector '0.3333333333 0.3333333333 0.3333333333'
state_transition_probability_matrix '0.3333333333 0.3333333333 0.3333333333;

0.3333333333 0.3333333333 0.3333333333;

0.3333333333 0.3333333333 0.3333333333'

observation_emission_probability_matrix 'no:0.25 yes:0.75; no:0.35 yes:0.65; no:0.45 yes:0.55'

The sum of the probabilities in each row for the initial state probabilities, state transition probabilities, or observation emission probabilities parameters must round to 1.0. The number of states and the number of observed symbols must be consistent with the NumberHiddenStates syntax element and the observed symbols in the input table; otherwise, the function displays error messages.