HMMSupervised Example: Customer Loyalty Prediction - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.10
1.1
Published
October 2019
Language
English (United States)
Last Update
2019-12-31
dita:mapPath
ima1540829771750.ditamap
dita:ditavalPath
jsj1481748799576.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantage™

Customer loyalty cannot be directly observed or measured, but can be inferred probabilistically. This use case shows how to predict the loyalty level of customers to a retailer from customer shopping patterns.

Customers who make frequent purchases or costlier purchases tend to be more loyal to a brand than infrequent shoppers buying low-value items. A customer who makes a purchase each week is more loyal than a customer who makes a purchase once a year. Similarly, a customer spending more money is more loyal than a customer spending less money.

You can use the shopping patterns of customers with extensive purchase histories to build a Hidden Markov Model to analyze the loyalty of new customers. Gaining insight about the loyalty levels of customers can help retailers devise strategic marketing plans to retain loyal customers and convert low-loyalty customers to high-loyalty. Retailers can also take proactive action to turn around the fading relationship with customers who show a downward trajectory of loyalty. This is particularly important for high-value customers.

The data set for this use case is the shopping history of customers. This data set is transformed into observations that can be used to train an HMM. Based on the time between purchases and the difference in amount spent between purchases, purchases are classified into the following levels:

Time between purchases Difference in amount spent
Small (S) Less (L)
Medium (M) Same (S)
Large (L) More (M)

To determine the level that a purchase belongs to, you can use K-Means clustering with K=3 to create clusters based on either time difference between purchases or difference in amount spent. Purchases clustered using either metric are classified into one of the three purchase levels. The observation associated with a purchase is the combination of the levels from both metrics.

From the different levels of time difference and spending amount difference between purchases, nine spending profiles are possible:
SL SS SM
ML MS MM
LL LS LM
These nine spending profiles serve as the observation symbols for the HMM.

Assuming that a customer has one of three loyalty levels—low (L), normal (N), or high (H)—the number of hidden states is three. The business determines the definitions of the loyalty levels.

This example uses supervised learning, so each purchase in the input data set is labeled with the customer loyalty level. You can use these labeled observations to train an HMM which can later be used to assign loyalty levels to new, unlabeled purchase data.

Input

Vertices: customer_loyalty
id user_id seq_id purchase_date loyalty_level observation
1 1 1 2014-01-01 L LL
2 1 1 2014-01-02 L ML
3 1 1 2014-01-03 L SL
4 1 1 2014-01-04 L LM
5 1 1 2014-01-05 L ML
6 1 1 2014-01-06 L LL
7 1 1 2014-01-07 L MM
8 1 1 2014-01-08 L MS
9 1 1 2014-01-09 L ML
10 1 1 2014-01-10 L LM
... ... ... ... ... ...
101 2 2 2014-01-01 N ML
102 2 2 2014-01-02 L LS
103 2 2 2014-01-03 N ML
104 2 2 2014-01-04 N ML
105 2 2 2014-01-05 N MS
106 2 2 2014-01-06 N ML
107 2 2 2014-01-07 N SM
108 2 2 2014-01-08 N MS
109 2 2 2014-01-09 N LL
110 2 2 2014-01-10 N MM
... ... ... ... ... ...
201 3 3 2014-01-01 L LL
202 3 3 2014-01-02 L LS
203 3 3 2014-01-03 H SS
204 3 3 2014-01-04 H SM
205 3 3 2014-01-05 H LL
206 3 3 2014-01-06 N ML
207 3 3 2014-01-07 H SS
208 3 3 2014-01-08 H MM
209 3 3 2014-01-09 H MS
210 3 3 2014-01-10 H ML
... ... ... ... ... ...

SQL Call

The following SQL query creates the probabilities with state information:

DROP TABLE pi_loyalty;
DROP TABLE A_loyalty;
DROP TABLE B_loyalty;

SELECT * FROM HMMSupervised (
  ON customer_loyalty AS Vertices
    PARTITION BY user_id, seq_id
    ORDER BY user_id, seq_id, purchase_date ASC
  OUT TABLE InitStateTable (pi_loyalty)
  OUT TABLE StateTransitionTable (A_loyalty)
  OUT TABLE EmissionTable (B_loyalty)
  USING
  ModelColumn ('user_id')
  SeqColumn ('seq_id')
  ObservationColumn ('observation')
  StateColumn ('loyalty_level')
) AS dt;
SELECT * FROM HMMSupervised(
ON customer_loyalty AS Vertices PARTITION BY user_id, seq_id
ORDER BY user_id, seq_id, purchase_date ASC
OUT TABLE InitStateTable(pi_loyalty)
OUT TABLE StateTransitionTable(A_loyalty)
OUT TABLE EmissionTable(B_loyalty)
USING
ModelColumn('user_id')
SeqColumn('seq_id')
ObservationColumn('observation')
StateColumn('loyalty_level')
) AS dt ;

Output

 message                                                                                             
 --------------------------------------------------------------------------------------------------- 
 HMM models will be saved to the  output tables once the training process is successfully completed.
SELECT * FROM pi_loyalty ORDER BY 1, 2;
 user_id state probability 
 ------- ----- ----------- 
       1 h             0.0
       1 l             1.0
       1 n             0.0
       2 h             0.0
       2 l             0.0
       2 n             1.0
       3 h             0.0
       3 l             1.0
       3 n             0.0
SELECT * FROM A_loyalty ORDER BY 1, 2, 3;
 user_id from_state to_state probability          
 ------- ---------- -------- -------------------- 
       1 h          h         0.16666666666666666
       1 h          l          0.6666666666666666
       1 h          n         0.16666666666666666
       1 l          h        0.044444444444444446
       1 l          l          0.9333333333333333
       1 l          n        0.022222222222222223
       1 n          h          0.3333333333333333
       1 n          l          0.6666666666666666
       1 n          n                         0.0
       2 h          h         0.46153846153846156
       2 h          l         0.23076923076923078
       2 h          n          0.3076923076923077
       2 l          h                         0.2
       2 l          l                         0.1
       2 l          n                         0.7
       2 n          h         0.06578947368421052
       2 n          l         0.07894736842105263
       2 n          n          0.8552631578947368
       3 h          h          0.9213483146067416
       3 h          l         0.02247191011235955
       3 h          n        0.056179775280898875
       3 l          h                        0.75
       3 l          l                        0.25
       3 l          n                         0.0
       3 n          h          0.8333333333333334
       3 n          l                         0.0
       3 n          n         0.16666666666666666
SELECT * FROM B_loyalty ORDER BY 1, 2, 3;
 user_id state observed probability          
 ------- ----- -------- -------------------- 
       1 h     ll                        0.0
       1 h     lm                        0.0
       1 h     ls                        0.0
       1 h     ml                        0.0
       1 h     mm        0.16666666666666666
       1 h     ms                        0.5
       1 h     sl        0.16666666666666666
       1 h     sm        0.16666666666666666
       1 h     ss                        0.0
       1 l     ll         0.3076923076923077
       1 l     lm        0.14285714285714285
       1 l     ls        0.13186813186813187
       1 l     ml        0.15384615384615385
       1 l     mm       0.054945054945054944
       1 l     ms        0.08791208791208792
       1 l     sl       0.054945054945054944
       1 l     sm        0.01098901098901099
       1 l     ss       0.054945054945054944
       1 n     ll                        0.0
       1 n     lm                        0.0
       1 n     ls         0.3333333333333333
       1 n     ml         0.3333333333333333
       1 n     mm                        0.0
       1 n     ms         0.3333333333333333
       1 n     sl                        0.0
       1 n     sm                        0.0
       1 n     ss                        0.0
       2 h     ll                        0.0
       2 h     lm        0.07692307692307693
       2 h     ls                        0.0
       2 h     ml        0.15384615384615385
       2 h     mm        0.07692307692307693
       2 h     ms        0.15384615384615385
       2 h     sl        0.15384615384615385
       2 h     sm        0.15384615384615385
       2 h     ss        0.23076923076923078
       2 l     ll                        0.1
       2 l     lm                        0.2
       2 l     ls                        0.3
       2 l     ml                        0.0
       2 l     mm                        0.2
       2 l     ms                        0.0
       2 l     sl                        0.2
       2 l     sm                        0.0
       2 l     ss                        0.0
       2 n     ll        0.09090909090909091
       2 n     lm        0.07792207792207792
       2 n     ls        0.07792207792207792
       2 n     ml         0.2077922077922078
       2 n     mm        0.12987012987012986
       2 n     ms        0.19480519480519481
       2 n     sl        0.05194805194805195
       2 n     sm        0.09090909090909091
       2 n     ss        0.07792207792207792
       3 h     ll        0.05555555555555555
       3 h     lm       0.044444444444444446
       3 h     ls        0.07777777777777778
       3 h     ml        0.06666666666666667
       3 h     mm        0.14444444444444443
       3 h     ms         0.1111111111111111
       3 h     sl        0.05555555555555555
       3 h     sm                        0.2
       3 h     ss        0.24444444444444444
       3 l     ll                       0.25
       3 l     lm                       0.25
       3 l     ls                        0.5
       3 l     ml                        0.0
       3 l     mm                        0.0
       3 l     ms                        0.0
       3 l     sl                        0.0
       3 l     sm                        0.0
       3 l     ss                        0.0
       3 n     ll                        0.0
       3 n     lm                        0.0
       3 n     ls        0.16666666666666666
       3 n     ml         0.3333333333333333
       3 n     mm        0.16666666666666666
       3 n     ms                        0.0
       3 n     sl                        0.0
       3 n     sm        0.16666666666666666
       3 n     ss        0.16666666666666666

Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.