Customer loyalty cannot be directly observed or measured, but can be inferred probabilistically. This use case shows how to predict the loyalty level of customers to a retailer from customer shopping patterns.
Customers who make frequent purchases or costlier purchases tend to be more loyal to a brand than infrequent shoppers buying low-value items. A customer who makes a purchase each week is more loyal than a customer who makes a purchase once a year. Similarly, a customer spending more money is more loyal than a customer spending less money.
You can use the shopping patterns of customers with extensive purchase histories to build a Hidden Markov Model to analyze the loyalty of new customers. Gaining insight about the loyalty levels of customers can help retailers devise strategic marketing plans to retain loyal customers and convert low-loyalty customers to high-loyalty. Retailers can also take proactive action to turn around the fading relationship with customers who show a downward trajectory of loyalty. This is particularly important for high-value customers.
The data set for this use case is the shopping history of customers. This data set is transformed into observations that can be used to train an HMM. Based on the time between purchases and the difference in amount spent between purchases, purchases are classified into the following levels:
Time between purchases | Difference in amount spent |
---|---|
Small (S) | Less (L) |
Medium (M) | Same (S) |
Large (L) | More (M) |
To determine the level that a purchase belongs to, you can use K-Means clustering with K=3 to create clusters based on either time difference between purchases or difference in amount spent. Purchases clustered using either metric are classified into one of the three purchase levels. The observation associated with a purchase is the combination of the levels from both metrics.
SL | SS | SM |
ML | MS | MM |
LL | LS | LM |
Assuming that a customer has one of three loyalty levels—low (L), normal (N), or high (H)—the number of hidden states is three. The business determines the definitions of the loyalty levels.
This example uses supervised learning, so each purchase in the input data set is labeled with the customer loyalty level. You can use these labeled observations to train an HMM which can later be used to assign loyalty levels to new, unlabeled purchase data.
Input
id | user_id | seq_id | purchase_date | loyalty_level | observation |
---|---|---|---|---|---|
1 | 1 | 1 | 2014-01-01 | L | LL |
2 | 1 | 1 | 2014-01-02 | L | ML |
3 | 1 | 1 | 2014-01-03 | L | SL |
4 | 1 | 1 | 2014-01-04 | L | LM |
5 | 1 | 1 | 2014-01-05 | L | ML |
6 | 1 | 1 | 2014-01-06 | L | LL |
7 | 1 | 1 | 2014-01-07 | L | MM |
8 | 1 | 1 | 2014-01-08 | L | MS |
9 | 1 | 1 | 2014-01-09 | L | ML |
10 | 1 | 1 | 2014-01-10 | L | LM |
... | ... | ... | ... | ... | ... |
101 | 2 | 2 | 2014-01-01 | N | ML |
102 | 2 | 2 | 2014-01-02 | L | LS |
103 | 2 | 2 | 2014-01-03 | N | ML |
104 | 2 | 2 | 2014-01-04 | N | ML |
105 | 2 | 2 | 2014-01-05 | N | MS |
106 | 2 | 2 | 2014-01-06 | N | ML |
107 | 2 | 2 | 2014-01-07 | N | SM |
108 | 2 | 2 | 2014-01-08 | N | MS |
109 | 2 | 2 | 2014-01-09 | N | LL |
110 | 2 | 2 | 2014-01-10 | N | MM |
... | ... | ... | ... | ... | ... |
201 | 3 | 3 | 2014-01-01 | L | LL |
202 | 3 | 3 | 2014-01-02 | L | LS |
203 | 3 | 3 | 2014-01-03 | H | SS |
204 | 3 | 3 | 2014-01-04 | H | SM |
205 | 3 | 3 | 2014-01-05 | H | LL |
206 | 3 | 3 | 2014-01-06 | N | ML |
207 | 3 | 3 | 2014-01-07 | H | SS |
208 | 3 | 3 | 2014-01-08 | H | MM |
209 | 3 | 3 | 2014-01-09 | H | MS |
210 | 3 | 3 | 2014-01-10 | H | ML |
... | ... | ... | ... | ... | ... |
SQL Call
The following SQL query creates the probabilities with state information:
DROP TABLE pi_loyalty; DROP TABLE A_loyalty; DROP TABLE B_loyalty; SELECT * FROM HMMSupervised ( ON customer_loyalty AS vertices PARTITION BY user_id, seq_id ORDER BY user_id, seq_id, purchase_date ASC OUT TABLE InitStateTable (pi_loyalty) OUT TABLE StateTransitionTable (A_loyalty) OUT TABLE EmissionTable (B_loyalty) USING ModelColumn ('user_id') SeqColumn ('seq_id') ObservationColumn ('observation') StateColumn ('loyalty_level') ) AS dt;
Output
message |
---|
HMM models will be saved to the output tables once the training process is successfully completed. |
This query returns the following table:
SELECT * FROM pi_loyalty ORDER BY 1, 2;
user_id | state | probability |
---|---|---|
1 | H | 0 |
1 | L | 1 |
1 | N | 0 |
2 | H | 0 |
2 | L | 0 |
2 | N | 1 |
3 | H | 0 |
3 | L | 1 |
3 | N | 0 |
This query returns the following table:
SELECT * FROM A_loyalty ORDER BY 1, 2, 3;
user_id | from_state | to_state | probability |
---|---|---|---|
1 | H | H | 0.166666666666667 |
1 | H | L | 0.666666666666667 |
1 | H | N | 0.166666666666667 |
1 | L | H | 0.0444444444444444 |
1 | L | L | 0.933333333333333 |
1 | L | N | 0.0222222222222222 |
1 | N | H | 0.333333333333333 |
1 | N | L | 0.666666666666667 |
1 | N | N | 0 |
2 | H | H | 0.461538461538462 |
2 | H | L | 0.230769230769231 |
2 | H | N | 0.307692307692308 |
2 | L | H | 0.2 |
2 | L | L | 0.1 |
2 | L | N | 0.7 |
2 | N | H | 0.0657894736842105 |
2 | N | L | 0.0789473684210526 |
2 | N | N | 0.855263157894737 |
3 | H | H | 0.921348314606742 |
3 | H | L | 0.0224719101123595 |
3 | H | N | 0.0561797752808989 |
3 | L | H | 0.75 |
3 | L | L | 0.25 |
3 | L | N | 0 |
3 | N | H | 0.833333333333333 |
3 | N | L | 0 |
3 | N | N | 0.166666666666667 |
This query returns the following table:
SELECT * FROM B_loyalty ORDER BY 1, 2, 3;
user_id | state | observed | probability |
---|---|---|---|
1 | H | LL | 0 |
1 | H | LM | 0 |
1 | H | LS | 0 |
1 | H | ML | 0 |
1 | H | MM | 0.166666666666667 |
1 | H | MS | 0.5 |
1 | H | SL | 0.166666666666667 |
1 | H | SM | 0.166666666666667 |
1 | H | SS | 0 |
1 | L | LL | 0.307692307692308 |
1 | L | LM | 0.142857142857143 |
1 | L | LS | 0.131868131868132 |
1 | L | ML | 0.153846153846154 |
1 | L | MM | 0.0549450549450549 |
1 | L | MS | 0.0879120879120879 |
1 | L | SL | 0.0549450549450549 |
1 | L | SM | 0.010989010989011 |
1 | L | SS | 0.0549450549450549 |
1 | N | LL | 0 |
1 | N | LM | 0 |
1 | N | LS | 0.333333333333333 |
1 | N | ML | 0.333333333333333 |
1 | N | MM | 0 |
1 | N | MS | 0.333333333333333 |
1 | N | SL | 0 |
1 | N | SM | 0 |
1 | N | SS | 0 |
2 | H | LL | 0 |
2 | H | LM | 0.0769230769230769 |
2 | H | LS | 0 |
2 | H | ML | 0.153846153846154 |
2 | H | MM | 0.0769230769230769 |
2 | H | MS | 0.153846153846154 |
2 | H | SL | 0.153846153846154 |
2 | H | SM | 0.153846153846154 |
2 | H | SS | 0.230769230769231 |
2 | L | LL | 0.1 |
2 | L | LM | 0.2 |
2 | L | LS | 0.3 |
2 | L | ML | 0 |
2 | L | MM | 0.2 |
2 | L | MS | 0 |
2 | L | SL | 0.2 |
2 | L | SM | 0 |
2 | L | SS | 0 |
2 | N | LL | 0.0909090909090909 |
2 | N | LM | 0.0779220779220779 |
2 | N | LS | 0.0779220779220779 |
2 | N | ML | 0.207792207792208 |
2 | N | MM | 0.12987012987013 |
2 | N | MS | 0.194805194805195 |
2 | N | SL | 0.051948051948052 |
2 | N | SM | 0.0909090909090909 |
2 | N | SS | 0.0779220779220779 |
3 | H | LL | 0.0555555555555556 |
3 | H | LM | 0.0444444444444444 |
3 | H | LS | 0.0777777777777778 |
3 | H | ML | 0.0666666666666667 |
3 | H | MM | 0.144444444444444 |
3 | H | MS | 0.111111111111111 |
3 | H | SL | 0.0555555555555556 |
3 | H | SM | 0.2 |
3 | H | SS | 0.244444444444444 |
3 | L | LL | 0.25 |
3 | L | LM | 0.25 |
3 | L | LS | 0.5 |
3 | L | ML | 0 |
3 | L | MM | 0 |
3 | L | MS | 0 |
3 | L | SL | 0 |
3 | L | SM | 0 |
3 | L | SS | 0 |
3 | N | LL | 0 |
3 | N | LM | 0 |
3 | N | LS | 0.166666666666667 |
3 | N | ML | 0.333333333333333 |
3 | N | MM | 0.166666666666667 |
3 | N | MS | 0 |
3 | N | SL | 0 |
3 | N | SM | 0.166666666666667 |
3 | N | SS | 0.166666666666667 |