Column Name | Data Type | Description |
---|---|---|
col1_item1 | VARCHAR | Name of item1. |
col1_item2 | VARCHAR | Name of item2. |
cntb | INTEGER | Count of co-occurrence of both items in the partition. |
cnt1 | INTEGER | Count of occurrence of item1 in the partition. |
cnt2 | INTEGER | Count of occurrence of item2 in the partition. |
score | DOUBLE PRECISION | Product of two conditional probabilities: P({ item2 | item1 }) * P({ item1 | item2 }) The preceding product equals the following quotient: (cntb * cntb)/(cnt1 * cnt2) |
support | DOUBLE PRECISION | Percentage of transactions in the partition in which the two items co-occur, calculated with this formula: cntb/tran_cnt where tran_cnt is the number of transactions in the partition. For example, if eggs and milk were purchased together 3 times in 5 transactions in the same store, and the data is partitioned by store, then the support value in the partition is 3/5 = 0.6 = 60%. |
confidence | DOUBLE PRECISION | Percentage of transactions in the partition in which item1 occurs, in which item2 also occurs, calculated with this formula: cntb/cnt1 For example, if, in the same store, the number of times that a customer buys both milk (item1) and butter (item2) is 3 (cntb) and the number of times that a customer buys milk is 4 (cnt1), then the confidence that a person who buys milk will also buy butter is 3/4 = 0.75 = 75%. |
lift | DOUBLE PRECISION | Ratio of the observed support value to the expected support value if item1 and item2 were independent; that is: support(item1 and item2) / [support(item1) * support(item2)] The value is calculated with this formula: (cntb/tran_cnt) / [(cnt1/tran_cnt) * (cnt2/tran_cnt)] If Lift > 1, the occurrence of item1 or item2 has a positive effect on the occurrence of the other items. If Lift = 1, the occurrence of item1 or item2 has a no effect on the occurrence of the other items. If Lift < 1, the occurrence of item1 or item2 has a negative effect on the occurrence of the other items. |
z_score | DOUBLE PRECISION | Significance of co-occurrence, assuming that cntb follows a normal distribution, calculated with this formula: (cntb - mean(cntb)) / sd(cntb) If all cntb values are equal, then sd(cntb) is 0, and the function does not calculate zscore. |