Measures | Association Analysis | Vantage Analytics Library - Association Rule Measures - Vantage Analytics Library

Vantage Analytics Library User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Lake
Product
Vantage Analytics Library
Release Number
2.2.0
Published
June 2025
ft:locale
en-US
ft:lastEdition
2025-07-02
dita:mapPath
ibw1595473364329.ditamap
dita:ditavalPath
iup1603985291876.ditaval
dita:id
zyl1473786378775
Product Category
Teradata Vantage

Support

Support is a measure of the generality of an entire association rule, its antecedent or consequent, or a single item that it references.

For an entire rule, antecedent, or consequent, Support is the percentage of groups that contain all items referenced by the rule, antecedent, or consequent.

For a single item, Support is the percentage of groups that contain it.

Assume these definitions:
  • N is the total number of customers.
  • L is the number of customers who own the set of products in the antecedent.
  • R is the number of customers who own the set of products in the consequent.
  • LR is the number of customers who own all products in the association rule (this notation does not mean L*R).
These are the formulas for support:
  • Support (L) = L/N
  • Support (R) = R/N
  • Support (L "" R) = LR/N

For example, assume there are 10 customers (N=10). Six have a checking account (L=6), five have a savings account (R=5), and four have both (LR=4). Support (L) = 6/10 = 0.6, Support (R) = 5/10 = 0.5, and Support (L "" R) = 4/10 = 0.4.

Confidence

The Confidence of an association rule is the probability of R occurring in an item group given that L is in the item group:

Confidence (L "" R) = Support (L "" R) = ) / Support (L)

Equivalently, Confidence is the percentage of groups containing L that also contain R:

Confidence (L "" R) = LR/L

For example, the Confidence that checking account ownership implies savings account ownership is 4/6.

Expected Value

The Expected Value of an association rule is the number of customers expected to have both L and R if there is no relationship between L and R. No relationship between L and R means customers who have L are neither more nor less likely to have R than are customers who do not have L.

These formulas for Expected Value (E_Value) are equivalent:
  • E_Value (L "" R) = (L*R)/N
  • E_Value (L "" R) = (Support (L)) * (Support (R)) * N

The Expected Value of the number of customers with checking and savings is (6*5)/10 = 3.

Expected Confidence

The Expected Confidence of an association rule is the Confidence that results if there is no relationship between L and R:

E_Confidence (L "" R) = R/N

Because owning L has no effect on owning R, the Expected Confidence of the rule is also the percentage of customers who own R:

E_Confidence (L "" R) = Support (R)

The Expected Confidence of the rule that having a checking account implies having a savings account is 5/10.

Lift

The Lift of an association rule is how much the probability of R is increased by the presence of L in an item group. For example:
  • Lift (L "" R) = 1 means there are exactly as many occurrences of R as expected. The presence of L neither increases nor decreases the probability of R.
  • Lift (L "" R) = 5 means there are 5 times as many occurrences of R than expected. The presence of L increases the probability of R by 5.
  • Lift (L "" R) = 0.5 means there are half as many occurrences of R as expected. The presence of L decreases the probability of R by half.

This is a formula for Lift:

Lift (L "" R) = LR / E_Confidence (L "" R)

Equivalently, lift is the ratio of Confidence to Expected Confidence, and you can calculate it by either of these formulas:
  • Lift (L "" R) = (Confidence (L "" R))/E_Confidence (L "" R)
  • Lift (L "" R) = (Confidence (L "" R)) * (Support (R)) * N

The Lift of the rule that having a checking account implies having a savings account is 4/3.

The Lift formulas do not account for ordered items. For more information, see Sequence Analysis.

Z-Score

The Z-score of an association rule is the statistical difference between the actual and expected results. For example:
  • Z-score (L "" R) = 0 means the actual and expected results are the same. The presence of L neither increases nor decreases the likelihood of owning R.
  • Z-score (L "" R) = 1 means the actual result is 1 standard deviation more than the expected result. The presence of L increases the likelihood of owning R.
  • Z-score (L "" R) = -3 means the actual result is 3 standard deviations less than the expected result. The presence of L decreases the likelihood of owning R.

A Z-score greater than 3 or less than -3 is statistically significant, which means the difference between the actual and expected result is very unlikely to be due to chance.

A Z-score helps answer the question of how confident you can be about the observed relationship between L and R, but does not directly indicate the magnitude of the relationship.

These formulas for Z-score are equivalent:
  • Z-score (L "" R) = (LR - E_Value (L "" R)) / SQRT (E_Value (L "" R)(1 - (E_Value (L "" R) / N)))
  • Z-score (L "" R) = ((N * Support (LR) - N) * Support (L) * Support (R)) / SQRT (N * Support (L) * Support (R) * (1 - Support (L) * Support (R)))

The mean value is E_Value (L "" R). The expected value is 6*5/10, so the mean value is 3.

The actual value is LR, which is 4.

The standard deviation is SQRT (E_Value (L "" R) * (1 - (E_Value (L "" R) / N)). The standard deviation is SQRT(3*(1-3/10)) = 1.449.

Therefore, the Z-score is (4 - 3) / 1.449 = .690.

The Z-score formulas do not account for ordered items. For more information, see Sequence Analysis.