Interpreting Measures | Association Rules | Vantage Analytics Library - Interpreting Measures - Vantage Analytics Library

Vantage Analytics Library User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
Lake
VMware
Product
Vantage Analytics Library
Release Number
2.2.0
Published
March 2023
Language
English (United States)
Last Update
2024-01-02
dita:mapPath
ibw1595473364329.ditamap
dita:ditavalPath
iup1603985291876.ditaval
dita:id
zyl1473786378775
Product Category
Teradata Vantage

As an example of interpreting the measures of an association rule L "" R, consider product ownership association analysis.

Confidence measures the strength of an association: What percent of L customers also own R? A rule with high confidence that applies to very few customers—that is, a rule with low Support—is not very useful.

Another shortcoming of Confidence is that it does not tell whether owning L affects the likelihood of owning R, which is probably more important. If customers who own L own R at the same rate as the entire population, Confidence provides no information. You probably want to find the products L for which the Confidence of is significantly greater than 20%.

This is what Lift measures—the difference between Confidence and Expected Confidence, or how much owning L increases or decreases the probability of owning R.

Like Confidence, Lift is much less meaningful with low Support. If Expected Confidence is 2 and 8 customers own R, Lift is an impressive 400. However, because of the small number of customers, the rule is not very useful. The association might even have occurred by chance.

Z-score shows the likelihood that R is owned given that L is owned—how trustworthy the observed difference between the actual and expected ownership is, relative to what could be observed due to chance alone. For example, if the expectation is that 10,000 customers own both L and R, and actually 11,000 customers do, Lift is only 1.1, but the Z-score is very high, because such a large difference could not be due to chance.

A large Z-score and small Lift means owning L definitely affects owning R, but the effect is small. A small Z-score and large Lift means owning L appears to have a large effect on owning R, but the effect might not be real.

The strategy for interpreting the measures of association rules depends on the nature of the business problem, but here is a suggestion:
  1. Discard rules with low Z-scores (for example, less than 2).
  2. Filter rules according to Support and Lift.

    Where to set the Support threshold depends on what products are of interest and performance considerations.

    Where to set the Lift threshold depends on how large a lift is useful from a business perspective. If Lift threshold of 1.5 does not yield interesting results, set the threshold higher.