1.0 - 8.00 - FPGrowth Output - Teradata Vantage

Teradata® Vantage Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.0
8.00
Release Date
May 2019
Content Type
Programming Reference
Publication ID
B700-4003-098K
Language
English (United States)

The FPGrowth function outputs a message and either a pattern table, a rule table, or both (depending on the PatternsOrRules argument).

Output Message Schema

Column Data Type Description
output_information VARCHAR Reports that patterns and rules are kept in tables specified in OutputPatternTable and OutputRuleTable arguments.

OutputPatternTable Schema

Column Data Type Description
partition_column Same as in input table [Column appears once for each specified partition_column.] Column copied from input table.
pattern_target_column VARCHAR Pattern composed of transaction items.
length_of_pattern INTEGER Number of items in pattern.
count BIGINT Count of occurrence of pattern.
support DOUBLE PRECISION Percentage of transactions that contain the pattern count/t , where t is number of transactions.

For example, if eggs and milk were purchased together 3 times in 5 transactions, the support value is 3/5, 60%.

OutputRuleTable Schema

Column Data Type Contents
antecedent_target_column VARCHAR Items in the antecedent of the rule.
consequence_target_column VARCHAR Items in the consequence of the rule.
count_of_antecedent INTEGER Count of items in the antecedent of the rule.
count_of_consequence INTEGER Count of items in the consequence of the rule.
cntb BIGINT Count of transactions that contain both the antecedent and consequence.
cnt_antecedent BIGINT Count of transactions that contain the antecedent.
cnt_consequence BIGINT Count of transactions that contain the consequence.
score DOUBLE PRECISION Product of two conditional probabilities:

(cntb / cnt_antecedent) * (cntb / cnt_consequence)

support DOUBLE PRECISION Percentage of transactions that contain both the antecedent and consequence: cntb/t, where t is the number of transactions.

For example, if eggs and milk were purchased together 3 times in 5 transactions, then the support value is 3/5, 60%.

confidence DOUBLE PRECISION Percentage of transactions that contain the antecedent that also contain the consequence:

cntb / cnt_antecedent

For example, for the antecedent milk and consequence butter, if cntb=3 and cnt_antecedent=4, then the confidence value is 3/4, 75%. In other words, 75% of the time, when a person buys milk, the person also buys butter.

lift DOUBLE PRECISION Ratio of the observed support value to the expected support value if the antecedent and consequence are independent:

(cntb/t) / ((cnt_antecedent/t) * (cnt_consequence/t))

conviction DOUBLE PRECISION More reliable alternative to confidence:

(1-cnt_consequence/t) / (1-cntb/cnt_antecedent)

leverage DOUBLE PRECISION Difference between the percentage of transactions that contain both the antecedent and consequence (cntb/t) and the expectation for cntb/t if the antecedent and consequence were statistically independent:

(cntb/t) - ((cnt_antecedent/t) * (cnt_consequence/t))

coverage DOUBLE PRECISION Percentage of transactions in which the rule applies:

cnt_antecedent/t

Another name for coverage is antecedent support.

chi_square DOUBLE PRECISION Chi-squared test result, used to test the hypothesis that the antecedent and consequence are not associated. The formula follows this table.
z_score DOUBLE PRECISION Significance of cntb, assuming that it follows a normal distribution:

(cntb - mean(cntb)) / standard_deviation(cntb)

If every cntb is the same, then standard_deviation(cntb) is 0, and the function does not compute z_score.

Formula for chi_square Value

(t * (cntb * (t + cntb - cnt_antecedent - cnt_consequence) - (cnt_antecedent - cntb) *

  (cnt_consequence - cntb))**2) /

  (cnt_antecedent * (t - cnt_antecedent) * cnt_consequence * (t - cnt_consequence))