1.1 - 8.10 - FPGrowth Output - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)

The FPGrowth function outputs a message and either a pattern table, a rule table, or both (depending on the PatternsOrRules syntax element).

Output Message Schema

Column Data Type Description
output_information VARCHAR Reports that patterns and rules are kept in tables specified in OutputPatternsTable and OutputRulesTable syntax elements.

OutputPatternsTable Schema

Column Data Type Description
group_by_column Same as in InputTable [Column appears once for each specified group_by_column.] Column copied from InputTable.
pattern_target_column VARCHAR Pattern composed of transaction items.
length_of_pattern INTEGER Number of items in pattern.
count BIGINT Count of occurrence of pattern.
support DOUBLE PRECISION Percentage of transactions that contain the pattern count/t , where t is number of transactions.

For example, if eggs and milk were purchased together 3 times in 5 transactions, the support value is 3/5, 60%.

OutputRulesTable Schema

Column Data Type Contents
antecedent_target_column VARCHAR Items in the antecedent of the rule.
consequence_target_column VARCHAR Items in the consequence of the rule.
count_of_antecedent INTEGER Count of items in the antecedent of the rule.
count_of_consequence INTEGER Count of items in the consequence of the rule.
cntb BIGINT Count of transactions that contain both the antecedent and consequence.
cnt_antecedent BIGINT Count of transactions that contain the antecedent.
cnt_consequence BIGINT Count of transactions that contain the consequence.
score DOUBLE PRECISION Product of two conditional probabilities:

(cntb / cnt_antecedent) * (cntb / cnt_consequence)

support DOUBLE PRECISION Percentage of transactions that contain both the antecedent and consequence: cntb/t, where t is the number of transactions.

For example, if eggs and milk were purchased together 3 times in 5 transactions, then the support value is 3/5, 60%.

confidence DOUBLE PRECISION Percentage of transactions that contain the antecedent that also contain the consequence:

cntb / cnt_antecedent

For example, for the antecedent milk and consequence butter, if cntb=3 and cnt_antecedent=4, then the confidence value is 3/4, 75%. In other words, 75% of the time, when a person buys milk, the person also buys butter.

lift DOUBLE PRECISION Ratio of the observed support value to the expected support value if the antecedent and consequence are independent:

(cntb/t) / ((cnt_antecedent/t) * (cnt_consequence/t))

conviction DOUBLE PRECISION More reliable alternative to confidence:

(1-cnt_consequence/t) / (1-cntb/cnt_antecedent)

leverage DOUBLE PRECISION Difference between the percentage of transactions that contain both the antecedent and consequence (cntb/t) and the expectation for cntb/t if the antecedent and consequence were statistically independent:

(cntb/t) - ((cnt_antecedent/t) * (cnt_consequence/t))

coverage DOUBLE PRECISION Percentage of transactions in which the rule applies:

cnt_antecedent/t

Another name for coverage is antecedent support.

chi_square DOUBLE PRECISION Chi-squared test result, used to test the hypothesis that the antecedent and consequence are not associated. The formula follows this table.
z_score DOUBLE PRECISION Significance of cntb, assuming that it follows a normal distribution:

(cntb - mean(cntb)) / standard_deviation(cntb)

If every cntb is the same, then standard_deviation(cntb) is 0, and the function does not compute z_score.

Formula for chi_square Value

(t * (cntb * (t + cntb - cnt_antecedent - cnt_consequence) - (cnt_antecedent - cntb) *

  (cnt_consequence - cntb))**2) /

  (cnt_antecedent * (t - cnt_antecedent) * cnt_consequence * (t - cnt_consequence))