FPGrowth Output - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

The FPGrowth function outputs a message and either a pattern table, a rule table, or both (depending on the PatternsOrRules syntax element).

Output Message Schema

Column Data Type Description
output_information VARCHAR Reports that patterns and rules are kept in tables specified in OutputPatternsTable and OutputRulesTable syntax elements.

OutputPatternsTable Schema

Column Data Type Description
group_by_column Same as in InputTable [Column appears once for each specified group_by_column.] Column copied from InputTable.
pattern_target_column VARCHAR Pattern composed of transaction items.
length_of_pattern INTEGER Number of items in pattern.
count BIGINT Count of occurrence of pattern.
support DOUBLE PRECISION Percentage of transactions that contain the pattern count/t, where t is number of transactions.

For example, if eggs and milk were purchased together 3 times in 5 transactions, the support value is 3/5, 60%.

OutputRulesTable Schema

Column Data Type Contents
antecedent_target_column VARCHAR Items in the antecedent of the rule.
consequence_target_column VARCHAR Items in the consequence of the rule.
count_of_antecedent INTEGER Count of items in the antecedent of the rule.
count_of_consequence INTEGER Count of items in the consequence of the rule.
cntb BIGINT Count of transactions that contain both the antecedent and consequence.
cnt_antecedent BIGINT Count of transactions that contain the antecedent.
cnt_consequence BIGINT Count of transactions that contain the consequence.
score DOUBLE PRECISION Product of two conditional probabilities:

(cntb / cnt_antecedent) * (cntb / cnt_consequence)

support DOUBLE PRECISION Percentage of transactions that contain both the antecedent and consequence: cntb/t, where t is the number of transactions.

For example, if eggs and milk were purchased together 3 times in 5 transactions, then the support value is 3/5, 60%.

confidence DOUBLE PRECISION Percentage of transactions that contain the antecedent that also contain the consequence:

cntb / cnt_antecedent

For example, for the antecedent milk and consequence butter, if cntb=3 and cnt_antecedent=4, then the confidence value is 3/4, 75%. In other words, 75% of the time, when a person buys milk, the person also buys butter.

lift DOUBLE PRECISION Ratio of the observed support value to the expected support value if the antecedent and consequence are independent:

(cntb/t) / ((cnt_antecedent/t) * (cnt_consequence/t))

conviction DOUBLE PRECISION More reliable alternative to confidence:

(1-cnt_consequence/t) / (1-cntb/cnt_antecedent)

leverage DOUBLE PRECISION Difference between the percentage of transactions that contain both the antecedent and consequence (cntb/t) and the expectation for cntb/t if the antecedent and consequence were statistically independent:

(cntb/t) - ((cnt_antecedent/t) * (cnt_consequence/t))

coverage DOUBLE PRECISION Percentage of transactions in which the rule applies:

cnt_antecedent/t

Another name for coverage is antecedent support.

chi_square DOUBLE PRECISION Chi-squared test result, used to test the hypothesis that the antecedent and consequence are not associated. The formula follows this table.
z_score DOUBLE PRECISION Significance of cntb, assuming that it follows a normal distribution:

(cntb - mean(cntb)) / standard_deviation(cntb)

If every cntb is the same, then standard_deviation(cntb) is 0, and the function does not compute z_score.

Formula for chi_square Value

(t * (cntb * (t + cntb - cnt_antecedent - cnt_consequence) - (cnt_antecedent - cntb) *

  (cnt_consequence - cntb))**2) /

  (cnt_antecedent * (t - cnt_antecedent) * cnt_consequence * (t - cnt_consequence))