The FPGrowth function outputs a message and either a pattern table, a rule table, or both (depending on the PatternsOrRules argument).
Output Message Schema
|output_information||VARCHAR||Reports that patterns and rules are kept in tables specified in OutputPatternTable and OutputRuleTable arguments.|
|partition_column||Same as in input table||[Column appears once for each specified partition_column.] Column copied from input table.|
|pattern_target_column||VARCHAR||Pattern composed of transaction items.|
|length_of_pattern||INTEGER||Number of items in pattern.|
|count||BIGINT||Count of occurrence of pattern.|
|support||DOUBLE PRECISION||Percentage of transactions that contain the pattern count/t
, where t is number of transactions.
For example, if eggs and milk were purchased together 3 times in 5 transactions, the support value is 3/5, 60%.
|antecedent_target_column||VARCHAR||Items in the antecedent of the rule.|
|consequence_target_column||VARCHAR||Items in the consequence of the rule.|
|count_of_antecedent||INTEGER||Count of items in the antecedent of the rule.|
|count_of_consequence||INTEGER||Count of items in the consequence of the rule.|
|cntb||BIGINT||Count of transactions that contain both the antecedent and consequence.|
|cnt_antecedent||BIGINT||Count of transactions that contain the antecedent.|
|cnt_consequence||BIGINT||Count of transactions that contain the consequence.|
|score||DOUBLE PRECISION||Product of two conditional probabilities:
(cntb / cnt_antecedent) * (cntb / cnt_consequence)
|support||DOUBLE PRECISION||Percentage of transactions that contain both the antecedent and consequence: cntb/t, where t is the number of transactions.
For example, if eggs and milk were purchased together 3 times in 5 transactions, then the support value is 3/5, 60%.
|confidence||DOUBLE PRECISION||Percentage of transactions that contain the antecedent that also contain the consequence:
cntb / cnt_antecedent
For example, for the antecedent milk and consequence butter, if cntb=3 and cnt_antecedent=4, then the confidence value is 3/4, 75%. In other words, 75% of the time, when a person buys milk, the person also buys butter.
|lift||DOUBLE PRECISION||Ratio of the observed support value to the expected support value if the antecedent and consequence are independent:
(cntb/t) / ((cnt_antecedent/t) * (cnt_consequence/t))
|conviction||DOUBLE PRECISION||More reliable alternative to confidence:
(1-cnt_consequence/t) / (1-cntb/cnt_antecedent)
|leverage||DOUBLE PRECISION||Difference between the percentage of transactions that contain both the antecedent and consequence (cntb/t) and the expectation for cntb/t if the antecedent and consequence were statistically independent:
(cntb/t) - ((cnt_antecedent/t) * (cnt_consequence/t))
|coverage||DOUBLE PRECISION||Percentage of transactions in which the rule applies:
Another name for coverage is antecedent support.
|chi_square||DOUBLE PRECISION||Chi-squared test result, used to test the hypothesis that the antecedent and consequence are not associated. The formula follows this table.|
|z_score||DOUBLE PRECISION||Significance of cntb, assuming that it follows a normal distribution:
(cntb - mean(cntb)) / standard_deviation(cntb)
If every cntb is the same, then standard_deviation(cntb) is 0, and the function does not compute z_score.
Formula for chi_square Value
(t * (cntb * (t + cntb - cnt_antecedent - cnt_consequence) - (cnt_antecedent - cntb) *
(cnt_consequence - cntb))**2) /
(cnt_antecedent * (t - cnt_antecedent) * cnt_consequence * (t - cnt_consequence))