1.1 - 8.10 - CFilter Output - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)

Output Message Schema

Column Data Type Description
message VARCHAR Reports that output table was created successfully.

OutputTable Schema

Output is nondeterministic unless each add_column is unique in the group defined by JoinColumns (for more information, see Nondeterministic Results and UniqueID Syntax Element).

Column Data Type Description
col1_item1 VARCHAR Name of item1.
col1_item2 VARCHAR Name of item2.
cntb INTEGER Count of co-occurrence of both items in partition.
cnt1 INTEGER Count of occurrence of item1 in partition.
cnt2 INTEGER Count of occurrence of item2 in partition.
score DOUBLE PRECISION Product of two conditional probabilities:

P({ item2 | item1 }) * P({ item1 | item2 })

Preceding product equals following quotient:

(cntb * cntb)/(cnt1 * cnt2)

support DOUBLE PRECISION Percentage of transactions in partition in which the two items co-occur, calculated with this formula:

cntb/tran_cnt

where tran_cnt is the number of transactions in the partition.

For example, if eggs and milk were purchased together 3 times in 5 transactions in the same store, and the data is partitioned by store, then the support value in the partition is 3/5 = 0.6 = 60%.

confidence DOUBLE PRECISION Percentage of transactions in partition in which item1 occurs, in which item2 also occurs, calculated with this formula:

cntb/cnt1

For example, if, in the same store, the number of times that a customer buys both milk (item1) and butter (item2) is 3 (cntb) and the number of times that a customer buys milk is 4 (cnt1), then the confidence that a person who buys milk will also buy butter is 3/4 = 0.75 = 75%.

lift DOUBLE PRECISION Ratio of observed support value to expected support value if item1 and item2 were independent; that is:

support(item1 and item2) / [support(item1) * support(item2)]

Value is calculated with this formula:

(cntb/tran_cnt) / [(cnt1/tran_cnt) * (cnt2/tran_cnt)]

If Lift > 1, the occurrence of item1 or item2 has a positive effect on the occurrence of the other items.

If Lift = 1, the occurrence of item1 or item2 has a no effect on the occurrence of the other items.

If Lift < 1, the occurrence of item1 or item2 has a negative effect on the occurrence of the other items.

z_score DOUBLE PRECISION Significance of co-occurrence, assuming that cntb follows a normal distribution, calculated with this formula:

(cntb - mean(cntb)) / sd(cntb)

If all cntb values are equal, then sd(cntb) is 0, and function does not calculate zscore.