Background - Aster Analytics

Teradata AsterĀ® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Language
English (United States)
Last Update
2018-04-17
dita:mapPath
uce1497542673292.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1022
lifecycle
previous
Product Category
Software

The purpose of association rule mining is to identify strong rules in databases, using different measures of interestingness, and then discover regularities between products in large-scale transaction data recorded by point-of-sale (POS) systems in supermarkets. For example, the association rule {onions, potatoes}=>{hamburger} indicates that if a customer buys onions and potatoes together, he or she is likely to buy hamburger. Unlike sequence mining, association rule mining typically does not consider the order of items, either within a transaction or across transactions.

Association rules are used in making marketing decisions, such as when to offer promotional pricing or where to place products. They are also useful in many other areas, such as Web usage mining, intrusion detection, continuous production, and bioinformatics. However, association rule mining in large databases is computationally expensive, especially when there is a large number of patterns.

An efficient, scalable method for mining the complete set of frequent patterns from a data set is the frequent-pattern growth (FP-growth) algorithm. The FP-growth algorithm stores crucial information about frequent pattern growth in a frequent-pattern tree (FP-tree), an extended prefix-tree structure of compressed information, and then uses a divide-and-conquer strategy. Because the FP-tree retains the item set association information, the FP-growth algorithm does not do candidate generations, which are computationally expensive.

The FPGrowth function does the following:

  1. Divides a large data set into independent partitions.
  2. Compresses the data in each partition, creating an FP-tree to represent association information between items.
  3. Divides the FP-tree into conditional databases, each associated with one frequent pattern.
  4. Mines each conditional database for patterns.
  5. Generates association rules from the patterns.

    To generate an association rule from a pattern, the function takes one or more items in the pattern as the consequence and the remaining items as the antecedent. In the rule {onions, potatoes}=>{hamburger}, {onions, potatoes} is the antecedent and {hamburger} is the consequence.

  6. Determines the interestingness of the association rules.
The FPGrowth function automatically truncates long transactions by removing low-frequency items, guaranteeing that a single transaction generates at most 1 million patterns. Automatic truncation depends only on the value of the MaxPatternLength argument.