Arguments - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product
Aster Analytics
Release Number
6.21
Published
November 2016
Language
English (United States)
Last Update
2018-04-14
dita:mapPath
kiu1466024880662.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1021
lifecycle
previous
Product Category
Software
Argument Category Definition
InputTable Required Specifies the name of the table that contains the input sequences. Each row is one item in a sequence. If input_table does not include a schema, the function searches for it in the user’s search path. The function ignores rows that contain any NULL values.
OutputTable Required Specifies the name of the table where the function outputs the subsequences.
PartitionColumns Required Specifies the names of the columns that comprise the partition key of the input sequences.
TimeColumn Optional* Specifies the name of the input table column that determines the order of items in a sequence. Items in the same sequence that have the same time stamp belong to the same set.

*Required when ItemColumn or ItemDefinition is specified.

PathFilters Optional Specifies the filters to use on the input table sequences. Only input table sequences that satisfy all constraints of at least one filter are input to the function.

Each filter has one or more constraints, which are separated by spaces. Each constraint has this syntax:

constraint (item [symbol ...])

By default, symbol is comma (,). If you specify symbol, it applies to all filters. The constraint is one of the following:

  • STW (start-with_constraint)

    The first item set of the sequence must contain at least one item. For example, STW(c,d) requires the first item set of the sequence to contain c or d. Sequence “(a, c), e, (f, d)” meets this constraint because the first item set, (a,c), contains c.

  • EDW (end-with_constraint)

    The last item set of the sequence must contain at least one item. For example, EDW(f,g) requires the last item set of the sequence to contain f or g. Sequence “(a, b), e, (f, d)” meets this constraint because the last item set, (f,d), contains f.

  • CTN (containing_constraint)

    The sequence must contain at least one item. For example, CTN(a,b) requires the sequence to contain a or b. The sequence “(a,c), d, (e,f)” meets this constraint but the sequence “d, (e,f)” does not.

Constraints in the same filter must be different. For example:

  • Valid: 'STW(c,d) EDW(g,k) CTN(e)'
  • Invalid: 'STW(c,d) STW(e,h)'

This argument specifies a separator and uses it in two filters:

PathFilters('Separator(#)', 'STW(c#d) EDW(g#k) CTN(e)', 'CTN(h#k)')

GroupByColumns Optional Specifies the names of the input table columns by which to group the input table sequences. If you specify this argument, then the function operates on each group separately and copies each group_by_column to the output table.
SeqPatternTable Optional Specifies the name of the table where the function outputs sequence-pattern pairs. For example, if a sequence has a partition value of "1" and contains 3 patterns with IDs 2, 9, and 10, then for that sequence the function outputs the sequence-pattern pairs ("1", 2), ("1", 9), and ("1", 10).

If sequence_pattern_table does not include a schema, the function creates it in the first schema in the user’s search path.

If the function finds no sequence-pattern pairs, then it does not create sequence_pattern_table.

ItemColumn Optional* Specifies the names of the input table columns that contain the items.

*Required if you specify neither ItemDefinition nor PathColumn.

ItemDefinition Optional* Specifies the name of the item definition table and the names of its index, definition, and item columns. If item_definition_table does not include a schema, the function searches for it in the schema in the user’s search path.

*Required if you specify neither ItemColumn nor PathColumn.

PathColumn Optional* Specifies the name of the input table column that contains paths in the form of sequence strings. A sequence string has this syntax:
'[item [, ...]]'

In the sequence string syntax, you must type the outer brackets (bold). The sequence strings in this column can be generated by the nPath function.

If you specify this argument, then each item set can have only one item.

* Required if you specify neither ItemColumn nor ItemDefinition.

MinSupport Required Determines the threshold for whether a sequential pattern is frequent. The minimum must be a positive real number.

If minimum is in the range (0,1], then it is a relative threshold: If N is the total number of input sequences, then the threshold is T=N*minimum. For example, if there are 1000 sequences in the input table and minimum is 0.05, then the threshold is 50.

If minimum is in the range (1,+), then it is an absolute threshold: Regardless of N, T=minimum. For example, if minimum is 50, then the threshold is 50, regardless of N.

A pattern is frequent if its support value is at least T.

Because the function outputs only frequent patterns, minimum controls the number of output patterns. If minimum is small, processing time increases exponentially; therefore, Teradata recommends starting the trial with a larger value.—for example, 5% of the total sequence number if you know N and 0.05 otherwise.

If you specify a relative minimum and GroupByColumns, then the function calculates N and T for each group.

If you specify a relative minimum and PathFilters, then N is the number of sequences that meet the constraints of the filters.

MaxLength Optional Specifies the maximum length of the output sequential patterns. The length of a pattern is its number of sets. By default, there is no maximum length.
MinLength Optional Specifies the minimum length of the output sequential patterns. The default value is 1.
ClosedPattern Optional Specifies whether to output only closed patterns. The default value is 'false'.