- InputTable
- Specifies the name of the table that contains the data to filter.
- OutputTable
- Specifies the name of the output table that the function creates. The table must not exist.
- InputColumns
- Specifies the names of the input table columns that contain the data to filter.
- JoinColumns
- Specifies the names of join columns, which the function uses as follows:
- The function uses the items in each join column to define groups of items listed in the input columns.
- The function tries to identify items in each input column that often appear in the same group.
For example, a join column might contain a list of sales transactions from a store, and the input column might contain each individual item purchased at the store. A sales transaction can include multiple items. For each sales transaction, the function tries to identify items that often appear in the same sales transaction (that is, items that are often purchased together).
- AddColumns
- [Optional] Specifies the names of the input columns to copy to the output table. The function partitions the input data and the output table on these columns. Default behavior: The function treats the input data as belonging to one partition.Specifying a column as both an add_column and a join_column causes incorrect counts in partitions.
- PartitionKey
- [Optional] Specifies the name of the output column to use as the partition key. Default: 'col1_item1'.
- MaxItemSet
- [Optional] Specifies the maximum size of the item set. Default: 100. The function uses max_item_set to determine the size of the data structures it uses to accumulate intermediate results. If the number of distinct items in an input_column is greater than max_item_set, the function might report incorrect results without an error message.
- DropTable
- [Optional] Specifies whether to drop the output table if it exists. Default: 'false'.