| |
Methods defined here:
- __init__(self, data=None, input_columns=None, join_columns=None, add_columns=None, partition_key='col1_item1', max_itemset=100, data_sequence_column=None)
- DESCRIPTION:
The CFilter function is a general-purpose collaborative filter.
PARAMETERS:
data:
Required Argument.
Specifies the name of the teradataml DataFrame that contains the data
to filter.
input_columns:
Required Argument.
Specifies the names of the input teradataml DataFrame columns that
contain the data to filter.
Types: str OR list of Strings (str)
join_columns:
Required Argument.
Specifies the names of the input teradataml DataFrame columns to join.
Types: str OR list of Strings (str)
add_columns:
Optional Argument.
Specifies the names of the input columns to copy to the output table.
The function partitions the input data and the output teradataml
DataFrame on these columns. By default, the function treats the input
data as belonging to one partition.
Note: Specifying a column as both an add_column and a join_column causes
incorrect counts in partitions.
Types: str OR list of Strings (str)
partition_key:
Optional Argument.
Specifies the names of the output column to use as the partition key.
Default Value: "col1_item1"
Types: str
max_itemset:
Optional Argument.
Specifies the maximum size of the item set.
Default Value: 100
Types: int
data_sequence_column:
Optional Argument.
Specifies the list of column(s) that uniquely identifies each row of
the input argument "data". The argument is used to ensure
deterministic results for functions which produce results that vary
from run to run.
Types: str OR list of Strings (str)
RETURNS:
Instance of CFilter.
Output teradataml DataFrames can be accessed using attribute
references, such as CFilterObj.<attribute_name>.
Output teradataml DataFrame attribute name is:
1. output_table
2. output
RAISES:
TeradataMlException
EXAMPLES:
# Load example data.
load_example_data("cfilter", "sales_transaction")
# Provided example table is: sales_transaction
# These input table contains data of an office supply chain store. The columns are:
# orderid: order (transaction) identifier
# orderdate: order date
# orderqty: quantity of product ordered
# region: geographic region of store where order was placed
# customer_segment: segment of customer who ordered product
# prd_category: category of product ordered
# product: product ordered
# Create teradataml DataFrame objects.
sales_transaction = DataFrame.from_table("sales_transaction")
# Example 1 - Collaborative Filtering by Product.
CFilter_out1 = CFilter(data = sales_transaction,
input_columns = ["product"],
join_columns = ["orderid"],
add_columns = ["region"]
)
# Print the output data
print(CFilter_out1)
# Example 2 - Collaborative Filtering by Customer Segment.
CFilter_out2 = CFilter(data = sales_transaction,
input_columns = ["customer_segment"],
join_columns = ["product"]
)
# Print the output data
print(CFilter_out2)
- __repr__(self)
- Returns the string representation for a CFilter class instance.
|