Teradata Package for Python Function Reference - CFilter - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.

Teradata® Package for Python Function Reference

Product

Teradata Package for Python

Release Number

17.00

Published

November 2021

Language

English (United States)

Last Update

2021-11-19

lifecycle

Product Category

Teradata Vantage

teradataml.analytics.mle.CFilter = class CFilter(builtins.object)

teradataml.analytics.mle.CFilter(data=None, input_columns=None, join_columns=None, add_columns=None, partition_key='col1_item1', max_itemset=100, data_sequence_column=None, null_handling=True, use_basketgenerator=True)

Methods defined here:

__init__(self, data=None, input_columns=None, join_columns=None, add_columns=None, partition_key='col1_item1', max_itemset=100, data_sequence_column=None, null_handling=True, use_basketgenerator=True): DESCRIPTION: The CFilter function is a general-purpose collaborative filter. PARAMETERS: data: Required Argument. Specifies the name of the teradataml DataFrame that contains the data to filter. input_columns: Required Argument. Specifies the names of the input teradataml DataFrame columns that contain the data to filter. Types: str OR list of Strings (str) join_columns: Required Argument. Specifies the names of the input teradataml DataFrame columns to join. Types: str OR list of Strings (str) add_columns: Optional Argument. Specifies the names of the input columns to copy to the output table. The function partitions the input data and the output teradataml DataFrame on these columns. By default, the function treats the input data as belonging to one partition. Note: Specifying a column as both an add_column and a join_column causes incorrect counts in partitions. Types: str OR list of Strings (str) partition_key: Optional Argument. Specifies the names of the output column to use as the partition key. Default Value: "col1_item1" Types: str max_itemset: Optional Argument. Specifies the maximum size of the item set. Default Value: 100 Types: int null_handling: Optional Argument. Specifies whether to handle null values in the input. If the input data contains null values, then this argument should be True. Note: "null_handling" is only available when teradataml is connected to Vantage 1.3. Default Value: True Types: bool use_basketgenerator: Optional Argument. Specifies whether to use BasketGenerator function to generate baskets. Note: "use_basketgenerator" is only available when teradataml is connected to Vantage 1.3. Default Value: True Types: bool data_sequence_column: Optional Argument. Specifies the list of column(s) that uniquely identifies each row of the input argument "data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run. Types: str OR list of Strings (str) RETURNS: Instance of CFilter. Output teradataml DataFrames can be accessed using attribute references, such as CFilterObj.<attribute_name>. Output teradataml DataFrame attribute name is: 1. output_table 2. output RAISES: TeradataMlException EXAMPLES: # Load example data. load_example_data("cfilter", "sales_transaction") # Provided example table is: sales_transaction # These input table contains data of an office supply chain store. The columns are: # orderid: order (transaction) identifier # orderdate: order date # orderqty: quantity of product ordered # region: geographic region of store where order was placed # customer_segment: segment of customer who ordered product # prd_category: category of product ordered # product: product ordered # Create teradataml DataFrame objects. sales_transaction = DataFrame.from_table("sales_transaction") # Example 1 - Collaborative Filtering by Product. CFilter_out1 = CFilter(data = sales_transaction, input_columns = ["product"], join_columns = ["orderid"], add_columns = ["region"] ) # Print the output data print(CFilter_out1) # Example 2 - Collaborative Filtering by Customer Segment. CFilter_out2 = CFilter(data = sales_transaction, input_columns = ["customer_segment"], join_columns = ["product"] ) # Print the output data print(CFilter_out2)

__repr__(self): Returns the string representation for a CFilter class instance.

get_build_time(self): Function to return the build time of the algorithm in seconds. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

get_prediction_type(self): Function to return the Prediction type of the algorithm. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

get_target_column(self): Function to return the Target Column of the algorithm. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

show_query(self): Function to return the underlying SQL query. When model object is created using retrieve_model(), then None is returned.