Teradata Package for Python Function Reference - PathAnalyzer - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.

Teradata® Package for Python Function Reference

Product

Teradata Package for Python

Release Number

17.00

Published

November 2021

Language

English (United States)

Last Update

2021-11-19

lifecycle

Product Category

Teradata Vantage

teradataml.analytics.mle.PathAnalyzer = class PathAnalyzer(builtins.object)

Methods defined here:

__init__(self, data=None, seq_column=None, count_column=None, hash=False, delimiter=',', data_sequence_column=None): DESCRIPTION: This function generates the children, parent for a particular node and calculates its depth and number of visits. The PathAnalyzer function: - Inputs a set of paths to the PathGenerator function. - Inputs the output to the PathSummarizer function. - Inputs the output to the PathStart function, which outputs, for each parent, all children and the number of times that the user traveled each child. PARAMETERS: data: Required Argument. Specifies either the name of the input teradataml DataFrame or processed NPath output. The input teradataml DataFrame contains the paths to analyze. Each path is a string of alphanumeric symbols that represents an ordered sequence of page views (or actions). Typically, each symbol is a code that represents a unique page view. If you would like to use output of NPath, then it must be processed to select two columns; the column that contains the paths (seq_column) and the column that contains the number of times a path was traveled (count_column), which should be grouped by seq_column, so that the input teradataml DataFrame has one row for each unique path traveled on a web site. seq_column: Required Argument. Specifies the name of the input teradataml DataFrame column that contains the paths. Types: str count_column: Optional Argument. Specifies the name of the input teradataml DataFrame column that contains the number of times a path was traveled. Note: 'count_column' is required when teradataml is connected to Vantage version prior to 1.1.1. Default Value: 1 Types: str hash: Optional Argument. Specifies whether to include the hash code of the output column node. Default Value: False Types: bool delimiter: Optional Argument. Specifies the single-character delimiter that separates symbols in the path string. Note: Do not use any of the following characters as delimiter (they cause the function to fail): Asterisk (*), Plus (+), Left parenthesis ((), Right parenthesis ()), Single quotation mark ('), Escaped single quotation mark (\'), Backslash (\) Default Value: "," Types: str data_sequence_column: Optional Argument. Specifies the list of column(s) that uniquely identifies each row of the input argument "data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run. Types: str OR list of Strings (str) RETURNS: Instance of PathAnalyzer. Output teradataml DataFrames can be accessed using attribute references, such as PathAnalyzerObj.<attribute_name>. Output teradataml DataFrame attribute name is: output_table RAISES: TeradataMlException EXAMPLES: # Load example data. load_example_data("pathanalyzer", "clickstream1") # Create teradataml DataFrame objects. # The table contains clickstream data, where the "path" column # contains symbols for the pages that the customer clicked. clickstream1 = DataFrame.from_table("clickstream1") # Example 1 - Let's analyze the Paths taken for a parent, children # in this clickstream data, to reach to a page. PathAnalyzer_out = PathAnalyzer(data = clickstream1, seq_column = "path", count_column = "cnt", hash = False, delimiter = "," ) # Print the results print(PathAnalyzer_out)

__repr__(self): Returns the string representation for a PathAnalyzer class instance.

get_build_time(self): Function to return the build time of the algorithm in seconds. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

get_prediction_type(self): Function to return the Prediction type of the algorithm. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

get_target_column(self): Function to return the Target Column of the algorithm. When model object is created using retrieve_model(), then the value returned is as saved in the Model Catalog.

show_query(self): Function to return the underlying SQL query. When model object is created using retrieve_model(), then None is returned.