Teradata R Package Function Reference - 16.20 - PathAnalyzer - Teradata R Package

Teradata® R Package Function Reference

Teradata R Package
February 2020
Programming Reference


The PathAnalyzer function (td_path_analyzer_mle):

  1. Inputs a set of paths to the PathGenerator function (td_path_generator_mle).

  2. Inputs the PathGenerator output to the PathSummarizer function (td_path_summarizer_mle).

  3. Inputs the PathSummarizer output to the PathStart function (td_path_start_mle), which outputs, for each parent, all children and the number of times that the user traveled each child.


  td_path_analyzer_mle (
      data = NULL,
      seq.column = NULL,
      count.column = NULL,
      hash = FALSE,
      delimiter = ",",
      data.sequence.column = NULL



Required Argument.
Specifies either the input tbl_teradata that contains the paths to analyze, or the output of the td_npath_sqle function.
Each path is a string of alphanumeric symbols that represents an ordered sequence of page views (or actions). Typically each symbol is a code that represents a unique page view.


Required Argument.
Specifies the name of the input tbl_teradata column that contains the paths.
Types: character


Optional Argument.
Specifies the name of the input tbl_teradata column that contains the number of times a path was traveled.
Note: If this argument is not specified, a column "cnt" is auto-generated with the number of unique paths in the "seq.column" column.
Types: character


Optional Argument.
Specifies whether to include the hash code of the output column node.
Default Value: FALSE
Types: logical


Optional Argument. Specifies the single-character delimiter that separates symbols in the path string.
Note: Do not use any of the following characters as delimiter (they cause the function to fail): Asterisk (*), Plus (+), Left parenthesis ((), Right parenthesis ()), Single quotation mark ('), Escaped single quotation mark (\'), Backslash (\).
Default Value: ","
Types: character


Optional Argument.
Specifies the vector of column(s) that uniquely identifies each row of the input argument "data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run.
Types: character OR vector of Strings (character)


Function returns an object of class "td_path_analyzer_mle" which is a named list containing Teradata tbl objects.
Named list members can be referenced directly with the "$" operator using following names:

  1. output.table

  2. output


    # Get the current context/connection
    con <- td_get_context()$connection
    # Load example data.
    loadExampleData("pathgenerator_example", "clickstream1")
    # Create remote tibble objects.
    clickstream1 <- tbl(con, "clickstream1")
    # Example 1 - Use the click stream data to run path analysis functions
    td_path_analyzer_mle_out1 <- td_path_analyzer_mle(data = clickstream1,
                                                     seq.column = "path",
                                                     count.column = "cnt"
    # Example 2 - Exclude count.column argument, and the function generates the count column
    td_path_analyzer_mle_out2 <- td_path_analyzer_mle(data = clickstream1,
                                                     seq.column = "path"

    # Example 3 - Use the output of NPath (td_npath_sqle) function as an input
    loadExampleData("npath_example1", "bank_web_clicks2")
    # Create remote tibble objects.
    bank_web_clicks2 <- tbl(con, "bank_web_clicks2")
    # Execute npath function.
    td_npath_out <- td_npath_sqle(
                       data1.partition.column = c("customer_id", "session_id"),
                       data1.order.column = "datestamp",
                       mode = "nonoverlapping",
                       pattern = "A*",
                       symbols = c("true AS A"),
                       result = c("ACCUMULATE (page OF A) AS page_path")

    # This takes the td_npath_out object as input and the count column gets auto-generated
    td_path_analyzer_mle_out3 <- td_path_analyzer_mle(data = td_npath_out,
                                                     seq.column = "page_path"