Teradata R Package Function Reference | 17.00 - 17.00 - PathAnalyzer - Teradata R Package

Teradata® R Package Function Reference

prodname
Teradata R Package
vrm_release
17.00
created_date
September 2020
category
Programming Reference
featnum
B700-4007-090K

Description

The PathAnalyzer function:

  1. Inputs a set of paths to the PathGenerator function PathGenerator (td_path_generator_mle) function.

  2. Inputs the PathGenerator output to the PathSummarizer function PathSummarizer (td_path_summarizer_mle).

  3. Inputs the PathSummarizer output to the PathStart function PathStart (td_path_start_mle), which outputs, for each parent, all children and the number of times that the user traveled each child.

Usage

  td_path_analyzer_mle (
      data = NULL,
      seq.column = NULL,
      count.column = NULL,
      hash = FALSE,
      delimiter = ",",
      data.sequence.column = NULL
  )

Arguments

data

Required Argument.
Specifies either the input tbl_teradata that contains the paths to analyze, or the output of the td_npath_sqle function.
Each path is a string of alphanumeric symbols that represents an ordered sequence of page views (or actions). Typically each symbol is a code that represents a unique page view.

seq.column

Required Argument.
Specifies the name of the input tbl_teradata column that contains the paths.
Types: character

count.column

Optional Argument.
Specifies the name of the input tbl_teradata column that contains the number of times a path was traveled.
Note: If this argument is not specified, a column "cnt" is auto-generated with the number of unique paths in the "seq.column" column.
Types: character

hash

Optional Argument.
Specifies whether to include the hash code of the output column node.
Default Value: FALSE
Types: logical

delimiter

Optional Argument.
Specifies the single-character delimiter that separates symbols in the path string.
Note: Do not use any of the following characters as delimiter (they cause the function to fail):
Asterisk (*), Plus (+), Left parenthesis ((), Right parenthesis ()), Single quotation mark ('), Escaped single quotation mark (\'), Backslash (\).
Default Value: ","
Types: character

data.sequence.column

Optional Argument.
Specifies the vector of column(s) that uniquely identifies each row of the input argument "data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run.
Types: character OR vector of Strings (character)

Value

Function returns an object of class "td_path_analyzer_mle" which is a named list containing objects of class "tbl_teradata".
Named list members can be referenced directly with the "$" operator using the following names:

  1. output.table

  2. output

Examples

    # Get the current context/connection
    con <- td_get_context()$connection
    
    # Load example data.
    loadExampleData("pathgenerator_example", "clickstream1")
    
    # Create object(s) of class "tbl_teradata".
    clickstream1 <- tbl(con, "clickstream1")
    
    # Example 1 - Use the click stream data to run path analysis functions
    td_path_analyzer_mle_out1 <- td_path_analyzer_mle(data = clickstream1,
                                                     seq.column = "path",
                                                     count.column = "cnt"
                                                     )
    # Example 2 - Exclude count.column argument, and the function generates the count column
    td_path_analyzer_mle_out2 <- td_path_analyzer_mle(data = clickstream1,
                                                     seq.column = "path"
                                                     )

    # Example 3 - Use the output of NPath td_npath_sqle() function as an input.
    loadExampleData("npath_example1", "bank_web_clicks2")
    
   # Create object(s) of class "tbl_teradata".
    bank_web_clicks2 <- tbl(con, "bank_web_clicks2")
    
    # Execute npath function.
    td_npath_out <- td_npath_sqle(
                       data1=bank_web_clicks2,
                       data1.partition.column = c("customer_id", "session_id"),
                       data1.order.column = "datestamp",
                       mode = "nonoverlapping",
                       pattern = "A*",
                       symbols = c("true AS A"),
                       result = c("ACCUMULATE (page OF A) AS page_path")
                       );

    # This takes the td_npath_out object as input and the count column gets auto-generated
    td_path_analyzer_mle_out3 <- td_path_analyzer_mle(data = td_npath_out,
                                                     seq.column = "page_path"
                                                     )