Teradata R Package Function Reference | 17.00 - 17.00 - Sessionize - Teradata R Package

Teradata® R Package Function Reference

prodname
Teradata R Package
vrm_release
17.00
created_date
September 2020
category
Programming Reference
featnum
B700-4007-090K

Description

The Sessionize function maps each click in a session to a unique session identifier. A session is defined as a sequence of clicks by one user that are separated by at most n seconds.
Note: This function is only available when tdplyr is connected to Vantage 1.1 or later versions.

Usage

  td_sessionize_mle (
      data = NULL,
      time.column = NULL,
      time.out = NULL,
      click.lag = NULL,
      emit.null = FALSE,
      data.sequence.column = NULL,
      data.partition.column = NULL,
      data.order.column = NULL
  )

Arguments

data

Required Argument.
Specifies the name of the input tbl_teradata.

data.partition.column

Required Argument.
Specifies Partition By columns for "data".
Values to this argument can be provided as a vector, if multiple columns are used for partition.
Types: character OR vector of Strings (character)

data.order.column

Required Argument.
Specifies Order By columns for "data".
Values to this argument can be provided as a vector, if multiple columns are used for ordering.
Types: character OR vector of Strings (character)

time.column

Required Argument.
Specifies the name of the input column that contains the click times.
Note: The "time.column" must also be an "data.order.column".
Types: character

time.out

Required Argument.
Specifies the number of seconds at which the session times out. If session timeout seconds elapse after a click, then the next click starts a new session.
Types: numeric

click.lag

Optional Argument.
Specifies the minimum number of seconds between clicks for the session user to be considered human. If clicks are more frequent, indicating that the user is a "bot," the function ignores the session. The "click.lag" must be less than "time.out" value.
Types: numeric

emit.null

Optional Argument.
Specifies whether to output rows that have NULL values in their session id and rapid fire columns, even if their "time.column" has a NULL value.
Default Value: FALSE
Types: logical

data.sequence.column

Optional Argument.
Specifies the vector of column(s) that uniquely identifies each row of the input argument "data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run.
Types: character OR vector of Strings (character)

Value

Function returns an object of class "td_sessionize_mle" which is a named list containing object of class "tbl_teradata".
Named list member can be referenced directly with the "$" operator using the name: result.

Examples

    # Get the current context/connection
    con <- td_get_context()$connection
    
    # Load example data.
    loadExampleData("sessionize_example", "sessionize_table")

    # Create object(s) of class "tbl_teradata".
    # contains web clickstream data recorded as a user navigates through a web site.
    # Events, view, click etc are recorded with a timestamp.
    sessionize_table <- tbl(con, "sessionize_table")

    # Example
    td_sessionize_out <- td_sessionize_mle(data = sessionize_table,
                                           data.partition.column = c("partition_id"),
                                           data.order.column = c("clicktime"),
                                           time.column = "clicktime",
                                           time.out = 60,
                                           click.lag = 0.2
                                           )