Teradata R Package Function Reference | 17.00 - 17.00 - NTree - Teradata R Package

Teradata® R Package Function Reference

Teradata R Package
September 2020
Programming Reference


The NTree function is a hierarchical analysis function that can build and traverse tree structures on all worker machines. The function reads the data only once from the disk and creates the trees in memory.


  td_ntree_mle (
      data = NULL,
      data.partition.column = "1",
      data.order.column = NULL,
      root.node = NULL,
      node.id = NULL,
      parent.id = NULL,
      allow.cycles = FALSE,
      starts.with = NULL,
      mode = NULL,
      output = NULL,
      max.distance = 5,
      logging = FALSE,
      result = NULL,
      data.sequence.column = NULL



Required Argument.
Specifies the input tbl_teradata containing the input data.


Optional Argument.
Specifies Partition By columns for "data".
Values to this argument can be provided as a vector, if multiple columns are used for partition.
Default Value: "1"
Types: character OR vector of Strings (character)


Optional Argument.
Specifies Order By columns for "data".
Values to this argument can be provided as a vector, if multiple columns are used for ordering.
Types: character OR vector of Strings (character)


Required Argument.
Specifies the logical SQL expression that defines the root nodes of the trees (for example, parent.id IS NULL).
Types: character


Required Argument.
Specifies the SQL expression whose value uniquely identifies a node in the input tbl_teradata.
Note: A node can appear multiple times in the data set, with different parents.
Types: character


Required Argument.
Specifies the SQL expression whose value identifies the parent node.
Types: character


Optional Argument.
Specifies whether trees can contain cycles. If not, a cycle in the data set causes the function to throw an exception.
Default Value: FALSE
Types: logical


Required Argument.
Specifies the node from which to start tree traversal.
Permitted Values: ROOT, LEAF, or a SQL expression that identifies a node.
Types: character


Required Argument.
Specifies the direction of tree traversal from the start node, either up to the root node or down to the leaf nodes.
Permitted Values: UP, DOWN
Types: character


Required Argument.
Specifies when to output a tuple, either at every node along the traversal path ("all") or only at the end of the traversal path ("end").
Default Value: END
Permitted Values: END, ALL
Types: character


Optional Argument.
Specifies the maximum tree depth.
Default Value: 5
Types: integer


Optional Argument.
Specifies whether the function prints log messages.
Default Value: FALSE
Types: logical


Required Argument.
Specifies aggregate operations to perform during tree traversal. The function reports the result of each aggregate operation in the output tbl_teradata. The syntax of aggregate is:
operation (expression) [ ALIAS alias ], where

  1. operation is either PATH, SUM, LEVEL, MAX, MIN, IS_CYCLE, AVG, or PROPAGATE.

  2. expression is a SQL expression. If operation is LEVEL or IS_CYCLE, then expression must be *.

  3. alias is the name of the output tbl_teradata column that contains the result of the operation.

The default value is the string "operation(expression)" without the quotation marks. For example, PATH(node_name).
Note: The function ignores alias if it is the same as an input tbl_teradata column name.
For the path from the "starts.with" node to the last traversed node, the operations do the following:

  1. PATH: Outputs the value of expression for each node, separating values with "->".

  2. SUM: Computes the value of expression for each node and outputs the sum of these values.

  3. LEVEL: Outputs the number of hops.

  4. MAX: Computes the value of expression for each node and outputs the highest of these values.

  5. MIN: Computes the value of expression for each node and outputs the lowest of these values.

  6. IS_CYCLE: Outputs the cycle (if any).

  7. AVG: Computes the value of expression for each node and outputs the average of these values.

  8. PROPAGATE: Evaluates expression with the value of the "starts.with" node and propagates the result to every node.

Types: character


Optional Argument.
Specifies the vector of column(s) that uniquely identifies each row of the input argument "data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run.
Types: character OR vector of Strings (character)


Function returns an object of class "td_ntree_mle" which is a named list containing object of class "tbl_teradata".
Named list member can be referenced directly with the "$" operator using the name: result.


    # Get the current context/connection.
    con <- td_get_context()$connection
    # Load example data
    loadExampleData("ntree_example", "employee_table", "emp_table_by_dept")

    # Create object(s) of class "tbl_teradata".
    employee_table <- tbl(con, "employee_table")
    emp_table_by_dept <- tbl(con, "emp_table_by_dept")

    # Example 1 - This example finds the employees who report to employee
    # 100 (either directly or indirectly) by traversing the tree
    # of employees from employee 100 downward.
    td_ntree_out1 <- td_ntree_mle(data=employee_table,
                                  root.node = 'mgr_id is NULL',
                                  result='PATH(emp_name) AS path'

    # Example 2 - This example finds the reporting structure by department.
    td_ntree_out2 <- td_ntree_mle(data=emp_table_by_dept,
                                  root.node = "mgr_id = 'none'",
                                  result='PATH(name) AS path, PATH(id) as path2'