Teradata R Package Function Reference | 17.00 - 17.00 - CoxPH - Teradata R Package

Teradata® R Package Function Reference

prodname
Teradata R Package
vrm_release
17.00
created_date
September 2020
category
Programming Reference
featnum
B700-4007-090K

Description

The CoxPH function is named for the Cox proportional hazards model, a statistical survival model. The function estimates coefficients by learning a set of explanatory variables. The output of the CoxPH function is input to the CoxHazardRatio (td_cox_hazard_ratio_mle) and CoxSurvival (td_cox_survival_mle) functions.

Usage

  td_coxph_mle (
      data = NULL,
      feature.columns = NULL,
      time.interval.column = NULL,
      event.column = NULL,
      threshold = 1.0E-9,
      max.iter.num = 10,
      categorical.columns = NULL,
      accumulate = NULL,
      data.sequence.column = NULL
  )

Arguments

data

Required Argument.
Specifies the name of the tbl_teradata object that contains the input parameters.

feature.columns

Required Argument.
Specifies the names of the columns from the input tbl_teradata used in the "data" argument that contain the features of the input parameters.
Types: character OR vector of Strings (character)

time.interval.column

Required Argument.
Specifies the name of the column from input tbl_teradata used in the "data" argument that contains the time intervals of the input parameters; that is, end_time - start_time, in any unit of time (for example, years, months, or days).
Types: character

event.column

Required Argument.
Specifies the name of the column from the input tbl_teradata used in the "data" argument that contains 1 if the event occurred by end_time and 0 if it did not. (0 represents survival or right-censorship.) The function ignores values other than 1 and 0.
Types: character

threshold

Optional Argument.
Specifies the convergence threshold.
Default Value: 1.0E-9
Types: numeric

max.iter.num

Optional Argument.
Specifies the maximum number of iterations that the function runs before finishing, if the convergence threshold has not been met.
Default Value: 10
Types: integer

categorical.columns

Optional Argument.
Specifies the names of the columns from the input tbl_teradata used in the "data" argument that contain categorical predictors. Each categorical column must also be a feature column. By default, the function detects the categorical columns by their SQL data types.
Types: character OR vector of Strings (character)

accumulate

Optional Argument.
Specifies the names of the columns from the input tbl_teradata used in the "data" argument that the function copies to "linear.predictor.table".
Types: character OR vector of Strings (character)

data.sequence.column

Optional Argument.
Specifies the vector of column(s) that uniquely identifies each row of the input argument "data". The argument is used to ensure deterministic results for functions which produce results that vary from run to run.
Types: character OR vector of Strings (character)

Value

Function returns an object of class "td_coxph_mle" which is a named list containing objects of class "tbl_teradata".
Named list members can be referenced directly with the "$" operator using following names:

  1. coefficient.table

  2. linear.predictor.table

  3. output

Examples

    # Get the current context/connection
    con <- td_get_context()$connection
    
    # Load example data.
    # The input table, lungcancer, contains data from a randomized trial of two treatment 
    # regimens for lung cancer used to model survival analysis. There are three categorical 
    # predictors and three numerical predictors

    loadExampleData("coxph_example", "lungcancer")

    # Create object(s) of class "tbl_teradata".
    lungcancer <- tbl(con, "lungcancer")

    # Example 1 -
    td_coxph_out <- td_coxph_mle(data = lungcancer,
                                 feature.columns = c("trt", "celltype", "karno", "diagtime", "age",
                                                     "prior"),
                                 time.interval.column = "time_int",
                                 event.column = "status",
                                 categorical.columns = c("trt", "celltype", "prior")
                                 )