Teradata Package for R Function Reference | 17.20 - UnivariateStatistics - Teradata Package for R - Look here for syntax, methods and examples for the functions included in the Teradata Package for R.

Teradata® Package for R Function Reference

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for R
Release Number
17.20
Published
March 2024
Language
English (United States)
Last Update
2024-05-03
dita:id
TeradataR_FxRef_Enterprise_1720
Product Category
Teradata Vantage

UnivariateStatistics

Description

td_univariate_statistics_sqle() function displays descriptive statistics for each specified numeric input tbl_teradata column.

Usage

  td_univariate_statistics_sqle (
      newdata = NULL,
      target.columns = NULL,
      partition.columns = NULL,
      stats = 'ALL',
      centiles = c(1, 5, 10, 25, 50, 75, 90, 95, 99),
      trim.percentile = 20,
      ...
  )

Arguments

newdata

Required Argument.
Specifies the input tbl_teradata.
Types: tbl_teradata

target.columns

Required Argument.
Specifies the name(s) of the column(s) in "data" for which univariate statistics need to be displayed.
Types: character OR vector of Strings (character)

partition.columns

Optional Argument.
Specifies the names of the input partition columns.
Types: character OR vector of Strings (character)

stats

Optional Argument.
Specifies the statistics to calculate.
Permitted Values:

  • SUM

  • COUNT or CNT

  • MAXIMUM or MAX

  • MINIMUM or MIN

  • MEAN

  • UNCORRECTED SUM OF SQUARES or USS

  • NULL COUNT or NLC

  • POSITIVE VALUES COUNT or PVC

  • NEGATIVE VALUES COUNT or NVC

  • ZERO VALUES COUNT or ZVC

  • TOP5 or TOP

  • BOTTOM5 or BTM

  • RANGE or RNG

  • GEOMETRIC MEAN or GM

  • HARMONIC MEAN or HM

  • VARIANCE or VAR

  • STANDARD DEVIATION or STD

  • STANDARD ERROR or SE

  • SKEWNESS or SKW

  • KURTOSIS or KUR

  • COEFFICIENT OF VARIATION or CV

  • CORRECTED SUM OF SQUARES or CSS

  • MODE

  • MEDIAN or MED

  • UNIQUE ENTITY COUNT or UEC

  • INTERQUARTILE RANGE or IQR

  • TRIMMED MEAN or TM

  • PERCENTILES or PRC

  • ALL

Default Value: 'ALL'
Types: character OR vector of Strings (character)

centiles

Optional Argument.
Specifies the percentile to calculate.
The function ignores Centiles unless Stats specifies PERCENTILES, PRC, or ALL.
Default Value: c(1, 5, 10, 25, 50, 75, 90, 95, 99)
Types: integer or vector of integers

trim.percentile

Optional Argument.
Specifies the trimmed lower percentile.
Default Value: 20
Types: integer

...

Specifies the generic keyword arguments SQLE functions accept.
Below are the generic keyword arguments:

persist:
Optional Argument.
Specifies whether to persist the results of the function in a table or not.
When set to TRUE, results are persisted in a table; otherwise, results are garbage collected at the end of the session.
Default Value: FALSE
Types: logical

volatile:
Optional Argument.
Specifies whether to put the results of the function in a volatile table or not.
When set to TRUE, results are stored in a volatile table, otherwise not.
Default Value: FALSE
Types: logical

Function allows the user to partition, hash, order or local order the input data. These generic arguments are available for each argument that accepts tbl_teradata as input and can be accessed as:

  • "<input.data.arg.name>.partition.column" accepts character OR vector of Strings (character) (Strings)

  • "<input.data.arg.name>.hash.column" accepts character OR vector of Strings (character) (Strings)

  • "<input.data.arg.name>.order.column" accepts character OR vector of Strings (character) (Strings)

  • "local.order.<input.data.arg.name>" accepts logical

Note:
These generic arguments are supported by tdplyr if the underlying SQL Engine function supports, else an exception is raised.

Value

Function returns an object of class "td_univariate_statistics_sqle" which is a named list containing object of class "tbl_teradata".
Named list member(s) can be referenced directly with the "$" operator using the name(s):result

Examples

  
    
    # Get the current context/connection.
    con <- td_get_context()$connection
    
    # Load the example data.
    loadExampleData("tdplyr_example", "titanic")
    
    # Create tbl_teradata object.
    titanic_data <- tbl(con, "titanic")
    
    # Check the list of available analytic functions.
    display_analytic_functions()
    
    # Example 1: Display descriptive statistics of input
    #            column "fare" by partitioning "sex" and "age".
    obj <- td_univariate_statistics_sqle(
            newdata=titanic_data,
            target.columns='fare',
            partition.columns=c('sex', 'age'),
            stats='ALL',
            centiles=c(1, 5, 10, 25, 50, 75, 90, 95, 99),
            trim.percentile=20)
    
    # Print the result.
    print(obj$result)