7.00.02.01 - Example 5: Using aa.tapply() - Aster R

Teradata Aster® R User GuideUpdate 3

prodname
Aster R
vrm_release
7.00.02.01
created_date
December 2017
category
Programming Reference
User Guide
featnum
B700-1033-700K

The examples in this section illustrate the use of aa.tapply(). The function aa.tapply() is used to apply an R function on a partitioned virtual data frame. The data frame is partitioned by the value given in the INDEX argument.

These examples use the dataset "npk" from the MASS package.

  1. Create the table in the database.
    library(MASS)
    
    ta.create(npk, table = 'npk', 
              schemaName = 'myschema', tableType = 'fact', 
              partitionKey = 'block')
  2. Create the virtual data frame and update the column names to lowercase.
    tadf_peas<-ta.data.frame("npk", schemaName = "myschema")
    
    ta.colnames(tadf_peas)<-c("block","n","p","k","yield")
  3. Create a function vec_avg to calculate the average yield by block.
    vec_avg <- function(vect) {
      v <- mean(vect)
      return(v)
    }
    
  4. Use aa.tapply() to run the function vec_avg, calculating the average yield by block.
    r1<-aa.tapply(tadf_peas[,5], FUN = vec_avg, 
                  INDEX=tadf_peas$block, out.format=list(type="object")) 
    

    The output "r1" is a virtual object (type "aa.object") of length 6.

    > r1
    $block=6 
    --------------
    [1] 56.35
    
    $block=2 
    --------------
    [1] 57.45
    
    $block=3 
    --------------
    [1] 60.775
    
    $block=4 
    --------------
    [1] 50.125
    
    $block=1 
    --------------
    [1] 54.025
    
    $block=5 
    --------------
    [1] 50.525
    
    > class(r1)
    [1] "aa.object"
    > ta.length(r1)
    [1] 6
  5. Use aa.tapply() to run the function vec_avg, this time partitioning with two columns.
    r2<-aa.tapply(tadf_peas[,5], FUN = vec_avg,
            INDEX=tadf_peas[,c("block","n")], out.format=list(type="object")) 
    
    The output "r2" is:
    > r2
    $block=6,$n=0 
    -------------------
    [1] 54.6
    
    $block=2,$n=1 
    -------------------
    [1] 59.15
    
    $block=2,$n=0 
    -------------------
    [1] 55.75
    
    $block=6,$n=1 
    -------------------
    [1] 58.1
    
    $block=4,$n=1 
    -------------------
    [1] 55.4
    
    $block=3,$n=0 
    -------------------
    [1] 58.9
    
    $block=3,$n=1 
    -------------------
    [1] 62.65
    
    $block=4,$n=0 
    -------------------
    [1] 44.85
    
    $block=1,$n=0 
    -------------------
    [1] 48.15
    
    $block=5,$n=1 
    -------------------
    [1] 50.9
    
    $block=5,$n=0 
    -------------------
    [1] 50.15
    
    $block=1,$n=1 
    -------------------
    [1] 59.9