Example 5: Using aa.tapply() - Aster R

Teradata Aster® R User GuideUpdate 3

Product
Aster R
Release Number
7.00.02.01
Published
December 2017
Language
English (United States)
Last Update
2018-04-13
dita:mapPath
fop1497542774450.ditamap
dita:ditavalPath
Generic_no_ie_no_tempfilter.ditaval
dita:id
fbp1477004286096
lifecycle
previous
Product Category
Software

The examples in this section illustrate the use of aa.tapply(). The function aa.tapply() is used to apply an R function on a partitioned virtual data frame. The data frame is partitioned by the value given in the INDEX argument.

These examples use the dataset "npk" from the MASS package.

  1. Create the table in the database.
    library(MASS)
    
    ta.create(npk, table = 'npk', 
              schemaName = 'myschema', tableType = 'fact', 
              partitionKey = 'block')
  2. Create the virtual data frame and update the column names to lowercase.
    tadf_peas<-ta.data.frame("npk", schemaName = "myschema")
    
    ta.colnames(tadf_peas)<-c("block","n","p","k","yield")
  3. Create a function vec_avg to calculate the average yield by block.
    vec_avg <- function(vect) {
      v <- mean(vect)
      return(v)
    }
    
  4. Use aa.tapply() to run the function vec_avg, calculating the average yield by block.
    r1<-aa.tapply(tadf_peas[,5], FUN = vec_avg, 
                  INDEX=tadf_peas$block, out.format=list(type="object")) 
    

    The output "r1" is a virtual object (type "aa.object") of length 6.

    > r1
    $block=6 
    --------------
    [1] 56.35
    
    $block=2 
    --------------
    [1] 57.45
    
    $block=3 
    --------------
    [1] 60.775
    
    $block=4 
    --------------
    [1] 50.125
    
    $block=1 
    --------------
    [1] 54.025
    
    $block=5 
    --------------
    [1] 50.525
    
    > class(r1)
    [1] "aa.object"
    > ta.length(r1)
    [1] 6
  5. Use aa.tapply() to run the function vec_avg, this time partitioning with two columns.
    r2<-aa.tapply(tadf_peas[,5], FUN = vec_avg,
            INDEX=tadf_peas[,c("block","n")], out.format=list(type="object")) 
    
    The output "r2" is:
    > r2
    $block=6,$n=0 
    -------------------
    [1] 54.6
    
    $block=2,$n=1 
    -------------------
    [1] 59.15
    
    $block=2,$n=0 
    -------------------
    [1] 55.75
    
    $block=6,$n=1 
    -------------------
    [1] 58.1
    
    $block=4,$n=1 
    -------------------
    [1] 55.4
    
    $block=3,$n=0 
    -------------------
    [1] 58.9
    
    $block=3,$n=1 
    -------------------
    [1] 62.65
    
    $block=4,$n=0 
    -------------------
    [1] 44.85
    
    $block=1,$n=0 
    -------------------
    [1] 48.15
    
    $block=5,$n=1 
    -------------------
    [1] 50.9
    
    $block=5,$n=0 
    -------------------
    [1] 50.15
    
    $block=1,$n=1 
    -------------------
    [1] 59.9