Example 3: Regression - Aster R

Teradata Aster® R User GuideUpdate 3

Product
Aster R
Release Number
7.00.02.01
Published
December 2017
Language
English (United States)
Last Update
2018-04-13
dita:mapPath
fop1497542774450.ditamap
dita:ditavalPath
Generic_no_ie_no_tempfilter.ditaval
dita:id
fbp1477004286096
lifecycle
previous
Product Category
Software
The input table in this example is taken from the "cats" dataset in the R package "MASS".

This example reads all of the rows from the input table, builds a linear regression model, and outputs a table containing the original input columns plus a column containing the fitted values from the regression.

  1. Create the R script with the name "cats_regr_1103.R".
    # Script filename: cats_regr_1103.R
    #File descriptor pointing to standard input
    in_table = file(description="stdin",open="r") 
    
    # Read column values from stdin into a vector, nrows = -1 means read all rows
    while(1){ 
        fields<-try(read.table(in_table,header=FALSE,sep="\t",quote="",nrows=-1),silent=TRUE
        ) 
        if(inherits(fields,"try-error"))   
        break 
        #Fields are from database table cats_tmp 
        sex <- as.character(fields[,1]) 
        body_weight <- as.numeric(fields[,2]) 
        heart_weight   <-
        as.numeric(fields[,3]) 
         
        # do regression 
        regr <- lm(heart_weight ~ body_weight) 
        outrec<-data.frame(sex, body_weight, heart_weight, regr$fitted )   
    
        #write to standard output stream 
        write.table(outrec,
        stdout(),col.names=FALSE,row.names=FALSE,quote=FALSE,sep="\t")
    }
    # END
  2. Load the R package "MASS" for use in this example.
    library(MASS)
  3. Create a table in the database based on the input data "cats" from the loaded R package "MASS", and convert it to a virtual data frame.
    ta.create(cats, 
    table="cats_tmp", 
    tableType="dimension", 
    colTypes=c("varchar", "numeric", "numeric")
    )
    
    db.tadf.cats <- ta.data.frame("cats_tmp")
  4. Create the vector defining the output column names and types.
    outputs_cats <- c("sex char","BodyWt numeric","HeartWt numeric","FittedHtWt numeric")
  5. Install the R script on the cluster.
    ta.install.scripts('cats_regr_1103.R', schema="public")
  6. Use the Aster R runner function ta.source() to run the script.
    ta.source(data=db.tadf.cats, script="cats_regr_1103.R", output=outputs_cats)
    The output is shown here.
    
        sex BodyWt HeartWt    FittedHtWt
    1     F    2.0     7.0      7.711463
    2     F    2.0     7.4      7.711463
    3     F    2.0     9.5      7.711463
    4     F    2.1     7.2      8.114869
    5     F    2.1     7.3      8.114869
    6     F    2.1     7.6      8.114869
    7     F    2.1     8.1      8.114869
    8     F    2.1     8.2      8.114869
    9     F    2.1     8.3      8.114869
    10    F    2.1     8.5      8.114869
    11    F    2.1     8.7      8.114869
    12    F    2.1     9.8      8.114869
    13    F    2.2     7.1      8.518276
    14    F    2.2     8.7      8.518276
    15    F    2.2     9.1      8.518276
    16    F    2.2     9.7      8.518276
    17    F    2.2    10.9      8.518276
    18    F    2.2    11.0      8.518276
    19    F    2.3     7.3      8.921682
    20    F    2.3     7.9      8.921682
    ...   ...  ...     ...      ...
    Users can also run this example using ACT commands from within the Aster Database as shown here. The output is the same.
    1. Create the R script and the input table "cats_tmp" in the database, as in Step 1 to 3.
    2. Install the R script.
      \install cats_regr_1103.R
    3. Call the Stream function as shown here.
      
      select * from stream (on cats_tmp 
      script('Rexec --vanilla cats_regr_1103.R') 
      outputs('sex varchar', 'BodyWt numeric','HeartWt numeric', 'FittedHtWt numeric') delimiter('\t') 
      );