The input table in this example is taken from the "cats" dataset in the R package "MASS".
This example reads all of the rows from the input table, builds a linear regression model, and outputs a table containing the original input columns plus a column containing the fitted values from the regression.
-
Create the R script with the name "cats_regr_1103.R".
# Script filename: cats_regr_1103.R #File descriptor pointing to standard input in_table = file(description="stdin",open="r") # Read column values from stdin into a vector, nrows = -1 means read all rows while(1){ fields<-try(read.table(in_table,header=FALSE,sep="\t",quote="",nrows=-1),silent=TRUE ) if(inherits(fields,"try-error")) break #Fields are from database table cats_tmp sex <- as.character(fields[,1]) body_weight <- as.numeric(fields[,2]) heart_weight <- as.numeric(fields[,3]) # do regression regr <- lm(heart_weight ~ body_weight) outrec<-data.frame(sex, body_weight, heart_weight, regr$fitted ) #write to standard output stream write.table(outrec, stdout(),col.names=FALSE,row.names=FALSE,quote=FALSE,sep="\t") } # END
-
Load the R package "MASS" for use in this example.
library(MASS)
-
Create a table in the database based on the input data "cats" from the loaded R package "MASS", and convert it to a virtual data frame.
ta.create(cats, table="cats_tmp", tableType="dimension", colTypes=c("varchar", "numeric", "numeric") ) db.tadf.cats <- ta.data.frame("cats_tmp")
-
Create the vector defining the output column names and types.
outputs_cats <- c("sex char","BodyWt numeric","HeartWt numeric","FittedHtWt numeric")
-
Install the R script on the cluster.
ta.install.scripts('cats_regr_1103.R', schema="public")
-
Use the Aster R runner function ta.source() to run the script.
ta.source(data=db.tadf.cats, script="cats_regr_1103.R", output=outputs_cats)
The output is shown here.sex BodyWt HeartWt FittedHtWt 1 F 2.0 7.0 7.711463 2 F 2.0 7.4 7.711463 3 F 2.0 9.5 7.711463 4 F 2.1 7.2 8.114869 5 F 2.1 7.3 8.114869 6 F 2.1 7.6 8.114869 7 F 2.1 8.1 8.114869 8 F 2.1 8.2 8.114869 9 F 2.1 8.3 8.114869 10 F 2.1 8.5 8.114869 11 F 2.1 8.7 8.114869 12 F 2.1 9.8 8.114869 13 F 2.2 7.1 8.518276 14 F 2.2 8.7 8.518276 15 F 2.2 9.1 8.518276 16 F 2.2 9.7 8.518276 17 F 2.2 10.9 8.518276 18 F 2.2 11.0 8.518276 19 F 2.3 7.3 8.921682 20 F 2.3 7.9 8.921682 ... ... ... ... ...
Users can also run this example using ACT commands from within the Aster Database as shown here. The output is the same.- Create the R script and the input table "cats_tmp" in the database, as in Step 1 to 3.
-
Install the R script.
\install cats_regr_1103.R
-
Call the Stream function as shown here.
select * from stream (on cats_tmp script('Rexec --vanilla cats_regr_1103.R') outputs('sex varchar', 'BodyWt numeric','HeartWt numeric', 'FittedHtWt numeric') delimiter('\t') );