R Table Operator Example: Echo Example - Advanced SQL Engine - Teradata Database

SQL External Routine Programming

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Published
July 2021
Language
English (United States)
Last Update
2021-07-27
dita:mapPath
rin1593638965306.ditamap
dita:ditavalPath
rin1593638965306.ditaval
dita:id
B035-1147
lifecycle
previous
Product Category
Teradata Vantageā„¢

The example shows an R program that echoes a table it reads from input.

Prerequisites

The following is required to run the example:

  • R 3.2.1 or later installed
  • Run DipRTblOp, if not previously run

Echo Example

create table test(i int, j int);

insert into test values(10,1);
insert into test values(20,2);
insert into test values(30,3);
insert into test values(40,4);
insert into test values(50,5);

sel * from test;

sel * from td_sysgpl.execr (
  ON (sel * from test)
  using
  keepLog(1)
  contract
  (
    'library(tdr);
     on_clause_input_stream <- 0;
     on_clause_output_stream <- 0;
     direction <- "R";

     incols <- tdr.GetColDef(on_clause_input_stream, direction);
     tdr.SetOutputColDef(on_clause_output_stream, incols);'
  )
  operator
  (
    'library(tdr);

     streamin <- 0;
     streamout <- 0;
     read_direction <- "R";
     write_direction <- "W";

     options <- 0;
     inHandle <- tdr.Open(read_direction, streamin, options);
     outHandle <- tdr.Open(write_direction, streamout, options);

     numcols <- tdr.GetColCount(streamin, read_direction);

     while (tdr.Read(inHandle) == 0)
     {
       lapply( 0: (numcols - 1),
               function(index){
                 att <- tdr.GetAttributeByNdx(inHandle, index, NULL);
                 tdr.SetAttributeByNdx(outHandle, index, att, NULL);
               });

       tdr.Write(outHandle);
     };

     tdr.Close(inHandle);
     tdr.Close(outHandle);'
  )
) as Rexp;

Result:

          i            j
-----------  -----------
         20            2
         10            1
         40            4
         30            3
         50            5

Explanation of Echo Example

For this query the ON clause is kept simple. There is no Hash By, Partition By, nor Order By specified. Thus none of the rows are redistributed and are read as is on the AMPs for input to the Operator code. Because the first column values are unique, we can specify a hash by i (or j) to ensure that each AMP (on a four AMP system) gets at least one row.

The keepLog key is specified with a value of 1 to write stdout to files in /home/tdatuser. There will be one log file for the Contract and N log files for the Operator (one for each AMP) on an N AMP system.

In both the Contract and the Operator, when using tdr.GetColDef and tdr.Open respectively, streamin is used to reference input from an ON clause by position. Integers from 0 to 15 can be specified because up to 16 ON clauses can be used. In the Contract, the variable is called on_clause_input_stream.

The Contract is executed by a single thread and can store information to be used by the AMPs when they process the Operator code. The main purpose of the Contract is to set up the output column definitions. In this case, the output column definitions are simply the same as the input.

Because each of the rows in the test table are on different AMPs and because the Operator code gets executed by each AMP in parallel, the input for each AMP will typically vary. The consequence of this can be seen in the while loop. The while loop keeps calling tdr.Read on the input handle until it reaches the end of the input stream.

The tdr.Read function reads a row at a time from the input stream. AMPs that have no input rows to process will just close the input and output handles. They don't write anything to the output stream. For AMPs that do have input, the tdr.Read function will read the data into the input handle which can then be used to write output.

This latter case is seen inside the while loop in the example. After the row is read into the input handle, each column value is read from the input handle using tdr.GetAttributeByNdx and then subsequently written to the corresponding column of the output handle using tdr.SetAttributeByNdx whose column types were defined in the contract.