Tokenizing the Text with APPLY Table Operator | Gradient Boosting | Open Analytics Framework - Tokenizing the Text with APPLY Table Operator - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

Use the Apply class to create a teradataml Apply class object with the characteristics you want to consider for the call to the APPLY Table Operator.

In this example, specify the following:
  • The apply_command argument to call the R interpreter in your user environment and execute your script.
  • The returns argument with the list of output variables and types returned by your script.
  • The env_name argument to specify your user environment handler.
  1. Call to the Apply class.
    apply_obj = Apply(
        apply_command='Rscript tokenizers.R',
        returns={"OUTPUT": VARCHAR(200)},
        env_name=demo_env,
        style='csv',
        delimiter=(';'),
        quotechar=('#')
    )
    You can request to view SQL queries submitted by teradataml in the background with the following statement:
    display.print_sqlmr_query = True
    If some of the output is hidden or not visible, you can use the following statement to display 100 rows. You can display up to 999999 rows.
    display.max_rows = 100
  2. Run the R script inside the user environment with the execute_script method of the Apply class object.
    apply_obj.execute_script().head(n=5)
    Observe that after running the R statement, the system prints for you the corresponding SQL query as requested before producing the results.
    SELECT DISTINCT * FROM apply (
    RETURNS (OUTPUT VARCHAR(200))
    using
    environment('tokenizers_r_env')
    apply_command('Rscript tokenizers.R')
    style('csv')
    delimiter=(';'),
    quotechar=('#')
    ) as dt order by 1;
    Output:
    [1] "now i" "ow is" "w is " " is t" "is th" "s the" " the " "the h" "he ho
     [1] "now"   "now "  "now i" "ow "   "ow i"  "ow is" "w i"   "w is"  "w is
     [1] "now" "owi" "wis" "ist" "sth" "the" "heh" "eho" "hou" "our" "uro" "rof
     [1] "nowis" "owist" "wisth" "isthe" "stheh" "theho" "hehou" "ehour" "houro
    [10] " is"   " is "  " is t" "is "   "is t"  "is th" "s t"   "s th"  "s the
    [10] "e hou" " hour" "hour " "our o" "ur of" "r of " " of o" "of ou" "f our
    [10] "ourof" "urofo" "rofou" "ofour" "fourd" "ourdi" "urdis" "rdisc" "disco
    [13] "ofo" "fou" "our" "urd" "rdi" "dis" "isc" "sco" "con" "ont" "nte" "ten
    [19] " our " "our d" "ur di" "r dis" " disc" "disco" "iscon" "scont" "conte
    [19] " th"   " the"  " the " "the"   "the "  "the h" "he "   "he h"  "he ho
    [19] "iscon" "scont" "conte" "onten" "ntent"
    [25] "ent"
    [28] "e h"   "e ho"  "e hou" " ho"   " hou"  " hour" "hou"   "hour"  "hour
    [28] "onten" "ntent"
    [37] "our"   "our "  "our o" "ur "   "ur o"  "ur of" "r o"   "r of"  "r of
    [46] " of"   " of "  " of o" "of "   "of o"  "of ou" "f o"   "f ou"  "f our
    [55] " ou"   " our"  " our " "our"   "our "  "our d" "ur "   "ur d"  "ur di
    [64] "r d"   "r di"  "r dis" " di"   " dis"  " disc" "dis"   "disc"  "disco
    [73] "isc"   "isco"  "iscon" "sco"   "scon"  "scont" "con"   "cont"  "conte
    [82] "ont"   "onte"  "onten" "nte"   "nten"  "ntent" "ten"   "tent"  "ent"
    [[1]]