Requirements for Writing a SQL-MapReduce Function in C - Aster Execution Engine

Teradata Aster® Developer Guide

Product
Aster Execution Engine
Release Number
7.00.02
Published
July 2017
Language
English (United States)
Last Update
2018-04-13
dita:mapPath
xnl1494366523182.ditamap
dita:ditavalPath
Generic_no_ie_no_tempfilter.ditaval
dita:id
ffu1489104705746
lifecycle
previous
Product Category
Software

You define a C-based SQL-MapReduce function by writing a function that satisfies the following requirements and uses the following API resources. (For examples, see the sqlmr-sdk/example directory.)

  1. Headers to include: Import the needed SQL-MapReduce headers from sqlmr/api/c/ (in the sqlmr-sdk/include directory). Most SQL-MapReduce C functions will include Core.h, FunctionModule.h, NativeValue.h, RowBuilder.h, RowIterator.h, RowView.h and RuntimeContract.h. Because it is a row function, the echo_input example must include RowFunction.h.
  2. Making the function known to an Aster instance: In a SQLMR_FUNCTION_MODULE_BEGIN / END block, declare the function as a newRowFunction or newPartitionFunction. This looks similar to:
    SQLMR_FUNCTION_MODULE_BEGIN()
    {
       SQLMR_FUNCTION_ENTRY("echo_input")
       {
          entry->newRowFunction = &echo_input_newRowFunction;
       }
    }
    SQLMR_FUNCTION_MODULE_END()
  3. Row/partition implementation requirements: Implement your function by doing one of the following:
    • If the function is a row function, write the prototype and implementation of the my_function_name_newRowFunction() method and the my_function_name_operateOnSomeRows() method. The row function must complete the runtime contract, as explained in RuntimeContract.h.
    • If the function is a partition function, write the prototype and implementation of the my_function_name_newPartitionFunction() method and the my_function_name_operateOnPartition() method. The partition function must complete the runtime contract, as explained in RuntimeContract.h.
  4. Handling arguments: If your SQL-MapReduce function will take arguments in the SQL command, use the facilities provided in ArgumentClause.h (SqlmrArgumentClauseH) to add argument clauses to your function. For an example of this, see sqlmr-sdk/example/c/repeat_input/repeat_input.c.
  5. Helper files: You can install files on the cluster to act as SQL-MapReduce helper files or to hold data that you do not wish to store in the database. If your SQL-MapReduce function will use or operate on an installed file, use the functions of InstalledFile.h (like sqlmr_if_getInstalledFiles and sqlmr_if_openForRead). For an example of this, see sqlmr-sdk/example/c/list_files/list_files.c.
  6. Naming conventions for API methods: The user-facing methods provided by the SQL-MapReduce C API have names starting with "sqlmr_". Often, there's also a two-letter code in the name indicating which module provides the function. These include:
    • sqlmr_rc_* for functions provided by RuntimeContract
    • sqlmr_ri_* for functions provided by RowIterator
    • sqlmr_rb_* for functions provided by RowBuilder
    • sqlmr_rv_* for functions provided by RowView
  7. Naming conventions for datatypes: Datatype names in the SQL-MapReduce C API follow these rules:
    • All types in the API start with "Sqlmr".
    • Some types also end with "H" (for example, SqlmrRowViewH). The "H" stands for "handle," meaning that such types are actually pointers to some opaque type.
    • By contrast, types that do not end with "H" (for example, SqlmrNativeValue) are value types, and are structs with a non-opaque representation.
  8. Error handling: Handle errors and return error information using the methods of sqlmr-sdk/include/sqlmr/api/c/core/Error.h.
  9. Memory Management: See Memory Management in C API Functions.