16.20 - Guidelines for Manipulating LOBs in an SQL Procedure - Teradata Database - Teradata Vantage NewSQL Engine

Teradata Vantage™ SQL Data Definition Language Detailed Topics

Product
Teradata Database
Teradata Vantage NewSQL Engine
Release Number
16.20
Release Date
March 2019
Content Type
Programming Reference
Publication ID
B035-1184-162K
Language
English (United States)
You must always exercise care when writing SQL procedures that access LOBs for the following reasons.
  • While procedures execute in the PE, LOBs are stored on the AMPs.
  • Procedures store a reference to the LOB that resides on the AMPs.
  • All procedure operations that need access to a LOB must transfer that entire LOB to the PE on which the procedure is running.
The following example converts a CLOB containing XML sales data into a set of rows that are inserted into a sales table. The procedure also does the following:
  • Stores the CLOB in a log table.
  • Stores half of the CLOB into a local variable.
  • Stores half of the CLOB into a different log table.
        CREATE TABLE sales (
          partnum   INTEGER, 
          qtysold   INTEGER,
          storecode INTEGER,
          salesdate DATE)
        PRIMARY INDEX (partnum, qtysold, storecode, salesdate);
    
        CREATE TABLE SalesLog (
          storecode INTEGER, 
          salesdate DATE,
          sales     CLOB,
          logdate   DATE,
          logtime   TIME)
        PRIMARY INDEX (storecode, salesdate, logdate, logtime);
    
        CREATE TABLE saleshalflog (
          storecode INTEGER, 
          salesdate DATE,
          sales     CLOB,
          logdate   DATE,
          logtime   TIME)
        PRIMARY INDEX (storecode, salesdate, logdate, logtime);
    
        CREATE PROCEDURE storessalesprocedure (
          storecode INTEGER,
          salesdate DATE,
          salesclob CLOB)
            BEGIN
             DECLARE localclob CLOB;
             SET localclob = SUBSTR(:salesclob,1,CHARACTERS(:salesclob)/2 );
             INSERT saleslog (:storecode, :salesdate, :salesclob,
                              CURRENT_DATE, CURRENT_TIME );
             INSERT saleshalflog (:storecode, :salesdate, :localclob,
                                  CURRENT_DATE, CURRENT_TIME );
             INSERT sales SELECT * FROM TABLE (xmlparser(:salesclob));
            END;

Suppose the procedure were to be invoked by passing a 1 MB CLOB, as follows.

    USING (
      storecode INTEGER,
      salesdate DATE,
      salesclob CLOB AS DEFERRED)
     CALL storessalesprocedure (:storecode, :salesdate, :salesclob);

For this case, the following process occurs:

  1. The CLOB is transferred from the client application to an AMP.
  2. The procedure is invoked with a reference to the 1 MB CLOB and the other data.
  3. The procedure builds a new CLOB, called localclob, by taking the substring of the 1 MB CLOB.
  4. The entire 1 MB CLOB is transferred to the PE.
  5. A half-MB CLOB is transferred to the AMP to be written for LocalClob.
  6. A reference to LocalClob is stored in the PE.
  7. The INSERT into the SalesLog table is sent to the Parser with a reference to the 1 MB CLOB.
  8. The Parser sends an INSERT step, containing a reference to the 1 MB CLOB, to the AMP.
  9. The AMP retrieves the 1 MB CLOB from the AMP where it is stored and inserts it into the SalesLog table.
  10. The INSERT into the SalesHalfLog table is sent to the Parser with a reference to the half-MB CLOB.
  11. The Parser sends an INSERT step, containing a reference to the half-MB CLOB, to the AMP.
  12. The AMP retrieves the half-MB CLOB from the AMP where it is stored and inserts into the SalesHalfLog table.
  13. The INSERT into the Sales table is sent to the Parser with a reference to the 1 MB CLOB.
  14. The Parser sends INSERT … SELECT steps, containing a reference to the 1 MB CLOB, to all AMPs.
  15. The AMP containing the 1 MB CLOB converts it into rows that are hash-redistributed to other AMPs.
  16. Those rows are inserted into the Sales table.
The use of LocalClob was unnecessary. It would have been more efficient not to use the local variable and instead place the substring expression in the INSERT statement like this.
    INSERT INTO saleshalflog VALUES (:storecode, :salesdate, SUBSTR(
                             :salesclob, 1, CHARACTERS(:salesclob)/2),
                             CURRENT_DATE, CURRENT_TIME);

If it is not possible to place the LOB functions in the SQL statements, then consider converting the data types of the LOBs to VARBYTE or VARCHAR. Each conversion still brings the entire LOB into the PE, but for small LOBs it might be an option.

For SELECT operations inside a procedure, use of the FOR syntax can prevent unnecessary transfer of LOBs. Consider the following example that shows a procedure selecting a LOB and inserting it into another table.

    CREATE TABLE sales (
      partnum   INTEGER, 
      qtysold   INTEGER,
      storecode INTEGER,
      salesdate DATE)
    PRIMARY INDEX (partnum, qtysold, storecode, salesdate);

    CREATE TABLE saleslog (
      storecode INTEGER, 
      salesdate DATE,
      sales     CLOB,
      logdate   DATE,
      logtime   TIME)
    PRIMARY INDEX (storecode, salesdate, logdate, logtime);

    CREATE PROCEDURE stores_sales_log_procedure
      (storecode INTEGER, salesdate DATE )
      BEGIN
          FOR cur AS curname CURSOR FOR
            SELECT * 
            FROM saleslog
            WHERE storecode = :storecode 
            AND   salesdate = :salesdate;
           DO
            IF (:cur.lobdate = DATE-1) THEN
               INSERT sales 
               SELECT *
               FROM TABLE (xml_sales_parser(:cur.sales));
            END IF;
          END FOR;
      END;

The procedure named stores_sales_log_procedure performs the following process.

  1. Stores a reference to the sales CLOB in :cur.sales.
  2. Sends this reference with the INSERT request into the sales table request to the Parser and the INSERT … SELECT steps sent to the AMPs.
  3. If :cur.sales were then copied to a local variable, the CLOB is transferred from the AMP to the PE and then back again to the AMP.

The SELECT INTO- and FETCH INTO-styles of procedure SELECT statements make copies to local variables that cause extra LOB transfers from the AMP to the PE and back to the AMP.

The following case is an example of how not to create an efficient procedure for handling LOB data.

    CREATE TABLE sales (
      partnum   INTEGER, 
      qtysold   INTEGER,
      storecode INTEGER,
      salesdate DATE)
    PRIMARY INDEX (partnum, qtysold, storecode, salesdate);

    CREATE TABLE saleslog (
      storecode INTEGER, 
      salesdate DATE,
      sales     CLOB,
      logdate   DATE,
      logtime   TIME)
    PRIMARY INDEX (storecode, salesdate, logdate, logtime);

    CREATE PROCEDURE stores_sales_stream_proc
     (storecode INTEGER, salesdate DATE, salesclob CLOB)
      BEGIN
          DECLARE resultrownum INTEGER;
          DECLARE partnum      INTEGER;
          DECLARE qtysold      INTEGER;
          INSERT INTO SalesLog VALUES (:storecode, :salesdate, :sales,
                                       CURRENT_DATE, CURRENT_TIME );
        CALL xmlsalesparser(:resultrownum,:partnum,:qtysold,:salesclob);
        WHILE (resultrownum> 0 )
        DO
          INSERT INTO Sales VALUES (:partnum, :qtysold, :storecode,
                                    :salesdate);
        CALL xmlsalesparser(:resultrownum,:partnum,:qtysold,:salesclob);
        END WHILE;
      END;