16.20 - Parallel Debugging Example - Advanced SQL Engine - Teradata Database

Teradata Vantage™ - SQL External Routine Programming

Product
Advanced SQL Engine
Teradata Database
Release Number
16.20
Release Date
April 2020
Content Type
Programming Reference
Publication ID
B035-1147-162K
Language
English (United States)

This example introduces the parallel debugging facilities of the Teradata C/C++ UDF Debugger. When a SQL request executes a UDF, multiple instances of the UDF may be running on one or more AMPs. Using the Teradata C/C++ UDF Debugger, you can switch between the instances running on those AMPs. This example will refer collectively to those instances as UDFs.

The Simple Debugging Example uses a query that runs a single UDF to illustrate the basic debugging process. You can follow that example and reproduce exactly the steps in the process. The details in this example cannot be reproduced exactly. Results are dependent on your specific database system configuration. What you see may differ from what is shown here, but this example will illustrate some of the available parallel debugging capabilities.

Create a Table

To produce similar results to this example, create a 2-column table, named “testdata,” loaded with these values:

  a    b
---  ---
  1    2
  2    1
  3    4
  4    3
  5    6
  6    5
  7    8
  8    7
  9   10

You can create this table in bteq with this statement:

create table testdata ( a integer, b integer ) primary index ( a );

Populate this table with the rows shown above, either by entering them with INSERT statements in bteq or by loading them from a file using FastLoad.

For information on using BTEQ, see Basic Teradata® Query Reference, B035-2414. For information on using FastLoad, see Teradata® FastLoad Reference, B035-2411.

Debug Multiple Instances of a UDF

Under the debugger, execute a request to run plusudf (the UDF you created in Simple Debugging Example) for each row in the testdata table, for example:

set session debug function plusudf on;
select a, b, plusudf(a, b) from testdata;

Follow the steps described in Simple Debugging Example. After joining this query to the debugger you will see something like:

(gdb) join 1003
Reading symbols for task udfsectsk... (libudf.so, 0x7ffff7bb7000)
   (libudf1.so, 0x7ffff79b4000) (libstdc++.so.6, 0x7ffff76a9000)
   (libjil.so, 0x7ffff748b000) (libnetpde.so, 0x7ffff723f000)
   (libpde.so, 0x7ffff6f80000) (libemf.so, 0x7ffff6d72000)
   (libpdesym.so, 0x7ffff6b5b000) (libpthread.so.0, 0x7ffff693e000)
   (libelf.so.1, 0x7ffff672a000) (libnsl.so.1, 0x7ffff6512000)
   (libm.so.6, 0x7ffff62bc000) (libc.so.6, 0x7ffff5f5a000)
   (libdl.so.2, 0x7ffff5d56000) (ld-linux-x86-64.so.2, 0x7ffff7dde000)
   (libgcc_s.so.1, 0x7ffff5b3f000) (libacl.so.1, 0x7ffff5937000)
   (libcrypto.so.0.9.8, 0x7ffff5598000)
   (libthread_db.so.1, 0x7ffff5390000) (libattr.so.1, 0x7ffff518b000) 
   (libz.so.1, 0x7ffff4f75000) (libudf_1026_17.so, 0x7ffff355b000) done.
Node  Pid   Tid   Type
   3  5758  5758  C   
   2  5748  5748  C   
   1  5743  5743  C   
   0  5753  5753  C   

Instead of just one line in the table listing the UDFs joined, this now shows one line for each AMP—Node 3, 2, 1, 0. (Note that the debugger uses the label “Node” to display the vproc number. An AMP is a specific type of vproc.) If you set a breakpoint at line 12 of the UDF and continue, you will see something like this:

(gdb) b plusudf
Breakpoint 1 at 0x7ffff356d12c: file plusudf.c, line 12.
(gdb) c
Continuing.
0//5753: Secondary breakpoint #1 at plusudf()   (plusudf.c  line 12)
1//5743: Secondary breakpoint #1 at plusudf()   (plusudf.c  line 12)
3//5758: Secondary breakpoint #1 at plusudf()   (plusudf.c  line 12)
[Switching to 2//5748]

Breakpoint 1, plusudf (a=0x7ffff7fa1940, b=0x7ffff7fa1950, result=0x7ffff7fa1964, sqlstate=0x7ffff7fa196c "00000")
    at plusudf.c:12
12          *result = *a + *b;

The example above shows that UDFs reach the breakpoint in all AMPs at the same time. Debugging must be performed serially on the UDFs. The breakpoint in vproc 2 was reached first, so the debugger switches the current context to that UDF (indicated by [Switching to 2//5748]). All the other UDFs that reached the breakpoint (on vproc 0, 1, and 3) are labeled as Secondary breakpoint and remain stopped at the breakpoint while you work on the thread in the current context.

When you display variables at this point, only the UDF in the current context (vproc 2, thread 57481) is affected:

(gdb) p *a
$1 = 7
(gdb) p *b
$2 = 8
(gdb) p *result
$3 = 0

When you step execution the function continues from the line it stopped at:

(gdb) n
13      }
(gdb) p *result
$4 = 15

You can see that the line executed and that printing *result returns the value for that UDF.

You can switch to one of the other UDFs waiting at a breakpoint with the context command (abbreviated as an @ sign) followed by the thread selector:

(gdb) @ 3//5758
[3//5758]
(gdb) p *a
$5 = 6

Note that after you select a new UDF and display an argument, it shows the value for the newly selected UDF. You can go through all UDFs currently stopped and step each one individually. You can also set context to all active UDFs and display a variable for all of them with one command:

(gdb) @ all
[1//5743]
(gdb) p *a
[1//5743]
$6 = 9
[0//5753]
$7 = 5
[2//5748]
$8 = 7
[3//5758]
$9 = 6

You can provide one or more specific thread selectors to set context to one or more specific tasks. When multiple tasks are in context, the value for each task is preceded by a line showing the value’s thread selector. Although you can show values for multiple UDFs, you can only step execution in one UDF at a time. The thread selector that is displayed after the @ all command shows which UDF that will be.

Continuing execution resumes all currently active UDFs, which in this example run to completion and have a result similar to the following:

(gdb) c
Continuing.
[Switching to 0//5753]
Breakpoint 1, plusudf (a=0x7ffff7fa1940, b=0x7ffff7fa1950, result=0x7ffff7fa1964, sqlstate=0x7ffff7fa196c "00000")
    at plusudf.c:12
12          *result = *a + *b;

Notice that in this example, the UDF reaches the breakpoint on vproc 0, which is the same vproc on which a UDF completed before but for a different table row. Secondary breakpoints could occur here if multiple UDFs hit the breakpoint at the same time, but only one UDF hit it in this instance.

At this point, you can display variables and step through the new UDF as before:

(gdb) p *a
$10 = 3
(gdb) p *b
$11 = 4
(gdb) n
13      }
(gdb) p *result
$12 = 7

Whenever the debugger stops, it only reports UDFs that have hit a breakpoint, but it suspends execution of all UDFs that it has joined. To see the others, you can put all threads in context (@ all) and issue the info context long command to display all threads in context. At this point in the debug session, that might show something like:

(gdb) @ all
[2//5748]
(gdb) i context long
all ->   2//5748  0//5753

UDF 2//5748 was not reported when the debugger reached the breakpoint. It was joined to the debugger but had not yet hit its breakpoint when all the UDFs were stopped. Printing UDF variables for 2//5748 will not work because execution stopped inside the line that defines the variables:

(gdb) p *a
[2//5748]
No symbol "a" in current context.
[0//5753]
$15 = 3

Continuing again at this point allows 2//5748 to run to its breakpoint:

(gdb) c
Continuing.
Breakpoint 1, plusudf (a=0x7ffff7fa1940, b=0x7ffff7fa1950, result=0x7ffff7fa1964, sqlstate=0x7ffff7fa196c "00000")
    at plusudf.c:12
12          *result = *a + *b;
(gdb) info context
2//5748

No message about switching context appears here because the current context set by the previous @ command already contains the thread that stopped. Showing the context after the stop confirms which thread stopped, in this case 2//5748.

Continuing again will allow this thread to run, and the debugger will stop when one or more UDFs reach the breakpoint. You can continue until the UDF has executed for every row in the table. If you do not want to step through every UDF, you can quit the debugger and the query will then run to completion without stopping. Alternately, you can delete or disable the breakpoint and continue to let the query finish without leaving the debugger. At that point, rerunning the query will stop the query again when the first UDF runs.

Parallel debugging offers many more capabilities than this simple example illustrates. If you only want to step through problem code that only a few UDFs execute, setting a breakpoint at that point will only stop UDFs that hit it; all other UDFs will silently execute without stopping. You can also set conditions on breakpoints to stop only when data values of interest are passed to a UDF. Commands can be attached to breakpoints that automatically display variables whenever breakpoints hit; the last command can continue execution to generate a trace of UDFs that a query executes without stopping at all.