Four Insertion Methods - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
ft:locale
en-US
ft:lastEdition
2024-12-11
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

Case 1: INSERT ... SELECT

This method uses the following procedure.

  1. FastLoad the rows into an empty table.
  2. INSERT ... SELECT the rows from the freshly loaded table into the table that has a join index defined.

This procedure uses block-at-a-time optimization and is the fastest of the methods examined.

Case 2: Dropping and Recreating a Join Index after Inserting Rows into Base Table

This method uses the following procedure:
  1. Drop the join index.
  2. Insert the rows into the base table using any method.
  3. Recreate the join index.

The elapsed time for this method averages to about 1.3 times the elapsed time measured for the method provided in Case 1.

Case 3: Teradata Parallel Data Pump

This method uses the Teradata Parallel Data Pump utility to insert rows into the base table with a join index. The Teradata Parallel Data Pump utility has procedural advantages, including restartability and resource throttleability.

Because the Teradata Parallel Data Pump utility performs row-at-a-time operations, its performance for this application is not optimal. Teradata Parallel Data Pump utility, in an nonoptimized run, is more than 200 times slower than the method of Case 1 with respect to elapsed time measurements.

Optimization of the run using techniques like sorting input data in different orders can reduce this figure by an order of magnitude, reducing the cost differential to a 20:1 ratio for a 1-hit-per-data-block situation. As the number of hits per data block increases, the cost differential increases, because methods that use block-at-a-time methods become increasingly efficient.

Case 4: Single Row INSERT Request

This method uses the SQL INSERT statement to insert the rows into the base table. The data presented in Join Index Maintenance Cost as Function of Hits for Each Data Block confirms that this method is better than the method presented in Case 2.