Teradata TPump uses MultiLoad-like syntax, which leverages MultiLoad knowledge and power, provides easy transition from MultiLoad to Teradata TPump, and supports the useful upsert feature. Teradata TPump shares much of its command syntax with MultiLoad, which facilitates conversion of scripts between the two utilities; however, there are substantial differences in how the two utilities operate.
- Economies of Scale
- Resource Consumption
Economies of Scale
MultiLoad has an economy of scale and is not necessarily efficient when operating on really large tables when there are not many rows to insert or update. For MultiLoad to be efficient, it must touch more than one row per data block in the database.
For example, to achieve efficient MultiLoad performance on a two billion, 65-byte row table, composed of 16KB blocks, more than 0.4% of the table (8,125,000 rows) must be affected. While 0.4% of a table is a small update, eight million records is probably more data than should be run through a BTEQ script.
MultiLoad is limited to a database variable limit for the maximum number of instances running concurrently. Teradata TPump does not impose this limit. In addition, while MultiLoad uses table-level locks, Teradata TPump uses row-hash locks, making concurrent updates on the same table a possibility.
Finally, because of the phased nature of MultiLoad, there are potentially inconvenient windows of time when MultiLoad cannot be stopped without losing access to the target tables. In contrast, Teradata TPump can always be stopped and all of its locks dropped with no ill effect.
MultiLoad is designed for the highest possible throughput, and uses any database and host resources that help to achieve this capability. There is no way to reduce MultiLoad's resource consumption even if a longer run time for the job is acceptable. Teradata TPump, however, has a built-in resource governing facility.
This allows the operator to specify how many updates occur (the statement rate) minute by minute, and then change the statement rate, while the job continues to run. Thus, this facility can be used to increase the statement rate during windows when Teradata TPump is running by itself, but then decrease the statement rate later on, if users log on for ad hoc query access.