fastload() | Teradata Python Package - fastload()

fastload() | Teradata Python Package - fastload() - Teradata Package for Python

Teradata® Package for Python User Guide

Product

Teradata Package for Python

Release Number

17.00

Published

November 2021

Language

English (United States)

Last Update

2022-01-14

dita:mapPath

bol1585763678431.ditamap

dita:ditavalPath

ayr1485454803741.ditaval

dita:id

B700-4006

lifecycle

Product Category

Teradata Vantage

The fastload() function writes records from a Pandas DataFrame to Vantage using Fastload, and can be used to quickly load large amounts of data in an empty table on Vantage.

Teradata recommends using fastload() function when number of rows in the Pandas DataFrame is greater than 100,000 for better performance. To insert lesser rows, you can use the copy_to_sql() function for optimized performance.

The data is loaded in batches.

FastLoad API cannot load duplicate rows in the DataFrame if the table is a MULTISET Indexed table.
FastLoad API does not support all Advanced SQL Engine data types.
For example, target table having BLOB and CLOB data type columns cannot be loaded.
If there are any incorrect rows due to constraint violations, data type conversion errors, etc., FastLoad protocol ignores those rows and inserts all valid rows.
Rows in the DataFrame that failed to get inserted are categorized into errors and warnings by FastLoad protocol and these errors and warnings are stored into respective error and warning tables by FastLoad API.
If 'save_errors' argument is set to True, the names of error and warning tables are shown once the fastload operation is complete. These tables will be persisted using copy_to_sql.

FastLoad returns a dict containing the following attributes:

errors_dataframe: It is a Pandas DataFrame containing error messages thrown by fastload.
DataFrame is empty if there are no errors.
warnings_dataframe: It is a Pandas DataFrame containing warning messages thrown by fastload.
DataFrame is empty if there are no warnings.
errors_table: Name of the table containing errors.
It is None, if argument save_errors is False.
warnings_table: Name of the table containing warnings.
It is None, if argument save_errors is False.

See the FastLoad section of https://pypi.org/project/teradatasql/ for more information about FastLoad protocol through teradatasql driver.

Minimum version requirements for fastload()

teradatasql version 16.20.00.48 or later is required for fastload() API to work properly. If you have a lower version installed, then teradatasql raises OperationalError and fastload() call ends with the following error:

[Teradata Database] [Error 3706] Syntax error: expected something between the beginning of the request and the word 'teradata_require_fastloadINSERT'.

pandas version 0.24 or later is required for fastload() API to work properly. If you have a lower version, fastload() API will fail with following error:

AttributeError: 'Index' object has no attribute 'to_list'

Install pandas >= 0.24 to solve this issue.

Example Prerequisites

>>> from teradataml.dataframe.fastload import fastload
>>> from teradatasqlalchemy.types import *
>>> import pandas as pd

>>> df = {'emp_name': ['A1', 'A2', 'A3', 'A4'],
         'emp_sage': [100, 200, 300, 400],
         'emp_id': [133, 144, 155, 177],
         'marks': [99.99, 97.32, 94.67, 91.00]
          }

>>> pandas_df = pd.DataFrame(df)

Example 1: Save a Pandas DataFrame with default signature

>>> fastload(df = pandas_df, table_name = 'my_table')

Example 2: Save a Pandas DataFrame with primary_index

>>> pandas_df = pandas_df.set_index(['emp_id'])

>>> fastload(df = pandas_df, table_name = 'my_table_1', primary_index='emp_id')

Example 3: Save a Pandas DataFrame with index and primary_index

>>> fastload(df = pandas_df, table_name = 'my_table_2', index=True, primary_index='index_label')

Example 4: Save a Pandas DataFrame with types, appending to the table if it already exists

>>> fastload(df = pandas_df, table_name = 'my_table_3', schema_name = 'alice', index = True, index_label = 'my_index_label', primary_index = ['emp_id'], if_exists = 'append', types = {'emp_name': VARCHAR, 'emp_sage':INTEGER, 'emp_id': BIGINT, 'marks': DECIMAL})

Example 5: Save a Pandas DataFrame using levels in index of type MultiIndex, replacing the table if it already exists

>>> pandas_df = pandas_df.set_index(['emp_id', 'emp_name'])

>>> fastload(df = pandas_df, table_name = 'my_table_4', schema_name = 'alice', index = True, index_label = ['index1', 'index2'], primary_index = ['index1'], if_exists = 'replace'