fastexport() | Teradata Package for Python - fastexport() - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

The fastexport() function exports teradataml DataFrame to pandas DataFrame or CSV file using the FastExport data transfer protocol.

Teradata recommends using fastexport() function when number of rows in the teradataml DataFrame is at least 100,000. To extract lesser rows, you can ignore this function and use regular to_pandas() or to_csv() functions.

FastExport opens multiple data transfer connections to the database.The number of data transfer sessions can be set using the keyword argument open_sessions. If open_sessions argument is not set, by default, data transfer sessions opened by teradataml is the smaller of 8 and the number of available AMPs in Vantage.

  • FastExport does not support all database data types.

    For example, tables with BLOB and CLOB type columns cannot be extracted.

  • FastExport cannot be used to extract data from a volatile or temporary table.
  • For best efficiency, do not use DataFrame.groupby() and DataFrame.sort() with FastExport.

See the FastExport section of https://pypi.org/project/teradatasql/ for more information about FastExport protocol through teradatasql driver.

Required Arguments:
  • df: Specifies the pandas DataFrame object to be saved in Vantage.
Optional Arguments:
  • export_to: Specifies a value that notifies where to export the data.
    Permitted values are:
    • "pandas": Export data to a pandas DataFrame.
    • "csv": Export data to a given CSV file.

    The default value is 'panda'.

  • index_column: Specifies column(s) to be used as index column for the converted object.

    Default values is None.

    This argument is applicable only when export_to is set to "pandas".
  • catch_errors_warnings: Specifies whether to catch errors and warnings (if any) raised by FastExport protocol while converting teradataml DataFrame.

    Default values is False.

    • When export_to is set to "pandas" and catch_errors_warnings is set to True, fastexport() returns a tuple containing:

      a. Pandas DataFrame.

      b. Errors(if any) in a list thrown by fastexport.

      c. Warnings(if any) in a list thrown by fastexport.

      When set to False, prints the fastexport errors and warnings to the standard output, if there are any.
    • When export_to is set to "csv" and catch_errors_warnings is set to True, fastexport() returns a tuple containing:

      a. Errors (if any) in a list thrown by fastexport.

      b. Warnings(if any) in a list thrown by fastexport.

  • csv_file: Specifies the name of CSV file to which data is to be exported.
    This argument is required when export_to is set to "csv".
  • kwargs specifies keyword arguments.
    • sep: Specifies a single character string used to separate fields in a CSV file, with default value ' , '.
    • quotechar: Specifies a single character string used to quote fields in a CSV file, with default value ' \" ' (double quote).
    • coerce_float: Specifies whether to convert non-string, non-numeric objects to floating point.
    • parse_dates: Specifies columns to parse as dates.
    • open_sessions: Specifies the number of Teradata data transfer sessions to be opened for fastexport. This argument is only applicable in fastexport mode.
      If open_sessions argument is not set, by default, the number of data transfer sessions opened by teradataml is the smaller of 8 and the number of available AMPs in Vantage.
    • sep and quotechar cannot be line feed ('\\n') or carriage return ('\\r').
    • sep and quotechar cannot be the same.
    • Length of sep and quotechar must be 1.

    See the FastExport section of https://pypi.org/project/teradatasql/ for more information about number of data transfer session opened during fastexport.

    See https://pandas.pydata.org/docs/reference/api/pandas.read_sql.html for more information about the coerce_float and parse_dates arguments.

Returns

The fastexport() function returns:
  • When export_to is set to 'pandas' and catch_errors_warnings is set to 'True', the fastexport() function returns a tuple containing:
    • pandas DataFrame;
    • Errors, if any, in a list of strings thrown by fastexport;
    • Warnings, if any, in a list of strings thrown by fastexport.
  • When export_to is set to 'pandas' and catch_errors_warnings is set to 'False', the fastexport() function returns a pandas DataFrame, and prints the fastexport errors and warnings to the standard output, if there is any.
  • When export_to is set to 'csv' and catch_errors_warnings is set to 'True', fastexport() function returns a CSV file with name specified by argument csv_file and a tuple containing:
    • Errors, if any, in a list of strings thrown by fastexport;
    • Warnings, if any, in a list of strings thrown by fastexport.

Example Setup

>>> from teradataml import fastexport
>>> load_example_data("dataframe", "admissions_train")
>>> df = DataFrame("admissions_train")

Example 1: Export teradataml DataFrame to pandas DataFrame

>>> fastexport(df)

Example 2: Export teradataml DataFrame to pandas DataFrame with settings

This example exports the teradataml DataFrame 'df' to pandas DataFrame, setting index column with argument index_column, converting non-string, non-numeric objects to floating point using argument coerce_float, and catching errors and warnings thrown by fastexport.

>>> pandas_df, err, warn = fastexport(df, index_column="gpa", coerce_float=True)
# Print pandas DataFrame.
>>> pandas_df
# Print errors list.
>>> err
# Print warnings list.
>>> warn

Example 3: Export teradataml DataFrame to pandas DataFrame by opening specific number of sessions

This example exports the teradataml DataFrame 'df' to pandas DataFrame using two sessions.
>>> fastexport(df, open_sessions=2)

Example 4: Export teradataml DataFrame to a given CSV file

>>> fastexport(df, export_to="csv", csv_file="Test.csv")

Example 5: Export teradataml DataFrame to a given CSV file by opening specific number of sessions

>>> fastexport(df, export_to="csv", csv_file="Test_1.csv", open_sessions=2)

Example 6: Export teradataml DataFrame to a given CSV file and catch errors and warnings

>>> err, warn = fastexport(df, export_to="csv", catch_errors_warnings=True, csv_file="Test_3.csv")
# Print errors list.
>>> err
# Print warnings list.
>>> warn

Example 7: Export teradataml DataFrame to CSV file with field separator and field quote character

This example exports the teradataml DataFrame 'df' to a CSV file with (|) as field separator and single quote (') as field quote character.

>>> fastexport(df, export_to="csv", csv_file="Test_4.csv",  sep = "|", quotechar="'")