to_pandas() Method

Teradata® Python Package User Guide

brand
Teradata Vantage
prodname
Teradata Python Package
vrm_release
16.20
category
User Guide
featnum
B700-4006-098K

The to_pandas() function creates a pandas DataFrame from a teradataml DataFrame.

The function takes the following optional parameters:

  • index_column: Column name or List of column names representing columns to be used as Pandas index.
    When the optional parameter index_column is provided, the specified column is used as the Pandas index. Otherwise, the index (if exists) of the teradataml DataFrame or the primary index of the Teradata Database table is used as the Pandas index. The default integer index is used if none of the above indexes exists.
  • num_rows: The number of rows to retrieve from DataFrame while creating Pandas DataFrame. The default is 99999.

Examples Prerequisite

Assume a teradataml DataFrame "df" is created from a Teradata table "sales", using command:
df = DataFrame("sales")

Example: Create a pandas DataFrame without specifying index

>>> pandas_df = df.to_pandas()

Enter pandas_df to display the pandas DataFrame:

>>> pandas_df
            Feb   Jan   Mar   Apr    datetime
accounts
Alpha Co    210   200   215   250  2017-04-01
Blue Inc     90    50    95   101  2017-04-01
Yellow Inc   90  None  None  None  2017-04-01
Jones LLC   200   150   140   180  2017-04-01
Red Inc     200   150   140  None  2017-04-01
Orange Inc  210  None  None   250  2017-04-01

Example: Create a pandas DataFrame using index_column to set the index to "Feb"

>>> pandas_df = df.to_pandas(index_column = 'Feb')

Enter pandas_df to display the pandas DataFrame:

>>> pandas_df
       accounts   Jan   Mar   Apr    datetime
Feb
210    Alpha Co   200   215   250  2017-04-01
90     Blue Inc    50    95   101  2017-04-01
90   Yellow Inc  None  None  None  2017-04-01
200   Jones LLC   150   140   180  2017-04-01
200     Red Inc   150   140  None  2017-04-01
210  Orange Inc  None  None   250  2017-04-01

Example: Create a pandas DataFrame using a list of column names for a multi-column index

>>> pandas_df = df.to_pandas(index_column = ['accounts', 'Feb'])

Enter pandas_df to display the pandas DataFrame:

>>> pandas_df
                 Jan   Mar   Apr    datetime
accounts   Feb
Yellow Inc 90   None  None  None  2017-04-01
Alpha Co   210   200   215   250  2017-04-01
Jones LLC  200   150   140   180  2017-04-01
Orange Inc 210  None  None   250  2017-04-01
Blue Inc   90     50    95   101  2017-04-01
Red Inc    200   150   140  None  2017-04-01

Example: Create a pandas DataFrame using num_rows to limit the number of rows to 3

>>> pandas_df = df.to_pandas(index_column = 'Feb', num_rows = 3)

Enter pandas_df to display the pandas DataFrame:

>>> pandas_df
       accounts   Jan   Mar  Apr    datetime
Feb
210    Alpha Co   200   215  250  2017-04-01
210  Orange Inc  None  None  250  2017-04-01
90     Blue Inc    50    95  101  2017-04-01