DataFrame Creation on PTI Tables Partially Supported | Teradata Python Package - 17.00 - DataFrame Creation on Primary Time Index Tables is Partially Supported - Teradata Package for Python

Teradata® Package for Python User Guide

Product
Teradata Package for Python
Release Number
17.00
Release Date
November 2021
Content Type
User Guide
Publication ID
B700-4006-070K
Language
English (United States)

DataFrame creation on Primary Time Index (PTI) tables is partially supported. DataFrame is created when the underlying PTI table is not created with 'timebucket' duration.

Example

Create a PTI table without 'timebucket_duration', using copy_to_sql and then create a DataFrame on it.

>>> load_example_data("sessionize", "sessionize_table")
>>> df3 = DataFrame('sessionize_table')
>>> copy_to_sql(df3, "test_copyto_pti",
                timecode_column='clicktime',
                columns_list='event')
>>> DataFrame("test_copyto_pti")
                      TD_TIMECODE partition_id adid productid
event                                                       
click  2009-03-19 16:43:26.000000         1199    1      1001
click  2009-07-04 09:18:17.000000         1231    1      1001
click  2009-07-04 09:18:17.000000         1231    1      1001
click  2009-07-16 11:18:16.000000         1039    4      1001
click  2009-07-16 11:18:16.000000         1039    4      1001
click  2009-07-24 04:18:10.000000         1167    2      1001
view   2009-02-09 15:17:59.000000         1263    4      1001
view   2009-03-09 21:17:59.000000         1199    2      1001
view   2009-03-09 21:17:59.000000         1199    2      1001
view   2009-03-13 17:17:59.000000         1071    4      1001

Example

Create a PTI table with 'timebucket_duration', DataFrame creation on the PTI table fails.

>>> load_example_data("sessionize", "sessionize_table")
>>> df3 = DataFrame('sessionize_table')
>>> copy_to_sql(df3, "test_copyto_pti1", 
           	timecode_column='clicktime', 
                columns_list='event', 
			   timebucket_duration="HOURS(2)")
>>> DataFrame("test_copyto_pti1")
/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/reflection.py:888: SAWarning: index key 'TD_TIMEBUCKET' was not located in columns for table 'test_copyto_pti1'
  "columns for table '%s'" % (flavor, c, table_name)
Traceback (most recent call last):
  File "/Users/pp186043/Github_Repos/teradataml_repo/pyTeradata/teradataml/dataframe/dataframe.py", line 163, in __init__
    self._metaexpr = self._get_metaexpr()
  File "/Users/pp186043/Github_Repos/teradataml_repo/pyTeradata/teradataml/dataframe/dataframe.py", line 395, in _get_metaexpr
    return _MetaExpression(t, column_order = self.columns)
  File "/Users/pp186043/Github_Repos/teradataml_repo/pyTeradata/teradataml/dataframe/sql.py", line 164, in __init__
    self.__t = _SQLTableExpression(table, **kw)
  File "/Users/pp186043/Github_Repos/teradataml_repo/pyTeradata/teradataml/dataframe/sql.py", line 460, in __init__
    raise ValueError('Reflected column names do not match those in DataFrame.columns')
ValueError: Reflected column names do not match those in DataFrame.columns
 
The above exception was the direct cause of the following exception:
 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/pp186043/Github_Repos/teradataml_repo/pyTeradata/teradataml/dataframe/dataframe.py", line 171, in __init__
    raise TeradataMlException(Messages.get_message(MessageCodes.TDMLDF_CREATE_FAIL), MessageCodes.TDMLDF_CREATE_FAIL) from err
teradataml.common.exceptions.TeradataMlException: [Teradata][teradataml](TDML_2010) Failed to create Teradata DataFrame.