DataFrame Creation on PTI Tables Partially Supported | Teradata Python Package - DataFrame Creation on Primary Time Index Tables is Partially Supported

DataFrame Creation on PTI Tables Partially Supported | Teradata Python Package - DataFrame Creation on Primary Time Index Tables is Partially Supported - Teradata Package for Python

Teradata® Package for Python User Guide

Product

Teradata Package for Python

Release Number

17.00

Published

November 2021

Language

English (United States)

Last Update

2022-01-14

dita:mapPath

bol1585763678431.ditamap

dita:ditavalPath

ayr1485454803741.ditaval

dita:id

B700-4006

lifecycle

Product Category

Teradata Vantage

DataFrame creation on Primary Time Index (PTI) tables is partially supported. DataFrame is created when the underlying PTI table is not created with 'timebucket' duration.

Example

Create a PTI table without 'timebucket_duration', using copy_to_sql and then create a DataFrame on it.

>>> load_example_data("sessionize", "sessionize_table")
>>> df3 = DataFrame('sessionize_table')
>>> copy_to_sql(df3, "test_copyto_pti",
                timecode_column='clicktime',
                columns_list='event')

>>> DataFrame("test_copyto_pti")
                      TD_TIMECODE partition_id adid productid
event                                                       
click  2009-03-19 16:43:26.000000         1199    1      1001
click  2009-07-04 09:18:17.000000         1231    1      1001
click  2009-07-04 09:18:17.000000         1231    1      1001
click  2009-07-16 11:18:16.000000         1039    4      1001
click  2009-07-16 11:18:16.000000         1039    4      1001
click  2009-07-24 04:18:10.000000         1167    2      1001
view   2009-02-09 15:17:59.000000         1263    4      1001
view   2009-03-09 21:17:59.000000         1199    2      1001
view   2009-03-09 21:17:59.000000         1199    2      1001
view   2009-03-13 17:17:59.000000         1071    4      1001

Example

Create a PTI table with 'timebucket_duration', DataFrame creation on the PTI table fails.

>>> load_example_data("sessionize", "sessionize_table")
>>> df3 = DataFrame('sessionize_table')
>>> copy_to_sql(df3, "test_copyto_pti1", 
           	timecode_column='clicktime', 
                columns_list='event', 
			   timebucket_duration="HOURS(2)")

>>> DataFrame("test_copyto_pti1")
/anaconda3/lib/python3.6/site-packages/sqlalchemy/engine/reflection.py:888: SAWarning: index key 'TD_TIMEBUCKET' was not located in columns for table 'test_copyto_pti1'
  "columns for table '%s'" % (flavor, c, table_name)
Traceback (most recent call last):
  File "/Users/pp186043/Github_Repos/teradataml_repo/pyTeradata/teradataml/dataframe/dataframe.py", line 163, in __init__
    self._metaexpr = self._get_metaexpr()
  File "/Users/pp186043/Github_Repos/teradataml_repo/pyTeradata/teradataml/dataframe/dataframe.py", line 395, in _get_metaexpr
    return _MetaExpression(t, column_order = self.columns)
  File "/Users/pp186043/Github_Repos/teradataml_repo/pyTeradata/teradataml/dataframe/sql.py", line 164, in __init__
    self.__t = _SQLTableExpression(table, **kw)
  File "/Users/pp186043/Github_Repos/teradataml_repo/pyTeradata/teradataml/dataframe/sql.py", line 460, in __init__
    raise ValueError('Reflected column names do not match those in DataFrame.columns')
ValueError: Reflected column names do not match those in DataFrame.columns
 
The above exception was the direct cause of the following exception:
 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/pp186043/Github_Repos/teradataml_repo/pyTeradata/teradataml/dataframe/dataframe.py", line 171, in __init__
    raise TeradataMlException(Messages.get_message(MessageCodes.TDMLDF_CREATE_FAIL), MessageCodes.TDMLDF_CREATE_FAIL) from err
teradataml.common.exceptions.TeradataMlException: [Teradata][teradataml](TDML_2010) Failed to create Teradata DataFrame.