Type Casting: cast() | Teradata Package for Python - Type Casting: cast() - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
December 2024
ft:locale
en-US
ft:lastEdition
2025-01-23
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
lifecycle
latest
Product Category
Teradata Vantage

Use the cast() method to cast a DataFrame column (Column Expression) to a specified teradatasqlalchemy type. It can be used with the DataFrame methods assign and filter.

Required Argument:
  • type_: Specifies a teradatasqlalchemy type or an object of a teradatasqlalchemy type that the column needs to be cast to.

    Default value is None.

    Types: teradatasqlalchemy type or object of teradatasqlalchemy type

Optional Arguments:
  • format: Specifies a variable length string containing formatting characters that define the display format for the data type.

    Formats can be specified for columns that have character, numeric, byte, DateTime, Period or UDT data types.

    Teradata supports different formats. Refer to "Data Type Formats and Format Phrases" in Teradata Vantage™ - Data Types and Literals, B035-1143.

    Default value: None

    Types: str

  • timezone: Specifies the timezone string.

    Refer to Teradata Vantage™ - SQL Date and Time Functions and Expressions, B035-1211 for supported timezones.

    Type: ColumnExpression or str.

Example Setup

>>> load_example_data("dataframe","admissions_train")
>>> df = DataFrame('admissions_train')
>>> df
   masters   gpa     stats programming  admitted
id
13      no  4.00  Advanced      Novice         1
26     yes  3.57  Advanced    Advanced         1
5       no  3.44    Novice      Novice         0
19     yes  1.98  Advanced    Advanced         0
15     yes  4.00  Advanced    Advanced         1
40     yes  3.95    Novice    Beginner         0
7      yes  2.33    Novice      Novice         1
22     yes  3.46    Novice    Beginner         0
36      no  3.00  Advanced      Novice         0
38     yes  2.65  Advanced    Beginner         1
>>> df.dtypes
id               int
masters          str
gpa            float
stats            str
programming      str
admitted         int

Example 1: Use assign to create a new DataFrame with a new column, casted to VARCHAR(5) type

Use assign() to create a new DataFrame with a new column 'char_id', which is the 'id' column (of type INTEGER) cast to VARCHAR(5) (object of teradatasqlalchemy type corresponding to SQL Type VARCHAR(5)).

>>> from teradatasqlalchemy import VARCHAR
>>> new_df = df.assign(char_id = df.id.cast(type_=VARCHAR(5)))
>>> new_df
   masters   gpa     stats programming  admitted char_id
id
5       no  3.44    Novice      Novice         0       5
34     yes  3.85  Advanced    Beginner         0      34
13      no  4.00  Advanced      Novice         1      13
40     yes  3.95    Novice    Beginner         0      40
22     yes  3.46    Novice    Beginner         0      22
19     yes  1.98  Advanced    Advanced         0      19
36      no  3.00  Advanced      Novice         0      36
15     yes  4.00  Advanced    Advanced         1      15
7      yes  2.33    Novice      Novice         1       7
17      no  3.83  Advanced    Advanced         1      17
>>> new_df.dtypes
id               int
masters          str
gpa            float
stats            str
programming      str
admitted         int
char_id          str

Example 2: Use assign to create a new DataFrame with a new column, casted to VARCHAR type

Use assign() to create a new DataFrame with a new column 'char_id', which is the 'id' column (of type INTEGER) cast to VARCHAR (teradatasqlalchemy type corresponding to SQL Type VARCHAR).

>>> new_df_2 = df.assign(char_id = df.id.cast(type_=VARCHAR))
>>> new_df_2
   masters   gpa     stats programming  admitted char_id
id
5       no  3.44    Novice      Novice         0       5
34     yes  3.85  Advanced    Beginner         0      34
13      no  4.00  Advanced      Novice         1      13
40     yes  3.95    Novice    Beginner         0      40
22     yes  3.46    Novice    Beginner         0      22
19     yes  1.98  Advanced    Advanced         0      19
36      no  3.00  Advanced      Novice         0      36
15     yes  4.00  Advanced    Advanced         1      15
7      yes  2.33    Novice      Novice         1       7
17      no  3.83  Advanced    Advanced         1      17
>>> new_df_2.dtypes
id               int
masters          str
gpa            float
stats            str
programming      str
admitted         int
char_id          str

Example 3: Filter data with match on a column cast to another type

>>> df[df.id.cast(VARCHAR(5)) == '1']
   masters   gpa     stats programming  admitted
id
1      yes  3.95  Beginner    Beginner         0
>>> df[df.id.cast(VARCHAR) == '1']
   masters   gpa     stats programming  admitted
id
1      yes  3.95  Beginner    Beginner         0

Example 4: Create new DataFrame casting 'timestamp_col' column (of type VARCHAR) to TIMESTAMP, using format

>>> dataframe_dict = {"id": [100, 200,300],
>>> "timestamp_col": ['1000-01-10 23:00:12-02:00', '2015-01-08 13:00:00+12:00', '2014-12-10 10:00:35-08:00'],
>>> "timezone_col": ["GMT", "America Pacific", "GMT+10"]}
>>> pandas_df = pd.DataFrame(dataframe_dict)
>>> copy_to_sql(pandas_df, table_name = 'new_table', if_exists = 'replace')
>>> df1 = DataFrame("new_table")
>>> df1
id   timestamp_col                    timezone_col
300  2014-12-10 10:00:35-08:00        GMT+10
200  2015-01-08 13:00:00+12:00        America Pacific
100  1000-01-10 23:00:12-02:00        GMT
>>> df1.dtypes
id               int
timestamp_col    str
timezone_col     str
>>> new_df1 = df1.assign(new_col = df1.timestamp_col.cast(TIMESTAMP, format='Y4-MM-DDBHH:MI:SSBZ'))
id   timestamp_col                    timezone_col         new_col
300  2014-12-10 10:00:35-08:00        GMT+10               2014-12-10 18:00:35
200  2015-01-08 13:00:00+12:00        America Pacific      2015-01-08 01:00:00
100  1000-01-10 23:00:12-02:00        GMT                  1000-01-11 01:00:12
>>> new_df1.tdtypes
id              int
timestamp_col   str
timezone_col    str
new_col         datetime.datetime

Example 5: Create new DataFrame casting 'id' column (of type INTEGER) to VARCHAR, using format

>>> new_df2 = df1.assign(new_col = df1.id.cast(VARCHAR, format='zzz.zz'))
id   timestamp_col              timezone_col      new_col
300  2014-12-10 10:00:35-08:00  GMT+10            300.00
200  2015-01-08 13:00:00+12:00  America Pacific   200.00
100  1000-01-10 23:00:12-02:00  GMT               100.00
>>> new_df2.dtypes
id               int
timestamp_col    str
timezone_col     str
new_col          str

Example 6: Create new DataFrame casting 'timestamp_with_timezone' column (of type TIMESTAMP) to TIMESTAMP WITH TIMEZONE, with offset 'GMT+10'

>>> new_df3 = new_df1.assign(timestamp_with_timezone = new_df1.new_col.cast(TIMESTAMP(timezone=True), timezone='GMT+10'))
id   timestamp_col              timezone_col     new_col              timestamp_with_timezone
300  2014-12-10 10:00:35-08:00  GMT+10           2014-12-10 18:00:35  2014-12-11 04:00:35.000000+10:00
200  2015-01-08 13:00:00+12:00  America Pacific  2015-01-08 01:00:00  2015-01-08 11:00:00.000000+10:00
100  1000-01-10 23:00:12-02:00  GMT              1000-01-11 01:00:12  1000-01-11 11:00:12.000000+10:00
>>> new_df3.dtypes
id                         int
timestamp_col              str
timezone_col               str
new_col                    datetime.datetime
timestamp_with_timezone    datetime.datetime