Access teradataml DataFrame Column | Teradata Python Package - 17.00 - Access teradataml DataFrame Column - Teradata Package for Python

Teradata® Package for Python User Guide

Product
Teradata Package for Python
Release Number
17.00
Release Date
November 2021
Content Type
User Guide
Publication ID
B700-4006-070K
Language
English (United States)

In order to use teradataml DataFrame Column, also known as ColumnExpression, in various ways in filter, assign or join, you must access the Column.

There are two ways to access column:
  • Access column as DataFrame attribute:

    <dataframe_object>.column_name

  • Access column like dictionary:

    <dataframe_object>["column_name"]

If column name contains whitespace or special character, Teradata recommends accessing ColumnExpression like dictionary.

Example Prerequisites

>>> from teradataml.dataframe.sql_functions import case
>>> load_example_data("GLM", ["admissions_train"])
>>> df = DataFrame("admissions_train")
>>> print(df)
   masters   gpa     stats programming  admitted
id
5       no  3.44    Novice      Novice         0
3       no  3.70    Novice    Beginner         1
1      yes  3.95  Beginner    Beginner         0
20     yes  3.90  Advanced    Advanced         1
8       no  3.60  Beginner    Advanced         1
25      no  3.96  Advanced    Advanced         1
18     yes  3.81  Advanced    Advanced         1
24      no  1.87  Advanced      Novice         1
26     yes  3.57  Advanced    Advanced         1
38     yes  2.65  Advanced    Beginner         1

Example: Access ColumnExpression as attribute and use the same as predicate for filter

>>> gpa = df.gpa
>>> good_df = df[case([(gpa > 3.0, 'good'),
                       (gpa > 2.0, 'average')],
                       else_='bad') == 'good']
>>> print(good_df)
   masters   gpa     stats programming  admitted
id
13      no  4.00  Advanced      Novice         1
11      no  3.13  Advanced    Advanced         1
9       no  3.82  Advanced    Advanced         1
26     yes  3.57  Advanced    Advanced         1
3       no  3.70    Novice    Beginner         1
1      yes  3.95  Beginner    Beginner         0
20     yes  3.90  Advanced    Advanced         1
18     yes  3.81  Advanced    Advanced         1
5       no  3.44    Novice      Novice         0
32     yes  3.46  Advanced    Beginner         0
>>> print(good_df.shape)
(35, 6)

Example: Access ColumnExpression like dictionary and use the same to create a new DataFrame

This example accesses ColumnExpression like dictionary and uses the same to create a new DataFrame with an additional 'rating' column using assign operation, with the same case construct used above.

>>> gpa = df['gpa']
>>> whens_df = df.assign(rating = case([(gpa > 3.0, 'good'),
                                        (gpa > 2.0, 'average')],
                                        else_='bad'))
>>> print(whens_df)
   masters   gpa     stats programming  admitted   rating
id
5       no  3.44    Novice      Novice         0     good
3       no  3.70    Novice    Beginner         1     good
1      yes  3.95  Beginner    Beginner         0     good
20     yes  3.90  Advanced    Advanced         1     good
8       no  3.60  Beginner    Advanced         1     good
25      no  3.96  Advanced    Advanced         1     good
18     yes  3.81  Advanced    Advanced         1     good
24      no  1.87  Advanced      Novice         1      bad
26     yes  3.57  Advanced    Advanced         1     good
38     yes  2.65  Advanced    Beginner         1  average
>>> print(whens_df.shape)
(40, 7)