Access teradataml DataFrame Column | Teradata Package for Python - Access teradataml DataFrame Column - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

To use teradataml DataFrame Column (also called ColumnExpression) in a filter, assign or join, you must access the Column.

There are two ways to access column:
  • Access column as DataFrame attribute:

    dataframe_object.column_name

  • Access column like dictionary:

    dataframe_object["column_name"]

If column name contains whitespace or special character, Teradata recommends accessing ColumnExpression like dictionary.

Example Setup

>>> from teradataml.dataframe.sql_functions import case
>>> load_example_data("GLM", ["admissions_train"])
>>> df = DataFrame("admissions_train")
>>> print(df)
   masters   gpa     stats programming  admitted
id
5       no  3.44    Novice      Novice         0
3       no  3.70    Novice    Beginner         1
1      yes  3.95  Beginner    Beginner         0
20     yes  3.90  Advanced    Advanced         1
8       no  3.60  Beginner    Advanced         1
25      no  3.96  Advanced    Advanced         1
18     yes  3.81  Advanced    Advanced         1
24      no  1.87  Advanced      Novice         1
26     yes  3.57  Advanced    Advanced         1
38     yes  2.65  Advanced    Beginner         1

Example 1: Access ColumnExpression as attribute and use the same as predicate for filter

>>> gpa = df.gpa
>>> good_df = df[case([(gpa > 3.0, 'good'),
                       (gpa > 2.0, 'average')],
                       else_='bad') == 'good']
>>> print(good_df)
   masters   gpa     stats programming  admitted
id
13      no  4.00  Advanced      Novice         1
11      no  3.13  Advanced    Advanced         1
9       no  3.82  Advanced    Advanced         1
26     yes  3.57  Advanced    Advanced         1
3       no  3.70    Novice    Beginner         1
1      yes  3.95  Beginner    Beginner         0
20     yes  3.90  Advanced    Advanced         1
18     yes  3.81  Advanced    Advanced         1
5       no  3.44    Novice      Novice         0
32     yes  3.46  Advanced    Beginner         0
>>> print(good_df.shape)
(35, 6)

Example 2: Access ColumnExpression like dictionary and use the same to create a new DataFrame

This example accesses ColumnExpression like dictionary and uses the same to create a new DataFrame with an additional 'rating' column using assign operation, with the same case construct used in the previous example.

>>> gpa = df['gpa']
>>> whens_df = df.assign(rating = case([(gpa > 3.0, 'good'),
                                        (gpa > 2.0, 'average')],
                                        else_='bad'))
>>> print(whens_df)
   masters   gpa     stats programming  admitted   rating
id
5       no  3.44    Novice      Novice         0     good
3       no  3.70    Novice    Beginner         1     good
1      yes  3.95  Beginner    Beginner         0     good
20     yes  3.90  Advanced    Advanced         1     good
8       no  3.60  Beginner    Advanced         1     good
25      no  3.96  Advanced    Advanced         1     good
18     yes  3.81  Advanced    Advanced         1     good
24      no  1.87  Advanced      Novice         1      bad
26     yes  3.57  Advanced    Advanced         1     good
38     yes  2.65  Advanced    Beginner         1  average
>>> print(whens_df.shape)
(40, 7)